《python数据分析基础》之数据库
一、向MySQL中插入CSV文件
1、首先安装本地MySQL
此步骤不再赘述
2、启动本地MySQL
在contana的搜索框张收入cmd,然后右键单击,并选择以管理员身份运行!
切换到C盘根目录:
C:\WINDOWS\system32>cd .. C:\Windows>cd .. C:\> C:\>D: D:\>cd D:\navicat\mysql-5.6.44-winx64\bin\ D:\navicat\mysql-5.6.44-winx64\bin>net start mysql MySQL 服务正在启动 . MySQL 服务已经启动成功。 D:\navicat\mysql-5.6.44-winx64\bin>mysql -uroot -p #这里直接enter键,进入MySQL
mysql>
进入MySQL后,进行以下操作:
创建数据库my_suppliers,选择创建的数据库,在新创建的数据库下新建一个数据表Suppliers,描述数据表,创建一个新用户clinton,向新用户授予所有权限。
CREATE DATABASE my_suppliers; Query OK, 1 row affected (0.05 sec) mysql> use my_suppliers; Database changed mysql> CREATE TABLE IF NOT EXISTS Suppliers -> (Suppliers VARCHAR(20), -> Invoice_Number VARCHAR(20), -> Part_Number VARCHAR(20), -> Cost FLOAT, -> Purchase_date date); Query OK, 0 rows affected (0.24 sec) mysql> describe Suppliers; +----------------+-------------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +----------------+-------------+------+-----+---------+-------+ | Suppliers | varchar(20) | YES | | NULL | | | Invoice_Number | varchar(20) | YES | | NULL | | | Part_Number | varchar(20) | YES | | NULL | | | Cost | float | YES | | NULL | | | Purchase_date | date | YES | | NULL | | +----------------+-------------+------+-----+---------+-------+ 5 rows in set (0.01 sec) mysql> CREATE USER 'clinton'@'localhost' IDENTIFIED BY 'secret_password'; Query OK, 0 rows affected (0.00 sec) mysql> GRANT ALL PRIVILEGES ON my_suppliers.* TO 'clinton'@'localhost'; Query OK, 0 rows affected (0.00 sec) mysql> FLUSH PRIVILEGES; Query OK, 0 rows affected (0.00 sec)
3、使用python向MySQL表中插入新记录
该脚本会将CSV文件的数据插入到MySQL中的Suppliers表中:
#!/usr/bin/env python3
import csv
import MySQLdb
from datetime import datetime,date
input_file = "F://python入门//数据1//CSV测试数据.csv"
#连接到数据库
con = MySQLdb.connect(host='localhost',port=3306,db='my_suppliers',user='clinton',\
passwd='secret_password')
#创建了一个光标
c = con.cursor()
#从CSV格式的输入文件中读取要加载到数据库中的数据,创建file_reader对象,用于存储CSV中的数据集
file_reader = csv.reader(open(input_file,'r',newline=''))
#从输入文件中读入第一行
header = next(file_reader)
#将输入的所有数据进行循环,先是每行循环,再是每列循环
for row in file_reader:
data = []
for column_index in range(len(header)):
if column_index < 4:
data.append(str(row[column_index]).lstrip('$').replace(',','').strip())
else:
a_date = datetime.date(datetime.strptime(str(row[column_index]),'%Y/%m/%d'))
a_date = a_date.strftime('%Y-%m-%d')
data.append(a_date)
print(data)
c.execute("""INSERT INTO Suppliers VALUES(%s,%s,%s,%s,%s)""",data)
#将修改保存到数据库
con.commit()
print('………………')
#执行选择所有数据的SQL
c.execute("SELECT * FROM Suppliers")
#返回结果集中的所有行,返回的是一个大的列表
rows = c.fetchall()
print(rows)
print('………………')
for row in rows:
row_list_output = []
for column_index in range(len(row)):
row_list_output.append(str(row[column_index]))
print(row_list_output)
MySQL中的结果集:
mysql> select * from Suppliers; +-------------+----------------+-------------+------+---------------+ | Suppliers | Invoice_Number | Part_Number | Cost | Purchase_date | +-------------+----------------+-------------+------+---------------+ | Suppliers X | 001-1001 | 2341 | 500 | 2019-01-20 | | Suppliers X | 001-1001 | 2341 | 500 | 2019-01-20 | | Suppliers X | 001-1001 | 5467 | 750 | 2019-01-20 | | Suppliers X | 001-1001 | 5467 | 750 | 2019-01-20 | | Suppliers Y | 50-9501 | 7009 | 250 | 2019-01-30 | | Suppliers Y | 50-9501 | 7009 | 250 | 2019-01-30 | | Suppliers Y | 50-9505 | 6650 | 125 | 2019-02-03 | | Suppliers Y | 50-9505 | 6650 | 125 | 2019-02-03 | | Suppliers Z | 920-4803 | 3321 | 615 | 2019-02-03 | | Suppliers Z | 920-4804 | 3321 | 615 | 2019-02-10 | | Suppliers Z | 920-4805 | 3321 | 615 | 2019-02-17 | | Suppliers Z | 920-4806 | 3321 | 615 | 2019-02-24 | +-------------+----------------+-------------+------+---------------+ 12 rows in set (0.00 sec)
Spyder右下角的输出:
['Suppliers X', '001-1001', '2341', '500.00', '2019-01-20'] ['Suppliers X', '001-1001', '2341', '500.00', '2019-01-20'] ['Suppliers X', '001-1001', '5467', '750.00', '2019-01-20'] ['Suppliers X', '001-1001', '5467', '750.00', '2019-01-20'] ['Suppliers Y', '50-9501', '7009', '250.00', '2019-01-30'] ['Suppliers Y', '50-9501', '7009', '250.00', '2019-01-30'] ['Suppliers Y', '50-9505', '6650', '125.00', '2019-02-03'] ['Suppliers Y', '50-9505', '6650', '125.00', '2019-02-03'] ['Suppliers Z', '920-4803', '3321', '615.00', '2019-02-03'] ['Suppliers Z', '920-4804', '3321', '615.00', '2019-02-10'] ['Suppliers Z', '920-4805', '3321', '615.00', '2019-02-17'] ['Suppliers Z', '920-4806', '3321', '615.00', '2019-02-24'] ……………… (('Suppliers X', '001-1001', '2341', 500.0, datetime.date(2019, 1, 20)), ('Suppliers X', '001-1001', '2341', 500.0, datetime.date(2019, 1, 20)), ('Suppliers X', '001-1001', '5467', 750.0, datetime.date(2019, 1, 20)), ('Suppliers X', '001-1001', '5467', 750.0, datetime.date(2019, 1, 20)), ('Suppliers Y', '50-9501', '7009', 250.0, datetime.date(2019, 1, 30)), ('Suppliers Y', '50-9501', '7009', 250.0, datetime.date(2019, 1, 30)), ('Suppliers Y', '50-9505', '6650', 125.0, datetime.date(2019, 2, 3)), ('Suppliers Y', '50-9505', '6650', 125.0, datetime.date(2019, 2, 3)), ('Suppliers Z', '920-4803', '3321', 615.0, datetime.date(2019, 2, 3)), ('Suppliers Z', '920-4804', '3321', 615.0, datetime.date(2019, 2, 10)), ('Suppliers Z', '920-4805', '3321', 615.0, datetime.date(2019, 2, 17)), ('Suppliers Z', '920-4806', '3321', 615.0, datetime.date(2019, 2, 24))) ……………… ['Suppliers X', '001-1001', '2341', '500.0', '2019-01-20'] ['Suppliers X', '001-1001', '2341', '500.0', '2019-01-20'] ['Suppliers X', '001-1001', '5467', '750.0', '2019-01-20'] ['Suppliers X', '001-1001', '5467', '750.0', '2019-01-20'] ['Suppliers Y', '50-9501', '7009', '250.0', '2019-01-30'] ['Suppliers Y', '50-9501', '7009', '250.0', '2019-01-30'] ['Suppliers Y', '50-9505', '6650', '125.0', '2019-02-03'] ['Suppliers Y', '50-9505', '6650', '125.0', '2019-02-03'] ['Suppliers Z', '920-4803', '3321', '615.0', '2019-02-03'] ['Suppliers Z', '920-4804', '3321', '615.0', '2019-02-10'] ['Suppliers Z', '920-4805', '3321', '615.0', '2019-02-17'] ['Suppliers Z', '920-4806', '3321', '615.0', '2019-02-24']
4、关于报错
在脚本中导入import MySQLdb,会报错,ModuleNotFoundError: No module named 'MySQLdb'
针对这一报错,解决方法:
打开anaconda prompt
pip install mysqlclient --安装mysqlclient
完成后,再执行以上脚本,就不会再报错了~
二、从MySQL表中将数据输出写入CSV文件
将目标数据写入CSV中:
#!/usr/bin/env python3 import csv import MySQLdb output_file = "F://python入门//数据1//CSV测试数据_yuan.csv" #连接到数据库 con = MySQLdb.connect(host='localhost',port=3306,db='my_suppliers',user='clinton',\ passwd='secret_password') #创建了一个光标 c = con.cursor() #打开将要写入的文件 filewriter = csv.writer(open(output_file,'w',newline=''),delimiter=',') #设置CSV文件列标题 header = ['Suppliers','Invoice_Number','Part_Number','Cost','Purchase_date'] #将列标题写入文件中 filewriter.writerow(header) #查询Suppliers表,并大于600美元的目标数据取出 c.execute("""select * from Suppliers where Cost>600.00;""") #返回结果集中的所有行 rows = c.fetchall() #将每行写入到目标文件 for row in rows: filewriter.writerow(row)
"CSV测试数据_yuan.csv"结果集:
三、更新MySQL表中的记录
"CSV测试数据.csv"源数据:
以源数据更新MySQL表中的内容:
#!/usr/bin/env python3 import csv import MySQLdb input_file = "F://python入门//数据1//CSV测试数据.csv" #连接到数据库 con = MySQLdb.connect(host='localhost',port=3306,db='my_suppliers',user='clinton',\ passwd='secret_password') #创建了一个光标 c = con.cursor() #打开要读取的文件 file_reader = csv.reader(open(input_file,'r',newline=''),delimiter=',') #读取标题行 header = next(file_reader,None) #遍历CSV文件的内容 for row in file_reader: data = [] for column_index in range(len(header)): data.append(str(row[column_index]).strip()) print(data) c.execute("""update Suppliers set Cost=%s,Purchase_Date=%s where Suppliers=%s;""",data) #将修改保存到数据库 con.commit() print('………………') c.execute("""select * from Suppliers;""") #返回结果集中的所有行 rows = c.fetchall() #将修改后的内容打印出来 for row in rows: output = [] for column_index in range(len(row)): output.append(str(row[column_index])) print(output)
MySQL中的表内容:
mysql> select * from Suppliers; +-------------+----------------+-------------+------+---------------+ | Suppliers | Invoice_Number | Part_Number | Cost | Purchase_date | +-------------+----------------+-------------+------+---------------+ | Suppliers X | 001-1001 | 2341 | 600 | 2019-01-22 | | Suppliers X | 001-1001 | 2341 | 600 | 2019-01-22 | | Suppliers X | 001-1001 | 5467 | 600 | 2019-01-22 | | Suppliers X | 001-1001 | 5467 | 600 | 2019-01-22 | | Suppliers Y | 50-9501 | 7009 | 200 | 2019-02-01 | | Suppliers Y | 50-9501 | 7009 | 200 | 2019-02-01 | | Suppliers Y | 50-9505 | 6650 | 200 | 2019-02-01 | | Suppliers Y | 50-9505 | 6650 | 200 | 2019-02-01 | | Suppliers Z | 920-4803 | 3321 | 615 | 2019-02-03 | | Suppliers Z | 920-4804 | 3321 | 615 | 2019-02-10 | | Suppliers Z | 920-4805 | 3321 | 615 | 2019-02-17 | | Suppliers Z | 920-4806 | 3321 | 615 | 2019-02-24 | +-------------+----------------+-------------+------+---------------+ 12 rows in set (0.00 sec)
Spyder右下角打印的内容:
['600.00', '2019-01-22', 'Suppliers X'] ['200.00', '2019-02-01', 'Suppliers Y'] ……………… ['Suppliers X', '001-1001', '2341', '600.0', '2019-01-22'] ['Suppliers X', '001-1001', '2341', '600.0', '2019-01-22'] ['Suppliers X', '001-1001', '5467', '600.0', '2019-01-22'] ['Suppliers X', '001-1001', '5467', '600.0', '2019-01-22'] ['Suppliers Y', '50-9501', '7009', '200.0', '2019-02-01'] ['Suppliers Y', '50-9501', '7009', '200.0', '2019-02-01'] ['Suppliers Y', '50-9505', '6650', '200.0', '2019-02-01'] ['Suppliers Y', '50-9505', '6650', '200.0', '2019-02-01'] ['Suppliers Z', '920-4803', '3321', '615.0', '2019-02-03'] ['Suppliers Z', '920-4804', '3321', '615.0', '2019-02-10'] ['Suppliers Z', '920-4805', '3321', '615.0', '2019-02-17'] ['Suppliers Z', '920-4806', '3321', '615.0', '2019-02-24']