Pandas读取csv文件并修改英文逗号

需求背景:有一个很大的csv文件,大概有几百万行,数据质量不是很好,不能直接写入数据库中,如下:

1,HR00001,bigolin-03,03,,,,,"*,可以",100,,,,

2,HR00002,bigolin-06,06,,,,,"12.23,备份",340,,,,

 

目的:将"" 中的英文逗号修改成中文逗号

代码如下:

import pandas as pd
data = pd.read_csv('20220324.csv',encoding='gbk')
p = data[['id']]
q = data[['name']]
r = data[['school']]
s = r.replace('[,]','',regex=True) # 使用正则表达式修改
g = p.join(q)
o = g.join(r)
w = g.join(s)
w.head()
# w.to_csv('20220324_new.csv') # 将修改后的csv文件导出

 

 

 

分享案例一:

使用Python中的csv工具包读取csv文件和写入操作:

代码如下:

import csv
import pandas as pd

# 读取csv文件
mydatalist = []
with open('20220324.csv') as f:
    reader = csv.reader(f)
    for row in reader:
        templist = [row[0],row[1]]
        mydatalist.append(templist)
    print(mydatalist)

# 将csv文件写入出去
with open('20220324.csv','w') as f:
    writer = csv.writer(f)
    for row in mydatalist:
        writer.writerow(row)
        

 

 分享案例二:使用Python将csv文件转成list

代码如下:

import csv
data = open('20220324.csv','r') # r-是读取的意思,w-是读取并写入权限的意思
reader_data = csv.reader(data)
data_list = list(reader_data)

print(data_list)

print(data_list[0])
print(data_list[1])
print(data_list[2])

 

 

 

另外推荐大家几个不错的案例视频:

使用Python读取csv文件:https://www.bilibili.com/video/BV1Ti4y197Kd?spm_id_from=333.337.search-card.all.click

 使用Python读取后写入csv文件:https://www.bilibili.com/video/BV1dq4y1A7Yc?spm_id_from=333.999.0.0

使用Pandas 分块读取超大文件100G:https://www.bilibili.com/video/BV1qR4y1g787?spm_id_from=333.337.search-card.all.click

使用Python replace函数处理csv文件:https://www.bilibili.com/video/BV11741157iP?spm_id_from=333.337.search-card.all.click

posted @ 2022-03-24 21:57  明明就-  阅读(451)  评论(0编辑  收藏  举报