读取csv文件的字符编码错误
"D:\Program Files\Python36-32\python.exe" D:/PyCharm_Project/bishe/process/read_csv.py
Traceback (most recent call last):
File "pandas\_libs\parsers.pyx", line 1130, in pandas._libs.parsers.TextReader._convert_tokens
File "pandas\_libs\parsers.pyx", line 1254, in pandas._libs.parsers.TextReader._convert_with_dtype
File "pandas\_libs\parsers.pyx", line 1269, in pandas._libs.parsers.TextReader._string_convert
File "pandas\_libs\parsers.pyx", line 1459, in pandas._libs.parsers._string_box_utf8
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc0 in position 2: invalid start byte
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "D:/PyCharm_Project/bishe/process/read_csv.py", line 12, in <module>
df = pd.read_csv(csv_path)
File "D:\Program Files\Python36-32\lib\site-packages\pandas\io\parsers.py", line 676, in parser_f
return _read(filepath_or_buffer, kwds)
File "D:\Program Files\Python36-32\lib\site-packages\pandas\io\parsers.py", line 454, in _read
data = parser.read(nrows)
File "D:\Program Files\Python36-32\lib\site-packages\pandas\io\parsers.py", line 1133, in read
ret = self._engine.read(nrows)
File "D:\Program Files\Python36-32\lib\site-packages\pandas\io\parsers.py", line 2037, in read
data = self._reader.read(nrows)
File "pandas\_libs\parsers.pyx", line 860, in pandas._libs.parsers.TextReader.read
File "pandas\_libs\parsers.pyx", line 875, in pandas._libs.parsers.TextReader._read_low_memory
File "pandas\_libs\parsers.pyx", line 952, in pandas._libs.parsers.TextReader._read_rows
File "pandas\_libs\parsers.pyx", line 1084, in pandas._libs.parsers.TextReader._convert_column_data
File "pandas\_libs\parsers.pyx", line 1137, in pandas._libs.parsers.TextReader._convert_tokens
File "pandas\_libs\parsers.pyx", line 1254, in pandas._libs.parsers.TextReader._convert_with_dtype
File "pandas\_libs\parsers.pyx", line 1269, in pandas._libs.parsers.TextReader._string_convert
File "pandas\_libs\parsers.pyx", line 1459, in pandas._libs.parsers._string_box_utf8
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc0 in position 2: invalid start byte
解决办法哩,如下:
if __name__ == '__main__':
csv_path = r'E:/data_backup/shuju/1540880931324.csv'
# df = pd.read_csv(csv_path) 报错
df = pd.read_csv(csv_path,engine="python") #不报错
print(df.head(10))
碎碎念:黑鸭子组合的《茉莉花》很好听呀,大家累的时候可以听听呀!
参考文章
https://www.jb51.net/article/142060.htm
https://www.cnblogs.com/zhanshan/archive/2018/07/26/9370032.html