pandas.errors.ParserError: Error tokenizing data. C error: Expected 1 fields in line 4, saw 2
pandas.errors.ParserError: Error tokenizing data. C error: Expected 1 fields in line 4, saw 2
"D:\Program Files\Python36-32\python.exe" D:/PyCharm_Project/bishe/process/read_csv.py
Traceback (most recent call last):
File "D:/PyCharm_Project/bishe/process/read_csv.py", line 11, in <module>
df = pd.read_csv(csv_path) #报错
File "D:\Program Files\Python36-32\lib\site-packages\pandas\io\parsers.py", line 676, in parser_f
return _read(filepath_or_buffer, kwds)
File "D:\Program Files\Python36-32\lib\site-packages\pandas\io\parsers.py", line 454, in _read
data = parser.read(nrows)
File "D:\Program Files\Python36-32\lib\site-packages\pandas\io\parsers.py", line 1133, in read
ret = self._engine.read(nrows)
File "D:\Program Files\Python36-32\lib\site-packages\pandas\io\parsers.py", line 2037, in read
data = self._reader.read(nrows)
File "pandas\_libs\parsers.pyx", line 860, in pandas._libs.parsers.TextReader.read
File "pandas\_libs\parsers.pyx", line 875, in pandas._libs.parsers.TextReader._read_low_memory
File "pandas\_libs\parsers.pyx", line 929, in pandas._libs.parsers.TextReader._read_rows
File "pandas\_libs\parsers.pyx", line 916, in pandas._libs.parsers.TextReader._tokenize_rows
File "pandas\_libs\parsers.pyx", line 2071, in pandas._libs.parsers.raise_parser_error
pandas.errors.ParserError: Error tokenizing data. C error: Expected 1 fields in line 4, saw 2
查了下资料,应该是我强行转换格式(xlsx->csv)所引起的字符编码问题
这里稍微总结一下由字符编码问题引起的错误该如何解决办法呢,如下:
- 文件另存为csv
- 如果不是像我那样强转所导致的,就增加分隔符参数
df = pd.read_csv(csv_path)
df = pd.read_csv(csv_path, encoding='utf-8',sep = '\t')
或者增添这个参数
df = pd.read_csv(csv_path, error_bad_lines=False) #报错
再或者增添这个参数
df = pd.read_csv(csv_path, engine="python") #报错
参考文章
https://www.jianshu.com/p/be233bdb4dbf
https://blog.csdn.net/shuiyixin/article/details/88930359