python 删除utf8范围之外的数据

报错:SyntaxError:Non-UTF-8 code starting with '\x..' in file ...

1 #获得没法识别的字节错误:"Incorrect string value:'\\xF0\\xAB\\x96\\xAF\\xE7\\x9A...',把字节错误的地方换成?
2 errorbytes = [b'\xF0\xAB\x96\xAF\xE7\x9A',b'\xF0\xA8\xA8\x97\xEF\xBC']
3 for eb in errorbytes:
4     data['intro'] = [x.encode('utf8', errors='replace').replace(eb, b'?').decode('utf8'
5                      , errors='replace') for x in list(data['intro'])]

 

posted @ 2019-09-19 10:18  糖醋排骨加辣椒  阅读(564)  评论(0编辑  收藏  举报