提取txt文件,读取多种编码格式!
废话少说,直接上代码:
import chardet # 抽取txt文件内容 def parseTxt(filename): texts = [] encoding = chardet.detect(open(filename, 'rb').read()).get('encoding', 'utf-8') with open(filename, "r", encoding=encoding) as f: for item in f.readlines(): texts.append(item) return { "title": texts[0][:100], "content": texts }
本文来自博客园,作者:数据驱动,转载请注明原文链接:https://www.cnblogs.com/shun7man/p/14341650.html