利用jieba第三方库对文件进行关键字提取
已经爬取到的斗破苍穹文本以TXT形式存储
代码
import jieba.analyse path = 'C:/Users/Administrator/Desktop/bishe/doupo.text' fp = open(path,'r') content = fp.read() try: jieba.analyse.set_stop_words('C:/Users/Administrator/Desktop/bishe/aa.txt') tags = jieba.analyse.extract_tags(content, topK=15, withWeight=True) for item in tags: print(item[0]+'\t'+str(int(item[1]*1000))) finally: fp.close()
结果