python工具——wordcloud
生成词云
安装wordcloud模块
pip install -i https://pypi.tuna.tsinghua.edu.cn/simple/ wordcloud
用重复的单个单词组成单词云
import numpy as np from wordcloud import WordCloud text = "square" x, y = np.ogrid[:300, :300] mask = (x - 150) ** 2 + (y - 150) ** 2 > 130 ** 2 mask = 255 * mask.astype(int) wc = WordCloud(background_color="white", repeat=True, mask=mask) wc.generate(text) wc.to_file('wc.png')
使用一句话生成词云
from wordcloud import WordCloud wc = WordCloud() # 创建词云对象 wc.generate('This is not the end. It is not even the beginning of the end. But it is, perhaps, the end of the beginning.') # 生成词云 wc.to_file('wc.png') # 保存词云
读取txt文件生成
import os from os import path from wordcloud import WordCloud import matplotlib.pyplot as plt d = path.dirname(__file__) if "__file__" in locals() else os.getcwd() text = open(path.join(d, 'test.txt')).read() wordcloud = WordCloud(max_font_size=40).generate(text) plt.figure() plt.imshow(wordcloud, interpolation="bilinear") plt.axis("off") plt.show()
生成一个词云文件需要三步:
1、配置对象参数
2、加载词云文本
3、输出词云文件 (如果不加说明默认的图片大小为400 * 200)
wordcloud做词频统计分为以下几个步骤:
1、分隔:以空格分隔单词
2、统计 :单词出现的次数并过滤
3、字体:根据统计搭配相应的字号
4、布局
常用参数
eg:
import os from os import path from wordcloud import WordCloud d = path.dirname(__file__) if "__file__" in locals() else os.getcwd() text = open(path.join(d, 'test.txt')).read() text=text.lower() wordcloud = WordCloud(background_color="white",width=800,height=660).generate(text) import matplotlib.pyplot as plt plt.imshow(wordcloud) plt.axis("off") plt.show() wc.to_file('test.png')
test.txt的获取
链接:https://pan.baidu.com/s/1zfuK9-W5tyq1P8ftlQJuJQ
提取码:iet4