自然语言处理-01-词云图(wordcloud)
wordcloud可自动统计词频制作词云图:
1.如果处理语言为英语可以进行常规用法
2.如果处理语言为中文需要先进行结巴分词,并在WordCloud参数中加入font_path属性
点击查看代码
from wordcloud import WordCloud,STOPWORDS
import numpy
import PIL.Image as Image
import json
def startBuildWords():
text = ''
try: #读取本地json文件提取文本
with open("../file/result_clear.json",'r') as load_f:
load_dict = json.load(load_f)
text = json.dumps(load_dict['content'])
except:
return False
#print(text)
#2.图片遮罩层
mask_pic= numpy.array(Image.open("../picture/world.jpg"))
#设置停止词
m_stopwords=['and','to','and','the','with','in','by','its','for','of','an','to']
for word in m_stopwords:
STOPWORDS.add(word)
wordcloud = WordCloud(font_path="../font/TimesNewRoman.ttf",
width = 1500 ,
height = 1000 ,
# min_font_size = 20,
mask=mask_pic, #3.将参数mask设值为:mask_pic
stopwords = STOPWORDS,
# background_color="white",
).generate(text)
image = wordcloud.to_image()
image.show()
wordcloud.to_file("../ciyuntu/ciyun.png")
return True
startBuildWords()
本文来自博客园,作者:相对维度,转载请注明原文链接:https://www.cnblogs.com/wangjirui/articles/15935876.html

浙公网安备 33010602011771号