自然语言处理-01-词云图（wordcloud）

wordcloud可自动统计词频制作词云图：

1.如果处理语言为英语可以进行常规用法

2.如果处理语言为中文需要先进行结巴分词，并在WordCloud参数中加入font_path属性

点击查看代码

from wordcloud import WordCloud,STOPWORDS
import numpy
import PIL.Image as Image
import json

def startBuildWords():
    text = ''
    try: #读取本地json文件提取文本
        with open("../file/result_clear.json",'r') as load_f:
            load_dict = json.load(load_f)
            text = json.dumps(load_dict['content'])
    except:
        return False
        #print(text)  
    #2.图片遮罩层
    mask_pic= numpy.array(Image.open("../picture/world.jpg"))
    #设置停止词
    m_stopwords=['and','to','and','the','with','in','by','its','for','of','an','to'] 
    for word in m_stopwords:
        STOPWORDS.add(word)
    wordcloud = WordCloud(font_path="../font/TimesNewRoman.ttf",
                          width = 1500 ,
                          height = 1000 ,
#                         min_font_size = 20,
                          mask=mask_pic,  #3.将参数mask设值为：mask_pic
                          stopwords = STOPWORDS,
#                         background_color="white",
                        ).generate(text)
    image = wordcloud.to_image()
    image.show()
    wordcloud.to_file("../ciyuntu/ciyun.png")
    return True
startBuildWords()

![image](https://img2022.cnblogs.com/blog/2757887/202202/2757887-20220225141659551-2122596691.png)

posted @ 2022-02-25 14:18 相对维度阅读(61) 评论(0) 收藏举报

刷新页面返回顶部

相对维度

临渊羡鱼不如退而结网

自然语言处理-01-词云图（wordcloud）

1.如果处理语言为英语可以进行常规用法

2.如果处理语言为中文需要先进行结巴分词，并在WordCloud参数中加入font_path属性

公告