一个完整的大作业
源码:
import urllib.request as ur from bs4 import BeautifulSoup import re url='http://news.scnu.edu.cn/16635' res=ur.urlopen(url).read().decode('utf-8') soup=BeautifulSoup(res,'lxml') content=soup.select('section[class="maintext"] p') data='' for i in content: data+=re.sub('\s+','',i.get_text()) import re import jieba import matplotlib.pyplot as plt from wordcloud import WordCloud word=' '.join(jieba.cut(data)) my_wordcloud = WordCloud(font_path='C:\\Windows\\Fonts\\AdobeKaitiStd-Regular.otf').generate(word) plt.imshow(my_wordcloud) plt.axis("off") plt.show()
云词: