这爷们真的丑

导航

西游记的jieba分词

mport jieba
 
def takeSecond(elem):
    return elem[1]
 
def main():
    path = "xiyouji.txt"
    file = open(path,"r",encoding="utf-8")
    text=file.read()
    file.close()
 
    words = jieba.lcut(text)
    counts = {}
    for word in words:
        counts[word] = counts.get(word,0) + 1
 
    items = list(counts.items())
    items.sort(key = takeSecond,reverse=True)    
 
    for i in range(20):
        item=items[i]
        keyWord =item[0]
        count=item[1]
        print("其中出现({0:<1})的次数为:{1:>1}".format(keyWord,count))
 
main()
运行结果:

 

 

posted on 2021-11-14 09:24  这爷们真的丑  阅读(35)  评论(0编辑  收藏  举报