文件方式实现完整的英文词频统计实例

1.读入待分析的字符串

2.分解提取单词 

3.计数字典

4.排除语法型词汇

5.排序

6.输出TOP(20)

fo=open('123.txt','w')
fo.write('''Twinkle, twinkle, little star, How I wonder what you are.
Up above the world so high, Like a diamond in the sky. 
Twinkle, twinkle, little star, How I wonder what you are!
When the blazing sun is gone, 
When he nothing shines upon, 
Then you show your little light, 
Twinkle, twinkle, all the night. 
Twinkle, twinkle, little star, 
How I wonder what you are! 
Then the traveler in the dark Thanks you for your tiny spark;
He could not see which way to go, If you did not twinkle so.
Twinkle, twinkle, little star, How I wonder what you are!
Twinkle Twinkle Little Star''')
fo.close()


fo =open('123.txt','r')
A= fo.read()
exc={'the','and','to','of','in','a','for','with',''}
for i in ',.?!\n"':
    A=A.replace(i,' ')
A=A.lower()
A=A.split(" ")
words=set(A)
dic={}
keys=set(A)#出现过单词的集合,字典的KEY
keys=keys-exc
for i in keys:
    dic[i]=A.count(i)
w=list(dic.items())
w.sort(key=lambda x:x[1],reverse=True)
for i in range(20):
    print(w[i])
fo.close()

 

posted @ 2017-09-26 19:23  013洪辉  阅读(118)  评论(0编辑  收藏  举报