文件方式实现完整的英文词频统计实例

1.读入待分析的字符串
fo = open('test.txt','w')
fo.write('''It's amazing how you can speak right to my heart
Without saying a word,
you can light up the dark
Try as I may I could never explain
What I hear when you don't say a thing
The smile on your face lets me know that you need me
There's a truth in your eyes saying you'll never leave me
The touch of your hand says you'll catch me whenever I fall
You say it best.. when you say nothing at all
All day long I can hear people talking out loud
But when you hold me near, you drown out the crowd
Try as they may they can never define
What's been said between your heart and mine
The smile on your face lets me know that you need me
There's a truth in your eyes saying you'll never leave me
The touch of your hand says you'll catch me whenever I fall
You say it best.. when you say nothing at all
The smile on your face lets me know that you need me
There's a truth in your eyes saying you'll never leave me
The touch of your hand says you'll catch me whenever I fall
You say it best.. when you say nothing at all''')
fo.close()

 


2.分解提取单词 
fo = open('test.txt','r')
song=fo.read()
fo.close()
exc={'the','a','in'}
song=song.replace(',',' ')
song=song.replace('.',' ')
song=song.replace('\n',' ')
song=song.lower()
words=song.split(' ')
dic={}

 

 

3.计数字典

4.排除语法型词汇

5.排序

words.sort()
keys = set(words)
keys = keys-exc
for i in keys:
    dic[i] = words.count(i)  # 单词出现次数的字典
items = list(dic.items())
print(items)

6.输出TOP(20)

  

items.sort(key=lambda x: x[1], reverse=True)
for i in range(20):
    print(items[i])

截图:

 

 

  

posted @ 2017-09-26 12:26  yishhaoo  阅读(146)  评论(0编辑  收藏  举报