文件方式实现完整的英文词频统计实例
1.读入待分析的字符串
fo = open('test.txt','w') fo.write('''It's amazing how you can speak right to my heart Without saying a word, you can light up the dark Try as I may I could never explain What I hear when you don't say a thing The smile on your face lets me know that you need me There's a truth in your eyes saying you'll never leave me The touch of your hand says you'll catch me whenever I fall You say it best.. when you say nothing at all All day long I can hear people talking out loud But when you hold me near, you drown out the crowd Try as they may they can never define What's been said between your heart and mine The smile on your face lets me know that you need me There's a truth in your eyes saying you'll never leave me The touch of your hand says you'll catch me whenever I fall You say it best.. when you say nothing at all The smile on your face lets me know that you need me There's a truth in your eyes saying you'll never leave me The touch of your hand says you'll catch me whenever I fall You say it best.. when you say nothing at all''') fo.close()
2.分解提取单词
fo = open('test.txt','r') song=fo.read() fo.close() exc={'the','a','in'} song=song.replace(',',' ') song=song.replace('.',' ') song=song.replace('\n',' ') song=song.lower() words=song.split(' ') dic={}
3.计数字典
4.排除语法型词汇
5.排序
words.sort() keys = set(words) keys = keys-exc for i in keys: dic[i] = words.count(i) # 单词出现次数的字典 items = list(dic.items()) print(items)
6.输出TOP(20)
items.sort(key=lambda x: x[1], reverse=True) for i in range(20): print(items[i])
截图: