作业7
文件方式实现完整的英文词频统计实例
1.读入待分析的字符串
2.分解提取单词
3.计数字典
4.排除语法型词汇
5.排序
6.输出TOP(20)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
|
>>> fo = open ( '123.txt' , 'w' ) >>> fo.write( '''I messed up tonight, I lost another fight I still mess up but I'll just start again I keep falling down, I keep on hitting the ground I always get up now to see what's next Birds don't just fly, they fall down and get up Nobody learns without getting it won I won't give up, no I won't give in Til I reach the end, then I'll start again No I won't leave, I wanna try everything I wanna try even though I could fail I won't give up, no I won't give in Til I reach the end and then I'll start again No I won't leave, I wanna try everything I wanna try even though I could fail''' ) 574 >>> fo.close() |
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
|
fo = open ( '123.txt' , 'r' ) song = fo.read() song = song.lower() #小写 for i in ',.' : song = song.replace(i, ' ' ) #替换标点符号 words = song.split() #分单词 #print(words) dic = {} #字典 keys = set (words) #集合 #print(keys) for i in keys: dic[i] = words.count(i) #出现单词的次数 #print(dic) wc = list (dic.items()) #列表 wc.sort(key = lambda x:x[ 1 ],reverse = True ) #print(wc) for i in range ( 20 ): print (wc[i]) fo.close() |