文件方式实现完整的英文词频统计实例
1.读入待分析的字符串
2.分解提取单词
3.计数字典
4.排除语法型词汇
5.排序
6.输出TOP(20)
fo=open('text','w') fo.write('''Beat It - Michael Jackson They Told Him Don't You Ever Come Around Here Don't Wanna See Your Face You Better Disappear The Fire's In Their Eyes And Their Words Are Really Clear So Beat It Just Beat It You Better Run You Better Do What You Can Don't Wanna See No Blood Don't Be A Macho Man You Wanna Be Tough Better Do What You Can So Beat It But You Wanna Be Bad Just Beat It Beat It Beat It Beat It No One Wants To Be Defeated Showin' How Funky Strong Is Your Fighter It Doesn't Matter Who's Wrong Or Right Just Beat It Beat It Just Beat It Beat It Just Beat It Beat It Just Beat It Beat It They're Out To Get You Better Leave While You Can Don't Wanna Be A Boy You Wanna Be A Man You Wanna Stay Alive Better Do What You Can So Beat It Just Beat It You Have To Show Them That You're Really Not Scared You're Playin' With Your Life This Ain't No Truth Or Dare They'll Kick You Then They Beat You Then They'll Tell You It's Fair So Beat It But You Wanna Be Bad Just Beat It Beat It Beat It Beat It No One Wants To Be Defeated Showin' How Funky Strong Is Your Fighter It Doesn't Matter Who's Wrong Or Right Just Beat It Beat It Beat It Beat It No One Wants To Be Defeated Showin' How Funky Strong Is Your Fighter It Doesn't Matter Who's Wrong Or Right Just Beat It Beat It Beat It Beat It Beat It... Beat It Beat It Beat It Beat It No One Wants To Be Defeated Showin' How Funky Strong Is Your Fighter It Doesn't Matter Who's Wrong Or Right Who Just Beat It Beat It Beat It Beat It No One Wants To Be Defeated Showin' How Funky Strong Is Your Fighter It Doesn't Matter Who's Wrong Or Who's Right Just Beat It Beat It Beat It Beat It No One Wants To Be Defeated Showin' How Funky Strong Is Your Fighter It Doesn't Matter Who's Wrong Or Right Just Beat It Beat It Beat It Beat It No One Wants To Be Defeated Showin' How Funky Strong Is Your Fighter It Doesn't Matter Who's Wrong Or Right Just Beat It -''') fo.close()#写入待分析的字符串到text fo=open('text','r') s=fo.read() fo.close#读入待分析的字符串 s=s.lower()#换小写 for i in ',.?!-': s=s.replace(i,' ') s=s.replace('\n',' ')#替换符号 s=s.split(' ')#分解提取单词 print(s) dict={}#建立一个字典 exc={'it','be','no','to','or',' '}#排除语法型词汇 keys=set(s)-exc#对字典赋键 for i in keys: dict[i]=s.count(i)#便利键后对字典赋值 print(dict) wc=list(dict.items())#将字典转换成由元组组成的列表 wc.sort(key=lambda x:x[1],reverse=True)#对字典的值按从大到小排序 print(wc) for i in range(20): print(wc[i])#输出前20个