英语词频统计

song = '''
I never knew
When the clock stopped and I'm looking at you

I never thought I'll miss someone like you

Someone I thought that I knew

I never knew

I should have known something wouldn't be true

Baby you know that I'm so into you

More than I know I should do

So why why why

Why should we waited

And I I I

I should be waiting

Waiting for someone new

Even though that it wasn't you

But I know that it's

Wonderful

Incredible

Baby irrational

I never knew it was obsessional

And I never knew it was with you oooh

Baby if it's just

Wonderful

Incredible

Baby irrational

I never knew it was so sad

Just so sad

I'm so sorry

Even now I just cannot feel you feel me

Hmmm

So why why why

Why should we waited

And I I I

I should be waiting

Waiting for someone new

Even though that it wasn't you

But I know that it's

Wonderful

Incredible

Baby irrational

I never knew it was obsessional

And I never knew it was with you oooh

Baby if it's just

Wonderful

Incredible

Baby irrational

I never knew it was so sad

Just so sad

I'm so sorry

Even now I just cannot feel you fall

I don't even know now

I'm sure you'll wait for me

Even now I just cannot deny

I just hold on so tight

Until you and I never could breathe

Oh

Wonderful

Incredible

Baby irrational

I never knew it was obsessional

And I never knew it was with you until you tell me to

Baby if it's just

Wonderful

Incredible

Baby irrational

I never knew it was so sad

Just so sad

I'm so sorry

Even now I just cannot feel you feel me
'''

UnusefulWords = ['on','was','I','i','at']#需要替换的单词
UnusefulSymbol = ["." "'", "(", ")"]#需要替换的标点

NewWords = song
for i in range(len(UnusefulSymbol)):
NewWords = NewWords.replace(UnusefulWords[i],' ') #把文章的标点符号替换
NewWords = NewWords.lower() #全部改成小写

WordsList = NewWords.split() #将字符串分成一个个单词

Count = dict(zip())

for i in WordsList:
Count[i] = NewWords.count(i) #用字典记录单词和其出现次数


for i in song:

if(Count.get(i)!=None):

Count.pop(i)

 

CountWords = sorted(Count.items(),key=lambda x:x[1],reverse = True)

for i in range(10):
print(CountWords[i]) #输出出现频率最高的10个词

posted @ 2018-03-25 21:26  157-符致伟  阅读(191)  评论(0编辑  收藏  举报