pandas 每行是list文本 使用NLTK统计语料词频 nltk

def frequent_words(tdf,k):
    words = ' '.join(tdf['OriginalTweet'].apply(lambda x:' '.join(x)).values.flatten()).split(' ')
    freq = nltk.FreqDist(words)
    common = freq.most_common(k)
    return common

 

posted @ 2023-03-04 09:24  cup_leo  阅读(11)  评论(0编辑  收藏  举报