NLTK基本使用

NLTK去除停用词(stopwords)

from nltk.corpus import stopwords
tokens=[ 'my','dog','has','flea','problems','help','please',
         'maybe','not','take','him','to','dog','park','stupid',
         'my','dalmation','is','so','cute','I','love','him'  ]
 
clean_tokens=tokens[:]
stwords=stopwords.words('english')
for token in tokens:
    if token in stwords:
        clean_tokens.remove(token)
 
print(clean_tokens)

 

posted on 2021-12-02 20:40  季昂  阅读(76)  评论(0编辑  收藏  举报