NLTK基本使用

NLTK去除停用词（stopwords）

from nltk.corpus import stopwords
tokens=[ 'my','dog','has','flea','problems','help','please',
         'maybe','not','take','him','to','dog','park','stupid',
         'my','dalmation','is','so','cute','I','love','him'  ]
 
clean_tokens=tokens[:]
stwords=stopwords.words('english')
for token in tokens:
    if token in stwords:
        clean_tokens.remove(token)
 
print(clean_tokens)

posted on 2021-12-02 20:40 季昂阅读(90) 评论(0) 编辑收藏举报

努力加载评论中...

刷新页面返回顶部