nltk

 

 

分词,保留网址url。

nltk.casual_tokenize('Enigma (now part of PTC http://www.linkedin.com/company/ptc)')
[u'Enigma', u'(', u'now', u'part', u'of', u'PTC', u'http://www.linkedin.com/company/ptc', u')']
nltk.casual_tokenize('ROSS Corporate/Pivotal Retail Services')
[u'ROSS', u'Corporate', u'/', u'Pivotal', u'Retail', u'Services']

posted on 2019-01-25 09:32  bingwork  阅读(224)  评论(0编辑  收藏  举报

导航