摘要:
定义来自维基百科Tokenizationis the process of breaking a stream of text up into words, phrases, symbols, or other meaningful elements called tokens. The list of tokens becomes input for further processing such asparsingortext mining.Parsingorsyntactic analysisis the process of analysing astringof symbols, e 阅读全文
posted @ 2014-01-30 01:00 wintor12 阅读(297) 评论(0) 推荐(0) 编辑