摘要:
IntroduceThrough the following, u will know what's edit distance.target : you are cute and I love u.source: I am cute I don't love u.For the two sentences, u can use edit including insert, delete and substitute the source so that it can be same to the target.u will subsutitute "you" 阅读全文
摘要:
Our final discussion in basic text processing is segmenting out sentences from text.We use a decision tree to solve this question. But it's doesn't enough, we should use more sophisticated decision tree features to gain the classifier. For example, u can get the probablity of one word end of 阅读全文
摘要:
Well, today I learned the word normalization and stemming.After word tokenization, we should stem to map them to a normal form. For examples, u should refer "are is " to "be", and refer "windows" to "window" and so on. Afterwards, we can use Linux tool to impl 阅读全文