HMM Part-of-Speech Tagging

A HMM is like this:

Q=q1q2....qN	A set of N status
A=a1a2...an1...ann	A transition probability matrix A, each aij representing the probability of moving from state i to state j, s.t. ▽i
O=o1o2...oT	A sequence of T observations, each one drawn from a vocabulary V=v1,v2,...,Vv
B = bi(ot)	A sequence of observation likelihoods, also called emission probabilities, each expressing the probability of an observation ot being generated from a state i
qO, qF	A special start state and end(final) state that are not associated with observations, together with transition probabilities a01a02...a0n out of the start state and a1Fa2F...anF into the end state

Viterbi: like a DP algorithm and a minimum edit distance algorithm

Extending the HMM to Trigrams:

See the article <<TnT-A Statistical Part-of-Speech Tagger>>

note

The HMM taggers uses trained on hand-tagged data. A tagger using EM algorithm can starts with a untagged data. But even a small amout of training data worked better than EM. The EM-trained ‘’pure HMM’’ tagger is probably best suited to cases for which no training data is available.

posted @ 2015-12-09 16:49 StevenLuke 阅读(118) 评论(0) 编辑收藏举报

刷新页面返回顶部

StevenLuke

HMM Part-of-Speech Tagging

HMM Part-of-Speech Tagging

A HMM is like this:

Viterbi: like a DP algorithm and a minimum edit distance algorithm

Extending the HMM to Trigrams:

note

公告