N-gram Language Models

N-gram Language Models

We have an arbitrary sequence of m words.

Joint possibility

\(P(w_1, w_2, ..., w_m)\)

Conditional Possibility

\(P(w_1, w_2, ..., w_m)=P(w_1)P(w_2|w_1)P(w_3|w_1, w_2)...P(w_m|w_1, ..., w_{m-1})\)

Markov Assumption

A Markov chain or Markov process is a stochastic model describing a sequence of possible events in which the probability of each event depends only on the state attained in the previous event.

\(P(w_i|w_1, ..., w_{i-1})≈P(w_i|w_{i-n+1}...w_{i-1})\)

Unigram Model(n=1)

\(P(w_1, w_2, ..., w_m)=\sum_{i=1}{m}P(w_i)\)

Bigram Model(n=2)

\(P(w_1, w_2, ..., w_m)=\sum_{i=1}{m}P(w_i|w_{i-1})\)

Trigram Model(n=3)

\(P(w_1, w_2, ..., w_m)=\sum_{i=1}{m}P(w_i|w_{i-2}w_{i-1})\)

posted @ 2022-06-20 18:04  千心  阅读(54)  评论(0编辑  收藏  举报