由于隐马尔可夫模型预测问题的两个算法,维特比算法和近似算法在处理过长序列时会出现0概率值,使程序失效。本文章通过对每部中间变量的简单放大处理巧妙地解决了此问题。
原始Viterbi算法
(1) 初始化 (初始状态向量乘以第一个观测 $ o_{1} $ ) :
\[\begin{array}{l}
\delta_{1}(i)=\pi_{i} b_{i}\left(o_{1}\right), \quad i=1,2, \cdots, N \\
\psi_{1}(i)=0, \quad t=1,2, \ldots, N
\end{array}
\]
(2) 递推,对于 $ t=2,3, \ldots, T $
\[\delta_{t}(i)=\max _{1 \leq j<N}\left[\delta_{t-1}(j) a_{j i}\right] b_{i}\left(O_{t}\right) i=1,2, \ldots, N
\]
记录当前状态:
\[\psi_{t}(i)=\arg \max _{1 \leqslant j \leqslant N}\left[\delta_{t-1}(j) a_{j i}\right] i=1,2, \ldots, N
\]
(3) 终止:
\[\begin{array}{l}
P^{*}=\max _{1 \leqslant i \leqslant N} \delta_{T}(i) \\
i_{T}^{*}=\arg \max _{1 \leq i \leq N}\left[\delta_{T}(i)\right]
\end{array}
\]
(4) 最优路径回溯.对于 $ t=T-1, T-2, \cdots, 1 $
\[i_{t}^{*}=\psi_{t+1}\left(i_{t+1}^{*}\right)
\]
求得的最优路径也就是最可能的状态序列为:
\[I^{*}=\left(i_{1}^{*}, i_{2}^{*}, \cdots, i_{T}^{*}\right)
\]
对原始Viterbi算法进行改进,在计算\(t\)时刻的局部状态\(\boldsymbol{\delta_t}\)后,对\(\boldsymbol{\delta_t}\)进行发大处理
\[\boldsymbol{\delta_t}=\boldsymbol{\delta_t}/\max(\boldsymbol{\delta_t})
\]
改进Viterbi算法
(1) 初始化 (初始状态向量乘以第一个观测 $ o_{1} $ ) :
\[\begin{array}{l}
\delta_{1}(i)=\pi_{i} b_{i}\left(o_{1}\right), \quad i=1,2, \cdots, N \\
\psi_{1}(i)=0, \quad t=1,2, \ldots, N
\end{array}
\]
对中间状态变量\(\boldsymbol{\delta_t}\)进行发大
\[\boldsymbol{\delta_1}=\boldsymbol{\delta_1}/\max(\boldsymbol{\delta_1})
\]
(2) 递推,对于 $ t=2,3, \ldots, T $
\[\delta_{t}(i)=\max _{1 \leq j<N}\left[\delta_{t-1}(j) a_{j i}\right] b_{i}\left(O_{t}\right) i=1,2, \ldots, N
\]
记录当前状态:
\[\psi_{t}(i)=\arg \max _{1 \leqslant j \leqslant N}\left[\delta_{t-1}(j) a_{j i}\right] i=1,2, \ldots, N
\]
对中间状态变量\(\boldsymbol{\delta_t}\)进行发大
\[\boldsymbol{\delta_t}=\boldsymbol{\delta_t}/\max(\boldsymbol{\delta_t})
\]
(3) 终止:
\[\begin{array}{l}
P^{*}=\max _{1 \leqslant i \leqslant N} \delta_{T}(i) \\
i_{T}^{*}=\arg \max _{1 \leq i \leq N}\left[\delta_{T}(i)\right]
\end{array}
\]
(4) 最优路径回溯.对于 $ t=T-1, T-2, \cdots, 1 $
\[i_{t}^{*}=\psi_{t+1}\left(i_{t+1}^{*}\right)
\]
求得的最优路径也就是最可能的状态序列为:
\[I^{*}=\left(i_{1}^{*}, i_{2}^{*}, \cdots, i_{T}^{*}\right)
\]
该算法下求得的最优路径与原算法求得的最优路径相同。
证明:
设原算法的中间变量为\(\delta,\psi\),最优路径\(I^*\),改进算法的中间变量为\(\delta',\psi'\),最优路径\(I^{*'}\)
\(t=1\)时:
\[\begin{array}{l}
\delta_{1}(i)=\pi_{i} b_{i}\left(o_{1}\right), \quad i=1,2, \cdots, N \\
\psi_{1}(i)=0, \quad t=1,2, \ldots, N
\end{array}
\]
令\(a=\max(\boldsymbol{\delta_1})\)
\[\begin{array}{l}
\delta_{1}(i)'&=\delta_1(i)/a \\
&=\delta_1(i)/\max(\boldsymbol{\delta_1}) \quad i=1,2, \ldots, N
\end{array}
\]
\[\begin{array}{l}
\psi_1'(i)&=\psi_{1}(i)\\
\end{array}
\]
\(t=2\)时:
\[\delta_{2}(i)=\max _{1 \leq j<N}\left[\delta_{1}(j) a_{j i}\right] b_{i}\left(O_{2}\right) \quad i=1,2, \ldots, N
\]
\[\begin{array}{l}
\psi_{2}(i)=\arg \max _{1 \leqslant j \leqslant N}\left[\delta_{1}(j) a_{j i}\right]
\end{array}
\]
\[\begin{array}{l}
\delta_{2}'(i)&=\max _{1 \leq j<N}\left[\delta_{2}'(j) a_{j i}\right] b_{i}\left(O_{2}\right) \\
&=\max _{1 \leq j<N}\left[\delta_{1}(j)/\max(\boldsymbol{\delta_{t}}) a_{j i}\right] b_{i}\left(O_{2}\right)\\
&=\max _{1 \leq j<N}\left[\delta_{1}(j) a_{j i}\right] b_{i}\left(O_{2}\right)/\max(\boldsymbol{\delta_{1}})\\
&=\delta_{2}(i)/\max(\boldsymbol{\delta_{1}})
\quad i=1,2, \ldots, N
\end{array}
\]
\[\begin{array}{l}
\psi_2'(i)&=\arg \max _{1 \leqslant j \leqslant N}\left[\delta_{1}'(j) a_{j i}\right]\\
&=\arg \max _{1 \leqslant j \leqslant N}\left[\delta_1(i)/\max(\boldsymbol{\delta_1}) a_{j i}\right]\\
&=\arg \max _{1 \leqslant j \leqslant N}\left[\delta_1(i) a_{j i}\right]\\
&=\psi_2(i)
\end{array}
\]
令
\[\begin{array}{l}
a &=\max(\boldsymbol{\delta_{2}'})\\
&=\max_{1\leq i\leq N} \left[ \delta_{2}(i)/\max(\boldsymbol{\delta_{1}})\right]\\
&=\max(\boldsymbol{\delta_{2}})/max(\boldsymbol{\delta_{1}})
\end{array}
\]
则
\[\begin{array}{l}
\delta_{2}'(i)&=\delta_{2}'(i)/a\\
&=\left\{\delta_{2}(i)/\max(\boldsymbol{\delta_{1}})\right\}/\left\{\max(\boldsymbol{\delta_{2}})/max(\boldsymbol{\delta_{1}})\right\}\\
&=\delta_{2}(i)/\max(\boldsymbol{\delta_{2}})
\end{array}
\]
同理可递推得到以下结论
\[\delta_{t}'(i)=\delta_{t}(i)/\max(\boldsymbol{\delta_{t}}) \quad i=2,3, \ldots, N
\]
\[\psi_t'(i)=\psi_t(i) \quad i=2,3 \ldots, N
\]
终止条件
\[\begin{array}{l}
P^{*'}&=\max _{1 \leqslant i \leqslant N} \delta_{T}'(i) \\
&= \max _{1 \leq i \leq N}\left[\delta_{T}(i)/\max(\boldsymbol{\delta_{t}})\right]\\
&= \max _{1 \leq i \leq N}\left[\delta_{T}(i)\right]\\
&=P^{*}
\end{array}
\]
\[\begin{array}{l}
i_{T}^{*'}&=\arg \max _{1 \leq i \leq N}\left[\delta_{T}'(i)\right]\\
&=\arg \max _{1 \leq i \leq N}\left[\delta_{T}(i)/\max(\boldsymbol{\delta_{t}})\right]\\
&=\arg \max _{1 \leq i \leq N}\left[\delta_{T}(i)\right]\\
&=i_{T}^{*}
\end{array}
\]
最优路径回溯.对于 $ t=T-1, T-2, \cdots, 1 $
\[\begin{array}{l}
i_{t}^{*'}&=\psi_{t+1}'\left(i_{t+1}^{*'}\right)\\
&=\psi_{t+1}\left(i_{t+1}^{*}\right)\\
&=i_{t}^{*}
\end{array}
\]
求得的最优路径也就是最可能的状态序列为:
\[\begin{array}{l}
I^{*'}&=\left(i_{1}^{*'}, i_{2}^{*'}, \cdots, i_{T}^{*'}\right) \\
&=\left(i_{1}^{*}, i_{2}^{*}, \cdots, i_{T}^{*}\right)\\
&=I^{*}
\end{array}
\]
得证。
近似算法
前向算法
- 计算初值
\[\alpha_{1}(i)=\pi_{i} b_{i}\left(o_{1}\right), \quad i=1,2, \cdots, N
\]
- 递推计算 $ t+1 $ 时刻,状态为 $ q_{i} $ 的向前概率:
\[\alpha_{t+1}(i)=\left[\sum_{j=1}^{N} \alpha_{t}(j) a_{j i}\right] b_{i}\left(o_{t+1}\right), \quad i=1,2, \cdots, N
\]
后向算法
- 计算初值:
\[\beta_{T}(i)=1, \quad i=1,2, \cdots, N
\]
\[\beta_{t}(i)=\sum_{j=1}^{N} a_{i j} b_{j}\left(o_{t+1}\right) \beta_{t+1}(j), \quad i=1,2, \cdots, N
\]
近似算法
\[\gamma_{t}(i)=\frac{\alpha_{t}(i) \beta_{t}(i)}{P(O \mid \lambda)}=\frac{\alpha_{t}(i) \beta_{t}(i)}{\sum_{j=1}^{N} \alpha_{t}(j) \beta_{t}(j)}
\]
在每一时刻 $ \mathrm{t} $ 最有可能的状态 \(i_{t}^{*}\) 是:
\[i_{t}^{*}=\arg \max _{1 \leqslant i \leqslant N}\left[\gamma_{t}(i)\right], \quad t=1,2, \cdots, T
\]
从而得到状态序列 $ I^{*} $ :
\[I^{*}=\left(i_{1}^{*}, i_{2}^{*}, \cdots, i_{T}^{*}\right)
\]
与改进Viterbi算法类似,在计算前向算法的中间状态概率\(\alpha_t(i)\)和后向算法的中间状态概率\(\beta_t(i)\)时,对两个概率进行放大
\[\begin{array}{l}
\alpha_t(i)=\alpha_t(i)/\max(\boldsymbol{\alpha_t})\\
\beta_t(i)=\beta_t(i)/\max(\boldsymbol{\beta_t})
\end{array}
\]
则改进近似算法中的前向和后向算法的计算流程如下:
前向算法
- 计算初值
\[\alpha_{1}(i)=\pi_{i} b_{i}\left(o_{1}\right), \quad i=1,2, \cdots, N
\]
\[\alpha_1(i)=\alpha_1(i)/\max(\boldsymbol{\alpha_1})
\]
- 向前递推
\[\alpha_{t+1}(i)=\left[\sum_{j=1}^{N} \alpha_{t}(j) a_{j i}\right] b_{i}\left(o_{t+1}\right), \quad i=1,2, \cdots, N
\]
\[\alpha_{t+1}(i)=\alpha_{t+1}(i)/\max(\boldsymbol{\alpha_{t+1}})
\]
后向算法
- 计算初值:
\[\beta_{T}(i)=1, \quad i=1,2, \cdots, N
\]
\[\beta_T(i)=\beta_T(i)/\max(\boldsymbol{\beta_T})
\]
- 向后递推
\[\beta_{t}(i)=\sum_{j=1}^{N} a_{i j} b_{j}\left(o_{t+1}\right) \beta_{t+1}(j), \quad i=1,2, \cdots, N
\]
\[\beta_t(i)=\beta_t(i)/\max(\boldsymbol{\beta_t})
\]
近似算法
\[\gamma_{t}(i)=\frac{\alpha_{t}(i) \beta_{t}(i)}{P(O \mid \lambda)}=\frac{\alpha_{t}(i) \beta_{t}(i)}{\sum_{j=1}^{N} \alpha_{t}(j) \beta_{t}(j)}
\]
在每一时刻 $ \mathrm{t} $ 最有可能的状态 \(i_{t}^{*}\) 是:
\[i_{t}^{*}=\arg \max _{1 \leqslant i \leqslant N}\left[\gamma_{t}(i)\right], \quad t=1,2, \cdots, T
\]
从而得到状态序列 $ I^{*} $ :
\[I^{*}=\left(i_{1}^{*}, i_{2}^{*}, \cdots, i_{T}^{*}\right)
\]
改进近似算法得到的状态序列与原始算法相同
证明:
设原算法中前向算法和后向算法的中间状态概率分别为\(\alpha,\beta\),改进近似算法中前向算法和后向算法的中间状态概率分别为\(\alpha',\beta'\)
前向算法
\(t=1\)时
\[\alpha_{1}(i)=\pi_{i} b_{i}\left(o_{1}\right), \quad i=1,2, \cdots, N
\]
\[\alpha_1'(i)=\alpha_1(i)/\max(\boldsymbol{\alpha_1}) \quad i=1,2, \cdots, N
\]
\(t=2\)时
\[\alpha_{2}(i)=\left[\sum_{j=1}^{N} \alpha_{1}(j) a_{j i}\right] b_{i}\left(o_{2}\right), \quad i=1,2, \cdots, N
\]
\[\begin{array}{l}
\alpha_{2}'(i)&=\left[\sum_{j=1}^{N} \alpha_{1}'(j) a_{j i}\right] b_{i}\left(o_{2}\right)\\
&=\left[\sum_{j=1}^{N} \alpha_1(i)/\max(\boldsymbol{\alpha_1}) a_{j i}\right] b_{i}\left(o_{2}\right)\\
&=\left[\sum_{j=1}^{N} \alpha_1(i) a_{j i}\right] b_{i}\left(o_{2}\right)/\max(\boldsymbol{\alpha_1})\\
&=\alpha_2(i)/\max(\boldsymbol{\alpha_1})\quad i=1,2, \cdots, N
\end{array}
\]
令
\[\begin{array}{l}
a &=\max(\boldsymbol{\alpha_{2}'})\\
&=\max_{1\leq i\leq N} \left[ \alpha_{2}(i)/\max(\boldsymbol{\alpha_{1}})\right]\\
&=\max(\boldsymbol{\alpha_{2}})/max(\boldsymbol{\alpha_{1}})
\end{array}
\]
则
\[\begin{array}{l}
\alpha_{2}'(i)&=\alpha_{2}'(i)/a\\
&=\left\{\alpha_2(i)/\max(\boldsymbol{\alpha_1})\right\}/\left\{\max(\boldsymbol{\alpha_{2}})/max(\boldsymbol{\alpha_{1}})\right\}\\
&=\alpha_2(i)/\max(\boldsymbol{\alpha_{2}})
\end{array}
\]
递推可得
\[\alpha_{t}'(i)=\alpha_t(i)/\max(\boldsymbol{\alpha_{t}}) \quad t=2,3,\cdots T
\]
后向算法
\(t=T\)时
\[\beta_{T}(i)=1 \quad i=1,2, \cdots, N
\]
\[\beta_T'(i)=\beta_T(i)/\max(\boldsymbol{\beta_T}) \quad i=1,2, \cdots, N
\]
\(t=T-1\)时
\[\beta_{T-1}(i)=\sum_{j=1}^{N} a_{i j} b_{j}\left(o_{T}\right) \beta_{T}(j), \quad i=1,2, \cdots, N
\]
\[\begin{array}{l}
\beta_{T-1}'(i)&=\sum_{j=1}^{N} a_{i j} b_{j}\left(O_{T}\right) \beta_{T}'(j)\\
&=\sum_{j=1}^{N} a_{i j} b_{j}\left(O_{T}\right)\beta_T(i)/\max(\boldsymbol{\beta_T})\\
&=\beta_{T-1}'(i)/\max(\boldsymbol{\beta_T})\quad i=1,2, \cdots, N
\end{array}
\]
令
\[\begin{array}{l}
a &=\max(\boldsymbol{\beta_{T-1}'})\\
&=\max_{1\leq i\leq N} \left[ \beta_{T-1}(i)/\max(\boldsymbol{\beta_{T}})\right]\\
&=\max(\boldsymbol{\beta_{T-1}})/max(\boldsymbol{\beta_{T}})
\end{array}
\]
则
\[\begin{array}{l}
\beta_{T-1}'(i)&=\beta_{T-1}'(i)/a\\
&=\left\{\beta_{T-1}(i)/\max(\boldsymbol{\beta_T})\right\}/\left\{\max(\boldsymbol{\beta_{T-1}})/max(\boldsymbol{\beta_{T}})\right\}\\
&=\beta_{T-1}(i)/\max(\boldsymbol{\beta_{T-1}})
\end{array}
\]
递推可得
\[\beta_{t}'(i)=\beta_t(i)/\max(\boldsymbol{\beta_{t}}) \quad t=1,2,\cdots T-1
\]
改进近似算法
\[\begin{array}{l}
\gamma_{t}'(i)&=\frac{\alpha_{t}'(i) \beta_{t}'(i)}{\sum_{j=1}^{N} \alpha_{t}'(j) \beta_{t}'(j)}\\
&=\frac{\alpha_t(i)\beta_t(i)/\left\{\max(\boldsymbol{\alpha_{t}})\max(\boldsymbol{\beta_{t}})\right\}}{\sum_{j=1}^{N} \alpha_t(j)\beta_t(j)/\left\{\max(\boldsymbol{\alpha_{t}})\max(\boldsymbol{\beta_{t}})\right\}}\\
&=\frac{\alpha_{t}(i) \beta_{t}(i)}{\sum_{j=1}^{N} \alpha_{t}(j) \beta_{t}(j)}\\
&=\gamma_{t}(i) \quad i=1,2,\cdots,N
\end{array}
\]
在每一时刻 $ \mathrm{t} $ 最有可能的状态 \(i_{t}^{*}\) 是:
\[\begin{array}{l}
i_{t}^{*'}&=\arg \max _{1 \leqslant i \leqslant N}\left[\gamma_{t}'(i)\right]\\
&=\arg \max _{1 \leqslant i \leqslant N}\left[\gamma_{t}(i)\right]\\
&=i_{t}^{*}\quad t=1,2, \cdots, T
\end{array}
\]
从而得到状态序列 $ I^{*'} $ :
\[I^{*'}=I^{*}=\left(i_{1}^{*}, i_{2}^{*}, \cdots, i_{T}^{*}\right)
\]
得证。