HMM学习(3)-Patterns generated by a hidden process

HMM学习(3)-Patterns generated by a hidden process

2007-12-18 20:31 903人阅读

3.  Patterns generated by a hidden process

3.2 When a Markov process may not be powerful enough

In some cases the patterns that we wish to find are not described sufficiently by a Markov process. Returning to the weather example, a hermit may perhaps not have access to direct weather observations, but does have a piece of seaweed. Folklore tells us that the state of the seaweed is probabilistically related to the state of the weather - the weather and seaweed states are closely linked. In this case we have two sets of states, the observable states (the state of the seaweed) and the hidden states (the state of the weather). We wish to devise an algorithm for the hermit to forecast weather from the seaweed and the Markov assumption without actually ever seeing the weather.

 

在一些场合我们所希望找到的模式不能够被马尔科夫过程很好的描述。回到天气的例子,一个隐士可能没有办法直接观察到天气,但是他有一片海藻。传说海藻与天气的状态有一定的联系。在这个例子中,我们有两个状态集合,可观察的状态(海藻的状态)和隐状态(天气的状态)。我们希望为隐士设计一个算法,在不能实际看到天气的情况下,来从海藻和马尔科夫假设中预测出天气的状态。

A more realistic problem is that of recognizing speech; the sound that we hear is the product of the vocal chords, size of throat, position of tongue and several other things. Each of these factors interacts to produce the sound of a word, and the sounds that a speech recognition system detects are the changing sound generated from the internal physical changes in the person speaking.

一个更实际的问题是语音识别;我们所听到的声音是产生自声带,喉咙的大小,舌头的位置和其他一些东西。每一个因素相互作用产生了词语的声音,一个语音识别系统所探测到的声音都是人说话时内部身体变化所产生的变化的声音。

Some speech recognition devices work by considering the internal speech production to be a sequence of hidden states, and the resulting sound to be a sequence of observable states generated by the speech process that at best approximates the true (hidden) states. In both examples it is important to note that the number of states in the hidden process and the number of observable states may be different. In a three state weather system (sunny, cloudy, rainy) it may be possible to observe four grades of seaweed dampness (dry, dryish, damp, soggy); pure speech may be described by (say) 80 phonemes, while a physical speech system may generate a number of distinguishable sounds that is either more or less than 80.

 

一些语音识别设备认为内部的语音的产物(internal speech production,语言?)是一个隐状态的序列,发出的声音是一个可观察状态的序列,这个序列由很好的近似了真实状态(隐状态)的语音过程所产生。在两个例子中,非常重要的一点是,隐过程中的状态的数量与可观察状态的数量会很不一样。在3状态的天气系统中(天晴,多云,下雨),可能会观察到海藻湿润度的四个等级(干燥,稍干,微湿,潮湿);纯语音可以被80个音素所描述,而一个人体的语音系统可能会产生非常多的不同的声音,比80多或少。

 

In such cases the observed sequence of states is probabilistically related to the hidden process. We model such processes using a hidden Markov model where there is an underlying hidden Markov process changing over time, and a set of observable states which are related somehow to the hidden states.

 

在这些情况下,状态的可观察序列在一定的概率下与隐过程相关联。我们使用隐马尔科夫模型来对这样的过程进行建模,这里有一个潜在的隐马尔科夫过程随时间而改变,以及一个在一定程度上与隐状态关联的可观察的状态集合。

3.2 Hidden Markov Models

The diagram below shows the hidden and observable states in the weather example. It is assumed that the hidden states (the true weather) are modeled by a simple first order Markov process, and so they are all connected to each other.

 

下图展示了在天气的例子中的隐状态以及可观察状态。它假定了隐状态(真实的天气)通过一个简单的一阶马尔科夫过程来进行建模,所以他们两两之间都有连接。

 

 

The connections between the hidden states and the observable states represent the probability of generating a particular observed state given that the Markov process is in a particular hidden state. It should thus be clear that all probabilities `entering' an observable state will sum to 1, since in the above case it would be the sum of Pr(Obs|Sun), Pr(Obs|Cloud) and Pr(Obs|Rain).

 

在隐状态和可观察状态之间的连接表示了在给定马尔科夫过程停留在一个特定的隐状态时产生特定的观察状态的概率。不难看出,所有的进入一个可观察状态的概率之和应该为1,在上面的例子中就应该是Pr(Obs|Sun), Pr(Obs|Cloud) 和 Pr(Obs|Rain)三者的和。??和底下的矩阵有什么区别??

 

In addition to the probabilities defining the Markov process, we therefore have another matrix, termed the confusion matrix, which contains the probabilities of the observable states given a particular hidden state. For the weather example the confusion matrix might be;

 

除了定义了马尔科夫过程的概率,我们还有另外一个矩阵,称作混合矩阵(confusion matrix,先验概率?),它包括了给定特定隐状态的情况下可观察状态的概率。天气的混合矩阵可以是:

 

 

Notice that the sum of each matrix row is 1.

每行之和为1。

3.3 Summary

We have seen that there are some processes where an observed sequence is probabalistically related to an underlying Markov process. In such cases, the number of observable states may be different to the number of hidden states.

 

我们已经看到在一些过程中,可观察序列是在一定概率下与隐藏的马尔科夫过程相关联。在这些例子中,可观察状态的数量可以与隐状态不同。

We model such cases using a hidden Markov model (HMM). This is a model containing two sets of states and three sets of probabilities;

我们使用隐马尔科夫模型来对这些例子进行建模。这个模型包含了两个状态集合和三个概率集(哪三个?转移概率,confusion matrix,?初始?)。

  • hidden states : the (TRUE) states of a system that may be described by a Markov process (e.g., the weather).
  • observable states : the states of the process that are `visible' (e.g., seaweed dampness).

 

confusion matrix?????????? 
posted @ 2014-03-24 10:58  帖子  阅读(128)  评论(0编辑  收藏  举报