条件概率与连式法则
条件概率与连式法则
条件概率公式:
\[p(y|x) = \frac{p(x,y)}{p(x)} \tag{1}
\]
链式法则:
\[\begin{align}
p(x_1,x_2,...,x_n) &= p(x_n)\prod_{i=1}^{n-1}{{p(x_i)}{p(x_i|x_{i+1},x_{i+2},...,x_{n})}}
\nonumber\\ &= p(x_1)\prod_{i=2}^{n}{p(x_i|x_{1},x_{2}...,x_{i-1})} \tag{2}
\end{align}
\]
其实, 链式法则就是根据条件概率公式推导得来, 例如:
\[\begin{align}
p(a,b,c,d) &= p(a|b,c,d)p(b,c,d) \nonumber\\
&= p(a|b,c,d)p(b|c,d)p(c,d) \nonumber \\
&= p(a|b,c,d)p(b|c,d)p(c|d)p(d) \tag{3}
\end{align}
\]
\[\begin{align}
p(a,b,c,d) &= p(d|a,b,c)p(a,b,c) \nonumber \\
&= p(d|a,b,c)p(c|a,b)p(a,b) \nonumber \\
&= p(d|a,b,c)p(c|a,b)p(b|a)p(a) \tag{4}
\end{align}
\]
例子, 假设有样本\({(x_1, y_1),...,(x_n, y_n)}\), \(x_i\)和\(y_i\)分别表示第\(i\)个样本的特征和类别, \(1 \le i \le n\), \(x_i\) 与 \(x_j\) 相互独立, \(x_i\) 与 \(y_j\) 相互独立, 那么有
\[\begin{align}
p(x_1,x_2,...,x_n,y_1,y_2,...,y_n) &= p(y_1,y_2,...,y_n)p(x_1,x_2,...,x_n|y_1,y_2,...,y_n) \nonumber \\
&= p(y_1,y_2,...,y_n) \cdot \prod_{i=1}^{n}{p(x_i|y_i)} \tag{5}
\end{align}
\]
又有假设后面的两本只与前面两个样本的取值有关(NLP等领域常用), 则
\[p(y_1,y_2,...,y_n) = \prod_{i=1}^{n}p(y_i|y_{i-1},y_{i-2}) \tag{6}
\]
其中\(y_{0} = y_{-1} = *\)
\[p(x_1,x_2,...,x_n|y_1,y_2,...,y_n) = \prod_{i=1}^{n}{p(x_i|x_{i-1},...,x_1,y_1,...y_n)} \tag{7}
\]
\(p(x_i|x_{i-1},...,x_1,y_1,...y_n)\)表示已知\(x_{i-1},...,x_1,y_1,...y_n\)的条件下, 取值为\(x_i\)的概率.