理解条件概率的链式法则
- 条件概率:
\(\text{x}=x\) 事件发生时 \(\text{y}=y\) 事件发生的概率:
\[P(\text{y}=y|\text{x}=x)=\frac{P(\text{x}=x,\text{y}=y)}{P(\text{x}=x)}
\]
- 条件概率的链式法则
也称为条件概率的乘法法则
\[\begin{aligned}
P(a,b,c) &= P(a|b,c)P(b,c) \\
&= P(a|b,c)P(b|c)P(c)
\end{aligned}
\]
- 推广到一般情况有:
\[\begin{aligned}
P(\text{x}^{(1)},\text{x}^{(2)},\cdots,\text{x}^{(n)}) &= P(\text{x}^{(n)}|\text{x}^{(n-1)},\cdots,\text{x}^{(1)})P(\text{x}^{(1)},\cdots,\text{x}^{(n-1)})\\
&=P(\text{x}^{(n)}|\text{x}^{(n-1)},\cdots,\text{x}^{(1)})P(\text{x}^{(n-1)}|\text{x}^{(n-2)},\cdots,\text{x}^{(1)})P(\text{x}^{(1)},\cdots,\text{x}^{(n-2)})\\
&=P(\text{x}^{(n)}|\text{x}^{(n-1)},\cdots,\text{x}^{(1)})P(\text{x}^{(n-1)}|\text{x}^{(n-2)},\cdots,\text{x}^{(1)})\cdots P(\text{x}^{(2)}|\text{x}^{(1)})P(\text{x}^{(1)})\\
&=P(\text{x}^{(1)})\prod_2^nP(\text{x}^{(i))}|\text{x}^{(1)}\cdots\text{x}^{(i-1)})
\end{aligned}
\]
通俗点讲,条件概率的链式法则可以如下理解:
以 \(P(\text{x}^{(1)},\text{x}^{(2)},\cdots,\text{x}^{(n)})\) 为例,可以看作 \(P(\text{x}^{(1)})\) 发生后,\(P(\text{x}^{(2)}|\text{x}^{(1)})P(\text{x}^{(1)})\) 是\(\text{x}^{(1)},\text{x}^{(2)}\) 同时发生的概率,\(P(\text{x}^{(3)}|\text{x}^{(1)},\text{x}^{(2)})P(\text{x}^{(2)}|\text{x}^{(1)})P(\text{x}^{(1)})\) 是 \(\text{x}^{(1)},\text{x}^{(2)},\text{x}^{(3)}\) 同时发生的概率,依次类推下去,便可以得到条件概率的链式法则公式。
参考资料:Deep Learning 3.5