《统计学习方法》第4章习题

习题4.1

用极大似然估计法推导朴素贝叶斯的先验概率和条件概率

假设数据集 \(T = \{(x^{(1)} , y^{(1)}), (x^{(2)} , y^{(2)}), ... , (x^{(M)} , y^{(M)})\}\)

假设 \(P(Y=c_k) = \theta_k\) ,则 \(P(Y \ne c_k) = 1 - \theta_k\) 。假设数据集中取值为 \(c_k\) 的个数为 \(m_k\)

可得似然函数 \(L(\theta_k) = P(y^{(1)}, y^{(2)}, ..., y^{(M)}) = \prod P(y^{(i)}) = \prod \theta_k^{m_k} * (1 - \theta_k)^{m_k}\)

可得 \(\frac{\partial log(L(\theta_k))}{\partial \theta_k} = \frac{m_k}{\theta_k} - \frac{M - m_k}{1 - \theta_k} =0\)

\(P(Y=c_k) = \theta_k = \frac{m_k}{M} = \frac{\sum I(y^{(i)} = c_k)}{M}\)

公式(4.8)得证

假设 \(P(X_j = a_{jl}| Y = c_k) = \theta_{kjl}\) , 则 \(P(X_j \ne a_{jl}| Y = c_k) = 1 - \theta_{kjl}\) 。假设数据集中取值为是 \((a_{jl} , c_k)\) 的个数为 \(m_{kjl}\)

可得似然函数 \(L(\theta_{kjl}) = P(x_j^{(1)}, x_j^{(2)}, ..., x_j^{(m_k)}|Y=c_k) = \prod P(x_j^{(i)}|Y=c_k) = \prod \theta_{kjl}^{m_{kjl}} * (1 - \theta_{kjl})^{m_{kjl}}\)

可得 \(\frac{\partial log(L(\theta_{kjl}))}{\partial \theta_{kjl}} = \frac{m_{kjl}}{\theta_{kjl}} - \frac{m_k - m_{kjl}}{1 - \theta_{kjl}} =0\)

\(P(X_j = a_{jl}| Y = c_k) = \theta_{kjl} = \frac{m_{kjl}}{m_k} = \frac{\sum I(x_j^{(i)} = a_{jl} , y^{(i)} = c_k)}{\sum I(y^{(i)} = c_k)}\)

公式(4.9)得证

习题4.2

用贝叶斯估计法推导朴素贝叶斯的条件概率和先验概率

借助习题1.1的思想,假设待估计的参数服从贝塔分布。

假设 \(\theta_{kjl}\) 服从 \(Be(\lambda+1, (S_j - 1)\lambda + 1)\) 的概率分布,其中贝塔分布的两个参数都为正数,根据贝叶斯估计

\(P(X_j = a_{jl}| Y = c_k) = \theta_{kjl} = \frac{m_{kjl} + \alpha - 1}{m_k + \alpha + \beta - 2} = \frac{\sum I(x_j^{(i)} = a_{jl} , y^{(i)} = c_k) + \lambda}{\sum I(y^{(i)} = c_k) + S_j\lambda}\)

公式(4.10)得证

同理,假设 \(\theta_k\) 服从 \(Be(\lambda+1, (K - 1)\lambda + 1)\) 的概率分布,其中贝塔分布的两个参数都为正数,根据贝叶斯估计

\(P(Y=c_k) = \theta_k = \frac{m_k + \alpha}{M + \alpha + \beta - 2} = \frac{\sum I(y^{(i)} = c_k) + \lambda}{M + K\lambda}\)

公式(4.11)得证

posted @ 2021-06-28 16:51  程劼  阅读(370)  评论(0编辑  收藏  举报