朴素贝叶斯(生成模型)

朴素贝叶斯中的基本假设

  1. 训练数据是由$P\left( {X,Y} \right)$独立同分布产生的
  2. 条件独立假设(当类别确定时特征之间是相互独立的):\[P\left( {X = x|Y = {c_k}} \right) = P\left( {{X^{\left( 1 \right)}} = {x^{\left( 1 \right)}},{X^{\left( 2 \right)}} = {x^{\left( 2 \right)}}, \ldots ,{X^{\left( n \right)}} = {x^{\left( n \right)}}|Y = {c_k}} \right) = \prod\limits_{j = 1}^n {P\left( {{X^{\left( j \right)}} = {x^{\left( j \right)}}|Y = {c_k}} \right)} \]

算法思想

对于给定的输入$x$,通过学习得到的模型计算后验概率分布$P\left( {Y{\rm{ = }}{c_k}|X = x} \right)$,将后验概率最大的类作为$x$的类,后验概率根据贝叶斯公式计算:\[P\left( {Y{\rm{ = }}{c_k}|X = x} \right) = \frac{{P\left( {Y = {c_k}} \right)\prod\limits_j {P\left( {{X^{\left( j \right)}} = {x^{\left( j \right)}}|Y = {c_k}} \right)} }}{{\sum\limits_i {P\left( {Y = {c_i}} \right)\prod\limits_j {P\left( {{X^{\left( j \right)}} = {x^{\left( j \right)}}|Y = {c_i}} \right)} } }}\]

朴素贝叶斯分类器可表示为:\[y = \arg {\max _{{c_k}}}P\left( {Y = {c_k}|X = x} \right) = \arg {\max _{{c_k}}}\frac{{P\left( {Y = {c_k}} \right)\prod\limits_j {P\left( {{X^j} = {x^j}|Y = {c_k}} \right)} }}{{\sum\limits_k {P\left( {Y = {c_k}} \right)\prod\limits_j {P\left( {{X^j} = {x^j}|Y = {c_k}} \right)} } }}\]

等价于:\[y = \arg {\max _{{c_k}}}P\left( {Y = {c_k}} \right)\prod\limits_j {P\left( {{X^j} = {x^j}|Y = {c_k}} \right)} \]

朴素贝叶斯法把实例分到后验概率最大的类中。这等价于损失函数是0-1函数时的期望风险最小化。

参数估计

 

 

 

posted @ 2019-06-17 21:48  xd_xumaomao  阅读(1596)  评论(0编辑  收藏  举报