朴素贝叶斯算法笔记

 

 

算法描述:

输入:训练数据$T={(x_{1},y_{1}),(x_{2},y_{2}),...,(x_{N},y_{N})}$,其中$x_{i}=(x_{i}^{(1)},x_{i}^{(2)},...,x_{i}^{(n)})$,$x_{i}^{(j)}$是第i个样本的第j个特征,$x_{i}^{(j)}\in \{ a_{j1},a_{j2},...,a_{js} \}$,$a_{jl}$表示第j个特征可能取的第l个值,j=1,2,...,n,l=1,2,...,Sj,$y_{i} \in \{ c_{1},c_{2},...,c_{k} \}$;实例x;

输出:实例x的分类

(1) 计算先验概率以及条件概率

   $P(Y=c_{k})=\frac{\sum_{i=1}^{N}I(y_{i}=c_{k})}{N},k=1,2,...,K$

   $P(X^{(j)}=a_{jl}|Y=c_{k})=\frac{\sum_{i=1}^{N}I(x_{i}^{(j)}=a_{jl},y_{i}=c_{k})}{\sum_{i=1}^{N}I(y_{i}=c_{k})},$

    $j=1,2,...,n;l=1,2,...,S_{j};k=1,2,...,K$

(2)对于给定的实例$x=(x^{(1)},x^{(2)},...,x^{(n)})^{T}$,计算

    $P(Y=c_{k})\prod_{j=1}^{n} P(X^{(j)}=x^{(j)}|Y=c_{k}),k=1,2,...,K$

(3)确定实例x的类别

    $y=arg \max_{c_{k}}P(Y=c_{k})\prod_{j=1}^{n}P(X^{(j)}=x^{(j)}|Y=c_{k})$

posted @ 2018-02-27 23:25  blackx  阅读(167)  评论(0编辑  收藏  举报