分层贝叶斯模型——结构

分层贝叶斯模型


对于一个随机变量序列$Y_{1},...,Y_{n} $,如果在任意排列顺序$\pi $下,其概率密度都满足$p(y_{1},...,y_{n})=p(y_{\pi_{1}},...,y_{\pi_{n}}) $,那么称这些变量是可交换的。当我们缺乏区分这些随机变量的信息时,可交换性是$p(y_{1},...,y_{n}) $的一个合理属性。在这种情况下,各个随机变量可以看作是从一个群体中独立采样的结果,群体的属性可以用一个固定的未知参数$\phi $来描述,即:

$$
\phi\sim p(\phi)
$$

$$
\{Y_{1},...,Y_{n}|\phi\}\sim^{i.i.d.}p(y|\phi)
$$

考虑分层数据$\{Y_{1},...,Y_{n}\} $,其中$Y_{j}=\{Y_{1,j},...,Y_{n_{j},j}\} $,那么有

$$
\{Y_{1,j},...,Y_{n_{j},j}|\phi_{j}\}\sim^{i.i.d.}p(y|\phi_{j})
$$

但是我们该如何表示组参数$\phi_{1},...,\phi_{m} $呢?若这些组本身归属于更大的组群,那么这些组参数变量同样满足可交换性,因此有

$$
\{\phi_{1},...,\phi_{m}|\phi\}\sim^{i.i.d.}p(\phi|\psi)
$$

 综上,我们能够得到三个概率分布

组内采样:$\{y_{1,j},...,y_{n_{j},j}|\phi_{j}\}\sim^{i.i.d.}p(y|\phi_{j}) $

组间采样:$\{\phi_{1},...,\phi_{m}|\phi\}\sim^{i.i.d.}p(\phi|\psi) $

先验分布:$\psi \sim p(\psi) $

 

分层正态分布模型


下面利用分层正态分布模型来描述几个群体之间的均值异质性,设组内和组间采样都服从正态分布:

组内模型:$\phi_{j}=\{\theta_{j},\sigma^2\},\; p(y|\phi_{j})=normal(\theta_{j},\sigma^2) $

组间模型:$\psi=\{\mu,\tau^2\},\; p(\theta_{j}|\psi)=normal(\mu,\tau^2) $

模型中固定的未知参数是,和,为了方便,我们假设这些参数服从标准半共轭正态分布和逆伽马分布:

$1/\sigma^2 \sim gamma(\nu_{0}/2,\nu_{0}\sigma_{0}^2/2) $

$1/\tau^2 \sim gamma(\eta_{0}/2,\eta_{0} \tau_{0}^2/2) $

$\mu \sim normal(\mu_{0}, \gamma_{0}^2) $

模型结构如下:

后验推断:

一元正态模型重要结论:

结论1 假设抽样模型为$\{Y_{1},...,Y_{n}|\theta,\sigma^2\}\sim^{i.i.d.}normal(\theta,\sigma^2) $,如果$\theta \sim normal(\mu_{0},\tau_{0}^2) $,$1/\sigma^2 \sim gamma(\nu_{0}/2,\nu_{0}\sigma_{0}^2/2)$,那么$p(\theta|\sigma^2,y_{1},...,y_{n})\sim normal(\mu_{n},\tau_{n}^2) $,其中$\mu_{n}=\frac{\mu_{0}/\tau{0}^2+n\bar{y}/\sigma^2}{1/\tau^2+n/\sigma^2} $,$\tau_{n}^2=\Big (\frac{1}{\tau_{0}^2}+\frac{n}{\sigma^2}\Big)^{-1} $

结论2假设抽样模型为$\{Y_{1},...,Y_{n}|\theta,\sigma^2\}\sim^{i.i.d.}normal(\theta,\sigma^2) $,如果$\theta \sim normal(\mu_{0},\tau_{0}^2) $,$1/\sigma^2 \sim gamma(\nu_{0}/2,\nu_{0}\sigma_{0}^2/2)$,那么$p(\sigma^2|\theta,y_{1},...,y_{n})\sim inverse-gamma(\nu_{n}/2,\nu_{n}\sigma_{n}^2(\theta)/2) $,其中$\nu_{n}=\nu_{0}+n $,$\sigma_{n}^2(\theta)=\frac{1}{\nu_{n}[\nu_{0}\sigma_{0}^2+ns_{n}^2(\theta)]} $,$s_{n}^2(\theta)=\sum (y_{i}-\theta)^2/n $

系统中的未知参数包括组内均值$(\theta_{1},...,\theta_{m}) $和方差$\sigma^2 $以及组间均值$\mu $和方差$\tau^2 $,参数的联合后验推理可以通过构造$Gibbs $采样器来进行估计$p(\theta,...,\theta,\mu,\tau^2,\sigma^2|y_{1},...,y_{m}) $,$Gibbs $采样器通过从每个参数的全条件分布迭代采样来进行计算。

$$
\begin{aligned}
&p(\theta_{1},...,\theta_{m},\mu,\tau^2,\sigma^2|y_{1},...,y_{m})\\
&\propto p(\mu,\tau^2,\sigma^2)\times p(\theta_{1},...,\theta_{m}|\mu,\tau^2,\sigma^2)\times p(y_{1},...,y_{m}|\theta_{1},...,\theta_{m},\mu,\tau^2,\sigma^2)\\
&=p(\mu)p(\tau^2)p(\sigma^2)\Big \{\prod_{j=1}^m p(\theta_j|\mu,\tau^2)\Big\} \Big\{\prod_{j=1}^m \prod_{i=1}^n p(y_{i,j}|\theta_j,\sigma^2) \Big\}
\end{aligned}
$$

 根据随机变量之间的依赖关系,我们能够得到各个变量的全条件分布

$$
p(\mu|\theta_{1},...,\theta_{m},\tau^2,\sigma^2,y_{1},...,y_{m})\propto p(\mu)\prod p(\theta_{j}|\mu,\tau^2)
$$

$$
p(\tau^2|\theta_{1},...,\theta_{m},\tau^2,\sigma^2,y_{1},...,y_{m})\propto p(\tau^2)\prod p(\theta_{j}|\mu,\tau^2)
$$

$$
p(\theta_{j}|\mu,\tau^2,\sigma^2,y_{1},...,y_{m})\propto p(\theta_{j}|\mu,\tau^2)\prod_{i=1}^{n_{j}} p(y_{i,j}|\theta_{j},\sigma^2)
$$

$$
\begin{aligned}
p(\sigma^2|\theta_{1},...,\theta_{m},y_{1},...,y_{m})&\propto p(\sigma^2)\prod_{j=1}^m \prod_{i=1}^{n_{j}}p(y_{i,j}|\theta_{j},\sigma^2)\\
&\propto (\sigma^2)^{-\nu_{0}/2+1}e^{-\frac{\nu_{0}\sigma_{0}^2}{2\sigma^2}}(\sigma^2)^{-\sum n_{j}/2}e^{-\frac{\sum \sum (y_{i,j}-\theta_{j})^2}{2\sigma^2}}
\end{aligned}
$$

从而根据上面两个结论,我们可得:

$$
\{\mu|\theta_{1},...,\theta_{m},\tau^2\}\sim normal \Big(\frac{m\bar{\theta}/\tau^2+\mu_{0}/\gamma_{0}^2}{m/\tau^2+1/\gamma_{0}^2},[m/\tau^2+1/\gamma_{0}^2]^{-1} \Big)
$$

$$
\{1/\tau^2|\theta_{1},...,\theta_{m},\mu\}\sim gamma \Big(\frac{\eta_{0}+m}{2},\frac{\eta_{0}\tau_{0}^2+\sum(\theta_{j}-\mu)^2}{2}\Big)
$$

$$
\{\theta_{j}|y_{1,j},...,y_{n,j},\sigma^2\}\sim normal\Big(\frac{n_{j}\bar{y}_{j}/\sigma^2+1/\tau^2}{n_{j}/\sigma^2+1/\tau^2},[n_{j}/\sigma^2+1/\tau^2]^{-1}\Big)
$$

$$
\{1/\sigma^2|\theta,y_{1},...,y_{n}\sim gamma \Big(\frac{1}{2}[\nu_{0}+\sum_{j=1}^m n_{j}],\frac{1}{2}[\nu_{0}\sigma_{0}^2+\sum_{j=1}^m \sum_{i=1}^{n_{j}}(y_{i,j}-\theta_{j})^2]\Big)\}
$$

计算流程:

  1. 设定先验分布参数

    $(\nu_{0},\sigma_{0}^2)\rightarrow p(\sigma^2) $

    $(\eta_{0},\tau_{0}^2) \rightarrow p(\tau_{0}^2) $

    $(\mu_{0},\gamma_{0}^2)\rightarrow p(\mu) $

  2.从全条件分布中迭代采样每个未知参数进行参数后验估计,即给定参数当前状态$\{\theta_{1}^{(s)},...,\theta_{m}^{(s)},\mu^{(s)},\tau^{2(s)},\sigma^{2(s)}\} $,新状态按下列方式获得:

    $sample:\;\mu^{(s+1)}\sim p(\mu|\theta_{1}^{(s)},...,\theta_{m}^{(s)},\tau^{2(s)}) $

    $sample:\;\tau^{2(s+1)}\sim p(\tau^2|\theta_{1}^{(s)},...,\theta_{m}^{(s)},\mu^{(s+1)})  $

    $sample:\;\sigma^{2(s+1)}\sim p(\sigma^2|\theta_{1}^{(s)},...,\theta_{m}^{(s)},y_{1},...,y_{m})  $

    $for\;each\;j\in\{1,...,m\},\;sample\;\theta_{j}^{(s+1)}\sim p(\theta_{j}|\mu^{(s+1)},\tau^{2(s+1)},\sigma^{2(s+1)},y_{i}) $

  直到参数收敛,从而得到系统参数。

进一步推广,如果组间的均值不同,组间的方差同样是不同的,此时令$\sigma_{j}^2 $为第$j $组的方差,那么我们的采样模型变为:$\{Y_{1,j},...,Y_{n_{j},j}\}\sim^{i.i.d.} normal(\theta_{j},\sigma_{j}^2) $,$\theta_{j} $的全条件分布为:$\{\theta_{j}|y_{1,j},...,y_{n_{j},j},\sigma_{j}^2\}\sim normal \Big(\frac{n_{j}\bar{y}_{j}/\sigma_{j}^2+1/\tau^2}{n_{j}/\sigma_{j}^2+1/\tau^2},[n_{j}/\sigma_{j}^2+1/\tau^2]^{-1}\Big) $

如何估计$\sigma_{j}^2 $呢?我们首先假设:

$$
\sigma_{1}^2,...,\sigma_{m}^2\sim^{i.i.d.}gamma(\nu_{0}/2,\nu_{0}\sigma_{0}^2/2)
$$

其全条件分布为:

$$
\{1/\sigma_{j}^2|y_{1,j},...,y_{n_{j},j},\theta_{j}\}\sim gamma \Big([\nu_{0}+n_{j}]/2,[\nu_{0}\sigma_{0}^2+\sum(y_{i,j}-\theta_{j})^2]/2)\Big )
$$

$\sigma_{1}^2,...,\sigma_{m}^2 $的值同样可以利用$Gibbs $采样迭代求解。

如果$\nu_{0} $和$\sigma_{0}^2 $是固定的话, $\sigma_{j}^2 $之间相互独立,也就是说$\sigma_{m}^2 $的值不能由$\sigma_{1}^2,...,\sigma_{m-1}^2 $来进行估计,但如果$\sigma_{m}^2 $所在的组别样本量很少的话,我们应该考虑采用$\sigma_{1}^2,...,\sigma_{m-1}^2 $的数据来提高对$\sigma_{m}^2 $的估计,那该如何做呢?其实我们要做的是可以把$\nu_{0} $和$\sigma_{0}^2 $作为估计值,系统整体的结构为:

从而我们的未知参数有:组内采样分布$\{(\theta_{1},\sigma_{1}^2),...,(\theta_{m},\sigma_{m}^2)\} $,组间均值异质性参数$\{\mu,\tau^2\} $,组间方差异质性参数$\{\nu_{0},\sigma_{0}^2\} $,$\{\mu,\tau^2\} $和$\{(\theta_{1},\sigma_{1}^2),...,(\theta_{m},\sigma_{m}^2)\} $的求法都已给出,现在讨论对$\{\nu_{0},\sigma_{0}^2\} $的估计。假定$\sigma_{0}^2 $服从共轭类的先验分布,$p(\sigma_{0}^2)\sim gamma(a,b) $,那么就有

$$
p(\sigma_{0}^2|\sigma_{1}^2,...,\sigma_{m}^2,\nu_{0})=dgamma(a+\frac{1}{2}m\nu_{0},b+\frac{1}{2}\sum_{j=1}^m (1/\sigma_{j}^2))
$$

简单$\nu_{0} $的共轭先验分布是不存在的,但如果我们将其限制为一个整数,问题就变得简单了。假定$\nu_{0} $的先验服从$\{1,2,...\} $上的几何分布使得$p(\nu_{0})\propto e^{-\alpha \nu_{0}} $,则

$$
\begin{aligned}
& p(\nu_{0}|\sigma_{0}^2,\sigma_{1}^2,...,\sigma_{m}^2)\\
& \propto p(\nu_{0})\times p(\sigma_{1}^2,...,\sigma_{m}^2|\nu_{0},\sigma_{0}^2)\\
& \propto \Big(\frac{(\nu_{0}\sigma_{0}^2/2)^{\nu_{0}/2}}{\Gamma(\nu_{0}/2)}\Big)^m \Big(\prod_{j=1}^m \frac{1}{\sigma_{j}^2}\Big)^{\nu_{0}/2-1}\times exp\{-\nu_{0}(\alpha+\frac{1}{2}\sigma_{0}^2\sum (1/\sigma_{j}^2))\}
\end{aligned}
$$

从而问题得到求解。

 

参考文献:Hoff, Peter D. A first course in Bayesian statistical methods. Springer Science & Business Media, 2009.

posted @ 2016-07-13 18:53  PhHuangXiao  阅读(6278)  评论(0编辑  收藏  举报