【5】多元正态分布的一些性质

上节我们通过四种方式定义了一个服从多维正态分布的随机向量,而这一节我们开始讨论随机向量的独立性和条件分布。

  • \(p\)维随机向量\(X\sim N_p(\mu,\Sigma)\)进行分割:

\[X= \left[ \begin{array}{c} X^{(1)}_r\\ X^{(2)}_{p-r} \end{array} \right], \mu= \left[ \begin{array}{c} \mu^{(1)}_r\\ \mu^{(2)}_{p-r} \end{array} \right], \Sigma= \left[ \begin{array}{c|c} \Sigma_{11} &\Sigma_{12}\\ \hline \Sigma_{21} &\Sigma_{22} \end{array} \right]>0,(\Sigma_{11}为r\times r方阵) \]

一、独立性

\(p\) 维随机向量 \(X\sim N_p(\mu,\Sigma)\),

\[X= \left[ \begin{array}{c} X^{(1)}\\ X^{(2)} \end{array} \right]\sim \left( \left[ \begin{array}{c} \mu^{(1)}\\ \mu^{(2)} \end{array} \right], \left[ \begin{array}{cc} \Sigma_{11} &\Sigma_{12}\\ \Sigma_{21} &\Sigma_{22} \end{array} \right] \right) \]

\[X^{(1)}与 X^{(2)}相互独立\ \leftrightarrows\ \Sigma_{12}=O \]

  • 这则充要条件说的是,对于一个服从正态分布的随机向量,若将其划分为两部分,那两个子量互不相关的充要条件是他们的协方差为\(O\).

(证明)

\(\Sigma_{12}=O\),则\(X\)的联合密度函数为:

\[\begin{align} f(x^{(1)},x^{(2)})=& \frac1{(2\pi)^{p/2}|\Sigma|^{1/2}}exp\left(-\frac12(x-\mu)' \left[ \begin{array}{cc} \Sigma_{11}&O\\ O&\Sigma_{22} \end{array} \right]^{-1} (x-\mu) \right)\\ =& \frac1{(2\pi)^{r/2}|\Sigma_{11}|^{1/2}}exp\left(-\frac12(x^{(1)}-\mu^{(1)})' \Sigma_{11}^{-1} (x^{(1)}-\mu^{(1)}) \right)\\ &\cdot \frac1{(2\pi)^{(p-r)/2}|\Sigma_{22}|^{1/2}}exp\left(-\frac12(x^{(2)}-\mu^{(2)})' \Sigma_{22}^{-1} (x^{(2)}-\mu^{(2)}) \right)\\ =&f_1(x^{(1)})\cdot f_2(x^{(2)}) \end{align} \]

因此\(X^{(1)},X^{(2)}\)相互独立。

(推论)

  • \(r_i\geq1,(i=1,\dots,k)\),且\(r_1+r_2+\dots+r_k=p\),则有

\[X= \left[ \begin{array}{c} X^{(1)}\\ \vdots\\ X^{(k)} \end{array} \right]\sim N_p \left( \left[ \begin{array}{c} \mu^{(1)}\\ \vdots\\ \mu^{(k)} \end{array} \right], \left[ \begin{array}{ccc} \Sigma_{11} &\cdots &\Sigma_{1k}\\ \vdots&&\vdots\\ \Sigma_{k1} &\cdots &\Sigma_{kk} \end{array} \right]_{p\times p} \right) \]

\(X^{(1)},X^{(2)},\dots,X^{(k)}\)相互独立 \(\leftrightarrows\) \(\Sigma_{ij}=O,(i\neq j)\).

  • \(X=(X_1,\dots,X_p)'\sim N_p(\mu,\Sigma)\),若\(\Sigma\)为对角矩阵,则\(X_1,\dots,X_p\)相互独立。

二、条件分布

对于一个二元正态分布,由条件分布的定义我们知道:当\(X_2\)给定时,\(X_1\)的条件密度为:

\[f_1(x_1|x_2)=\frac{f(x_1,x_2)}{f_2(x_2)} \]

由于我们还不知道\(f(x_1|x_2)\)的通式,但由二元正态分布的联合密度函数我们有:

\[f(x_1,x_2) =(*)\\ =\frac{1}{2\pi\sigma_1\sigma_2\sqrt{1-\rho^2}}exp\left\{-\frac{1}{2(1-\rho^2)}[(\frac{x_1-\mu_1}{\sigma_1})^2-2\rho(\frac{x_1-\mu_1}{\sigma_1})(\frac{x_2-\mu_2}{\sigma_2})+(\frac{x_2-\mu_2}{\sigma_2})^2]\right\} \]

简单变形,在指数项内\(\left(+\rho^2(\frac{x_2-\mu_2}{\sigma_2})^2-\rho^2(\frac{x_2-\mu_2}{\sigma_2})^2\right)\)则可得:

\[(*)=\frac{1}{2\pi\sigma_1\sigma_2\sqrt{1-\rho^2}}exp\left\{-\frac{1}{2(1-\rho^2)}[(\frac{x_1-\mu_1}{\sigma_1})^2-2\rho(\frac{x_1-\mu_1}{\sigma_1})(\frac{x_2-\mu_2}{\sigma_2})\\+(1-\rho^2)(\frac{x_2-\mu_2}{\sigma_2})^2+\rho^2(\frac{x_2-\mu_2}{\sigma_2})^2]\right\} \]

由指数运算性质,我们可以将\(Exp\left[-\frac1{2(1-\rho^2)}(1-\rho^2)(\frac{x_2-\mu_2}{\sigma_2})^2\right]\)项提出:

\[(*)=\frac{1}{\sqrt{2\pi}\sigma_2}exp\left\{-\frac{1}{2}(\frac{x_2-\mu_2}{\sigma_2})^2\right\}\\ \cdot\frac{1}{\sqrt{2\pi}\sigma_1\sqrt{1-\rho^2}}exp\left\{-\frac{1}{2(1-\rho^2)}[(\frac{x_1-\mu_1}{\sigma_1})^2-2\rho(\frac{x_1-\mu_1}{\sigma_1})(\frac{x_2-\mu_2}{\sigma_2})+\rho^2(\frac{x_2-\mu_2}{\sigma_2})^2]\right\}\\ \]

可以看到第一项就是服从\(X_2\sim N(\mu_2,\sigma_2^2)\)的一元概率密度函数\(f_2(x_2)\),而第二项经过简单整理可以得出下式:

\[(*)=f_2(x_2)\cdot\frac{1}{\sqrt{2\pi}\sigma_1\sqrt{1-\rho^2}}\cdot exp\left\{-\frac{1}{2(1-\rho^2)}[(\frac{x_1-\mu_1}{\sigma_1})-\rho(\frac{x_2-\mu_2}{\sigma_2})]^2\right\} \]

由于\(k^2(a+\frac{a}{k})^2=(ka+b)^2\),经过简单整理得:

\[(*)=f_2(x_2)\cdot\frac{1}{\sqrt{2\pi}\sigma_1\sqrt{1-\rho^2}}\cdot exp\left\{-\frac{1}{2(1-\rho^2)\sigma_1^2}[x_1-\mu_1-\rho\frac{\sigma_1}{\sigma_2}(x_2-\mu_2)]^2\right\}\\ \]

于是我们得到了二元正态分布全概率公式:

\[f(x_1,x_2)=f_2(x_2)\cdot f(x_1|x_2) \]

其中,\(f(x_1|x_2)\)为给定\(x_2\)条件下,\(x_1\)的条件概率密度函数:

\[f(x_1|x_2)=\frac{1}{\sqrt{2\pi}\sigma_1\sqrt{1-\rho^2}}\cdot exp\left\{ -\frac{1}{2(1-\rho^2)\sigma_1^2}[x_1-\left(\mu_1 +\rho\frac{\sigma_1}{\sigma_2}(x_2-\mu_2)\right)]^2 \right\}\\ \]

则可以得到\((X_1|X_2)\)服从正态分布,且:

\[(X_1|X_2)\sim N_1\left(\mu_1+\rho\frac{\sigma_1}{\sigma_2}(x_2-\mu_2),\sigma^2(1-\rho^2)\right) \]

将其推广到多维

\[X= \left[ \begin{array}{c} X^{(1)}_r\\ X^{(2)}_{p-r} \end{array} \right]\sim N_p(\mu,\Sigma),(\Sigma>0) \]

则当\(X^{(2)}\)给定时,\(X^{(1)}\)条件分布为:

\[(X^{(1)}|X^{(2)})\sim N_r(\mu_{1\cdot2},\Sigma_{11\cdot2}) \]

其中

\[\mu_{1\cdot2}=\mu^{(1)}+\Sigma_{12}\Sigma_{22}^{-1}(x^{(2)}-\mu^{(2)})\\ \Sigma_{11\cdot2}=\Sigma_{11}-\Sigma_{12}\Sigma_{22}^{-1}\Sigma_{21} \]

下附证明,而这段证明对于做题事实上非常具有启发性,后面会附上书上的一道课后习题:

(引理-\(\Sigma\)分块求逆公式)

\[\left[ \begin{array}{c|c} \Sigma_{11}&\Sigma_{12}\\\hline \Sigma_{21}&\Sigma_{22} \end{array} \right]^{-1} =\Sigma^{-1}=\left[ \begin{array}{c|c} \Sigma_{11.2}^{-1}&-\Sigma_{11.2}^{-1}\Sigma_{12}\Sigma_{22}^{-1}\\\hline -\Sigma_{22}^{-1}\Sigma_{21}\Sigma_{11.2}^{-1}&\Sigma_{22}^{-1}+\Sigma_{22}^{-1}\Sigma_{21}\Sigma_{11.2}^{-1}\Sigma_{12}\Sigma_{22}^{-1} \end{array} \right] \]

其中:\(\Sigma_{11.2}=\Sigma_{11}-\Sigma_{12}\Sigma_{22}^{-1}\Sigma_{21}\).

(证明)

我们若想求出\((X^{(1)}|X^{(2)})\)的分布只需要构造出其概率其密度函数,而由条件分布的定义可知:

\[f(X_1,X_2)=f(X_1|X_2)f(X_2) \]

而我们可以通过求解二元条件分布的时候使用的方法一样,通过构造一个非奇异的线性变换:

\[\begin{align} Z=\left[\begin{array}{c}Z^{(1)}\\Z^{(2)}\end{array}\right]=&\left[\begin{array}{c}X^{(1)}-\Sigma_{12}\Sigma_{22}^{-1}X^{(2)}\\X^{(2)}\end{array}\right]\\ =&\left[\begin{array}{c|c}I_r&-\Sigma_{12}\Sigma_{22}^{-1}\\\hline O&I_{p-r}\end{array}\right]\left[\begin{array}{c}X^{(1)}\\X^{(2)}\end{array}\right]\\ =&BX \end{align} \]

则我们可以得出\(Z\sim N_p(B\mu,B\Sigma B')\),即:

\[\begin{align} B\Sigma B'=&\left[\begin{array}{c|c}I_r&-\Sigma_{12}\Sigma_{22}^{-1}\\\hline O&I_{p-r}\end{array}\right]\left[ \begin{array}{c|c} \Sigma_{11}&\Sigma_{12}\\\hline \Sigma_{21}&\Sigma_{22} \end{array} \right]\left[\begin{array}{c|c}I_r&O\\\hline -\Sigma_{12}\Sigma_{22}^{-1}&I_{p-r}\end{array}\right]\\ =&\left[\begin{array}{c|c}\Sigma_{11.2}&O\\\hline O&\Sigma_{22}\end{array}\right] \end{align} \]

于是我们可以得出\(Z^{(1)},Z^{(2)}\)相互独立的结论,于是就可以写出\(Z\)的联合密度函数\(g(z^{(1)},z^{(2)})\),同时应注意到\(Z^{(2)}=X^{(2)}\)

\[g(z^{(1)},z^{(2)})=g_1(z^{(1)})g_2(z^{(2)})=g_1(z^{(1)})f_2(z^{(2)}) \]

另外,因为\(Z=BX\),利用雅可比行列式,我们可以用\(g(z)\)来表示\(X\)的密度函数\(f(x)\):

\[\begin{align} f(x^{(1)},x^{(2)})=&g(Bx)\cdot J(z\to x)\\ =&g_1(x^{(1)}-\Sigma_{12}\Sigma_{22}^{-1}x^{(2)})f_2(x^{(2)}) \end{align} \]

再次我们进行总结:

  • 我们构造了一个非奇异线性变换,并且证明了\(Z\)是服从正态分布的随机变量,而且\(Z^{(1)},Z^{(2)}=X^{(2)}\)相互独立;
  • 还是通过线性变换的性质,我们借助雅可比行列式,将\(X,Z\)的密度函数建立起了等式关系。

于是我们通过条件分布的定义,可以轻松写出变量\((X_1|X_2)\)的密度函数为:

\[\begin{align} f_1(x^{(1)}|x^{(2)})=&\frac{f(x^{(1)},x^{(2)})}{f_2(x^{(2)})}=g_1(x^{(1)}-\Sigma_{12}\Sigma_{22}^{-1}x^{(2)})\\ =&\frac{1}{(2\pi)^{r/2}|\Sigma_{11.2}|^{1/2}}Exp\left[-\frac12(x^{(1)}-\mu_{1.2})'\Sigma_{11.2}^{-1}(x^{(1)}-\mu_{1.2})\right] \end{align} \]

由定义得知,该式符合正态分布,即:

\[(X^{(1)}|X^{(2)})\sim N_r(\mu_{1.2},\Sigma_{11.2}) \]

重要推论!!

  • \(X^{(1)}-\Sigma_{12}\Sigma_{22}^{-1}X^{(2)}\)\(X^{(2)}\)相互独立;
  • \(X^{(2)}-\Sigma_{21}\Sigma_{11}^{-1}X^{(1)}\)\(X^{(1)}\)相互独立;
  • \((X^{(2)}|X^{(1)})\sim N_{p-r}(\mu_{2.1},\Sigma_{22.1})\)

\[\mu_{2.1}=\mu^{(2)}+\Sigma_{21}\Sigma_{11}^{-1}(x^{(1)}-\mu^{(1)})\\ \Sigma_{22.1}=\Sigma_{22}-\Sigma_{21}\Sigma_{11}^{-1}\Sigma_{12} \]

posted @ 2020-02-22 15:17  ExplodedVegetable  阅读(4054)  评论(0编辑  收藏  举报