【8】多元正态总体样本的极大似然估计
【8】p-Normal Distribution's MLE
考虑\(p\)元正态总体\(X\sim N_p(\mu,\Sigma)\),设\(X_{(i)}=(x_{i1},\dots,x_{ip})',\)\((i=1,\dots,n)\)为\(p\)元正态总体\(X\)的简单随机样本,则有观测数据库:
\[X=
\left(
\begin{array}
{cccc}
x_{11} & x_{12} & \dots & x_{1p}\\
x_{21} & x_{22} & \dots & x_{2p}\\
\vdots & \vdots & & \vdots \\
x_{n1} & x_{n2} & \dots & x_{np}\\
\end{array}
\right)
\]
是一个随机阵。
引理(1)
对于矩阵\(A_{m\times n}\),\(A_{n\times m}\)有:\(tr(AB)=tr(BA)\)。
\[\begin{align}
tr(AB)=&\sum_{i=1}^m(AB)_{ii}\\
=&\sum_{i=1}^m(\sum_{j=1}^na_{ij}b_{ji})\\
同理:tr(BA)=&\sum_{i=1}^n(\sum_{j=1}^mb_{ij}a_{ji})
\end{align}
\]
由于\(\Sigma\)的可交换性,因此:
\[tr(AB)=\sum_{i=1}^m(\sum_{j=1}^na_{ij}b_{ji})=\sum_{i=1}^n(\sum_{j=1}^mb_{ij}a_{ji})=tr(BA)
\]
于是可以推广到多个矩阵相乘:
\[tr(\prod_{i=1}^nA_i)=tr(A_n\prod_{i=1}^{n-1}A_i)
\]
似然函数\(L(\mu,\Sigma)\)
对于样本\(X_{(i)}=(x_{i1},\dots,x_{ip})',\)\((i=1,\dots,n)\),其联合密度函数为:
\[f(x_{(i)})=\frac1{(2\pi)^{p/2}|\Sigma|^{1/2}}exp\left\{-\frac12(x_{(i)}-\mu)'\Sigma^{-1}(x_{(i)}-\mu)\right\}
\]
而由似然函数定义:
\[\begin{align}
L(\mu,\Sigma)=&\prod_{i=1}^nf(x_{(i)})\\
=&\prod_{i=1}^n\frac1{(2\pi)^{p/2}|\Sigma|^{1/2}}exp\left\{-\frac12(x_{(i)}-\mu)'\Sigma^{-1}(x_{(i)}-\mu)\right\}\\
=&\frac1{(2\pi)^{np/2}|\Sigma|^{n/2}}exp\left\{-\frac12\sum_{i=1}^n(x_{(i)}-\mu)'\Sigma^{-1}(x_{(i)}-\mu)\right\}
\end{align}
\]
由于:
\[(x_{(i)}-\mu)'\Sigma^{-1}(x_{(i)}-\mu)=C_0(是一个数)
\]
所以:
\[(x_{(i)}-\mu)'\Sigma^{-1}(x_{(i)}-\mu)=tr\{(x_{(i)}-\mu)'\Sigma^{-1}(x_{(i)}-\mu)\}=tr(C_0)
\]
于是:
\[\begin{align}
L(\mu,\Sigma)
=&\frac1{(2\pi)^{np/2}|\Sigma|^{n/2}}exp\left\{-\frac12\sum_{i=1}^n(x_{(i)}-\mu)'\Sigma^{-1}(x_{(i)}-\mu)\right\}\\
=&\frac1{(2\pi)^{np/2}|\Sigma|^{n/2}}exp\left\{-\frac12\sum_{i=1}^ntr[(x_{(i)}-\mu)'\Sigma^{-1}(x_{(i)}-\mu)]\right\}\\
(由引理)=&\frac1{(2\pi)^{np/2}|\Sigma|^{n/2}}exp\left\{-\frac12\sum_{i=1}^ntr[\Sigma^{-1}(x_{(i)}-\mu)(x_{(i)}-\mu)']\right\}\\
=&\frac1{(2\pi)^{np/2}|\Sigma|^{n/2}}exp\left\{tr(-\frac12\Sigma^{-1}\sum_{i=1}^n(x_{(i)}-\mu)(x_{(i)}-\mu)')\right\}\\
\end{align}
\]
其中:
\[\begin{align}
&\sum_{i=1}^n(x_{(i)}-\mu)(x_{(i)}-\mu)'\\
=&\sum_{i=1}^n(x_{(i)}-\overline{X}+\overline{X}-\mu)(x_{(i)}-\overline{X}+\overline{X}-\mu)'\\
=&\sum_{i=1}^n(x_{(i)}-\overline{X})(x_{(i)}-\overline{X})'+n(\overline{X}-\mu)(\overline{X}-\mu)'\\
=&A+n(\overline{X}-\mu)(\overline{X}-\mu)'
\end{align}
\]
带回似然函数可得:
\[\begin{align}
L(\mu,\Sigma)=&\frac1{(2\pi)^{np/2}|\Sigma|^{n/2}}etr\left\{-\frac12\Sigma^{-1}\sum_{i=1}^n(x_{(i)}-\mu)(x_{(i)}-\mu)'\right\}\\
=&\frac1{(2\pi)^{np/2}|\Sigma|^{n/2}}etr\left\{-\frac12\Sigma^{-1}(A+n(\overline{X}-\mu)(\overline{X}-\mu)')\right\}\\
(两边求对数)\ln{L(\mu,\Sigma)}=&
-\frac{np}2\ln{(2\pi)}-\frac{n}2\ln{|\Sigma|}-\frac12tr\left[\Sigma^{-1}(A+n(\overline{X}-\mu)(\overline{X}-\mu)')\right]\\
=&
-\frac{np}2\ln{(2\pi)}-\frac{n}2\ln{|\Sigma|}-\frac12tr\left[\Sigma^{-1}A+\Sigma^{-1}n(\overline{X}-\mu)(\overline{X}-\mu)'\right]\\
=&
-\frac{np}2\ln{(2\pi)}-\frac{n}2\ln{|\Sigma|}-\frac12tr[\Sigma^{-1}A]-\frac12tr[\Sigma^{-1}n(\overline{X}-\mu)(\overline{X}-\mu)']\\
(仅当\mu=\overline{X}时取等号)\leq&-\frac{np}2\ln{(2\pi)}-\frac{n}2\ln{|\Sigma|}-\frac12tr[\Sigma^{-1}A]\\
\end{align}
\]
求似然函数极大值:
方法一
- (引理二)
设\(B\)为\(p\)阶正定矩阵,则:\(tr(B)-\ln{|B|}\geq p\).
\[\ln{|B|}=\sum_{i=1}^p\ln{\lambda_i}=\sum_{i=1}^p\ln{(1+\lambda_i-1)}\leq\sum_{i=1}^p(\lambda_i-1)=tr(B)-p
\]
则有:
\[tr{\,(B)}-\ln{|B|}\ge p\tag{引理,证毕}
\]
于是观察\(\ln{L(\mu,\Sigma)}\)得:
\[\begin{align}
\ln{L(\mu,\Sigma)}=&
-\frac{np}2\ln{(2\pi)}-\frac{n}2\ln{|\Sigma|}-\frac12tr[\Sigma^{-1}A]\\
=&-\frac{np}2\ln{(2\pi)}-\frac{n}2\left[\ln{|\Sigma|}+tr[\Sigma^{-1}\frac{A}n]\right]\\
=&-\frac{np}2\ln{(2\pi)}-\frac{n}2\left[tr[\Sigma^{-1}\frac{A}n]-\ln{|\Sigma^{-1}\frac An|}+\ln{\frac{A}{n}}\right]\\
\leq&-\frac{np}2\ln{(2\pi)}-\frac n2(p+\ln{|\frac An|})
\end{align}
\]
以上不等式的等号当且仅当\(\Sigma^{-1}\frac{A}n=I_p\)时成立,于是\(\Sigma=\frac{A}n\),则有:
\[\ln{L(\overline{X},\frac1nA)}=\max_{\overline{X},\Sigma>0}\ln{L(\overline{X},\Sigma)}=-\frac{np}2(1+\ln(2\pi))-\frac n2\ln{|\frac{A}n|}
\]
其中
\[\overline{X}=\frac1n\sum_{i=1}^nX_{(i)}\\
\Sigma=\frac1n\sum_{i=1}^n(x_{(i)}-\overline{X})(x_{(i)}-\overline{X})'
\]