高斯分布
一维正态分布
常用英文:univariate Gaussian, normal distribution, Gaussian distribution等
定义
如果一个随机变量的概率密度函数为:
则称一个随机变量\(\boldsymbol{X}\)服从正态分布,记为\({\displaystyle X\sim N(\mu ,\sigma ^{2})}\),其中的\(\mu\)和\(\sigma^{2}\)分别称为均值和方差。
当\(\mu = 0\),\(\sigma ^ 2 = 1\)时称为标准正态分布。
简单性质
- 一般正态分布可以通过标准正态分布线性变换得到
- 如果\({\displaystyle X\sim N(\mu _{X},\sigma _{X}^{2})}\)和\({\displaystyle Y\sim N(\mu _{Y},\sigma _{Y}^{2})}\)是两个相互独立的标准正态分布,那么:
- \({\displaystyle U=X+Y\sim N(\mu _{X}+\mu _{Y},\sigma _{X}^{2}+\sigma _{Y}^{2})}\)
- \({\displaystyle V=X-Y\sim N(\mu _{X}-\mu _{Y},\sigma _{X}^{2}+\sigma _{Y}^{2})}\)
- \(U\)和\(V\)也相互独立(如果\(X\)与\(Y\)的方差相等)
多维正态分布
英文:Multivariate normal distribution
定义
如果多维随机变量\({\displaystyle \ \mathbf{X}=[X_{1},\dots ,X_{k}]^{T}}\)的概率密度函数由下式给出:
说明\(\mathbf{X}\)服从多元正态分布,记为 \(\displaystyle \mathbf {X} \ \sim \ {\mathcal {N}}({\boldsymbol {\mu }},\,{\boldsymbol {\Sigma }})\),其中要求其协方差矩阵\({\boldsymbol {\Sigma }}\)是正定矩阵,\(x, \mu \in \mathbb{R}^n\)
- 其中: \({\displaystyle {\boldsymbol {\mu }}=\operatorname {E} [\mathbf {X} ]=[\operatorname {E} [X_{1}],\operatorname {E} [X_{2}],\ldots ,\operatorname {E} [X_{k}]]^{\rm {T}}}\)称为均值向量。
- \({\displaystyle {\boldsymbol {\Sigma }}=:\operatorname {E} [(\mathbf {X} -{\boldsymbol {\mu }})(\mathbf {X} -{\boldsymbol {\mu }})^{\rm {T}}]=[\operatorname {Cov} [X_{i},X_{j}];1\leq i,j\leq k]}\)称为协方差矩阵,其中\({\displaystyle \Sigma _{i,j}=:\operatorname {E} [(X_{i}-\mu _{i})(X_{j}-\mu _{j})]=\operatorname {Cov} [X_{i},X_{j}]}\)
补充定义
Standard normal random vector
如果\({\displaystyle \mathbf {X} =(X_{1},\ldots ,X_{k})^{\mathrm {T} }}\)被称为standard normal random vector ,如果他的每个分量相互独立且都是标准正态分布,即\({\displaystyle X_{n}\sim \ {\mathcal {N}}(0,1)},n=1,...,k\)
Centered normal random vector
A real random vector \({\displaystyle \mathbf {X} =(X_{1},\ldots ,X_{k})^{\mathrm {T} }}\)is called a centered normal random vector if there exists a deterministic \({\displaystyle k\times \ell }\) matrix \({\displaystyle {\boldsymbol {A}}}\) such that \({\displaystyle {\boldsymbol {A}}\mathbf {Z} }\) has the same distribution as \({\displaystyle \mathbf {X} }\) where \({\displaystyle \mathbf {Z} }\) is a standard normal random vector with \({\displaystyle \ell }\) components
Normal random vector
有定理
\[{\displaystyle \mathbf {X} \ \sim \ {\mathcal {N}}(\mathbf {\mu } ,{\boldsymbol {\Sigma }})\quad \iff \quad {\text{there exist }}\mathbf {\mu } \in \mathbb {R} ^{k},{\boldsymbol {A}}\in \mathbb {R} ^{k\times \ell }{\text{ such that }}\mathbf {X} ={\boldsymbol {A}}\mathbf {Z} +\mathbf {\mu } {\text{ for }}Z_{n}\sim \ {\mathcal {N}}(0,1),n=1,...,k,{\text{i.i.d.}}} \]
且协方差矩阵\({\displaystyle {\boldsymbol {\Sigma }}={\boldsymbol {A}}{\boldsymbol {A}}^{\mathrm {T} }}\)
性质
性质1:
\(\begin{array}{l}{\text { Theorem 1. Let } X \sim \mathcal{N}(\mu, \Sigma) \text { for some } \mu \in \mathbf{R}^{n} \text { and } \Sigma \in \mathbf{S}_{++}^{n} . \text { Then, there exists a matrix}} \\ {B \in \mathbf{R}^{n \times n} \text { such that if we define } Z=B^{-1}(X-\mu), \text { then } Z \sim \mathcal{N}(0, I)}\end{array}\)
性质2:
\({\displaystyle D_{\text{KL}}({\mathcal {N}}_{0}\|{\mathcal {N}}_{1})={1 \over 2}\left\{\operatorname {tr} \left({\boldsymbol {\Sigma }}_{1}^{-1}{\boldsymbol {\Sigma }}_{0}\right)+\left({\boldsymbol {\mu }}_{1}-{\boldsymbol {\mu }}_{0}\right)^{\rm {T}}{\boldsymbol {\Sigma }}_{1}^{-1}({\boldsymbol {\mu }}_{1}-{\boldsymbol {\mu }}_{0})-k+\ln {|{\boldsymbol {\Sigma }}_{1}| \over |{\boldsymbol {\Sigma }}_{0}|}\right\}}\)