Loading

变分推断中一类联合概率密度函数边缘均值与方差的推导

在变分推断中,常见的一类联合概率密度函数形式如下所示:

\[f\left(z_{m}, \mathbf{x}\right) {\;\propto\;} {\delta}\left(z_{m} - \mathbf{a}_{m}\mathbf{x}\right) \mathcal{CN}\left( {z}_{m}; {\mu}_{z, m}, {\tau}_{z, m} \right) \prod_{n} \mathcal{CN}\left( {x}_{n}; {\mu}_{x, n}, {\tau}_{x, n}\right) \]

其中,\(\mathbf{a}_{m} = \left[a_{m,1}, a_{m,2}, {\dots}, a_{m,N}\right]\)为行向量,\(\mathbf{x} = \left[x_{1}, x_{2}, {\dots}, x_{N}\right]^{T}\)为列向量。对于此类联合概率密度函数,计算\(x_{n}\)\(z_{m}\)的均值与方差至关重要,本篇文章分别给出均值与方差的推导。

预备知识

高斯相乘引理:

\[\mathcal{CN}\left(x; a, A\right)\mathcal{CN}\left(x; b, B\right) = \mathcal{CN}\left(0; a - b, A + B\right)\mathcal{CN}\left(x; \frac{\frac{a}{A}+\frac{b}{B}}{\frac{1}{A}+\frac{1}{B}}, \frac{1}{\frac{1}{A}+\frac{1}{B}}\right) \]

证明较为简单,不再赘述。

\(x_{n}\)的均值与方差

计算\(\mathsf{E}\left\{x_{n} \mid f\left(z_{m}, \mathbf{x}\right)\right\}\)\(\mathsf{Var}\left\{x_{n} \mid f\left(z_{m}, \mathbf{x}\right)\right\}\)的首要步骤为计算边缘概率密度函数:

\[f\left(x_{n}\right) = \int f\left(z_{m}, \mathbf{x}\right) \mathrm{d}z_{m}\mathrm{d}\mathbf{x}_{\backslash x_{n}} \]

将联合概率密度函数\(f\left(z_{m}, \mathbf{x}\right)\)代入上式,有:

\[\begin{aligned} f\left(x_{n}\right) &= \int {\delta}\left(z_{m} - \mathbf{a}_{m}\mathbf{x}\right) \mathcal{CN}\left( {z}_{m}; {\mu}_{z, m}, {\tau}_{z, m} \right) \prod_{p} \mathcal{CN}\left( {x}_{p}; {\mu}_{x, p}, {\tau}_{x, p} \right) \mathrm{d}z_{m}\mathrm{d}\mathbf{x}_{\backslash x_{n}} \nonumber \\ &\overset{\left(a\right)}{=} \int \mathcal{CN}\left( \sum_{q}a_{m, q}x_{q}; {\mu}_{z, m}, {\tau}_{z, m} \right) \prod_{p} \mathcal{CN}\left( {x}_{p}; {\mu}_{x, p}, {\tau}_{x, p} \right) \mathrm{d}\mathbf{x}_{\backslash x_{n}} \nonumber \\ &= \mathcal{CN}\left( {x}_{n}; {\mu}_{x, n}, {\tau}_{x, n} \right) \underbrace{\int \mathcal{CN}\left( \sum_{q}a_{m, q}x_{q}; {\mu}_{z, m}, {\tau}_{z, m} \right) \prod_{p{\neq}n} \mathcal{CN}\left( {x}_{p}; {\mu}_{x, p}, {\tau}_{x, p}\right) \mathrm{d}\mathbf{x}_{\backslash x_{n}}}_{{\;\triangleq\;}g\left(x_{n}\right)} \end{aligned}\]

其中,\(\left(a\right)\)利用了冲激函数的积分性质。对于上式中\(g\left(x_{n}\right)\)项,可以展开为:

\[\begin{aligned} g\left(x_{n}\right) &= \int \mathcal{CN}\left( \sum_{q}a_{m, q}x_{q}; {\mu}_{z, m}, {\tau}_{z, m} \right) \prod_{p{\neq}n} \mathcal{CN}\left( {x}_{p}; {\mu}_{x, p}, {\tau}_{x, p}\right) \mathrm{d}x_{k} \mathrm{d}\mathbf{x}_{\backslash \left\{x_{n}, x_{k}\right\}} \nonumber \\ &= \int \mathcal{CN}\left( x_{k}; \frac{{\mu}_{z, m} - \sum\limits_{q{\neq}k}a_{m, q}x_{q}}{a_{m, k}}, \frac{{\tau}_{z, m}}{\left|a_{m, k}\right|^{2}} \right) \prod_{p{\neq}n} \mathcal{CN}\left( {x}_{p}; {\mu}_{x, p}, {\tau}_{x, p} \right) \mathrm{d}x_{k} \mathrm{d}\mathbf{x}_{\backslash \left\{x_{n}, x_{k}\right\}} \nonumber \\ &\overset{\left(a\right)}{=} \int \mathcal{CN}\left(0; \frac{{\mu}_{z, m} - \sum\limits_{q{\neq}k}a_{m, q}x_{q}}{a_{m, k}} - {\mu}_{x, k}, \frac{{\tau}_{z, m}}{\left|a_{m, k}\right|^{2}} + {\tau}_{x, k} \right) \prod_{p{\neq}\left\{n, k\right\}} \mathcal{CN}\left( {x}_{p}; {\mu}_{x, p}, {\tau}_{x, p} \right) \mathrm{d}\mathbf{x}_{\backslash \left\{x_{n}, x_{k}\right\}} \nonumber \\ &= \int \mathcal{CN}\left( \sum_{q{\neq}k}a_{m,q}x_{q}; {\mu}_{z,m}-a_{m,k}{\mu}_{x,k}, {\tau}_{z, m} + \left|a_{m, k}\right|^{2}{\tau}_{x, k} \right) \prod_{p{\neq}\left\{n, k\right\}} \mathcal{CN}\left( {x}_{p}; {\mu}_{x, p}, {\tau}_{x, p} \right) \mathrm{d}\mathbf{x}_{\backslash \left\{x_{n}, x_{k}\right\}} \end{aligned}\]

其中,\(\left(a\right)\)将连乘项中与\(x_{k}\)相关的概率密度函数提取出来,并利用高斯相乘引理得到。可以发现,处理后\(g\left(x_{n}\right)\)积分符号中的项具有与处理前类似的形式,仅连乘前的均值和方差有变化,因此可以递归处理,经过\(N-1\)轮处理后,有:

\[\begin{aligned} g\left(x_{n}\right) &= \mathcal{CN}\left(a_{m, n}x_{n}; {\mu}_{z, m} - \sum_{n^{\prime}{\neq}n}a_{m,n^{\prime}}{\mu}_{x, n^{\prime}}, {\tau}_{z, m} + \sum_{n^{\prime}{\neq}n}\left|a_{m,n^{\prime}}\right|^{2}{\tau}_{x, n^{\prime}}\right) \nonumber \\ &= \mathcal{CN}\left(x_{n}; \frac{{\mu}_{z, m} - \sum\limits_{n^{\prime}{\neq}n}a_{m,n^{\prime}}{\mu}_{x, n^{\prime}}}{a_{m, n}}, \frac{{\tau}_{z, m} + \sum\limits_{n^{\prime}{\neq}n}\left|a_{m,n^{\prime}}\right|^{2}{\tau}_{x, n^{\prime}}}{\left|a_{m, n}\right|^{2}}\right) \end{aligned}\]

代回\(f\left(x_{n}\right)\)的表达式,有:

\[\begin{aligned} f\left(x_{n}\right) &= \mathcal{CN}\left( {x}_{n}; {\mu}_{x, n}, {\tau}_{x, n}\right) \mathcal{CN}\left(x_{n}; \frac{{\mu}_{z, m} - \sum\limits_{n^{\prime}{\neq}n}a_{m,n^{\prime}}{\mu}_{x, n^{\prime}}}{a_{m, n}}, \frac{{\tau}_{z, m} + \sum\limits_{n^{\prime}{\neq}n}\left|a_{m,n^{\prime}}\right|^{2}{\tau}_{x, n^{\prime}}}{\left|a_{m, n}\right|^{2}}\right) \nonumber \\ &{\;\propto\;} \mathcal{CN}\left(x_{n}; \frac{\frac{{\mu}_{x, n}}{{\tau}_{x, n}} + a_{m, n}^{*}\frac{{\mu}_{z, m} - \sum\limits_{n^{\prime}{\neq}n}a_{m,n^{\prime}}{\mu}_{x, n^{\prime}}}{{\tau}_{z, m} + \sum\limits_{n^{\prime}{\neq}n}\left|a_{m,n^{\prime}}\right|^{2}{\tau}_{x, n^{\prime}}}}{\frac{1}{\tau_{x, n}} + \frac{\left|a_{m, n}\right|^{2}}{{\tau}_{z, m} + \sum\limits_{n^{\prime}{\neq}n}\left|a_{m,n^{\prime}}\right|^{2}{\tau}_{x, n^{\prime}}}} , \frac{1}{\frac{1}{\tau_{x, n}} + \frac{\left|a_{m, n}\right|^{2}}{{\tau}_{z, m} + \sum\limits_{n^{\prime}{\neq}n}\left|a_{m,n^{\prime}}\right|^{2}{\tau}_{x, n^{\prime}}}}\right) \end{aligned}\]

此时,容易得到:

\[\mathsf{E}\left\{x_{n} \mid f\left(z_{m}, \mathbf{x}\right)\right\} = \frac{\frac{{\mu}_{x, n}}{{\tau}_{x, n}} + a_{m, n}^{*}\frac{{\mu}_{z, m} - \sum\limits_{n^{\prime}{\neq}n}a_{m,n^{\prime}}{\mu}_{x, n^{\prime}}}{{\tau}_{z, m} + \sum\limits_{n^{\prime}{\neq}n}\left|a_{m,n^{\prime}}\right|^{2}{\tau}_{x, n^{\prime}}}}{\frac{1}{\tau_{x, n}} + \frac{\left|a_{m, n}\right|^{2}}{{\tau}_{z, m} + \sum\limits_{n^{\prime}{\neq}n}\left|a_{m,n^{\prime}}\right|^{2}{\tau}_{x, n^{\prime}}}} \]

\[\mathsf{Var}\left\{x_{n} \mid f\left(z_{m}, \mathbf{x}\right)\right\} = \frac{1}{\frac{1}{\tau_{x, n}} + \frac{\left|a_{m, n}\right|^{2}}{{\tau}_{z, m} + \sum\limits_{n^{\prime}{\neq}n}\left|a_{m,n^{\prime}}\right|^{2}{\tau}_{x, n^{\prime}}}} \]

z_{m}的均值与方差

类似的,计算\(\mathsf{E}\left\{z_{m} \mid f\left(z_{m}, \mathbf{x}\right)\right\}\)\(\mathsf{Var}\left\{z_{m} \mid f\left(z_{m}, \mathbf{x}\right)\right\}\)的首要步骤为计算边缘概率密度函数:

\[f\left(z_{m}\right) = \int f\left(z_{m}, \mathbf{x}\right) \mathrm{d}\mathbf{x} \]

将联合概率密度函数\(f\left(z_{m}, \mathbf{x}\right)\)代入上式,有:

\[\begin{aligned} f\left(z_{m}\right) &= \int {\delta}\left(z_{m} - \mathbf{a}_{m}\mathbf{x}\right) \mathcal{CN}\left( {z}_{m}; {\mu}_{z, m}, {\tau}_{z, m} \right) \prod_{p} \mathcal{CN}\left( {x}_{p}; {\mu}_{x, p}, {\tau}_{x, p} \right) \mathrm{d}\mathbf{x} \nonumber \\ &= \mathcal{CN}\left( {z}_{m}; {\mu}_{z, m}, {\tau}_{z, m} \right) \underbrace{\int {\delta}\left(z_{m} - \mathbf{a}_{m}\mathbf{x}\right) \prod_{p} \mathcal{CN}\left( {x}_{p}; {\mu}_{x, p}, {\tau}_{x, p}\right) \mathrm{d}\mathbf{x}}_{{\;\triangleq\;}g\left(z_{m}\right)} \end{aligned}\]

对于上式中\(g\left(z_{m}\right)\)项,可以展开为:

\[\begin{aligned} g&\left(z_{m}\right) = \nonumber \\ &\int {\delta}\left(a_{m, k}x_{k} - \left(z_{m} - \sum_{n{\neq}k}a_{m, n}x_{n}\right)\right) \mathcal{CN}\left(a_{m, k}x_{k}; a_{m, k}{\mu}_{x, k}, \left|a_{m, k}\right|^{2}{\tau}_{x, k}\right) \prod_{p{\neq}k} \mathcal{CN}\left( {x}_{p}; {\mu}_{x, p}, {\tau}_{x, p} \right) \mathrm{d}\mathbf{x} \nonumber \\ &\overset{\left(a\right)}{=} \int \mathcal{CN}\left(z_{m} - \sum_{n{\neq}k}a_{m, n}x_{n}; a_{m, k}{\mu}_{x, k}, \left|a_{m, k}\right|^{2}{\tau}_{x, k} \right) \prod_{p{\neq}k} \mathcal{CN}\left( {x}_{p}; {\mu}_{x, p}, {\tau}_{x, p} \right) \mathrm{d}\mathbf{x}_{\backslash x_{k}} \nonumber \\ &= \int \mathcal{CN}\left(\sum_{n{\neq}k}a_{m,n}x_{n}; z_{m}-a_{m,k}{\mu}_{x,k}, \left|a_{m,k}\right|^{2}{\tau}_{x,k}\right) \prod_{p{\neq}k} \mathcal{CN}\left( {x}_{p}; {\mu}_{x, p}, {\tau}_{x, p}\right) \mathrm{d}\mathbf{x}_{\backslash x_{k}} \end{aligned}\]

接下来,采用与\(g\left(x_{n}\right)\)推导类似的过程,再进行\(N-1\)轮处理后,有:

\[\begin{aligned} g\left(z_{m}\right) &= \mathcal{CN}\left(0; z_{m} - \sum_{n}a_{m,n}{\mu}_{x,n}, \sum_{n}\left|a_{m,n}\right|^{2}{\tau}_{x,n}\right) \nonumber \\ &= \mathcal{CN}\left(z_{m}; \sum_{n}a_{m,n}{\mu}_{x,n}, \sum_{n}\left|a_{m,n}\right|^{2}{\tau}_{x,n}\right) \end{aligned}\]

代回\(f\left(z_{m}\right)\)的表达式,有:

\[\begin{aligned} f\left(z_{m}\right) &= \mathcal{CN}\left( {z}_{m}; {\mu}_{z, m}, {\tau}_{z, m} \right) \mathcal{CN}\left(z_{m}; \sum_{n}a_{m,n}{\mu}_{x,n}, \sum_{n}\left|a_{m,n}\right|^{2}{\tau}_{x,n}\right) \nonumber \\ &{\;\propto\;} \mathcal{CN}\left( z_{m}; \frac{\frac{{\mu}_{z,m}}{{\tau}_{z,m}} + \frac{\sum_{n}a_{m,n}{\mu}_{x,n}}{\sum_{n}\left|a_{m,n}\right|^{2}{\tau}_{x,n}}}{\frac{1}{{\tau}_{z, m}} + \frac{1}{\sum_{n}\left|a_{m,n}\right|^{2}{\tau}_{x,n}}}, \frac{1}{\frac{1}{{\tau}_{z, m}} + \frac{1}{\sum_{n}\left|a_{m,n}\right|^{2}{\tau}_{x,n}}} \right) \end{aligned}\]

此时,容易得到:

\[\mathsf{E}\left\{z_{m} \mid f\left(z_{m}, \mathbf{x}\right)\right\} = \frac{\frac{{\mu}_{z,m}}{{\tau}_{z,m}} + \frac{\sum_{n}a_{m,n}{\mu}_{x,n}}{\sum_{n}\left|a_{m,n}\right|^{2}{\tau}_{x,n}}}{\frac{1}{{\tau}_{z, m}} + \frac{1}{\sum_{n}\left|a_{m,n}\right|^{2}{\tau}_{x,n}}} \]

\[\mathsf{Var}\left\{z_{m} \mid f\left(z_{m}, \mathbf{x}\right) \right\} = \frac{1}{\frac{1}{{\tau}_{z, m}} + \frac{1}{\sum_{n}\left|a_{m,n}\right|^{2}{\tau}_{x,n}}} \]

posted @ 2023-03-23 15:10  Infinity-SEU  阅读(207)  评论(0编辑  收藏  举报