代数基础

矩阵微分

\(X \in \mathbb{R}^{m \times n}\), \(Y \in \mathbb{R}^{n \times p}\)
定义:

\[\mathrm{d}X = \left [ \begin{array}{ccc} \mathrm{d}X_{11} & \ldots & \mathrm{d}X_{1n} \\ \vdots & \ddots & \vdots \\ \mathrm{d}X_{m1} & \cdots & \mathrm{d}X_{mn} \end{array} \right ] \]

\(\mathrm{d}(X+Y) = \mathrm{d}X + \mathrm{d}Y\)

根据\(\mathrm{d}(X_{ij}+Y_{ij}) = \mathrm{d}X_{ij} + \mathrm{d}Y_{ij}\)可得。

\(\mathrm{d}(XY) = (\mathrm{d}X)Y + X\mathrm{d}Y\)

\[\mathrm{d}[XY]_{ij} = \mathrm{d}(X_i^TY^j) = \sum \limits_{k=1}^n \mathrm{d}(X_{ik}Y_{kj})=\sum \limits_{k=1}^n [\mathrm{d}X_{ik} Y_{kj} + X_{ik}\mathrm{d}Y_{kj}] = [(\mathrm{d}X)Y + X\mathrm{d}Y]_{ij} \]

\(\mathrm{d}X^T = (\mathrm{d}X)^T\)

\[[\mathrm{d}X^T]_{ij} = \mathrm{d}X_{ji} = [(\mathrm{d}X)^T]_{ij} \]

\(\mathrm{dTr}(X)\) = \(\mathrm{Tr}(\mathrm{d}X)\)

这里假设\(X \in \mathbb{R}^{n \times n}\)

\[\mathrm{dTr}(X) = \sum \limits_{i=1}^n \mathrm{d}X_{ii} = \mathrm{Tr}(\mathrm{d}X) \]

\(\mathrm{d}X^{-1} = -X^{-1}\mathrm{d}X X^{-1}\)

这里假设\(X \in \mathbb{R}^{n \times n}\),可逆。

\[XX^{-1} = I \]

对俩边同时微分可得:

\[\mathrm{d}XX^{-1}+X\mathrm{d}X^{-1} = 0 \]

所以

\[\mathrm{d}X^{-1} = -X^{-1}\mathrm{d}X X^{-1} \]

\(\mathrm{d}|X| = \mathrm{Tr}(X^*\mathrm{d}X)\)

这里假设\(X \in \mathbb{R}^{n \times n}\)
其中\(|\cdot|\)表行列式,\(X^*\)\(X\)的伴随矩阵,当\(X\)可逆的时候\(X^{-1}|X| = X^*\)
我们用\(x_{ij}\)来表示\(X_{ij}\)的代数余子式,对于任意\(X_{ij}\)而言:

\[|X| = \sum \limits_{k=1}^n X_{kj} x_{kj} \]

而且其中仅有第\(i\)项与\(X_{ij}\)有关,
所以

\[\frac{\partial |X|}{\partial X_{ij}} = x_{ij} \]

\[[X^{*}]_{ij} = x_{ji} \]

容易证得\(\mathrm{Tr}(X^*\mathrm{d}X) =\sum \limits_{i,j=1}^n x_{ij} \mathrm{d}X_{ij}\)得证。

\((A+B)^{-1} = A^{-1}(A^{-1}+B^{-1})^{-1}B^{-1}\)

其中\(A, B\)均可逆.
证:

\[(A+B)A^{-1}(A^{-1}+B^{-1})^{-1}B^{-1} = (I+BA^{-1})(BA^{-1}+I)^{-1}=I \]

证毕.

\((A+UCV)^{-1}=A^{-1}-A^{-1}U(C^{-1}+VA^{-1}U)^{-1}VA^{-1}\)

\(A \in \mathbb{R}^{n \times n}\)为非奇异矩阵, \(U \in \mathbb{R}^{n \times m}, V \in \mathbb{R}^{m \times n}\), 令\(C \in \mathbb{R}^{m \times m }\)为非奇异矩阵, 则\(A+UCV\)可逆当且仅当\(C^{-1}+VA^{-1}U\)可逆, 并且

\[(A+UCV)^{-1}=A^{-1}-A^{-1}U(C^{-1}+VA^{-1}U)^{-1}VA^{-1}. \]

证明:
\(\Leftarrow\)
\(X:=A+UCV\), \(Y=A^{-1}-A^{-1}U(C^{-1}+VA^{-1}U)^{-1}VA^{-1}.\)

\[\begin{array}{ll} XY&= I+UCVA^{-1}-(U+UCVA^{-1}U)(C^{-1}+VA^{-1}U)^{-1}VA^{-1} \\ &= I+UCVA^{-1}-UCVA^{-1} \\ &= I. \end{array} \]

\[\begin{array}{ll} YX&= I+A^{-1}UCV-A^{-1}U(C^{-1}+VA^{-1}U)^{-1}(V+VA^{-1}UCV) \\ &= I+A^{-1}UCV-A^{-1}UCV \\ &= I. \end{array} \]

\(\Rightarrow\)

\(C^{-1}+VA^{-1}U\)不可逆, 则存在特征向量\(x\):

\[(C^{-1}+VA^{-1}U)x=0, \]

\[(A+UCV)(A^{-1}Ux)=Ux-Ux=0. \]

特例

\(C=1, U=u \in \mathbb{R}^n, V^T=v \in \mathbb{R}^d\), 则

\[(A+uv^T)^{-1}=A^{-1}-\frac{A^{-1}uv^TA^{-1}}{1+v^TA^{-1}u}. \]

posted @ 2019-05-08 22:00  馒头and花卷  阅读(475)  评论(0编辑  收藏  举报