标量对矩阵求导
设矩阵 $X$ 为
$$X = \begin{bmatrix}
x_{11} & x_{12} & \cdots & x_{1n} \\
x_{21} & x_{22} & \cdots & x_{2n} \\
\vdots & \vdots & \cdots & \vdots \\
x_{m1} & x_{m2} & \cdots & x_{mn}
\end{bmatrix}$$
标量 $y$ 对矩阵 $X_{m \times n}$ 求导,其结果还是一个 $m \times n$ 的矩阵:
$$\frac{dy}{dX} = \begin{bmatrix}
\frac{\partial y}{\partial x_{11}} & \frac{\partial y}{\partial x_{12}} & \cdots & \frac{\partial y}{\partial x_{1n}} \\
\frac{\partial y}{\partial x_{21}} & \frac{\partial y}{\partial x_{22}} & \cdots & \frac{\partial y}{\partial x_{2n}} \\
\vdots & \vdots & \cdots & \vdots \\
\frac{\partial y}{\partial x_{m1}} & \frac{\partial y}{\partial x_{m2}} & \cdots & \frac{\partial y}{\partial x_{mn}}
\end{bmatrix}$$
形状规则:标量 $y$ 对矩阵 $X$ 的每个元素求导,然后将各个求导结果按矩阵 $X$ 的形状排列。
应用
1. $f(X) = u^{T}Xv$,求 $\frac{df}{dX}$。
将 $f(X)$ 展开得
$$f(X)= \sum_{i=1}^{n}\sum_{j=1}^{n}x_{ij}u_{i}v_{j}$$
所以
$$\frac{\partial f}{\partial x_{ij}} = u_{i}v_{j}$$
所以
$$\frac{\partial f}{\partial X} = u \cdot v^{T}$$