矩阵求导

Posted on 2021-07-03 15:37  foghorn  阅读(232)  评论(0编辑  收藏  举报

0 矩阵求导的本质

矩阵\(A\)对矩阵\(B\)求导:矩阵\(A\)中的每个元素分别对矩阵\(B\)中的每个元素求导。

因变量 自变量 导数个数
\(A_{1\times 1}\) \(B_{1\times 1}\) 1个导数
\(A_{m\times 1}\) \(B_{1\times 1}\) \(m\)个导数
\(A_{m\times 1}\) \(B_{p\times 1}\) \(m\times p\)个导数
\(A_{m\times n}\) \(B_{p\times q}\) m\(\times n\)\(\times p\)\(\times q\)个导数

1 两个概念

1.1 标量函数

\(\left [ f \right ]_{1\times 1}\):函数\(f\)是一个具体的值,如:\(f(x_{1},x_{2})=2x_{1}+3x_{2}^{2}\)

1.2 向量函数

\(\left [ f \right ]_{m\times n}\):形如\(f=\left [f_{1}(x)=x_{1},f_{2}(x)=x^{2} \right]_{1\times 2}\)

2 以分母布局为例的矩阵求导基本原则

2.1 原则一:

\(f\)为标量函数,\(f=f(x_{1},x_{2},...,x_{p})\)\(X\)为列向量,\(X=\begin{bmatrix} x_{1}\\ x_{2}\\ \vdots \\ x_{p}\\ \end{bmatrix}\),则定义:
\(\frac{\partial f}{\partial X}=\begin{bmatrix} \frac{\partial f}{\partial x_{1}}\\ \frac{\partial f}{\partial x_{2}}\\ \vdots\\ \frac{\partial f}{\partial x_{p}}\\ \end{bmatrix}\)

\(f\)为标量函数,\(x\)为行向量,则定义:

\(\frac{\partial f}{\partial X}=\begin{bmatrix} \frac{\partial f}{\partial x_{1}} &\frac{\partial f}{\partial x_{2}} & \cdots & \frac{\partial f}{\partial x_{p}} \end{bmatrix}_{1\times p}\)

2.2 原则二

\(f\)为列向量,\(f=\begin{bmatrix} f_{1}(x)\\ f_{2}(x)\\ \vdots \\ f_{m}(x) \end{bmatrix}_{m\times 1}\)\(X\)为标量,则定义:

\(\frac{\partial f}{\partial X}=\begin{bmatrix} \frac{\partial f_{1}(x)}{\partial X} & \frac{\partial f_{2}(x)}{\partial X} & \cdots & \frac{\partial f_{m}(x)}{\partial X} \end{bmatrix}=\frac{\partial (f^{T})}{\partial X}\)

\(\begin{bmatrix} f \end{bmatrix}_{m\times 1}\)\(\begin{bmatrix} X \end{bmatrix}_{p\times 1}\),则定义:

\(\frac{\partial f}{\partial X}=\begin{bmatrix} \frac{\partial f}{\partial x_{1}}\\ \frac{\partial f}{\partial x_{2}}\\ \vdots \\ \frac{\partial f}{\partial x_{p}} \end{bmatrix}=\begin{bmatrix} \frac{\partial f_{1}}{\partial x_{1}} & \frac{\partial f_{2}}{\partial x_{1}} & \cdots & \frac{\partial f_{m}}{\partial x_{1}}\\ \frac{\partial f_{1}}{\partial x_{2}} & \frac{\partial f_{2}}{\partial x_{2}} & \cdots & \frac{\partial f_{m}}{\partial x_{2}}\\ \vdots & \vdots & \vdots & \vdots \\ \frac{\partial f_{1}}{\partial x_{p}} & \frac{\partial f_{2}}{\partial x_{p}} & \cdots & \frac{\partial f_{m}}{\partial x_{p}} \end{bmatrix}\)

3 常用的公式

对于形如\(f(A,x)\)的函数,其中\(A=\begin{bmatrix} a_{1} &a_{2} &\cdots & a_{p} \end{bmatrix}\)\(x=\begin{bmatrix} x_{1}\\ x_{2}\\ \vdots \\ x_{p} \end{bmatrix}\),则有:

  • \(\frac{\partial f}{\partial x}=\frac{\partial A^{T}x}{\partial x}=\frac{\partial x^{T}A}{\partial x}=A\)
  • \(\frac{\partial f}{\partial x}=\frac{\partial x^{T}Ax}{\partial x}=(A+A^{T})x\)
  • \(\frac{\partial f}{\partial x}=\frac{\partial Ax}{\partial x}=A^{T}\)
  • \(\frac{\partial x^{T}x}{\partial x}=2x\)

若:\(U=\begin{bmatrix} u_{1}(x)\\ u_{2}(x)\\ \vdots \\ u_{p}(x) \end{bmatrix}\)\(V=\begin{bmatrix} v_{1}(x)\\ v_{2}(x)\\ \vdots \\ v_{p}(x) \end{bmatrix}\),则有:

  • \(\frac{\partial U^{T}V}{\partial x}=\frac{\partial U}{\partial x}V+\frac{\partial V}{\partial x}U\)
  • \(\frac{\partial (U+V)}{\partial x}=\frac{\partial U}{\partial x}+\frac{\partial V}{\partial x}\)

Copyright © 2024 foghorn
Powered by .NET 9.0 on Kubernetes