【智应数】Singular Value Decomposition

SVD

Def (eigenvalue\eigenvector). Eigenvalue $\lambda$ and eigenvector $\bm{v}$ of matrix $A$ satisfy $$A\bm{v}=\lambda\bm{v}.$$

Lem 1. Let $M\in\mathbb{R}^{n\times n}$ is a symmetric matrix. Let $\lambda_i$ and $\bm{u}_i$ be the eigenvalues and eigenvectors of $M$,

\[M=\sum_i\lambda_i\bm{u}_i\bm{u}_i^T. \]

i.e., let $U$ be the orthonormal matrix spanned by eigenvectors, $D$ be the diagonal matrix generated by eigenvalues,

\[M=UDU^T. \]

Pf. Only need to prove $U^TMU=D$. By definition, $\bm{u_1}^TM\bm{u_1}=\lambda_1$ and $W_1^TM\bm{u}_1=\vec{0}$. Using induction immediately prove it.

Lem 2. Suppose $\bm{v}$ is a eigenvector of $A^TA$, then $A\bm{v}$ is a eigenvector of $AA^T$, with same eigenvalue. $A^TA$ and $AA^T$ share same non-negative eigenvalues.

Pf.

\[A^TA\bm{v}=\lambda\bm{v} \]

\[\Rightarrow AA^TA\bm{v}=A\lambda\bm{v} \]

\[\Rightarrow (AA^T)(A\bm{v})=\lambda (A\bm{v}). \]

Thm (SVD). Let $M\in\mathbb{R}^{m\times n}$ is a arbitrary matrix. Let $\sigma_i,\bm{u}_i\bm{v}_i$ be the square root of singular values, left singular vectors and right singular vectors,

\[M=\sum\limits_{i}\sigma_i\bm{u}_i\bm{v}_i^T. \]

i.e., let $S,U,V$ be corresponding matrix ( $U$ and $V$ is orthonormal, $S$ is diagonal),

\[M=USV^T. \]

Pf. By intuition, $A^TA=VS^2V^T$ and $AA^T=US^2U^T$. Thus $V,U,S$ should be the eigenvectors of $A^TA$, the eigenvectors of $AA^T$, and the square root of eigenvalues respectively.

In Lem 2, $$\Vert A\bm{v}\Vert=\sqrt{\bm{v}^T A^TA\bm{v}}= \sqrt{\bm{v}^T\lambda \bm{v}}=\sqrt{\lambda} \Vert\bm{v}\Vert.$$

It means that if $V$ is the orthonormal matrix spanned by eigenvectors of $A^TA$, then $U=AVS^{-1}$ is the orthonormal matrix spanned by eigenvectors of $AA^T$. It immediately gives $$U=MVS^{-1}\Rightarrow M=USV^T.$$

Prop. Let $r=\text{rank}(A)$. We have

The first $r$ columns of $U$, $\bm{u}_1,...,\bm{u}_r$ form a orthonormal basis of $\text{Col}(A)$.
The first $r$ rows of $V$, $\bm{v}_1,...,\bm{v}_r$ form a orthonormal basis of $\text{Row}(A)$.
The last $m-r$ columns of $U$ form a orthonormal basis of $\text{Null}(A^T)$.
The last $n-r$ rows of $V$ form a orthonormal basis of $\text{Null}(A)$.

Best Rank-$k$ Approximations

Let $\sigma_1\ge \sigma_2\ge...$ be the square root of eigenvalues of $A$. Define

\[A_k=\sum\limits_{i=1}^k \sigma_i \bm{u}_i\bm{v}_i^T. \]

Lem. $$\Vert M\Vert_F ^2 = \text{trace}(M^TM)=\sum\limits_i\sigma_i(M) ^2.$$

Lem. For any $M$ with $\text{rank}(M)=k$,

\[\forall i, \sigma_{k+i}(A)\le \sigma_i(A-M). \]

Pf. Since $\text{rank}(M)=k,\dim(\text{Null}(M))=n-k$, $\exists \bm{w}\in \text{Null}(M)\cap \text{Span}(\bm{v}_1,...,\bm{v}_k)$.

\[\sigma_{k+1}(A)\Vert \bm{w}\Vert \le \Vert A\bm{w}\Vert\le \Vert (A-M)\bm{w}\Vert\le\sigma_{1}(A-M)\Vert \bm{w}\Vert. \]

Thm. Let $\Vert M\Vert_F=\sqrt{\sum\limits_{i,j}m_{i,j}^2}$. For any matrix $B$ of rank at most $k$, we have $$\Vert A-A_k\Vert_F\le \Vert A-B\Vert_F.$$

Pf. $$\Vert A-A_k\Vert_F^2=\sum\limits_{i=k+1} ^r \sigma_i(A)^2\le \sum\limits_{i=1} ^{r-k}\sigma_i(A-M) ^2\le \Vert A-M\Vert_F^2.$$

Thm. Let $\Vert M\Vert_2=\sup\limits_{|\bm{x}|=1}|M\bm{x}|$. For any matrix $B$ of rank at most $k$, we have $$\Vert A-A_k\Vert_2\le \Vert A-B\Vert_2.$$

Principal Component Analysis (PCA): $A\rightarrow AV_k$.

Power Method

Let $B=A^TA=\sum\limits_{i=1}^n\sigma_i^2\bm{v}_i\bm{v}_i^T$. Since $v_i$ are perpendicular to each others,$$B^k=\sum\limits_{i=1} ^n \sigma_i^{2k} \bm{v}_i\bm{v}_i^T.$$

Let $\bm{x}=\sum\limits_{i=1}^nc_i\bm{v}_i$ be any vector. When $k$ is large,

\[B^k\bm{x}=\sum\limits_{i=1} ^n \sigma_i^{2k} \bm{v}_ic_i\bm{x}\approx\sigma_1^{2k}c_1\bm{v}_1. \]

By following the above process, we can obtain $\sigma_1,\bm{u}_1,\bm{v}_1$. Then let $B'=B-\sigma_1\bm{v}_1\bm{v}_1^T$ and repeat the process to get $\sigma_2,...,\sigma_r$.

Thm 3.11. Let $A$ be an $n\times d$ matrix and $\bm{x}$ a unit length vector in $\mathbb{R}^d$ with $|\bm{x}^T \bm{v}_1|\ge\delta$, where $\delta>0$. Let $V$ be the space spanned by the right singular vectors of $A$ corresponding to singular values greater than $(1-\varepsilon)\delta_1$. Let $\bm{w}$ be the unit vector after $k = \frac{\ln(1/\varepsilon)}{2\varepsilon}$ iterations of the power method, namely,

\[w=\frac{(A^TA)^k\bm{x}}{|(A^TA)^k\bm{x}|}. \]

Then $w$ has a component of at most $\varepsilon$ perpendicular to $V$.

posted @ 2024-05-14 20:10 xcyle 阅读(68) 评论(0) 编辑收藏举报

刷新页面返回顶部

xcyle

【智应数】Singular Value Decomposition

SVD

Best Rank-\(k\) Approximations

Power Method

公告