PCA和SVD笔记
PCA(主成分分析):
基变换:将向量变换到指定基坐标系的变换
$一组基向量(行向量):\begin{bmatrix}p_1\\p_2\\p_3\\\cdot\\\cdot\\\cdot\\p_n \end{bmatrix} \cdot \begin{bmatrix}a_1&a_2&a_3&\cdot&\cdot&\cdot&a_m\end{bmatrix}m组样本向量(列向量) = 新的样本向量$
例:
首先给出一组基向量(基向量是单位向量):比如x'($\frac{\sqrt{2}}{2} , \frac{\sqrt{2}}{2}$) , y'($-\frac{\sqrt{2}}{2} , \frac{\sqrt{2}}{2}$) = $\begin{bmatrix} \frac{\sqrt{2}}{2} &\frac{\sqrt{2}}{2}\\-\frac{\sqrt{2}}{2} & \frac{\sqrt{2}}{2} \end{bmatrix} $
给出一个向量$$v(3,2) = \begin{bmatrix}3\\2 \end{bmatrix}$$
基变换:$$\begin{bmatrix} \frac{\sqrt{2}}{2} &\frac{\sqrt{2}}{2}\\-\frac{\sqrt{2}}{2} & \frac{\sqrt{2}}{2} \end{bmatrix} \cdot \begin{bmatrix}3\\2 \end{bmatrix} = \begin{bmatrix}5\\-1 \end{bmatrix}$$
$$协方差: Cov_{xy} = \frac{\Sigma^n_i(x_i - \mu_x)(y_i - \mu_y)}{n-1}$$
$$方差: \sigma = \frac{\Sigma^n_i(\vec{x_i} \cdot \vec{v} - \mu)^2}{n}$$
$$均值:\mu = \frac{\Sigma^n_i \vec{x_i} \cdot \vec{v}}{n}$$
PCA目的是找到线性不相关的正交轴,这些正交轴也称为m维空间中的主成分(PC),以将数据点投影到这些PC上。
将样本数据全部中心化则均值:$$ \frac{\Sigma^n_i \vec{x_i} \cdot \vec{v}}{n} = 0$$
则方差$$\sigma = \frac{\Sigma^n_i(\vec{x_i} \cdot \vec{v} - 0)^2}{n} = \Sigma^n_i \frac{(\vec{x_i}\cdot \vec{v})^2}{n}$$
分析$(\vec{x_i}\cdot \vec{v})^2 , \vec{x_i}为行向量,\vec{v}为单位列向量(\vec{x_i}\cdot \vec{v}) 结果为标量$ ,
$ 因此(\vec{x_i}\cdot \vec{v})^2 = (\vec{x_i}\cdot \vec{v})^T\cdot(\vec{x_i}\cdot \vec{v}) = v^Tx_i^Tx_iv$
$$\Sigma^n_i \frac{(\vec{x_i}\cdot \vec{v})^2}{n} = \Sigma^n_i \frac{v^Tx_i^Tx_iv}{n} = v^T\Sigma^n_i \frac{x_i^Tx_i}{n}v= v^T[\Sigma^n_i \frac{x_i^Tx_i}{n}]v$$
$$其中\Sigma^n_i \frac{x_i^Tx_i}{n}为协方差矩阵$$
$$设\Sigma = \frac{x_i^Tx_i}{n}$$
$在\vec{v}^T\cdot\vec{c} = 1 时求\vec{v}^T\Sigma\vec{v}的最大值$
拉格朗日条件极值:
$令L= \vec{v}^T\Sigma\vec{v} + \lambda(1-\vec{v}^T\vec{v})$
导:$\frac{\partial{L}}{\partial{\vec{v}}} = 2\Sigma\vec{v} - 2\lambda\vec{v} = 0$
$ \Sigma \vec{v} = \lambda \vec{v}$
$\lambda为\Sigma的特征值 , \vec{v}为\Sigma的特征向量$
import numpy as np data = np.random.randint(1, high=10, size=(5,4)) print('\n矩阵\n',data) mean = np.mean(data,axis = 0) print('\n均值\n',mean) std = np.std(data,axis = 0) print('\n标准差\n',std) data_center = (data - mean)/std print('\n中心化矩阵\n',data_center) cov1 = np.cov(data_center) cov2 = (data_center).dot(data_center.T)/(data_center.shape[1]-1) print('\n协方差矩阵1,\n',cov1) print('\n协方差矩阵2\n',cov2) eig_val , eig_vec = np.linalg.eig(cov) print('\n前n个特征值所占权重\n',np.cumsum(sorted(eig_val/eig_val.sum(),reverse=True))) eig_val_matrix = np.diag(eig_val) print('\n特征值矩阵\n',eig_val_matrix) print('\n特征向量矩阵\n',eig_vec) #根据特征值可将矩阵从四列降为2列 #选取特征值最大的两列对应的特征向量 V = np.vstack((eig_vec[:,0],eig_vec[:,1])).T print('\n投影矩阵\n',V) pca = data_center.dot(V) print('\npca降维\n',pca)
SVD(奇异值分解):
$$A = U_{mxm}\Sigma_{mxn} V^T_{nxn}$$
$$U:正交阵 \Sigma : 对角阵 V:正交阵$$
$$A^TA = V\Sigma^TU^TU \Sigma V^T$$
$$A^TAV = V\Sigma^TU^TU \Sigma V^TV = V\Sigma ^T \Sigma$$
$$取V与\Sigma^T\Sigma中对应一列列向量$$
$$v_{nx1}=[v_{11},v_{21},v_{31},......,v_{n1}]^T和\lambda_{mx1} = [\lambda_1,0,0,0,0,0,.....,0]^T$$
化为$$(A^TA)v_{nx1} = v_{nx1}\lambda_{mx1} = \lambda_1 v_{nx1}$$
$$由此形式可看出正交阵\Sigma 为A^TA的特征值对角阵 , V为A^TA的特征向量矩阵 $$
$$A = U\Sigma V^T$$
$$AV = U\Sigma V^TV = U\Sigma$$
# $$ U = AV\Sigma^{-1}$$
#numpy 实现: import numpy as np A = np.random.randint(1,high=10,size=(5,3)) def SVD(A): eig_val , eig_vec = np.linalg.eig(np.dot(A.T,A)) if not (np.allclose(np.zeros(eig_val.imag.shape),eig_val.imag) and np.allclose(np.zeros(eig_vec.imag.shape),eig_vec.imag)): raise np.ComplexWarning('complex matrix.') Sigma = np.diag(eig_val.real) V = eig_vec.real U = np.dot(np.dot(A,V),np.linalg.inv(Sigma)) np.allclose(U.dot(Sigma).dot(V.T),A) return a,b,c U , Sigma , V = SVD(A)
numpy库中的svd:
U , S , V = np.linalg.svd(A, full_matrices=False, compute_uv=True)