4、PCA and Whitening

总结：
1）前一个实验在自然图像上的拓展。
2）主要是展示了PCA处理后的图像的协方差矩阵基本为对角阵的这个现象，方便以后检验，而且要加正则项。
3）代码中给了一个如果从图像中选择小块的代码样例sampleIMAGESRAW.m
4）PCA降维的数据还是可以查看的，但是白化后的数据就没办法看了。
5）
问题：
1）对于PCAWhite的对角阵，展示了有无正则化的作用。但是正则化系数大小不同，的作用更具体的数学上的解释不懂。这个系数的大小变化的意义不懂。主要也是白化后的数据，没办法看的原因，不知道怎么看这个的技巧。
2）
3）
4）
5）
想法：
1）
2）
3）
4）
5）
　　UFLDL PCA and Whitening
　　实验需要下载代码pca_exercise.zip
　　实验代码pca_gen.m
clear;close all;clc;
disp('当前正在执行的程序是：');
disp([mfilename('fullpath'),'.m']);
%%================================================================
%% Step 0a: Load data
%  Here we provide the code to load natural image data into x.
%  x will be a 144 * 10000 matrix, where the kth column x(:, k) corresponds to
%  the raw image data from the kth 12x12 image patch sampled.
%  You do not need to change the code below.

x = sampleIMAGESRAW();
figure('name','Raw images');
randsel = randi(size(x,2),200,1); % A random selection of samples for visualization
display_network(x(:,randsel));

%%================================================================
%% Step 0b: Zero-mean the data (by row)
%  You can make use of the mean and repmat/bsxfun functions.

% -------------------- YOUR CODE HERE -------------------- 
%x为[144,10000]的矩阵，有10000个样本，每个样本为12x12的小块图像
avg=mean(x,1);
x=x-repmat(avg,size(x,1),1);


%%================================================================
%% Step 1a: Implement PCA to obtain xRot
%  Implement PCA to obtain xRot, the matrix in which the data is expressed
%  with respect to the eigenbasis of sigma, which is the matrix U.


% -------------------- YOUR CODE HERE -------------------- 
xRot = zeros(size(x)); % You need to compute this
sigma=x*x'/size(x,2);
[u,s,v]=svd(sigma);
diags=diag(s);
xRot=u'*x;

%%================================================================
%% Step 1b: Check your implementation of PCA
%  The covariance matrix for the data expressed with respect to the basis U
%  should be a diagonal matrix with non-zero entries only along the main
%  diagonal. We will verify this here.
%  Write code to compute the covariance matrix, covar. 
%  When visualised as an image, you should see a straight line across the
%  diagonal (non-zero entries) against a blue background (zero entries).

% -------------------- YOUR CODE HERE -------------------- 
covar = zeros(size(x, 1)); % You need to compute this
covar=xRot*xRot'/size(xRot,2);

% Visualise the covariance matrix. You should see a line across the
% diagonal against a blue background.
figure('name','Visualisation the covariance matrix of xRot');
imagesc(covar);

%%================================================================
%% Step 2: Find k, the number of components to retain
%  Write code to determine k, the number of components to retain in order
%  to retain at least 99% of the variance.

% -------------------- YOUR CODE HERE -------------------- 
k = 0; % Set k accordingly
%注意这里配合使用的sum(diag(s))，diag抽取s中对角线的元素，为一个列向量，sum一下得一个数字
%没有diag，sum默认是对于列累加的，得到一个行向量
total=sum(diag(s));
for k=1:size(s,1)
    if sum(diag(s(1:k,1:k)))/total > 0.99
        break;
    end
end
%下面为应用MATLAB函数的快速程序
%首先diags就是抽取了s中的对角线上的元素的列向量
%cumsum(diags)为累加对角线上元素各项和的累加向量
%sum(diags)为对角线元素的总和
%cumsum(diags)/sum(diags)为除出来的，一个累加和向量的百分比
%(cumsum(diags)/sum(diags))<=0.99为一个逻辑判断，小于0.99的保留
%diags((cumsum(diags)/sum(diags))<=0.99)抽取了小于0.99的数组中的元素
%（对应上面的逻辑判断的结果为 0 1）
%最后的length才是统计长度，统计的是重组后的数组的长度
%这样计算的两个结果有偏差，由于一个是>一个是<，上面计算的k为116，下面的是115，符合原理
% diags=diag(s);
% k = length(diags((cumsum(diags)/sum(diags))<=0.99));

%%================================================================
%% Step 3: Implement PCA with dimension reduction
%  Now that you have found k, you can reduce the dimension of the data by
%  discarding the remaining dimensions. In this way, you can represent the
%  data in k dimensions instead of the original 144, which will save you
%  computational time when running learning algorithms on the reduced
%  representation.
% 
%  Following the dimension reduction, invert the PCA transformation to produce 
%  the matrix xHat, the dimension-reduced data with respect to the original basis.
%  Visualise the data and compare it to the raw data. You will observe that
%  there is little loss due to throwing away the principal components that
%  correspond to dimensions with low variation.

% -------------------- YOUR CODE HERE -------------------- 
xHat = zeros(size(x));  % You need to compute this
xHat=u*[u(:,1:k),zeros(size(u,1),size(u,2)-k)]'*x;

% Visualise the data, and compare it to the raw data
% You should observe that the raw and processed data are of comparable quality.
% For comparison, you may wish to generate a PCA reduced image which
% retains only 90% of the variance.

figure('name',['PCA processed images ',sprintf('(%d / %d dimensions)', k, size(x, 1)),'']);
display_network(xHat(:,randsel));
%由于数据在前面进行了0均值处理，所以PCA还原后的数据要和0均值处理之后的数据，进行比较，也就是后面的图像
figure('name','Raw images');
display_network(x(:,randsel));

%%================================================================
%% Step 4a: Implement PCA with whitening and regularisation
%  Implement PCA with whitening and regularisation to produce the matrix
%  xPCAWhite. 

epsilon = 0.1;
xPCAWhite = zeros(size(x));

% -------------------- YOUR CODE HERE -------------------- 
%对于PCAWhite处理，如果要降维，那么是在PCAWhite白化之后进行降维处理更加简单
%不显示PCAWhite后的数据，是由于这个数据其实正如pca_2d实验中，和原始数据差距太大
%而本实验更多侧重于体现PCA降维后，能最大程度保留原始数据的信息的能力
xPCAWhite = diag(1./sqrt(diag(s)+epsilon))*u'*x;

%%================================================================
%% Step 4b: Check your implementation of PCA whitening 
%  Check your implementation of PCA whitening with and without regularisation. 
%  PCA whitening without regularisation results a covariance matrix 
%  that is equal to the identity matrix. PCA whitening with regularisation
%  results in a covariance matrix with diagonal entries starting close to 
%  1 and gradually becoming smaller. We will verify these properties here.
%  Write code to compute the covariance matrix, covar. 
%
%  Without regularisation (set epsilon to 0 or close to 0), 
%  when visualised as an image, you should see a red line across the
%  diagonal (one entries) against a blue background (zero entries).
%  With regularisation, you should see a red line that slowly turns
%  blue across the diagonal, corresponding to the one entries slowly
%  becoming smaller.

% -------------------- YOUR CODE HERE -------------------- 
%由于实验说要对比xPCAWhite有无regularization的情况，所以就同时显示两幅图
covar=xPCAWhite*xPCAWhite'/size(xPCAWhite,2);
% Visualise the covariance matrix. You should see a red line across the
% diagonal against a blue background.
covarwithregu=diag(covar);
figure('name','Visualisation the covariance matrix of xPCAWhite with regularization');
imagesc(covar);
%上面为xPCAWhite with regularization，下面为xPCAWhite with no regularization
epsilon = 1e-10;
xPCAWhite = diag(1./sqrt(diag(s)+epsilon))*u'*x;
covar=xPCAWhite*xPCAWhite'/size(xPCAWhite,2);
covarwithnoregu=diag(covar);
figure('name','Visualisation the covariance matrix of xPCAWhite with no regularization');
imagesc(covar);
%从上面的图，可以看出，加入正则化项后，可以避免由于特征值越来越少，导致的数据溢出
%这也可以从36行，提取出来的diags看出，具体值的变化，最大的特征值和倒数第二小的特征值差距900倍。
%%================================================================
%% Step 5: Implement ZCA whitening
%  Now implement ZCA whitening to produce the matrix xZCAWhite. 
%  Visualise the data and compare it to the raw data. You should observe
%  that whitening results in, among other things, enhanced edges.

xZCAWhite = zeros(size(x));

% -------------------- YOUR CODE HERE -------------------- 
%UFLDL中说ZCAWhite能够加强边缘，其实这算看出一点来了。
%但是对于epsilon的值的增加的变化的情况，没看出效果来
%由于上面对于不降维的ZCAWhite的epsilon的值增加没看出效果来
%所以想着对于降维的ZCAWhite看看epsilon的值增加的效果
%但是还是没有看出效果来，以后数学更好，再来解析吧
epsilon = 1;
xZCAWhite=u*diag(1./sqrt(diag(s)+epsilon))*u'*x;
% Visualise the data, and compare it to the raw data.
% You should observe that the whitened images have enhanced edges.
figure('name','ZCA whitened images with epsilon 1');
display_network(xZCAWhite(:,randsel));

xPCAWhite = diag(1./sqrt(diag(s)+epsilon))*u'*x;
xDZCAWhite=u*[xPCAWhite(1:k,:);zeros(size(xPCAWhite,1)-k,size(xPCAWhite,2))];
figure('name',['ZCA whitened images with epsilon 1',sprintf('(%d / %d dimensions)', k, size(x, 1)),'']);
display_network(xDZCAWhite(:,randsel));

epsilon = 0.1;
xZCAWhite=u*diag(1./sqrt(diag(s)+epsilon))*u'*x;
% Visualise the data, and compare it to the raw data.
% You should observe that the whitened images have enhanced edges.
figure('name','ZCA whitened images with epsilon 0.1');
display_network(xZCAWhite(:,randsel));

xPCAWhite = diag(1./sqrt(diag(s)+epsilon))*u'*x;
xDZCAWhite=u*[xPCAWhite(1:k,:);zeros(size(xPCAWhite,1)-k,size(xPCAWhite,2))];
figure('name',['ZCA whitened images with epsilon 0.1',sprintf('(%d / %d dimensions)', k, size(x, 1)),'']);
display_network(xDZCAWhite(:,randsel));

epsilon = 0.01;
xZCAWhite=u*diag(1./sqrt(diag(s)+epsilon))*u'*x;
% Visualise the data, and compare it to the raw data.
% You should observe that the whitened images have enhanced edges.
figure('name','ZCA whitened images with epsilon 0.01');
display_network(xZCAWhite(:,randsel));

xPCAWhite = diag(1./sqrt(diag(s)+epsilon))*u'*x;
xDZCAWhite=u*[xPCAWhite(1:k,:);zeros(size(xPCAWhite,1)-k,size(xPCAWhite,2))];
figure('name',['ZCA whitened images with epsilon 0.01',sprintf('(%d / %d dimensions)', k, size(x, 1)),'']);
display_network(xDZCAWhite(:,randsel));

figure('name','Raw images');
display_network(x(:,randsel));
　　图片太多，意义不大。
posted @ 2015-11-16 10:26 菜鸡一枚阅读(211) 评论(0) 编辑收藏举报
刷新页面返回顶部
菜鸡一枚

4、PCA and Whitening

公告