2、Vectorization

  这个实验其实主要是对于sparseAutoencoderCost.m向量化,但是由于之前已经进行了向量化,所以这其实没什么内容(之前其实感觉向量化,挺难的,但是第一个实验就向量化是有难度的,不过这个实验之前有很多向量化的提示)。就是自编码的结构参数,按照文档改一下,下载给的帮助代码。

  • 问题:

1)

2)

3)

4)

5)

  UFLDL Vectorization

% Change the filenames if you've saved the files under different names
% On some platforms, the files might be saved as 
% train-images.idx3-ubyte / train-labels.idx1-ubyte
images = loadMNISTImages('train-images-idx3-ubyte');
labels = loadMNISTLabels('train-labels-idx1-ubyte');
 
% We are using display_network from the autoencoder code
display_network(images(:,1:100)); % Show the first 100 images
disp(labels(1:10));

  下载MNIST数据集,和帮助导入MNIST的函数http://ufldl.stanford.edu/wiki/resources/mnistHelper.zip.

  基本就是没什么内容了,然后自己根据前一节的函数sampleIMAGES.m,改写一下train.m 就可以了。

  train.m

clear;clc;close all;
%% CS294A/CS294W Programming Assignment Starter Code

%  Instructions
%  ------------
% 
%  This file contains code that helps you get started on the
%  programming assignment. You will need to complete the code in sampleIMAGES.m,
%  sparseAutoencoderCost.m and computeNumericalGradient.m. 
%  For the purpose of completing the assignment, you do not need to
%  change the code in this file. 
%
%%======================================================================
%% STEP 0: Here we provide the relevant parameters values that will
%  allow your sparse autoencoder to get good filters; you do not need to 
%  change the parameters below.

visibleSize = 28*28;   % number of input units 
hiddenSize = 196;     % number of hidden units 
sparsityParam = 0.1;   % desired average activation of the hidden units.
                     % (This was denoted by the Greek alphabet rho, which looks like a lower-case "p",
		     %  in the lecture notes). 
  
% lambda = 0;             
lambda = 3e-3;     % weight decay parameter  

% beta = 0;
beta = 3;            % weight of sparsity penalty term       

%从http://deeplearning.stanford.edu/wiki/index.php/Using_the_MNIST_Dataset获得的代码。
%下面这两个函数还要下载对应的
% Change the filenames if you've saved the files under different names
% On some platforms, the files might be saved as 
% train-images.idx3-ubyte / train-labels.idx1-ubyte
images = loadMNISTImages('train-images-idx3-ubyte');
labels = loadMNISTLabels('train-labels-idx1-ubyte');
 
% We are using display_network from the autoencoder code
display_network(images(:,1:100)); % Show the first 100 images
disp(labels(1:10));
set(gcf,'NumberTitle','off');
set(gcf,'Name','显示MNIST前100个数据');

%%======================================================================
%% STEP 1: 
%下面为根据本次试验修改的读取patches
%patches = first 10000 images from the MNIST dataset
%
numpatches = 10000; 
patches = zeros(visibleSize, numpatches);
for imageNum=1:numpatches
    patches(:,imageNum)=reshape(images(:,imageNum),visibleSize,1);
end

%%======================================================================
%% STEP 2: Implement sparseAutoencoderCost
%
%  You can implement all of the components (squared error cost, weight decay term,
%  sparsity penalty) in the cost function at once, but it may be easier to do 
%  it step-by-step and run gradient checking (see STEP 3) after each step.  We 
%  suggest implementing the sparseAutoencoderCost function using the following steps:
%
%  (a) Implement forward propagation in your neural network, and implement the 
%      squared error term of the cost function.  Implement backpropagation to 
%      compute the derivatives.   Then (using lambda=beta=0), run Gradient Checking 
%      to verify that the calculations corresponding to the squared error cost 
%      term are correct.
%
%  (b) Add in the weight decay term (in both the cost function and the derivative
%      calculations), then re-run Gradient Checking to verify correctness. 
%
%  (c) Add in the sparsity penalty term, then re-run Gradient Checking to 
%      verify correctness.
%
%  Feel free to change the training settings when debugging your
%  code.  (For example, reducing the training set size or 
%  number of hidden units may make your code run faster; and setting beta 
%  and/or lambda to zero may be helpful for debugging.)  However, in your 
%  final submission of the visualized weights, please use parameters we 
%  gave in Step 0 above.

theta = initializeParameters(hiddenSize, visibleSize);
[costBegin, grad] = sparseAutoencoderCost(theta, visibleSize, hiddenSize, lambda, ...
                                     sparsityParam, beta, patches);

%%======================================================================

%下面这个为检查函数computeNumericalGradient和函数sparseAutoencoderCost的代码
%检查完毕,就可以注释掉,不用再运行
%{
%% STEP 3: Gradient Checking
%
% Hint: If you are debugging your code, performing gradient checking on smaller models 
% and smaller training sets (e.g., using only 10 training examples and 1-2 hidden 
% units) may speed things up.

% First, lets make sure your numerical gradient computation is correct for a
% simple function.  After you have implemented computeNumericalGradient.m,
% run the following: 

%检查梯度的函数 computeNumericalGradient.m,开始编写computeNumericalGradient的时候开始验证
%然后就不用再检查
%checkNumericalGradient();

% Now we can use it to check your cost function and derivative calculations
% for the sparse autoencoder. 

%下面这句语法上理解,其实一开始不懂,后面问了王鑫师姐,就完全懂了。
%下面的函数有两个参数,用逗号作为间隔,前一个参数是函数,后一个就是一个常数
%看 computeNumericalGradient(J, theta) 的函数定义,前一个参数J就是一个函数。
%这里就是用一个匿名函数作为参数,然后定义的x作为调用函数中的某一个参数变量
%所以后面如果调用这个参数的函数,那么送入的变量就会替换掉x
%搞懂这块真不容易呀,不过弄懂了是很爽呀,真就看懂了
%不过当年MATLAB也不熟,看这块代码确实是有点费劲
numgrad = computeNumericalGradient( @(x) sparseAutoencoderCost(x, visibleSize, ...
                                                  hiddenSize, lambda, ...
                                                  sparsityParam, beta, ...
                                                  patches), theta);

%下面就直接输出和theta[3289,1]维数一样的两个向量
% Use this to visually compare the gradients side by side
disp([numgrad grad]); 

% Compare numerically computed gradients with the ones obtained from backpropagation
diff = norm(numgrad-grad)/norm(numgrad+grad);

fprintf('Norm of the difference between numerical and analytical gradient (should be < 1e-9)\n\n');

disp(diff); % Should be small. In our implementation, these values are
            % usually less than 1e-9.

            % When you got this working, Congratulations!!! 

%%======================================================================
%}

%% STEP 4: After verifying that your implementation of
%  sparseAutoencoderCost is correct, You can start training your sparse
%  autoencoder with minFunc (L-BFGS).

%  Randomly initialize the parameters
theta = initializeParameters(hiddenSize, visibleSize);

%  Use minFunc to minimize the function
addpath minFunc/
options.Method = 'lbfgs'; % Here, we use L-BFGS to optimize our cost
                          % function. Generally, for minFunc to work, you
                          % need a function pointer with two outputs: the
                          % function value and the gradient. In our problem,
                          % sparseAutoencoderCost.m satisfies this.
options.maxIter = 400;	  % Maximum number of iterations of L-BFGS to run 
options.display = 'on';


[opttheta, costEnd] = minFunc( @(p) sparseAutoencoderCost(p, ...
                                   visibleSize, hiddenSize, ...
                                   lambda, sparsityParam, ...
                                   beta, patches), ...
                              theta, options);

%%======================================================================
%% STEP 5: Visualization 

W1 = reshape(opttheta(1:hiddenSize*visibleSize), hiddenSize, visibleSize);
figure;
display_network(W1', 12); 
set(gcf,'NumberTitle','off');
set(gcf,'Name','稀疏自编码后的第一层的权系数');

print -djpeg weights.jpg   % save the visualization to a file 

  看这个输出的cost,也可以看出,模型越大,cost越大。

  实验结果为如下,这些笔画。

posted @ 2015-11-14 15:50  菜鸡一枚  阅读(239)  评论(0编辑  收藏  举报