9、Convolution and Pooling
- 总结:
1)就是用前一个线性解码器的实验学习到的权值作为卷积核,然后卷积、池化,最后送入softmax分类器。
2)这种方法有个特殊就地方,就是由于原始数据用了0均值和ZCAWhite,所以对于线性解码器的参数进行了相应的处理,这里没有学过的。
3)MATLAB没办法搞DL,就这相当费时,费内存。在这上面跑就是浪费时间。
4)这里面没有CNN的BP,后面马上看。
5)对于图像的3个通道,这里用了累加,然后sigmoid函数处理的方法,感觉还是有道理的。也是对于3通道的图像处理,了解了一个CNN处理的方法。
6)这里就是用的没有重叠的平均池化
- 问题:
1)
2)
3)
4)
5)
- 想法:
1)
2)
3)
4)
5)
实验需要下载代码:cnn_exercise.zip 数据集stlSubset.zip
cnnExercise.m
clear;close all;clc; disp('当前正在执行的程序是:'); disp([mfilename('fullpath'),'.m']); %% CS294A/CS294W Convolutional Neural Networks Exercise % Instructions % ------------ % % This file contains code that helps you get started on the % convolutional neural networks exercise. In this exercise, you will only % need to modify cnnConvolve.m and cnnPool.m. You will not need to modify % this file. %%====================================================================== %% STEP 0: Initialization % Here we initialize some parameters used for the exercise. imageDim = 64; % image dimension imageChannels = 3; % number of channels (rgb, so 3) patchDim = 8; % patch dimension numPatches = 50000; % number of patches visibleSize = patchDim * patchDim * imageChannels; % number of input units 192 outputSize = visibleSize; % number of output units hiddenSize = 400; % number of hidden units epsilon = 0.1; % epsilon for ZCA whitening poolDim = 19; % dimension of pooling region %%====================================================================== %% STEP 1: Train a sparse autoencoder (with a linear decoder) to learn % features from color patches. If you have completed the linear decoder % execise, use the features that you have obtained from that exercise, % loading them into optTheta. Recall that we have to keep around the % parameters used in whitening (i.e., the ZCA whitening matrix and the % meanPatch) % --------------------------- YOUR CODE HERE -------------------------- % Train the sparse autoencoder and fill the following variables with % the optimal parameters: optTheta = zeros(2*hiddenSize*visibleSize+hiddenSize+visibleSize, 1); ZCAWhite = zeros(visibleSize, visibleSize); meanPatch = zeros(visibleSize, 1); %导入上一个实验学习到的上面三个参数 load STL10Features.mat % -------------------------------------------------------------------- % Display and check to see that the features look good %取出了第一层的权系数 W = reshape(optTheta(1:visibleSize * hiddenSize), hiddenSize, visibleSize); %隐层的偏差 b = optTheta(2*hiddenSize*visibleSize+1:2*hiddenSize*visibleSize+hiddenSize); figure; displayColorNetwork( (W*ZCAWhite)'); set(gcf,'NumberTitle','off'); set(gcf,'Name','自编码线性解码器学习到的特征'); %%====================================================================== %{ %% STEP 2: Implement and test convolution and pooling % In this step, you will implement convolution and pooling, and test them % on a small part of the data set to ensure that you have implemented % these two functions correctly. In the next step, you will actually % convolve and pool the features with the STL10 images. %% STEP 2a: Implement convolution % Implement convolution in the function cnnConvolve in cnnConvolve.m % Note that we have to preprocess the images in the exact same way % we preprocessed the patches before we can obtain the feature activations. load stlTrainSubset.mat % loads numTrainImages, trainImages, trainLabels %% Use only the first 8 images for testing %trainImages尺寸为[64,64,3,2000] convImages = trainImages(:, :, :, 1:8); % NOTE: Implement cnnConvolve in cnnConvolve.m first! convolvedFeatures = cnnConvolve(patchDim, hiddenSize, convImages, W, b, ZCAWhite, meanPatch); %% STEP 2b: Checking your convolution % To ensure that you have convolved the features correctly, we have % provided some code to compare the results of your convolution with % activations from the sparse autoencoder % For 1000 random points for i = 1:1000 featureNum = randi([1, hiddenSize]); imageNum = randi([1, 8]); imageRow = randi([1, imageDim - patchDim + 1]); imageCol = randi([1, imageDim - patchDim + 1]); patch = convImages(imageRow:imageRow + patchDim - 1, imageCol:imageCol + patchDim - 1, :, imageNum); patch = patch(:); patch = patch - meanPatch; patch = ZCAWhite * patch; features = feedForwardAutoencoder(optTheta, hiddenSize, visibleSize, patch); if abs(features(featureNum, 1) - convolvedFeatures(featureNum, imageNum, imageRow, imageCol)) > 1e-9 fprintf('Convolved feature does not match activation from autoencoder\n'); fprintf('Feature Number : %d\n', featureNum); fprintf('Image Number : %d\n', imageNum); fprintf('Image Row : %d\n', imageRow); fprintf('Image Column : %d\n', imageCol); fprintf('Convolved feature : %0.5f\n', convolvedFeatures(featureNum, imageNum, imageRow, imageCol)); fprintf('Sparse AE feature : %0.5f\n', features(featureNum, 1)); error('Convolved feature does not match activation from autoencoder'); end end disp('Congratulations! Your convolution code passed the test.'); %% STEP 2c: Implement pooling % Implement pooling in the function cnnPool in cnnPool.m % NOTE: Implement cnnPool in cnnPool.m first! pooledFeatures = cnnPool(poolDim, convolvedFeatures); %% STEP 2d: Checking your pooling % To ensure that you have implemented pooling, we will use your pooling % function to pool over a test matrix and check the results. testMatrix = reshape(1:64, 8, 8); expectedMatrix = [mean(mean(testMatrix(1:4, 1:4))) mean(mean(testMatrix(1:4, 5:8))); ... mean(mean(testMatrix(5:8, 1:4))) mean(mean(testMatrix(5:8, 5:8))); ]; testMatrix = reshape(testMatrix, 1, 1, 8, 8); pooledFeatures = squeeze(cnnPool(4, testMatrix)); if ~isequal(pooledFeatures, expectedMatrix) disp('Pooling incorrect'); disp('Expected'); disp(expectedMatrix); disp('Got'); disp(pooledFeatures); else disp('Congratulations! Your pooling code passed the test.'); end %} %%====================================================================== %% STEP 3: Convolve and pool with the dataset % In this step, you will convolve each of the features you learned with % the full large images to obtain the convolved features. You will then % pool the convolved features to obtain the pooled features for % classification. % % Because the convolved features matrix is very large, we will do the % convolution and pooling 50 features at a time to avoid running out of % memory. Reduce this number if necessary stepSize = 50; assert(mod(hiddenSize, stepSize) == 0, 'stepSize should divide hiddenSize'); load stlTrainSubset.mat % loads numTrainImages, trainImages, trainLabels load stlTestSubset.mat % loads numTestImages, testImages, testLabels pooledFeaturesTrain = zeros(hiddenSize, numTrainImages, ... floor((imageDim - patchDim + 1) / poolDim), ... floor((imageDim - patchDim + 1) / poolDim) ); pooledFeaturesTest = zeros(hiddenSize, numTestImages, ... floor((imageDim - patchDim + 1) / poolDim), ... floor((imageDim - patchDim + 1) / poolDim) ); tic(); %{ for convPart = 1:(hiddenSize / stepSize) featureStart = (convPart - 1) * stepSize + 1; featureEnd = convPart * stepSize; fprintf('Step %d: features %d to %d\n', convPart, featureStart, featureEnd); Wt = W(featureStart:featureEnd, :); bt = b(featureStart:featureEnd); fprintf('Convolving and pooling train images\n'); convolvedFeaturesThis = cnnConvolve(patchDim, stepSize, ... trainImages, Wt, bt, ZCAWhite, meanPatch); pooledFeaturesThis = cnnPool(poolDim, convolvedFeaturesThis); pooledFeaturesTrain(featureStart:featureEnd, :, :, :) = pooledFeaturesThis; toc(); clear convolvedFeaturesThis pooledFeaturesThis; fprintf('Convolving and pooling test images\n'); convolvedFeaturesThis = cnnConvolve(patchDim, stepSize, ... testImages, Wt, bt, ZCAWhite, meanPatch); pooledFeaturesThis = cnnPool(poolDim, convolvedFeaturesThis); pooledFeaturesTest(featureStart:featureEnd, :, :, :) = pooledFeaturesThis; toc(); clear convolvedFeaturesThis pooledFeaturesThis; end % You might want to save the pooled features since convolution and pooling takes a long time save('cnnPooledFeatures.mat', 'pooledFeaturesTrain', 'pooledFeaturesTest'); toc(); %} load cnnPooledFeatures.mat %%====================================================================== %% STEP 4: Use pooled features for classification % Now, you will use your pooled features to train a softmax classifier, % using softmaxTrain from the softmax exercise. % Training the softmax classifer for 1000 iterations should take less than % 10 minutes. % Add the path to your softmax solution, if necessary % addpath /path/to/solution/ % Setup parameters for softmax softmaxLambda = 1e-4; numClasses = 4; % Reshape the pooledFeatures to form an input vector for softmax softmaxX = permute(pooledFeaturesTrain, [1 3 4 2]); softmaxX = reshape(softmaxX, numel(pooledFeaturesTrain) / numTrainImages,... numTrainImages); softmaxY = trainLabels; options = struct; options.maxIter = 200; softmaxModel = softmaxTrain(numel(pooledFeaturesTrain) / numTrainImages,... numClasses, softmaxLambda, softmaxX, softmaxY, options); %%====================================================================== %% STEP 5: Test classifer % Now you will test your trained classifer against the test images softmaxX = permute(pooledFeaturesTest, [1 3 4 2]); softmaxX = reshape(softmaxX, numel(pooledFeaturesTest) / numTestImages, numTestImages); softmaxY = testLabels; [pred] = softmaxPredict(softmaxModel, softmaxX); acc = (pred(:) == softmaxY(:)); acc = sum(acc) / size(acc, 1); fprintf('Accuracy: %2.3f%%\n', acc * 100); % You should expect to get an accuracy of around 80% on the test images.
cnnConvolve.m
function convolvedFeatures = cnnConvolve(patchDim, numFeatures, images, W, b, ZCAWhite, meanPatch) %cnnConvolve Returns the convolution of the features given by W and b with %the given images % % Parameters: % patchDim - patch (feature) dimension % numFeatures - number of features % images - large images to convolve with, matrix in the form % images(r, c, channel, image number) % W, b - W, b for features from the sparse autoencoder % ZCAWhite, meanPatch - ZCAWhitening and meanPatch matrices used for % preprocessing % % Returns: % convolvedFeatures - matrix of convolved features in the form % convolvedFeatures(featureNum, imageNum, imageRow, imageCol) % 分别取出图像的数量,图像的维度,图像的通道数量 numImages = size(images, 4); imageDim = size(images, 1); %imageChannels = size(images, 3); %预先定义卷积后的特征 %numFeatures特征的数量,也就是特征的层数,这里就是隐神经元的数量,就是对于每一幅图像都要经过那么多的卷积核 %convolvedFeatures = zeros(numFeatures, numImages, imageDim - patchDim + 1, imageDim - patchDim + 1); % Instructions: % Convolve every feature with every large image here to produce the % numFeatures x numImages x (imageDim - patchDim + 1) x (imageDim - patchDim + 1) % matrix convolvedFeatures, such that % convolvedFeatures(featureNum, imageNum, imageRow, imageCol) is the % value of the convolved featureNum feature for the imageNum image over % the region (imageRow, imageCol) to (imageRow + patchDim - 1, imageCol + patchDim - 1) % % Expected running times: % Convolving with 100 images should take less than 3 minutes % Convolving with 5000 images should take around an hour % (So to save time when testing, you should convolve with less images, as % described earlier) % -------------------- YOUR CODE HERE -------------------- % Precompute the matrices that will be used during the convolution. Recall % that you need to take into account the whitening and mean subtraction % steps %下面两项就是把0均值和ZCAWhite都放进参数中,使得可以输入原始数据。 %W尺寸为[hiddenSize,visibleSize] WT=W*ZCAWhite;%等效的网络参数 %ZCAWhite尺寸为[visibleSize, visibleSize] %下面这项就是b=b-W*ZCAWhite*meanPath,等于把对于x的0均值处理,搬到了偏差b中 b=b-WT*meanPatch;%等效的b patchSize=patchDim*patchDim; % -------------------------------------------------------- %trainImages尺寸为[64,64,3,2000] % 分别取出图像的数量,图像的维度,图像的通道数量 % numImages = size(images, 4); % imageDim = size(images, 1); % imageChannels = size(images, 3); convolvedFeatures = zeros(numFeatures, numImages, imageDim - patchDim + 1, imageDim - patchDim + 1); for imageNum = 1:numImages%先图像的数量 %再卷积核的数量,也就是隐单元数量 for featureNum = 1:numFeatures%numFeatures=hiddenSize % convolution of image with feature matrix for each channel convolvedImage = zeros(imageDim - patchDim + 1, imageDim - patchDim + 1); %然后层的数量,不同的层对应着不同的卷积核 for channel = 1:3 % Obtain the feature (patchDim x patchDim) needed during the convolution % ---- YOUR CODE HERE ---- %获取卷积核 %feature = zeros(8,8); % You should replace this feature=reshape(WT(featureNum,(channel-1)*patchSize+1:channel*patchSize),patchDim,patchDim); % ------------------------ % Flip the feature matrix because of the definition of convolution, as explained later %squeeze为移除单一维度的意思 %flipud为上下翻转,fliplr为左右翻转,flipud和fliplr结合就是翻转180度的意思。 %由于MATLAB提示,用rot90(x,2)替换更快,所以替换。 %feature = flipud(fliplr(squeeze(feature))); %这里翻转是由于这个权值是实现训练好的,而在conv2中会把权值翻转180度,所以在前面先翻转180度 %这样,与conv2中的翻转抵消,权值就能够在正确的位置上滤波。 %但感觉其实,就这2D图像,更多是借助于这个函数,单从滤波器的角度,没有必要翻转也是可以的 %翻转权值都是学习来的。 feature =rot90(feature,2); % Obtain the image %获得对应图像在对应通道上的图像 im = squeeze(images(:, :, channel, imageNum)); % Convolve "feature" with "im", adding the result to convolvedImage % be sure to do a 'valid' convolution % ---- YOUR CODE HERE ---- %由于这里定义的卷积后的特征图谱的尺寸,分别为特征图谱的层数(隐层数量),图像数量,卷积后特征图谱的尺寸 %convolvedFeatures = zeros(numFeatures, numImages, imageDim - patchDim + 1, imageDim - patchDim + 1); %并没有考虑到彩色图像的3个通道,所以最简单的方法,就是把3个通道算出来的特征图谱累加 %不过感觉下面这个3个通道累加,可能出现超出图像值域的情形,虽然别人的代码都没考虑这种情况,但是自己还是做个平均。 %convolvedImage=convolvedImage+1/3.*conv2(im,feature,'valid'); %但是没想到,做了个平均,反而不能通过cnnExercise.m中 %111行error('Convolved feature does not match activation from autoencoder'); %多通道的特征,都是这样直接累加,然后也不做均值处理,应该是考虑后面反正都会进行sigmoid处理 %会自动进行输出值域的压缩,所以就自然不进行这个平均,会减慢速度 %从理论上这样分析是可以的。这样也在一定程度上,打破了对于颜色的依赖? convolvedImage=convolvedImage+conv2(im,feature,'valid'); % ------------------------ end % Subtract the bias unit (correcting for the mean subtraction as well) % Then, apply the sigmoid function to get the hidden activation % ---- YOUR CODE HERE ---- %MATLAB一个矩阵和一个常数的加法,就是对于这个矩阵所有的元素的相同位置都加上这个数字 convolvedImage=sigmoid(convolvedImage+b(featureNum)); % ------------------------ % The convolved feature is the sum of the convolved values for all channels convolvedFeatures(featureNum, imageNum, :, :) = convolvedImage; end end end function sigm = sigmoid(x) sigm = 1 ./ (1 + exp(-x)); end
cnnPool.m
function pooledFeatures = cnnPool(poolDim, convolvedFeatures) %cnnPool Pools the given convolved features % % Parameters: % poolDim - dimension of pooling region % convolvedFeatures - convolved features to pool (as given by cnnConvolve) % convolvedFeatures(featureNum, imageNum, imageRow, imageCol) % % Returns: % pooledFeatures - matrix of pooled features in the form % pooledFeatures(featureNum, imageNum, poolRow, poolCol) % %convolvedFeatures = zeros(numFeatures, numImages, imageDim - patchDim + 1, imageDim - patchDim + 1); numImages = size(convolvedFeatures, 2); numFeatures = size(convolvedFeatures, 1); convolvedDim = size(convolvedFeatures, 3); %floor趋向于负无穷,就是针对于池化的步长,选定要池化的范围 pooledFeatures = zeros(numFeatures, numImages, floor(convolvedDim / poolDim), floor(convolvedDim / poolDim)); %定义一个池化尺寸方便后面程序的编写 pooledDim=floor(convolvedDim / poolDim); % -------------------- YOUR CODE HERE -------------------- % Instructions: % Now pool the convolved features in regions of poolDim x poolDim, % to obtain the % numFeatures x numImages x (convolvedDim/poolDim) x (convolvedDim/poolDim) % matrix pooledFeatures, such that % pooledFeatures(featureNum, imageNum, poolRow, poolCol) is the % value of the featureNum feature for the imageNum image pooled over the % corresponding (poolRow, poolCol) pooling region % (see http://ufldl/wiki/index.php/Pooling ) % % Use mean pooling here. % -------------------- YOUR CODE HERE -------------------- %convolvedFeatures = zeros(numFeatures, numImages, imageDim - patchDim + 1, imageDim - patchDim + 1); % numImages = size(convolvedFeatures, 2); % numFeatures = size(convolvedFeatures, 1); % convolvedDim = size(convolvedFeatures, 3); %floor趋向于负无穷,就是池化后的尺寸,在缩小规整 % pooledFeatures = zeros(numFeatures, numImages, floor(convolvedDim / poolDim), floor(convolvedDim / poolDim)); %这其实模仿cnnConvolve.m就行 for imageNum = 1:numImages%先图像的数量 %再特征的数量 for featureNum = 1:numFeatures%numFeatures=hiddenSize %再做池化循环遍历 for rowNum = 1:pooledDim for colNum = 1:pooledDim pooledFeatures(featureNum, imageNum, rowNum, colNum)= ... mean(mean(convolvedFeatures(featureNum, imageNum, ... (rowNum-1)*poolDim+1:rowNum*poolDim, (colNum-1)*poolDim+1:colNum*poolDim))); end end end end end