5、Softmax Regression

总结：

1）softmax的一个介绍，现在才感觉到UFLDL就是直接上，详细的没有。

2）细节上各方面，展示的很到位。

3）算是学会了函数 bsxfun的使用方法。

4）

5）

问题：

1）由于对于矩阵运算不懂，以及softmax的基本原理不懂，以及lbfgs不懂，所以还是理解不到位的。

2）softmaxCost.m中的 groundTruth = full(sparse(labels, 1:numCases, 1));这个对于full和sparse的组合不懂。

3）

4）

5）

想法：

1）

2）

3）

4）

5）

　　UFLDL Softmax Regression

　　实验需要下载代码： softmax_exercise.zip、

　　softmaxCost.m

function [cost, grad] = softmaxCost(theta, numClasses, inputSize, lambda, data, labels)

% numClasses - the number of classes 
% inputSize - the size N of the input vector
% lambda - weight decay parameter
% data - the N x M input matrix, where each column data(:, i) corresponds to
%        a single test set
% labels - an M x 1 matrix containing the labels corresponding for the input data
%

% Unroll the parameters from theta
%输入为一个列向量随机数，变成一个矩阵
theta = reshape(theta, numClasses, inputSize);

numCases = size(data, 2);%样本个数
%后面这个语句，输入参数label尺寸为[60000,1],numCases为样本个数
%其实对于后面这个语句，还是没看懂
%输出的结果矩阵groundTruth尺寸为[10，60000]。为每列对应label行号为1，其余数据为0的矩阵
groundTruth = full(sparse(labels, 1:numCases, 1));
cost = 0;

thetagrad = zeros(numClasses, inputSize);

%% ---------- YOUR CODE HERE --------------------------------------
%  Instructions: Compute the cost and gradient for softmax regression.
%                You need to compute thetagrad and cost.
%                The groundTruth matrix might come in handy.
%其实这代码，是看网上的，自己编不出来。对于下面的矩阵运算，其实还是不熟
%下面这个减法，其实是由于参数是冗余的，为了防止指数后的值过大，所以减去一个最大结果
%theta*data的尺寸为[10,60000]
M=bsxfun(@minus,theta*data,max(theta*data,[],1));
M=exp(M);
%sum(M)正好为样本各个类别的概率的累加和
p=bsxfun(@rdivide,M,sum(M));
%下面这个公式的两个sum，正好对应着先进行样本内类别累加，然后样本进行累加
cost=-1/numCases*sum(sum(groundTruth.*log(p)))+lambda/2*sum(sum(theta.*theta));
thetagrad=-1/numCases*(groundTruth-p)*data'+lambda*theta;


% ------------------------------------------------------------------
% Unroll the gradient matrices into a vector for minFunc
grad = [thetagrad(:)];
end

　　softmaxPredict.m

function [pred] = softmaxPredict(softmaxModel, data)

% softmaxModel - model trained using softmaxTrain
% data - the N x M input matrix, where each column data(:, i) corresponds to
%        a single test set
%
% Your code should produce the prediction matrix 
% pred, where pred(i) is argmax_c P(y(c) | x(i)).
 
% Unroll the parameters from theta
theta = softmaxModel.optTheta;  % this provides a numClasses x inputSize matrix
pred = zeros(1, size(data, 2));

%% ---------- YOUR CODE HERE --------------------------------------
%  Instructions: Compute pred using theta assuming that the labels start 
%                from 1.

%softmax实验中theta尺寸为[10,784]，data尺寸为[784，10000]
%theta*data尺寸为[10，10000]
%而这个实验，是求出预测的label，也就是每列最大值的索引
%这就用到了MATLAB中max函数唯一输出两个参数的功能[C,I] = max(...)
%由于softmax预测的低为e，而e是单调递增的，所以直接theta*data最大的，就是label
[~,pred]=max(theta*data);

% ---------------------------------------------------------------------

end

posted @ 2015-11-16 20:27 菜鸡一枚阅读(210) 评论(0) 编辑收藏举报

刷新页面返回顶部

菜鸡一枚

5、Softmax Regression

公告