Logistic Regression
参考网页:
http://ufldl.stanford.edu/tutorial/supervised/LogisticRegression/
在线性回归中,我们找到一个函数,对于输入x预测值为连续实值,但有时候我们想得到一个离散值,如在二分类问题中,我们想要给一个输入一个类别标签,在logistic regression中,我们想要找到另外一个函数来预测样本x属于类别1的概率。
函数称为sigmoid或logistic函数,这也是logistic regression名字的由来 。
当时,样本属于类别1,反之属于类别0 。
目标函数:
注:上式对于一个样本,只有一项为1
梯度:
矩阵形式:
目标函数:
梯度:
注意:
- m表示样本数,要除以样本数,如果不阶,那么会随着样本的数目学出不同的函数
- 注意目标函数中的点乘
实验:
使用MNIST数据集中的标签为0,1的样本。图片大小为28*28
实验结果:
分类准确率为Accracy:100%
实验代码:
ex1b_logreg.m:
addpath ../common addpath ../common/minFunc_2012/minFunc addpath ../common/minFunc_2012/minFunc/compiled % Load the MNIST data for this exercise. % train.X and test.X will contain the training and testing images. % Each matrix has size [n,m] where: % m is the number of examples. % n is the number of pixels in each image. % train.y and test.y will contain the corresponding labels (0 or 1). binary_digits = true; [train,test] = ex1_load_mnist(binary_digits); % Add row of 1s to the dataset to act as an intercept term. train.X = [ones(1,size(train.X,2)); train.X]; test.X = [ones(1,size(test.X,2)); test.X]; % Training set dimensions m=size(train.X,2); n=size(train.X,1); % Train logistic regression classifier using minFunc options = struct('MaxIter', 100); % First, we initialize theta to some small random values. theta = rand(n,1)*0.001; % Call minFunc with the logistic_regression.m file as the objective function. % % TODO: Implement batch logistic regression in the logistic_regression.m file! % tic; theta=minFunc(@logistic_regression, theta, options, train.X, train.y); fprintf('Optimization took %f seconds.\n', toc); % Now, call minFunc again with logistic_regression_vec.m as objective. % % TODO: Implement batch logistic regression in logistic_regression_vec.m using % MATLAB's vectorization features to speed up your code. Compare the running % time for your logistic_regression.m and logistic_regression_vec.m implementations. % % Uncomment the lines below to run your vectorized code. %theta = rand(n,1)*0.001; %tic; %theta=minFunc(@logistic_regression_vec, theta, options, train.X, train.y); %fprintf('Optimization took %f seconds.\n', toc); % Print out training accuracy. tic; accuracy = binary_classifier_accuracy(theta,train.X,train.y); fprintf('Training accuracy: %2.1f%%\n', 100*accuracy); % Print out accuracy on the test set. accuracy = binary_classifier_accuracy(theta,test.X,test.y); fprintf('Test accuracy: %2.1f%%\n', 100*accuracy);
logistic_regression.m
function [f,g] = logistic_regression(theta, X,y) % % Arguments: % theta - A column vector containing the parameter values to optimize. % X - The examples stored in a matrix. % X(i,j) is the i'th coordinate of the j'th example. % y - The label for each example. y(j) is the j'th example's label. % m=size(X,2); % initialize objective value and gradient. f = 0; g = zeros(size(theta)); % % TODO: Compute the objective function by looping over the dataset and summing % up the objective values for each example. Store the result in 'f'. % % TODO: Compute the gradient of the objective by looping over the dataset and summing % up the gradients (df/dtheta) for each example. Store the result in 'g'. % %%% YOUR CODE HERE %%% % [n m] = size(X); % theta : n * 1 % X : n * m % y : 1 * m %y_hat = sigmoid( theta' * X ); y_hat = sigmoid( theta' * X ); % 1*m error = y .* log( y_hat ) + ( 1 - y ) .* log( 1 - y_hat ) ; % 1*1 error = y .* log( y_hat ) + ( 1 - y ) .* ( 1 -log( y_hat ) ); f = -1/m * sum( error(:) ); g = -1/m * X * ( y - y_hat )'; % n*1 end
参考资料:
http://ufldl.stanford.edu/tutorial/supervised/LogisticRegression/