UFLDL 教程学习笔记(二)
课程链接:http://ufldl.stanford.edu/tutorial/supervised/LogisticRegression/
这一节主要讲的是梯度的概念,在实验部分,比较之前的线性回归的梯度与通过定义来计算的梯度,统计二者之间的误差。
线性回归得到的是一个连续值,有时我们想得到0或者1这样的预测值,这就要用到logistic regression。因为要得到的是概率值,
之前的表示函数显然已经不合适了,这时需要用到新的函数来表示:
我们的目标就是对theta做优化,当x属于1时,概率值为1的概率越大越好,反之越小越好。
目标函数当然也得用新的啦(关于这个函数,可参考台大的机器学习基石:http://beader.me/mlnotebook/section3/logistic-regression.html):
作业部分就是训练识别手写0和1,需要注意的仍然是要分清各个变量的维数。跑了下训练准确率和
测试准确率都是100%
参考:http://blog.csdn.net/lingerlanlan/article/details/38390955
代码我加了点注释:
第一段代码改自ex1a_linreg.m,主要就是为了得到训练数据和测试数据,以及它们的标签。
% %This exercise uses a data from the UCI repository: % Bache, K. & Lichman, M. (2013). UCI Machine Learning Repository % http://archive.ics.uci.edu/ml % Irvine, CA: University of California, School of Information and Computer Science. % %Data created by: % Harrison, D. and Rubinfeld, D.L. % ''Hedonic prices and the demand for clean air'' % J. Environ. Economics & Management, vol.5, 81-102, 1978. % addpath ../common addpath ../common/minFunc_2012/minFunc addpath ../common/minFunc_2012/minFunc/compiled % Load housing data from file. data = load('housing.data'); data=data'; % put examples in columns % Include a row of 1s as an additional intercept feature. data = [ ones(1,size(data,2)); data ]; % Shuffle examples. data = data(:, randperm(size(data,2)));%返回data的一列数据 % Split into train and test sets取得训练数据和测试数据,并取得相应的标签 % The last row of 'data' is the median home price. train.X = data(1:end-1,1:400); train.y = data(end,1:400); test.X = data(1:end-1,401:end); test.y = data(end,401:end); m=size(train.X,2); n=size(train.X,1); % Initialize the coefficient vector theta to random values. theta = rand(n,1);%产生n行1列的在0到1之间的数字 % Run the minFunc optimizer with linear_regression.m as the objective. % % TODO: Implement the linear regression objective and gradient computations % in linear_regression.m % tic; % options = struct('MaxIter', 200); % theta = minFunc(@linear_regression, theta, options, train.X, train.y); % fprintf('Optimization took %f seconds.\n', toc); grad_check(@linear_regression,theta,200,train.X,train.y)
第二段代码是grad_check.m函数
function average_error = grad_check(fun, theta0, num_checks, varargin) delta=1e-3; sum_error=0; fprintf(' Iter i err'); fprintf(' g_est g f\n') for i=1:num_checks T = theta0; j = randsample(numel(T),1);%从1~numel(T)中随机返回一个数 T0=T; T0(j) = T0(j)-delta; T1=T; T1(j) = T1(j)+delta; [f,g] = fun(T, varargin{:});%T为目标函数,varargin为目标函数梯度 f0 = fun(T0, varargin{:}); f1 = fun(T1, varargin{:}); g_est = (f1-f0) / (2*delta); error = abs(g(j) - g_est); fprintf('% 5d % 6d % 15g % 15f % 15f % 15f\n', ... i,j,error,g(j),g_est,f); sum_error = sum_error + error; end average=sum_error/num_checks;