人脸识别实验结论
1、关于MSE
背景:看过文章《SRDA: An Efficient Algorithm for Large Scale Discriminant Analysis》之后呢,总感觉这个SRDA在使用ridge回归时,其方程公式22和MSE(minimum square error)算法非常相似,所以专门将SRDA与MSE进行了比较。
先说说不同之处吧:SRDA中的投影向量 是利用LSQR一列一列来优化的, 而MSE呢,直接求解一个逆矩阵就可以了(X'*X+lambda*I)\X'*Y;(LSQR solves Ax = b or min ||b - Ax||_2 if damp = 0,)
SRDA给每个特征都添加了1,MSE没有;
SRDA中的类标列数为c-1,而MSE的类标列数为c;
SRDA中的识别过程为 测试预测类标 与 训练预测类标之间的欧式距离, 而 MSE一般用测试预测类标 与 真实类标的欧式距离
MSE的类标一般为,[1,0,0,0,0,0,], 而SRDA的类标为由一个随机矩阵的QR分解求得!!!
下面说说识别结果吧:
在用同样大小的lambda惩罚因子情况下,对于Isolet数据集,SRDA的识别率最好,其中SRDA_NCC_ridge: 10.88, SRDA_NN_ridge:10.88, SRDA_Laso(NN为10.49,NCC为10.84),
而原始MSE为14.95,换成都是预测类标值后为11.23,而将类标换成SRDA方法生成的类标且用预测类标之间的距离度量时为11.03
个人感觉,这个类标如SRDA这么定义没有必要,效果并不明显,不过可以试试吧!!!
另外,值得说明的是,在求解 AX=B的问题上,LSQR和原始MSE的(X'*X+lambda*I)\X'*Y;结果基本上一样
norm(MSE_projection-eigvector,'fro') ans = 6.6695e-04
这里,我们贴上我们测试过程中使用的主函数代码
% The main function of Cai deng % http://www.cad.zju.edu.cn/home/dengcai/Data/ReproduceExp.html#SRDA clc,clear; clear all; clear memory; addpath('SLEP package'); addpath('data set'); %%----------- 这个主函数是专门调用论文中Isolet1等数据集而写 ----------% fea_train = []; gnd_train = []; load Isolet1 fea_train = [fea_train;fea]; gnd_train = [gnd_train;gnd]; clear fea gnd load Isolet2 fea_train = [fea_train;fea]; gnd_train = [gnd_train;gnd]; clear fea gnd fea_test = []; gnd_test = []; load Isolet4 fea_test = [fea_test;fea]; gnd_test = [gnd_test;gnd]; clear fea gnd load Isolet5 fea_test = [fea_test;fea]; gnd_test = [gnd_test;gnd]; clear fea gnd sele_num = 20; % select training samples % % ***************** PCA降维 ********************* % options = []; % options.PCARatio = 0.98; % [~,~,~,new_data] = PCA(fea,options); % fea = new_data; % % *********************************************** nnClass = length(unique(gnd_train)); % The number of classes; num_Class = []; for i = 1:nnClass num_Class = [num_Class length(find(gnd_train==i))]; % The number of samples of each class end Train_Ma = []; Train_Lab = []; Test_Ma = fea_test; Test_Lab = gnd_test; for j = 1:nnClass idx = find(gnd_train==j); % randIdx = 1:num_Class(j); % select the first sele_num training samples randIdx=randperm(num_Class(j)); % radomly select the sele_num training samples %-------------------------------------------------------------------- Train_Ma = [Train_Ma; fea_train(idx(randIdx(1:sele_num)),:)]; % Random select select_num samples per class for training Train_Lab = [Train_Lab;gnd_train(idx(randIdx(1:sele_num)))]; end Train_Ma = Train_Ma'; Train_Ma = Train_Ma./repmat(sqrt(sum(Train_Ma.^2)),[size(Train_Ma,1) 1]); Test_Ma = Test_Ma'; Test_Ma = Test_Ma./repmat(sqrt(sum(Test_Ma.^2)),[size(Test_Ma,1) 1]); %% ---------- Training SRDA with L2-regularization --------------% tic; options = []; options.ReguType = 'Ridge'; options.ReguAlpha = 1; % 这个是L2 的约束项 ridge regression model = SRDAtrain(Train_Ma',Train_Lab, options); TimeTrain = toc; tic; %-------------Use nearest center classifer ------ % accuracy = SRDApredict(Test_Ma', Test_Lab, model); TimeTest = toc; disp(['SRDA,',num2str(sele_num),' Training, Errorrate: ',num2str((1-accuracy)*100),' TrainTime: ',num2str(TimeTrain),' TestTime: ',num2str(TimeTest)]); % -------- use NN to classify the samples -------------% feaTrain = SRDAtest(Train_Ma', model); feaTest = SRDAtest(Test_Ma', model); D = EuDist2(feaTest,feaTrain,0); [dump,idx] = min(D,[],2); predictlabel = Train_Lab(idx); errorrate = (length(find(predictlabel-Test_Lab))/length(Test_Lab))*100; disp(['SRDA,',num2str(sele_num),' Train, NN Errorrate: ',num2str(errorrate)]); %% Training SRDA with L1-regularization (use LARs), (Sparse LDA) options = []; options.ReguType = 'Lasso'; options.LASSOway = 'LARs'; options.ReguAlpha = 0.01; options.LassoCardi = 50:10:200; model_Lasso = SRDAtrain(Train_Ma',Train_Lab, options); %-------------Use nearest center classifer ------ % accuracy_Lasso = SRDApredict(Test_Ma', Test_Lab, model_Lasso); for i = 1:length(model_Lasso.LassoCardi) % disp(['Sparse SRDA,',num2str(sele_num),' Train, Cardi=',num2str(model_Lasso.LassoCardi(i)),' NC Errorrate: ',num2str((1-accuracy_Lasso(i))*100)]); end Lasso = min((1-accuracy_Lasso)*100); % ----- Use nearest neighbor classifer ----- % feaTrain = SRDAtest(Train_Ma', model_Lasso); feaTest = SRDAtest(Test_Ma', model_Lasso); for i = 1:length(model_Lasso.LassoCardi) D = EuDist2(feaTest{i},feaTrain{i},0); [dump,idx] = min(D,[],2); predictlabel = Train_Lab(idx); errorrate(i) = (length(find(predictlabel-Test_Lab))/length(Test_Lab))*100; % disp(['Sparse SRDA,',num2str(sele_num),' Train, Cardi=',num2str(model_Lasso.LassoCardi(i)),' NN Errorrate: ',num2str(errorrate(i))]); end Lasso_NN = min(errorrate) Lasso % 因为感觉这个就跟MSE特别像 所以 试一下 MSE的识别率 addpath('G:\机器学习\代码和数据集\看论文编写代码\MSE'); lambda = 1; [MSE_projection,Y_Class,Y] = MSE(Train_Ma',Train_Lab,lambda); Predict_train = Train_Ma'*MSE_projection; Predict_test = Test_Ma'*MSE_projection; count = 0; for i = 1:size(Test_Ma,2) temp = Predict_test(i,:); for j = 1:size(Y_Class,1) res(j) = norm(temp-Y_Class(j,:)); end [rr,dd] = min(res); if dd == Test_Lab(i); count = count + 1; end end error_MSE = (1 - count/size(Test_Ma,2))*100 D = EuDist2(Predict_test,Predict_train,0); [dump,idx] = min(D,[],2); predictlabel_test = Train_Lab(idx); errorrate_MSE = (length(find(predictlabel_test-Test_Lab))/length(Test_Lab))*100 % [MSE_projection2,Y2] = MSE_2(Train_Ma',Train_Lab,lambda); % Train_Ma = [ones(1,size(Train_Ma,2));Train_Ma]; % Test_Ma = [ones(1,size(Test_Ma,2));Test_Ma]; % Predict_train2 = Train_Ma'*MSE_projection2; % Predict_test2 = Test_Ma'*MSE_projection2; % D = EuDist2(Predict_test2,Predict_train2,0); % [dump,idx] = min(D,[],2); % predictlabel_test2 = Train_Lab(idx); % errorrate_MSE2 = (length(find(predictlabel_test2-Test_Lab))/length(Test_Lab))*100 %% --- 调用LSQR 来 求解 这个 Ax=b的问题 options.ReguAlpha = 1; [eigvector, istop] = lsqr2(Train_Ma',Y, options.ReguAlpha, 20); Predict_train_LSQR = Train_Ma'*eigvector; Predict_test_LSQR = Test_Ma'*eigvector; D = EuDist2(Predict_test_LSQR,Predict_train_LSQR,0); [dump,idx] = min(D,[],2); predictlabel_test = Train_Lab(idx); errorrate_MSE_LSQR = (length(find(predictlabel_test-Test_Lab))/length(Test_Lab))*100