scores : array of float, shape=(len(list(cv)),) Array of scores of the estimator for each run of the cross validation.
关于scores:http://scikit-learn.org/stable/modules/cross_validation.html#cross-validation
第一个方法:
# -*- coding: utf-8 -*- """ Created on Tue Aug 09 22:12:13 2016 @author: Administrator """ from sklearn import datasets from sklearn import cross_validation from sklearn.linear_model import LogisticRegression from sklearn.naive_bayes import GaussianNB from sklearn.ensemble import RandomForestClassifier from sklearn.ensemble import VotingClassifier iris = datasets.load_iris() X, y = iris.data[:, 1:3], iris.target clf1 = LogisticRegression(random_state=1) clf2 = RandomForestClassifier(random_state=1) clf3 = GaussianNB() eclf = VotingClassifier(estimators=[('lr', clf1), ('rf', clf2), ('gnb', clf3)], voting='hard', weights=[2,1,2]) for clf, label in zip([clf1, clf2, clf3, eclf], ['Logistic Regression', 'Random Forest', 'naive Bayes', 'Ensemble']): print clf print label scores = cross_validation.cross_val_score(clf, X, y, cv=5, scoring='accuracy') print("Accuracy: %0.2f (+/- %0.2f) [%s]" % (scores.mean(), scores.std(), label))
第二个方法:
# -*- coding: utf-8 -*- """ Created on Tue Aug 09 22:06:31 2016 @author: Administrator """ import numpy as np from sklearn.linear_model import LogisticRegression from sklearn.naive_bayes import GaussianNB from sklearn.ensemble import RandomForestClassifier, VotingClassifier clf1 = LogisticRegression(random_state=1) clf2 = RandomForestClassifier(random_state=1) clf3 = GaussianNB() X = np.array([[-1, -1], [-2, -1], [-3, -2], [1, 1], [2, 1], [3, 2]]) y = np.array([1, 1, 1, 2, 2, 2]) eclf1 = VotingClassifier(estimators=[('lr', clf1), ('rf', clf2), ('gnb', clf3)], voting='hard') eclf1 = eclf1.fit(X, y) print(eclf1.predict(X)) eclf2 = VotingClassifier(estimators=[('lr', clf1), ('rf', clf2), ('gnb', clf3)],voting='soft') eclf2 = eclf2.fit(X, y) print(eclf2.predict(X)) eclf3 = VotingClassifier(estimators=[('lr', clf1), ('rf', clf2), ('gnb', clf3)],voting='soft', weights=[2,1,1]) eclf3 = eclf3.fit(X, y) print(eclf3.predict(X))
Parameters:
estimators : list of (string, estimator) tuples
Invoking the
fit
method on theVotingClassifier
will fit clones of those original estimators that will be stored in the class attribute self.estimators_.
voting : str, {‘hard’, ‘soft’} (default=’hard’)
If ‘hard’, uses predicted class labels for majority rule voting. Else if ‘soft’, predicts the class label based on the argmax( 自动回归滑动平均模型) of the sums of the predicted probabilities, which is recommended for an ensemble of well-calibrated(标准的) classifiers.
#投票规则,默认hard,多数的票;soft 模式看不懂,大约是根据每个方法的概率吧
weights : array-like, shape = [n_classifiers], optional (default=`None`)
Sequence of weights (float or int) to weight the occurrences of predicted class labels (hard voting) or class probabilities before averaging (soft voting). Uses uniform weights if None.
#每个方法预先的权值,默认各方法权值相同.