Precision,Recall,F1的计算

Precision又叫查准率，Recall又叫查全率。这两个指标共同衡量才能评价模型输出结果。

TP: 预测为1(Positive)，实际也为1(Truth-预测对了)
TN: 预测为0(Negative)，实际也为0(Truth-预测对了)
FP: 预测为1(Positive)，实际为0(False-预测错了)
FN: 预测为0(Negative)，实际为1(False-预测错了)

总的样本个数为：TP+TN+FP+FN。

Accuracy/Precision/Recall的定义

Accuracy = (预测正确的样本数)/(总样本数)=(TP+TN)/(TP+TN+FP+FN)

Precision = (预测为1且正确预测的样本数)/(所有预测为1的样本数) = TP/(TP+FP)

Recall = (预测为1且正确预测的样本数)/(所有真实情况为1的样本数) = TP/(TP+FN)

如何理解Precision/Recall

假设100癌症训练集中，只有1例为癌症。如果模型永远预测y=0，则模型的Precision=99/100，很高。但Recall=0/1=0,非常低。
所以单纯用Precision来评价模型是不完整的，评价模型时必须用Precision/Recall两个值。

如何理解F1

假设我们得到了模型的Precision/Recall如下

Precision	Recall
Algorithm1	0.5
Algorithm2	0.7
Algorithm3	0.02

但由于Precision/Recall是两个值，无法根据两个值来对比模型的好坏。有没有一个值能综合Precision/Recall呢？有，它就是F1。

F1 = 2*(Precision*Recall)/(Precision+Recall)

Algorithm	F1
Algorithm1	0.444
Algorithm2	0.175
Algorithm3	0.039

只有一个值，就好做模型对比了，这里我们根据F1可以发现Algorithm1是三者中最优的。

分类阈值对Precision/Recall的影响

做二值分类时，我们认为，若h(x)>=0.5，则predict=1；若h(x)<0.5，则predict=0。这里0.5就是分类阈值。

增加阈值，我们会对预测值更有信心，即增加了查准率。但这样会降低查全率。（High Precision, Low Recall）
减小阈值，则模型放过的真例就变少，查全率就增加。（Low Precision, High Recall）

from sklearn.metrics import classification_report
y=[0,1,2,2,2]
y_=[0,0,2,2,1]
# sklearn.metrics.classification_report(y_true,y_pred,labels=None,target_names=None,sample_weight=None,digits=2)
# y_true,y_pred 1d array-like
# labels shape=[n_labels] label索引的列表,需要在report中包含的
# target_names 匹配label的可选的display的名字
# sample_weight shape=[n_sample] 可选的sample weights
# digits int 输出的浮点数的个数
# returns 返回每个类别的precision recall F1
target_names=['class 0','class 1','class 2']
print(classification_report(y,y_,target_names=tar

    class 0      0.500     1.000     0.667         1
    class 1      0.000     0.000     0.000         1
    class 2      1.000     0.667     0.800         3

avg / total      0.700     0.600     0.613         5

posted @ 2017-12-29 13:24 Qniguoym 阅读(33406) 评论(0) 收藏举报

刷新页面返回顶部

Qniguoym

Precision,Recall,F1的计算

Accuracy/Precision/Recall的定义

如何理解Precision/Recall

如何理解F1

分类阈值对Precision/Recall的影响

公告