论文:Focal Loss for Dense Object Detection
论文链接:https://arxiv.org/abs/1708.02002
来龙去脉分析:https://blog.csdn.net/c9Yv2cf9I06K2A9E/article/details/78920998
Loss 梯度推导:https://github.com/zimenglan-sysu-512/paper-note/blob/master/focal_loss.pdf
Motivation
Class imbalance is a primary obstacle preventing one-stage detectors from surpassing two-stage methods. Authors think the imbalance causes two problems: (1) training is inefficient as most locations are easy negatives that contribute no useful learning signal; (2) en masse, the easy negatives can overwhelm training and lead to degenerate models.
Solution
Different from OHEM ignoring most easy negatives, authors reshape the standard cross entropy loss to down-weights the loss assigned to well-classified examples, not just ignoring them. See figure 4, changing γ do more effects on negative examples than positive ones, mainly in focusing more attention on hard negatives.
Author designs a simple one-stage object detector called RetinaNet, based on a ResNet-101-FPN backbone, achieves a COCO test-dev AP of 39.1 while running at 5 fps. Box regression subnet and the classification subnet share a common structure, but use separate parameters.
For the final conv layer of the classification subnet, the bias initialization is setted to b = − log((1 − π)=π), where π specifies that at the start of training every anchor should be labeled as foreground with confidence of ∼π. ( sigmoid(b) = 1/(1+exp(-b)) = π )