FasterRcnn

# FasterRCNN
* 原始版本
    * https://github.com/rbgirshick/py-faster-rcnn
* 论文
    * http://arxiv.org/abs/1506.01497
* 比较好的文章
    * https://zhuanlan.zhihu.com/p/76990686?from_voters_page=true
* ROIAlign
    * https://zhuanlan.zhihu.com/p/73662410
image = ...
feature = BackBone(image)

# rpn阶段
rpn_feature = RPNConv(feature)  # 用一个3x3卷积512输出提取特征
objectness_feature = OBJECTNessConv(rpn_feature, num_anchor * num_level * 2)   # 前背景分类，用softmaxloss多分类做的
object_regression = REGRESSIONConv(rpn_feature, num_anchor * num_level * 4)    # 对象的回归值

# num_anchor = 3 ->  scale 尺度，0.5  1  2
# num_anchor * num_level = 9
# objectness_feature  N x 2 x (num_anchor * num_level) x H x W
#                     N x 2 x (9 x H) x W
# groundtruth         N x 1 x (num_anchor * num_level) x H x W
#                     N x 1 x (9 x H) x W
# anchor target就是要解决groundtruth的赋值问题
groundtruth = AnchorTarget(anchors)
objectness_loss = CrossEntropyLoss(objectness_feature, groundtruth)


# region proposal阶段
objectness_prob = softmax(objectness_feature)

# objectness_prob   目标的概率值
# object_regression 目标的回归框预测结果
# targets           M个框以及类别

# 什么叫RPN检出的框？ objectness_prob > threshold 等基本判断条件
# region proposal阶段，是过滤所有的RPN检出框，交给下一个阶段
# -> 输出，是 proposal box

## RPN阶段（anchor target）：
1. 计算所有样本点(w x h)与9个anchor拼在一起形成w x h x 9个框，得到all_anchors（以图像为单位）
    - mesh_grid，featuremap中每个点为中心（乘以stride），与9个anchor为宽高，形成H x W x 9个框（以图像为单位）
    - labels[...] = -1
2. 删除all_anchors中越界的框得到inside_anchors
3. 对于每个样本点9个anchor中，选择与targets重叠度最高的，如果重叠度低于阈值（RPN_NEGATIVE_OVERLAP = 0.3），则认为是负例 0
    - Inside_Anchors(N, N <= HxWx9),  targets(M)  ->  iou = NxM
    - max_iou = Nx1
    - HxWx9这个groundtruth的默认初始值是-1（忽略的）
    - select_anchor_index = iou.max(dim=1) (Nx1)  <  RPN_NEGATIVE_OVERLAP
    - labels[select_anchor_index] = 0
4. 对于每个样本点9个anchor中，选择与targets重叠度最高的，作为正例 1，注意，这里会覆盖第3条的结果
    - iou = MxN    Inside_Anchors(N, N <= HxWx9),  targets(M)
    - 以M为主，选N个中最大值对应的anchor，为正例, select_anchor = iou.argmax(dim=1)   select_anchor = Mx1
    - labels[select_anchor] = 1
    - 如果gt匹配不到任何anchor（指，无法超过正例阈值的）
    - 每个targets必须有最少一个anchor匹配
5. 对于每个样本点9个anchor中，选择与targets重叠度最高的，如果重叠度大于等于阈值（RPN_POSITIVE_OVERLAP = 0.7），则认为是正例 1
6. 选择50%正例（RPN_FG_FRACTION * RPN_BATCHSIZE = 0.5 * 256），如果正例数量超出，则随机忽略多出的 -1
7. 选择50%负例（RPN_BATCHSIZE - num_fg），如果负例数量超出，则随机忽略多出的
8. 统计所有label >= 0，即负例和正例总数，为num_examples
9. 正负例的样本点权重给予1 / num_examples
10. 计算RPN阶段的Loss
## ReginProposal阶段：
1. 对RPN阶段输出的框裁剪到图像范围，并过滤掉宽高小于阈值的框（RPN_MIN_SIZE * scale，RPN_MIN_SIZE = 16）
2. 进行置信度排序，并提取前N个框，叫PreNMS（RPN_PRE_NMS_TOP_N = 12000），结果为proposals
3. 对proposals进行nms（RPN_NMS_THRESH = 0.7），并保留前N个框（RPN_POST_NMS_TOP_N = 2000），结果为proposals
4. 把proposals塞给RCNN额网络进行推理
## ProposalTarget阶段：
1. 对于proposals，进行标记
2. 对于每个proposals与targets计算iou，并取最大为max_overlaps
    - proposals(N)   targets(M)  iou = NxM    max_overlaps = iou.max(dim=1) = Nx1
3. 对于max_overlaps大于等于阈值的（FG_THRESH = 0.5）认为是正例。如果超出数量fg_rois_per_this_image（FG_FRACTION[0.25] * BATCH_SIZE[128]），则随机挑选要求数量个保留
4. 选择负样本为max_overlaps小于BG_THRESH_HI[0.5]，并且大于等于BG_THRESH_LO[0.1]的样本
5. bg_rois_per_this_image = BATCH_SIZE[128] - fg_rois_per_this_image，对于负例超出部分，随机选择bg_rois_per_this_image个样本保留
6. 训练RCNN
posted on 2022-12-02 11:00 哦哟这个怎么搞阅读(26) 评论(0) 编辑收藏举报
会员力量，点亮园子希望
刷新页面返回顶部
ruijiege

公告