ruijiege

  博客园 :: 首页 :: 博问 :: 闪存 :: 新随笔 :: 联系 :: 订阅 订阅 :: 管理 ::
# FasterRCNN
* 原始版本
    * https://github.com/rbgirshick/py-faster-rcnn
* 论文
    * http://arxiv.org/abs/1506.01497
* 比较好的文章
    * https://zhuanlan.zhihu.com/p/76990686?from_voters_page=true
* ROIAlign
    * https://zhuanlan.zhihu.com/p/73662410
image = ...
feature = BackBone(image)

# rpn阶段
rpn_feature = RPNConv(feature)  # 用一个3x3卷积512输出提取特征
objectness_feature = OBJECTNessConv(rpn_feature, num_anchor * num_level * 2)   # 前背景分类,用softmaxloss多分类做的
object_regression = REGRESSIONConv(rpn_feature, num_anchor * num_level * 4)    # 对象的回归值

# num_anchor = 3 ->  scale 尺度,0.5  1  2
# num_anchor * num_level = 9
# objectness_feature  N x 2 x (num_anchor * num_level) x H x W
#                     N x 2 x (9 x H) x W
# groundtruth         N x 1 x (num_anchor * num_level) x H x W
#                     N x 1 x (9 x H) x W
# anchor target就是要解决groundtruth的赋值问题
groundtruth = AnchorTarget(anchors)
objectness_loss = CrossEntropyLoss(objectness_feature, groundtruth)


# region proposal阶段
objectness_prob = softmax(objectness_feature)

# objectness_prob   目标的概率值
# object_regression 目标的回归框预测结果
# targets           M个框以及类别

# 什么叫RPN检出的框? objectness_prob > threshold 等基本判断条件
# region proposal阶段,是过滤所有的RPN检出框,交给下一个阶段
# -> 输出,是 proposal box

## RPN阶段(anchor target):
1. 计算所有样本点(w x h)与9个anchor拼在一起形成w x h x 9个框,得到all_anchors(以图像为单位)
    - mesh_grid,featuremap中每个点为中心(乘以stride),与9个anchor为宽高,形成H x W x 9个框(以图像为单位)
    - labels[...] = -1
2. 删除all_anchors中越界的框得到inside_anchors
3. 对于每个样本点9个anchor中,选择与targets重叠度最高的,如果重叠度低于阈值(RPN_NEGATIVE_OVERLAP = 0.3),则认为是负例 0
    - Inside_Anchors(N, N <= HxWx9),  targets(M)  ->  iou = NxM
    - max_iou = Nx1
    - HxWx9这个groundtruth的默认初始值是-1(忽略的)
    - select_anchor_index = iou.max(dim=1) (Nx1)  <  RPN_NEGATIVE_OVERLAP
    - labels[select_anchor_index] = 0
4. 对于每个样本点9个anchor中,选择与targets重叠度最高的,作为正例 1,注意,这里会覆盖第3条的结果
    - iou = MxN    Inside_Anchors(N, N <= HxWx9),  targets(M)
    - 以M为主,选N个中最大值对应的anchor,为正例, select_anchor = iou.argmax(dim=1)   select_anchor = Mx1
    - labels[select_anchor] = 1
    - 如果gt匹配不到任何anchor(指,无法超过正例阈值的)
    - 每个targets必须有最少一个anchor匹配
5. 对于每个样本点9个anchor中,选择与targets重叠度最高的,如果重叠度大于等于阈值(RPN_POSITIVE_OVERLAP = 0.7),则认为是正例 1
6. 选择50%正例(RPN_FG_FRACTION * RPN_BATCHSIZE = 0.5 * 256),如果正例数量超出,则随机忽略多出的 -1
7. 选择50%负例(RPN_BATCHSIZE - num_fg),如果负例数量超出,则随机忽略多出的
8. 统计所有label >= 0,即负例和正例总数,为num_examples
9. 正负例的样本点权重给予1 / num_examples
10. 计算RPN阶段的Loss
## ReginProposal阶段:
1. 对RPN阶段输出的框裁剪到图像范围,并过滤掉宽高小于阈值的框(RPN_MIN_SIZE * scale,RPN_MIN_SIZE = 162. 进行置信度排序,并提取前N个框,叫PreNMS(RPN_PRE_NMS_TOP_N = 12000),结果为proposals
3. 对proposals进行nms(RPN_NMS_THRESH = 0.7),并保留前N个框(RPN_POST_NMS_TOP_N = 2000),结果为proposals
4. 把proposals塞给RCNN额网络进行推理
## ProposalTarget阶段:
1. 对于proposals,进行标记
2. 对于每个proposals与targets计算iou,并取最大为max_overlaps
    - proposals(N)   targets(M)  iou = NxM    max_overlaps = iou.max(dim=1) = Nx1
3. 对于max_overlaps大于等于阈值的(FG_THRESH = 0.5)认为是正例。如果超出数量fg_rois_per_this_image(FG_FRACTION[0.25] * BATCH_SIZE[128]),则随机挑选要求数量个保留
4. 选择负样本为max_overlaps小于BG_THRESH_HI[0.5],并且大于等于BG_THRESH_LO[0.1]的样本
5. bg_rois_per_this_image = BATCH_SIZE[128] - fg_rois_per_this_image,对于负例超出部分,随机选择bg_rois_per_this_image个样本保留
6. 训练RCNN

 

posted on 2022-12-02 11:00  哦哟这个怎么搞  阅读(26)  评论(0编辑  收藏  举报