# FasterRCNN * 原始版本 * https://github.com/rbgirshick/py-faster-rcnn * 论文 * http://arxiv.org/abs/1506.01497 * 比较好的文章 * https://zhuanlan.zhihu.com/p/76990686?from_voters_page=true * ROIAlign * https://zhuanlan.zhihu.com/p/73662410 image = ... feature = BackBone(image) # rpn阶段 rpn_feature = RPNConv(feature) # 用一个3x3卷积512输出提取特征 objectness_feature = OBJECTNessConv(rpn_feature, num_anchor * num_level * 2) # 前背景分类,用softmaxloss多分类做的 object_regression = REGRESSIONConv(rpn_feature, num_anchor * num_level * 4) # 对象的回归值 # num_anchor = 3 -> scale 尺度,0.5 1 2 # num_anchor * num_level = 9 # objectness_feature N x 2 x (num_anchor * num_level) x H x W # N x 2 x (9 x H) x W # groundtruth N x 1 x (num_anchor * num_level) x H x W # N x 1 x (9 x H) x W # anchor target就是要解决groundtruth的赋值问题 groundtruth = AnchorTarget(anchors) objectness_loss = CrossEntropyLoss(objectness_feature, groundtruth) # region proposal阶段 objectness_prob = softmax(objectness_feature) # objectness_prob 目标的概率值 # object_regression 目标的回归框预测结果 # targets M个框以及类别 # 什么叫RPN检出的框? objectness_prob > threshold 等基本判断条件 # region proposal阶段,是过滤所有的RPN检出框,交给下一个阶段 # -> 输出,是 proposal box ## RPN阶段(anchor target): 1. 计算所有样本点(w x h)与9个anchor拼在一起形成w x h x 9个框,得到all_anchors(以图像为单位) - mesh_grid,featuremap中每个点为中心(乘以stride),与9个anchor为宽高,形成H x W x 9个框(以图像为单位) - labels[...] = -1 2. 删除all_anchors中越界的框得到inside_anchors 3. 对于每个样本点9个anchor中,选择与targets重叠度最高的,如果重叠度低于阈值(RPN_NEGATIVE_OVERLAP = 0.3),则认为是负例 0 - Inside_Anchors(N, N <= HxWx9), targets(M) -> iou = NxM - max_iou = Nx1 - HxWx9这个groundtruth的默认初始值是-1(忽略的) - select_anchor_index = iou.max(dim=1) (Nx1) < RPN_NEGATIVE_OVERLAP - labels[select_anchor_index] = 0 4. 对于每个样本点9个anchor中,选择与targets重叠度最高的,作为正例 1,注意,这里会覆盖第3条的结果 - iou = MxN Inside_Anchors(N, N <= HxWx9), targets(M) - 以M为主,选N个中最大值对应的anchor,为正例, select_anchor = iou.argmax(dim=1) select_anchor = Mx1 - labels[select_anchor] = 1 - 如果gt匹配不到任何anchor(指,无法超过正例阈值的) - 每个targets必须有最少一个anchor匹配 5. 对于每个样本点9个anchor中,选择与targets重叠度最高的,如果重叠度大于等于阈值(RPN_POSITIVE_OVERLAP = 0.7),则认为是正例 1 6. 选择50%正例(RPN_FG_FRACTION * RPN_BATCHSIZE = 0.5 * 256),如果正例数量超出,则随机忽略多出的 -1 7. 选择50%负例(RPN_BATCHSIZE - num_fg),如果负例数量超出,则随机忽略多出的 8. 统计所有label >= 0,即负例和正例总数,为num_examples 9. 正负例的样本点权重给予1 / num_examples 10. 计算RPN阶段的Loss ## ReginProposal阶段: 1. 对RPN阶段输出的框裁剪到图像范围,并过滤掉宽高小于阈值的框(RPN_MIN_SIZE * scale,RPN_MIN_SIZE = 16) 2. 进行置信度排序,并提取前N个框,叫PreNMS(RPN_PRE_NMS_TOP_N = 12000),结果为proposals 3. 对proposals进行nms(RPN_NMS_THRESH = 0.7),并保留前N个框(RPN_POST_NMS_TOP_N = 2000),结果为proposals 4. 把proposals塞给RCNN额网络进行推理 ## ProposalTarget阶段: 1. 对于proposals,进行标记 2. 对于每个proposals与targets计算iou,并取最大为max_overlaps - proposals(N) targets(M) iou = NxM max_overlaps = iou.max(dim=1) = Nx1 3. 对于max_overlaps大于等于阈值的(FG_THRESH = 0.5)认为是正例。如果超出数量fg_rois_per_this_image(FG_FRACTION[0.25] * BATCH_SIZE[128]),则随机挑选要求数量个保留 4. 选择负样本为max_overlaps小于BG_THRESH_HI[0.5],并且大于等于BG_THRESH_LO[0.1]的样本 5. bg_rois_per_this_image = BATCH_SIZE[128] - fg_rois_per_this_image,对于负例超出部分,随机选择bg_rois_per_this_image个样本保留 6. 训练RCNN
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· 25岁的心里话
· 闲置电脑爆改个人服务器(超详细) #公网映射 #Vmware虚拟网络编辑器
· 零经验选手,Compose 一天开发一款小游戏!
· 因为Apifox不支持离线,我果断选择了Apipost!
· 通过 API 将Deepseek响应流式内容输出到前端