yolov1笔记

1，标签处理（编码encode）

　　1,labelimg制作标签：boxes：tensor([[ 48., 240., 195., 371.],[ 8., 12., 352., 498.]]) 分别为检测框的左上角、右下角、宽、高

　　2, 将目标的左上角、右下角的坐标转换为相对于原图像的大小，控制在（0，1）范围内

　　3,转换成中心坐标x,y和宽高w,h

　　4,中心坐标转换为相对于这个grid cell左上角的坐标

得到的target.shape： torch.Size([1, 7, 7, 30])

2，损失函数

loss_xy = F.mse_loss(bbox_pred_response[:, :2], bbox_target_response[:, :2], reduction='sum')
loss_wh = F.mse_loss(torch.sqrt(bbox_pred_response[:, 2:4]), torch.sqrt(bbox_target_response[:, 2:4]), reduction='sum')

loss_obj = F.mse_loss(bbox_pred_response[:, 4], target_iou[:, 4], reduction='sum')
    
# Class probability loss for the cells which contain objects.
loss_class = F.mse_loss(class_pred, class_target, reduction='sum')

 # Total loss
loss = self.lambda_coord * (loss_xy + loss_wh) + loss_obj + self.lambda_noobj * loss_noobj + loss_class
loss = loss / float(batch_size)

3，后处理

　　1，得到预测结果：pred_tensor = self.yolo(img) # torch.Size([1, 7, 7, 30])

　　2，解码(decode)：boxes_normalized_all, class_labels_all, confidences_all, class_scores_all = self.decode(pred_tensor) 置信度过滤（prob_thresh）

　　3，非极大抑制：ids = self.nms(boxes_normalized_masked, confidences_masked) # 非极大抑制（nms_thresh）

　　4，画图，返回原图大小

posted @ 2022-06-07 10:40 cheng4632 阅读(42) 评论(0) 编辑收藏举报

刷新页面返回顶部