yolov1笔记

1,标签处理(编码encode)

  1,labelimg制作标签:boxes:tensor([[ 48., 240., 195., 371.],[ 8., 12., 352., 498.]])  分别为检测框的左上角、右下角、宽、高

  2, 将目标的左上角、右下角的坐标转换为相对于原图像的大小,控制在(0,1)范围内

  3,转换成中心坐标x,y和宽高w,h

  4,中心坐标转换为相对于这个grid cell左上角的坐标

得到的target.shape: torch.Size([1, 7, 7, 30])

 2,损失函数

loss_xy = F.mse_loss(bbox_pred_response[:, :2], bbox_target_response[:, :2], reduction='sum')
loss_wh = F.mse_loss(torch.sqrt(bbox_pred_response[:, 2:4]), torch.sqrt(bbox_target_response[:, 2:4]), reduction='sum')

loss_obj = F.mse_loss(bbox_pred_response[:, 4], target_iou[:, 4], reduction='sum')
    
# Class probability loss for the cells which contain objects.
loss_class = F.mse_loss(class_pred, class_target, reduction='sum')

 # Total loss
loss = self.lambda_coord * (loss_xy + loss_wh) + loss_obj + self.lambda_noobj * loss_noobj + loss_class
loss = loss / float(batch_size)

3,后处理

  1,得到预测结果:pred_tensor = self.yolo(img) # torch.Size([1, 7, 7, 30])

  2,解码(decode):boxes_normalized_all, class_labels_all, confidences_all, class_scores_all = self.decode(pred_tensor)   置信度过滤(prob_thresh)

  3,非极大抑制:ids = self.nms(boxes_normalized_masked, confidences_masked) # 非极大抑制(nms_thresh)

  4,画图,返回原图大小

posted @ 2022-06-07 10:40  cheng4632  阅读(42)  评论(0编辑  收藏  举报