yolov1笔记
1,标签处理(编码encode)
1,labelimg制作标签:boxes:tensor([[ 48., 240., 195., 371.],[ 8., 12., 352., 498.]]) 分别为检测框的左上角、右下角、宽、高
2, 将目标的左上角、右下角的坐标转换为相对于原图像的大小,控制在(0,1)范围内
3,转换成中心坐标x,y和宽高w,h
4,中心坐标转换为相对于这个grid cell左上角的坐标
得到的target.shape: torch.Size([1, 7, 7, 30])
2,损失函数
loss_xy = F.mse_loss(bbox_pred_response[:, :2], bbox_target_response[:, :2], reduction='sum') loss_wh = F.mse_loss(torch.sqrt(bbox_pred_response[:, 2:4]), torch.sqrt(bbox_target_response[:, 2:4]), reduction='sum') loss_obj = F.mse_loss(bbox_pred_response[:, 4], target_iou[:, 4], reduction='sum') # Class probability loss for the cells which contain objects. loss_class = F.mse_loss(class_pred, class_target, reduction='sum') # Total loss loss = self.lambda_coord * (loss_xy + loss_wh) + loss_obj + self.lambda_noobj * loss_noobj + loss_class loss = loss / float(batch_size)
3,后处理
1,得到预测结果:pred_tensor = self.yolo(img) # torch.Size([1, 7, 7, 30])
2,解码(decode):boxes_normalized_all, class_labels_all, confidences_all, class_scores_all = self.decode(pred_tensor) 置信度过滤(prob_thresh)
3,非极大抑制:ids = self.nms(boxes_normalized_masked, confidences_masked) # 非极大抑制(nms_thresh)
4,画图,返回原图大小