深度学习笔记(十五)目标检测回归损失 GIoU、DIoU、CIoU

论文:Generalized Intersection over Union: A Metric and A Loss for Bounding Box Regression
           Distance-IoU Loss: Faster and Better Learning for Bounding Box Regression

代码:https://giou.stanford.edu/
          https://github.com/Zzh-tju/CIoU

 

IoU

 

Intersection over Union (IoU) 是目标检测里一种重要的评价值。上面第一张途中框出了 gt box 和 predict box,IoU 通过计算这两个框 A、B 间的 Intersection Area $I$Union Area $U$ 的比值来获得:

\begin{equation}
\label{IoU}
IoU = \frac{|A \cap B|}{|A \cup B|} = \frac{|I|}{|U|}
\end{equation}

然而现有的算法都采用 distance losses(例如 SSD 里的 smooth_L1 loss) 来优化这一评价值。讲道理 The optimal objective for a metric is the metric itself. 所以我们可以直接将 IoU 直接作为回归 loss 来使用,令人遗憾的是 IoU 无法优化无重叠的 bboxes。

如果用 IoU 作为 loss($\mathcal{L}_{IoU} = 1 - IoU$) 衡量值的话有两个优点和一个缺点:
1. IoU 可以有效比较两个任意形状之间相似性
2. IoU 具有尺度不变性
3. 任意两个形状 A、B 之间如果没有 overlap,则 IoU 均为 0,此时,IoU 无法分辨两个形状 A、B 是靠的非常近还是非常远

GIoU

 GIoU 作为 IoU 的升级版,既继承了 IoU 的两个优点,又弥补了 IoU 无法衡量无重叠框之间的距离的缺点。具体计算方式是在 IoU 计算的基础上寻找一个 smallest convex shapes $C$具体计算公式是:

\begin{equation}
\label{GIoU}
GIoU = \frac{|A \cap B|}{|A \cup B|} - \frac{|C \setminus (A \cup B)|}{|C|} = IoU - \frac{|C \setminus (A \cup B)|}{|C|}
\end{equation}

下图中有两个不同的检测结果 bad & better,不难看出距离 gt box 越远 $C$ 越大。

如此,损失函数可以写成:$\mathcal{L}_{GIoU}  = 1- GIoU$,不难发现 $\mathcal{L}_{GIoU}$ 的值域范围为 $[0, 2)$。

 In summary, this generalization keeps the major properties of IoU while rectifying its weakness. 

DIoU & CIoU

 论文中提出,GIoU loss 仍然存在收敛速度慢、回归不准等问题。

In this paper, we propose a Distance-IoU (DIoU) loss by incorporating the normalized distance between the predicted box and the target box, which converges much faster in training than IoU and GIoU losses. Furthermore, this paper summarizes three geometric factors in bounding box regression, i.e., overlap area, central point distance and aspect ratio, based on which a Complete IoU (CIoU) loss is proposed, thereby leading to faster convergence and better performance. Moreover, DIoU can be easily adopted into non-maximum suppression (NMS) to act as the criterion, further boosting performance improvement.

 作者在分析 GIoU loss 时,发现 GIoU 首先会试图通过增加检测框的大小使其与目标 bbox 有重叠,然后利用 IoU loss 项使其与目标 bbox 重叠面积最大,如下左图所示:

 同时,但两个框有包含关系是,GIoU loss 就退化成了 IoU loss 了。这时候边界框的对齐变得较困难,收敛较慢。

In Distance-IoU (DIoU) loss, we simply add a penalty term on IoU loss to directly minimize the normalized distance between central points of two bounding boxes, leading to much faster convergence than GIoU loss.

作者认为,一个好的 bbox 回归损失应该考虑三个重要的集合度量:重叠面积、中心点距离和高宽比。结合这些,作者进一步提出了一个 Complete IoU (CIoU) loss。同时 DIoU 还可以引入到 NMS 中来替换里面的 IoU,使得目标在遮挡情况下检测更鲁棒。

DIoU

参考上图,DIoU loss 的公式为:

\begin{equation}
\label{DIoU}
\begin{split}
& \mathcal{R}_{DIoU} = \frac{\rho^2(\bf{b}, \bf{b^{gt}})}{c^2} \\
& \mathcal{L}_{DIoU} = 1 - IoU + \frac{\rho^2(\bf{b}, \bf{b^{gt}})}{c^2} \\
& \mathcal{L}_{DIoU} = 1 - IoU + \frac{d^2}{c^2}
\end{split}
\end{equation}

这里的 $\bf{d}$ 和 $\bf{c}$ 分别代表检测框和真实框的中心点,且 $d$ 代表的是计算两个中心点之间的欧氏距离,$c$ 则代表 GIoU 中提到的 smallest convex shapes 的对角线距离。

优点:

  • 与GIoU loss 类似,DIoU loss 在与目标框不重叠时,仍然可以为边界框提供移动方向。
  • DIoU loss 可以直接最小化两个目标框的距离,因此比 GIoU  loss 收敛快得多。
  • 对于包含两个框在水平方向和垂直方向上这种情况,DIoU loss 可以使回归非常快,而 GIoU loss 几乎退化为 IoU loss。
  • DIoU 还可以替换普通的 IoU 评价策略,应用于 NMS 中,使得 NMS 得到的结果更加合理和有效。

同 $\mathcal{L}_{GIoU}$ 类似, $\mathcal{L}_{DIoU}$ 的值域范围也为 $[0, 2)$。

CIoU

$\mathcal{L}_{CIoU}$ 在 $\mathcal{L}_{DIoU}$ 的基础上考虑了 aspect ratios:

\begin{equation}
\label{CIoU}
\begin{split}
& \mathcal{R}_{CIoU} = \frac{\rho^2(\bf{b}, \bf{b^{gt}})}{c^2} + \alpha v \\
& v = \frac{4}{{\pi}^2}(arctan \frac{w^{gt}}{h^{gt}} - arctan \frac{w}{h})^2 \\
& \alpha = \frac{v}{(1 - IoU) + v} \\
& \mathcal{L}_{CoU} = 1 - IoU + \frac{d^2}{c^2} + \alpha v
\end{split}
\end{equation}

 额,这个。。。看起来复杂的一逼

其中,$v$ 用来衡量高宽比的一致性,$\alpha$ 是一个 positive trade-off parameter, 是不参与求导的。

DIoU-NMS

这个还没试,等着。。。

示例

import numpy as np
import matplotlib.pyplot as plt
import math

epsilon = 1e-5

def IoU(box1, box2, wh=False):
    if wh:
        xmin1, ymin1 = box1[0] - box1[2] / 2.0, box1[1] - box1[3] / 2.0
        xmax1, ymax1 = box1[0] + box1[2] / 2.0, box1[1] + box1[3] / 2.0
        xmin2, ymin2 = box2[0] - box2[2] / 2.0, box2[1] - box2[3] / 2.0
        xmax2, ymax2 = box2[0] + box2[2] / 2.0, box2[1] + box2[3] / 2.0
    else:
        xmin1, ymin1, xmax1, ymax1 = box1
        xmin2, ymin2, xmax2, ymax2 = box2

    # 计算交集部分尺寸
    W = min(xmax1, xmax2) - max(xmin1, xmin2)
    H = min(ymax1, ymax2) - max(ymin1, ymin2)

    # 计算两个矩形框面积
    SA = (xmax1 - xmin1) * (ymax1 - ymin1)
    SB = (xmax2 - xmin2) * (ymax2 - ymin2)

    cross = max(0, W) * max(0, H)  # 计算交集面积
    iou = float(cross) / (SA + SB - cross)

    return iou

def GIoU(box1, box2, wh=False):
    if wh:
        xmin1, ymin1 = box1[0] - box1[2] / 2.0, box1[1] - box1[3] / 2.0
        xmax1, ymax1 = box1[0] + box1[2] / 2.0, box1[1] + box1[3] / 2.0
        xmin2, ymin2 = box2[0] - box2[2] / 2.0, box2[1] - box2[3] / 2.0
        xmax2, ymax2 = box2[0] + box2[2] / 2.0, box2[1] + box2[3] / 2.0
    else:
        xmin1, ymin1, xmax1, ymax1 = box1
        xmin2, ymin2, xmax2, ymax2 = box2

    iou = IoU(box1, box2, wh)
    SC = (max(xmax1, xmax2) - min(xmin1, xmin2)) * (max(ymax1, ymax2) - min(ymin1, ymin2))

    # 计算交集部分尺寸
    W = min(xmax1, xmax2) - max(xmin1, xmin2)
    H = min(ymax1, ymax2) - max(ymin1, ymin2)

    # 计算两个矩形框面积
    SA = (xmax1 - xmin1) * (ymax1 - ymin1)
    SB = (xmax2 - xmin2) * (ymax2 - ymin2)

    cross = max(0, W) * max(0, H)  # 计算交集面积

    add_area = SA + SB - cross  # 两矩形并集的面积

    end_area = (SC - add_area) / SC  # 闭包区域中不属于两个框的区域占闭包区域的比重
    giou = iou - end_area
    return giou

def DIoU(box1, box2, wh=False):
    if wh:
        inter_diag = (box1[0] - box2[0])**2 + (box1[1] - box2[1])**2
        xmin1, ymin1 = box1[0] - box1[2] / 2.0, box1[1] - box1[3] / 2.0
        xmax1, ymax1 = box1[0] + box1[2] / 2.0, box1[1] + box1[3] / 2.0
        xmin2, ymin2 = box2[0] - box2[2] / 2.0, box2[1] - box2[3] / 2.0
        xmax2, ymax2 = box2[0] + box2[2] / 2.0, box2[1] + box2[3] / 2.0
    else:
        xmin1, ymin1, xmax1, ymax1 = box1
        xmin2, ymin2, xmax2, ymax2 = box2
        center_x1 = (xmax1 + xmin1) / 2
        center_y1 = (ymax1 + ymin1) / 2
        center_x2 = (xmax2 + xmin2) / 2
        center_y2 = (ymax2 + ymin2) / 2
        inter_diag = (center_x1 - center_x2)/2 ** 2 + (center_y1 - center_y2) ** 2

    iou = IoU(box1, box2, wh)
    enclose1 = max(max(xmax1, xmax2)-min(xmin1, xmin2), 0.0)
    enclose2 = max(max(ymax1, ymax2)-min(ymin1, ymin2), 0.0)
    outer_diag = (enclose1 ** 2) + (enclose2 ** 2)
    diou = iou - 1.0 * inter_diag / outer_diag
    return diou

def CIoU(box1, box2, wh=False, normaled=False):
    if wh:
        w1, h1 = box1[2], box1[3]
        w2, h2 = box2[2], box2[3]
        inter_diag = (box1[0] - box2[0])**2 + (box1[1] - box2[1])**2
        xmin1, ymin1 = box1[0] - box1[2] / 2.0, box1[1] - box1[3] / 2.0
        xmax1, ymax1 = box1[0] + box1[2] / 2.0, box1[1] + box1[3] / 2.0
        xmin2, ymin2 = box2[0] - box2[2] / 2.0, box2[1] - box2[3] / 2.0
        xmax2, ymax2 = box2[0] + box2[2] / 2.0, box2[1] + box2[3] / 2.0
    else:
        xmin1, ymin1, xmax1, ymax1 = box1
        xmin2, ymin2, xmax2, ymax2 = box2
        w1, h1 = xmax1-xmin1, ymax1-ymin1
        w2, h2 = xmax2-xmin2, ymax2-ymin2
        center_x1 = (xmax1 + xmin1) / 2
        center_y1 = (ymax1 + ymin1) / 2
        center_x2 = (xmax2 + xmin2) / 2
        center_y2 = (ymax2 + ymin2) / 2
        inter_diag = (center_x1 - center_x2)/2 ** 2 + (center_y1 - center_y2) ** 2

    iou = IoU(box1, box2, wh)
    enclose1 = max(max(xmax1, xmax2)-min(xmin1, xmin2), 0.0)
    enclose2 = max(max(ymax1, ymax2)-min(ymin1, ymin2), 0.0)
    outer_diag = (enclose1 ** 2) + (enclose2 ** 2)
    u = (inter_diag) / outer_diag

    arctan = math.atan(w2 / h2) - math.atan(w1 / h1)
    v = (4 / (math.pi ** 2)) * (math.atan(w2 / h2) - math.atan(w1 / h1))**2
    S = 1 - iou
    alpha = v / (S + v)
    w_temp = 2 * w1
    distance = w1 ** 2 + h1 ** 2
    ar = (8 / (math.pi ** 2)) * arctan * ((w1 - w_temp) * h1)
    if not normaled:
        cious = iou - (u + alpha * ar / distance)
    else:
        cious = iou - (u + alpha * ar)
    cious = np.clip(cious, a_min=-1.0, a_max=1.0)

    return cious


def bbox_giou_np(boxes1, boxes2):
    # xywh -> xyxy
    boxes1 = np.concatenate([boxes1[..., :2] - boxes1[..., 2:] * 0.5,
                             boxes1[..., :2] + boxes1[..., 2:] * 0.5], axis=-1)
    boxes2 = np.concatenate([boxes2[..., :2] - boxes2[..., 2:] * 0.5,
                             boxes2[..., :2] + boxes2[..., 2:] * 0.5], axis=-1)

    boxes1 = np.concatenate([np.minimum(boxes1[..., :2], boxes1[..., 2:]),
                             np.maximum(boxes1[..., :2], boxes1[..., 2:])], axis=-1)
    boxes2 = np.concatenate([np.minimum(boxes2[..., :2], boxes2[..., 2:]),
                             np.maximum(boxes2[..., :2], boxes2[..., 2:])], axis=-1)

    boxes1_area = (boxes1[..., 2] - boxes1[..., 0]) * (boxes1[..., 3] - boxes1[..., 1])
    boxes2_area = (boxes2[..., 2] - boxes2[..., 0]) * (boxes2[..., 3] - boxes2[..., 1])

    left_up = np.maximum(boxes1[..., :2], boxes2[..., :2])
    right_down = np.minimum(boxes1[..., 2:], boxes2[..., 2:])

    inter_section = np.maximum(right_down - left_up, 0.0)
    inter_area = inter_section[..., 0] * inter_section[..., 1]
    union_area = boxes1_area + boxes2_area - inter_area
    # 计算两个边界框之间的 iou 值
    iou = inter_area / union_area
    # 计算最小闭合凸面 C 左上角和右下角的坐标
    enclose_left_up = np.minimum(boxes1[..., :2], boxes2[..., :2])
    enclose_right_down = np.maximum(boxes1[..., 2:], boxes2[..., 2:])
    enclose = np.maximum(enclose_right_down - enclose_left_up, 0.0)
    # 计算最小闭合凸面 C 的面积
    enclose_area = enclose[..., 0] * enclose[..., 1]
    # 根据 GIoU 公式计算 GIoU 值
    giou = iou - 1.0 * (enclose_area - union_area) / enclose_area

    return giou

# https://github.com/YunYang1994/TensorFlow2.0-Examples/blob/4d4a403d00e6e887ecb7229719b1407d2e132811/4-Object_Detection/YOLOV3/core/yolov3.py#L121
def bbox_giou_tf(boxes1, boxes2):
    # pred_xywh, label_xywh -> pred_xyxy, label_xyxy
    boxes1 = tf.concat([boxes1[..., :2] - boxes1[..., 2:] * 0.5,
                        boxes1[..., :2] + boxes1[..., 2:] * 0.5], axis=-1)
    boxes2 = tf.concat([boxes2[..., :2] - boxes2[..., 2:] * 0.5,
                        boxes2[..., :2] + boxes2[..., 2:] * 0.5], axis=-1)

    boxes1 = tf.concat([tf.minimum(boxes1[..., :2], boxes1[..., 2:]),
                        tf.maximum(boxes1[..., :2], boxes1[..., 2:])], axis=-1)
    boxes2 = tf.concat([tf.minimum(boxes2[..., :2], boxes2[..., 2:]),
                        tf.maximum(boxes2[..., :2], boxes2[..., 2:])], axis=-1)

    boxes1_area = (boxes1[..., 2] - boxes1[..., 0]) * (boxes1[..., 3] - boxes1[..., 1])
    boxes2_area = (boxes2[..., 2] - boxes2[..., 0]) * (boxes2[..., 3] - boxes2[..., 1])

    left_up = tf.maximum(boxes1[..., :2], boxes2[..., :2])
    right_down = tf.minimum(boxes1[..., 2:], boxes2[..., 2:])

    inter_section = tf.maximum(right_down - left_up, 0.0)
    inter_area = inter_section[..., 0] * inter_section[..., 1]
    union_area = boxes1_area + boxes2_area - inter_area
    # 计算两个边界框之间的 iou 值
    iou = inter_area / union_area
    # 计算最小闭合凸面 C 左上角和右下角的坐标
    enclose_left_up = tf.minimum(boxes1[..., :2], boxes2[..., :2])
    enclose_right_down = tf.maximum(boxes1[..., 2:], boxes2[..., 2:])
    enclose = tf.maximum(enclose_right_down - enclose_left_up, 0.0)
    # 计算最小闭合凸面 C 的面积
    enclose_area = enclose[..., 0] * enclose[..., 1]
    # 根据 GIoU 公式计算 GIoU 值
    giou = iou - 1.0 * (enclose_area - union_area) / enclose_area

    return giou

def bbox_giou_torch(boxes1, boxes2):
    # boxes1, boxes2 = torch.tensor(boxes1, dtype=torch.float32), torch.tensor(boxes2, dtype=torch.float32)
    boxes1, boxes2 = torch.from_numpy(boxes1).float(), torch.from_numpy(boxes2).float()
    # pred_xywh, label_xywh -> pred_xyxy, label_xyxy
    boxes1 = torch.cat([boxes1[..., :2] - boxes1[..., 2:] * 0.5,
                        boxes1[..., :2] + boxes1[..., 2:] * 0.5], dim=-1)
    boxes2 = torch.cat([boxes2[..., :2] - boxes2[..., 2:] * 0.5,
                        boxes2[..., :2] + boxes2[..., 2:] * 0.5], dim=-1)

    boxes1 = torch.cat([torch.min(boxes1[..., :2], boxes1[..., 2:]),
                        torch.max(boxes1[..., :2], boxes1[..., 2:])], dim=-1)
    boxes2 = torch.cat([torch.min(boxes2[..., :2], boxes2[..., 2:]),
                        torch.max(boxes2[..., :2], boxes2[..., 2:])], dim=-1)

    boxes1_area = (boxes1[..., 2] - boxes1[..., 0]) * (boxes1[..., 3] - boxes1[..., 1])
    boxes2_area = (boxes2[..., 2] - boxes2[..., 0]) * (boxes2[..., 3] - boxes2[..., 1])

    left_up = torch.max(boxes1[..., :2], boxes2[..., :2])
    right_down = torch.min(boxes1[..., 2:], boxes2[..., 2:])

    inter_section = torch.max(right_down - left_up, torch.tensor(0.0))
    inter_area = inter_section[..., 0] * inter_section[..., 1]
    union_area = boxes1_area + boxes2_area - inter_area
    # 计算两个边界框之间的 iou 值
    iou = inter_area / union_area
    # 计算最小闭合凸面 C 左上角和右下角的坐标
    enclose_left_up = torch.min(boxes1[..., :2], boxes2[..., :2])
    enclose_right_down = torch.max(boxes1[..., 2:], boxes2[..., 2:])
    enclose = torch.max(enclose_right_down - enclose_left_up, torch.tensor(0.0))
    # 计算最小闭合凸面 C 的面积
    enclose_area = enclose[..., 0] * enclose[..., 1]
    # 根据 GIoU 公式计算 GIoU 值
    giou = iou - 1.0 * (enclose_area - union_area) / enclose_area

    return giou


# https://github.com/Zzh-tju/DIoU-SSD-pytorch/blob/65b68b53f73173397937d4950ff916a41545c960/utils/box/box_utils.py#L5
def bbox_diou_torch(bboxes1, bboxes2):
    bboxes1, bboxes2 = torch.from_numpy(bboxes1).float(), torch.from_numpy(bboxes2).float()
    rows = bboxes1.shape[0]
    cols = bboxes2.shape[0]
    dious = torch.zeros((rows, cols))
    if rows * cols == 0:
        return dious
    exchange = False
    if bboxes1.shape[0] > bboxes2.shape[0]:
        bboxes1, bboxes2 = bboxes2, bboxes1
        dious = torch.zeros((cols, rows))
        exchange = True

    w1 = bboxes1[:, 2] - bboxes1[:, 0]
    h1 = bboxes1[:, 3] - bboxes1[:, 1]
    w2 = bboxes2[:, 2] - bboxes2[:, 0]
    h2 = bboxes2[:, 3] - bboxes2[:, 1]

    area1 = w1 * h1
    area2 = w2 * h2
    center_x1 = (bboxes1[:, 2] + bboxes1[:, 0]) / 2
    center_y1 = (bboxes1[:, 3] + bboxes1[:, 1]) / 2
    center_x2 = (bboxes2[:, 2] + bboxes2[:, 0]) / 2
    center_y2 = (bboxes2[:, 3] + bboxes2[:, 1]) / 2

    inter_max_xy = torch.min(bboxes1[:, 2:], bboxes2[:, 2:])
    inter_min_xy = torch.max(bboxes1[:, :2], bboxes2[:, :2])
    out_max_xy = torch.max(bboxes1[:, 2:], bboxes2[:, 2:])
    out_min_xy = torch.min(bboxes1[:, :2], bboxes2[:, :2])

    inter = torch.clamp((inter_max_xy - inter_min_xy), min=0)
    inter_area = inter[:, 0] * inter[:, 1]  # 交集
    inter_diag = (center_x2 - center_x1) ** 2 + (center_y2 - center_y1) ** 2
    outer = torch.clamp((out_max_xy - out_min_xy), min=0)
    outer_diag = (outer[:, 0] ** 2) + (outer[:, 1] ** 2)
    union = area1 + area2 - inter_area  # 并集
    dious = inter_area / union - (inter_diag) / outer_diag
    dious = torch.clamp(dious, min=-1.0, max=1.0)
    if exchange:
        dious = dious.T
    return dious

def bbox_diou_np(boxes1, boxes2, normaled=False):
    inter_diag = np.sum(np.square(boxes1[..., :2] - boxes2[..., :2]), axis=1)
    # pred_xywh, label_xywh -> pred_xyxy, label_xyxy
    boxes1 = np.concatenate([boxes1[..., :2] - boxes1[..., 2:] * 0.5,
                             boxes1[..., :2] + boxes1[..., 2:] * 0.5], axis=-1)
    boxes2 = np.concatenate([boxes2[..., :2] - boxes2[..., 2:] * 0.5,
                             boxes2[..., :2] + boxes2[..., 2:] * 0.5], axis=-1)

    boxes1 = np.concatenate([np.minimum(boxes1[..., :2], boxes1[..., 2:]),
                             np.maximum(boxes1[..., :2], boxes1[..., 2:])], axis=-1)
    boxes2 = np.concatenate([np.minimum(boxes2[..., :2], boxes2[..., 2:]),
                             np.maximum(boxes2[..., :2], boxes2[..., 2:])], axis=-1)

    boxes1_area = (boxes1[..., 2] - boxes1[..., 0]) * (boxes1[..., 3] - boxes1[..., 1])
    boxes2_area = (boxes2[..., 2] - boxes2[..., 0]) * (boxes2[..., 3] - boxes2[..., 1])

    left_up = np.maximum(boxes1[..., :2], boxes2[..., :2])
    right_down = np.minimum(boxes1[..., 2:], boxes2[..., 2:])

    inter_section = np.maximum(right_down - left_up, 0.0)
    inter_area = inter_section[..., 0] * inter_section[..., 1]
    union_area = boxes1_area + boxes2_area - inter_area
    # 计算两个边界框之间的 iou 值
    iou = inter_area / union_area
    # 计算最小闭合凸面 C 左上角和右下角的坐标
    enclose_left_up = np.minimum(boxes1[..., :2], boxes2[..., :2])
    enclose_right_down = np.maximum(boxes1[..., 2:], boxes2[..., 2:])
    enclose = np.maximum(enclose_right_down - enclose_left_up, 0.0)
    outer_diag = (enclose[:, 0] ** 2) + (enclose[:, 1] ** 2)
    # 根据 DIoU 公式计算 DIoU 值
    diou = iou - 1.0 * inter_diag / outer_diag
    diou = np.clip(diou, a_min=-1.0, a_max=1.0)

    return diou

def bbox_diou_tf(boxes1, boxes2):
    inter_diag = tf.reduce_sum(tf.square(boxes1[..., :2] - boxes2[..., :2]), axis=1)
    # pred_xywh, label_xywh -> pred_xyxy, label_xyxy
    boxes1 = tf.concat([boxes1[..., :2] - boxes1[..., 2:] * 0.5,
                        boxes1[..., :2] + boxes1[..., 2:] * 0.5], axis=-1)
    boxes2 = tf.concat([boxes2[..., :2] - boxes2[..., 2:] * 0.5,
                        boxes2[..., :2] + boxes2[..., 2:] * 0.5], axis=-1)

    boxes1 = tf.concat([tf.minimum(boxes1[..., :2], boxes1[..., 2:]),
                        tf.maximum(boxes1[..., :2], boxes1[..., 2:])], axis=-1)
    boxes2 = tf.concat([tf.minimum(boxes2[..., :2], boxes2[..., 2:]),
                        tf.maximum(boxes2[..., :2], boxes2[..., 2:])], axis=-1)

    boxes1_area = (boxes1[..., 2] - boxes1[..., 0]) * (boxes1[..., 3] - boxes1[..., 1])
    boxes2_area = (boxes2[..., 2] - boxes2[..., 0]) * (boxes2[..., 3] - boxes2[..., 1])

    left_up = tf.maximum(boxes1[..., :2], boxes2[..., :2])
    right_down = tf.minimum(boxes1[..., 2:], boxes2[..., 2:])

    inter_section = tf.maximum(right_down - left_up, 0.0)
    inter_area = inter_section[..., 0] * inter_section[..., 1]
    union_area = boxes1_area + boxes2_area - inter_area
    # 计算两个边界框之间的 iou 值
    iou = inter_area / union_area
    # 计算最小闭合凸面 C 左上角和右下角的坐标
    # 计算最小闭合凸面 C 左上角和右下角的坐标
    enclose_left_up = tf.minimum(boxes1[..., :2], boxes2[..., :2])
    enclose_right_down = tf.maximum(boxes1[..., 2:], boxes2[..., 2:])
    enclose = tf.maximum(enclose_right_down - enclose_left_up, 0.0)
    outer_diag = (enclose[:, 0] ** 2) + (enclose[:, 1] ** 2)
    # 根据 GIoU 公式计算 GIoU 值
    diou = iou - 1.0 * inter_diag / outer_diag
    diou = tf.clip_by_value(diou, clip_value_min=-1.0, clip_value_max=1.0)

    return diou


# https://github.com/Zzh-tju/DIoU-SSD-pytorch/blob/65b68b53f73173397937d4950ff916a41545c960/utils/box/box_utils.py#L47
def bbox_ciou_torch(bboxes1, bboxes2, normaled=False):
    bboxes1, bboxes2 = torch.from_numpy(bboxes1).float(), torch.from_numpy(bboxes2).float()
    rows = bboxes1.shape[0]
    cols = bboxes2.shape[0]
    cious = torch.zeros((rows, cols))
    if rows * cols == 0:
        return cious
    exchange = False
    if bboxes1.shape[0] > bboxes2.shape[0]:
        bboxes1, bboxes2 = bboxes2, bboxes1
        cious = torch.zeros((cols, rows))
        exchange = True

    w1 = bboxes1[:, 2] - bboxes1[:, 0]
    h1 = bboxes1[:, 3] - bboxes1[:, 1]
    w2 = bboxes2[:, 2] - bboxes2[:, 0]
    h2 = bboxes2[:, 3] - bboxes2[:, 1]

    area1 = w1 * h1
    area2 = w2 * h2

    center_x1 = (bboxes1[:, 2] + bboxes1[:, 0]) / 2
    center_y1 = (bboxes1[:, 3] + bboxes1[:, 1]) / 2
    center_x2 = (bboxes2[:, 2] + bboxes2[:, 0]) / 2
    center_y2 = (bboxes2[:, 3] + bboxes2[:, 1]) / 2

    inter_max_xy = torch.min(bboxes1[:, 2:], bboxes2[:, 2:])
    inter_min_xy = torch.max(bboxes1[:, :2], bboxes2[:, :2])
    out_max_xy = torch.max(bboxes1[:, 2:], bboxes2[:, 2:])
    out_min_xy = torch.min(bboxes1[:, :2], bboxes2[:, :2])

    inter = torch.clamp((inter_max_xy - inter_min_xy), min=0)
    inter_area = inter[:, 0] * inter[:, 1]
    inter_diag = (center_x2 - center_x1) ** 2 + (center_y2 - center_y1) ** 2
    outer = torch.clamp((out_max_xy - out_min_xy), min=0)
    outer_diag = (outer[:, 0] ** 2) + (outer[:, 1] ** 2)
    union = area1 + area2 - inter_area
    u = (inter_diag) / outer_diag
    iou = inter_area / union
    with torch.no_grad():
        arctan = torch.atan(w2 / h2) - torch.atan(w1 / h1)
        v = (4 / (math.pi ** 2)) * torch.pow((torch.atan(w2 / h2) - torch.atan(w1 / h1)), 2)
        S = 1 - iou
        alpha = v / (S + v)
        w_temp = 2 * w1
        distance = w1 ** 2 + h1 ** 2
    ar = (8 / (math.pi ** 2)) * arctan * ((w1 - w_temp) * h1)
    if not normaled:
        cious = iou - (u + alpha * ar / distance)
    else:
        cious = iou - (u + alpha * ar)
    cious = torch.clamp(cious, min=-1.0, max=1.0)
    if exchange:
        cious = cious.T
    return cious

def bbox_ciou_np(boxes1, boxes2, normaled=False):
    w1, h1 = boxes1[..., 2], boxes1[..., 3]
    w2, h2 = boxes2[..., 2], boxes2[..., 3]
    inter_diag = np.sum(np.square(boxes1[..., :2] - boxes2[..., :2]), axis=-1)
    # pred_xywh, label_xywh -> pred_xyxy, label_xyxy
    boxes1 = np.concatenate([boxes1[..., :2] - boxes1[..., 2:] * 0.5,
                             boxes1[..., :2] + boxes1[..., 2:] * 0.5], axis=-1)
    boxes2 = np.concatenate([boxes2[..., :2] - boxes2[..., 2:] * 0.5,
                             boxes2[..., :2] + boxes2[..., 2:] * 0.5], axis=-1)

    boxes1 = np.concatenate([np.minimum(boxes1[..., :2], boxes1[..., 2:]),
                             np.maximum(boxes1[..., :2], boxes1[..., 2:])], axis=-1)
    boxes2 = np.concatenate([np.minimum(boxes2[..., :2], boxes2[..., 2:]),
                             np.maximum(boxes2[..., :2], boxes2[..., 2:])], axis=-1)

    boxes1_area = (boxes1[..., 2] - boxes1[..., 0]) * (boxes1[..., 3] - boxes1[..., 1])
    boxes2_area = (boxes2[..., 2] - boxes2[..., 0]) * (boxes2[..., 3] - boxes2[..., 1])

    left_up = np.maximum(boxes1[..., :2], boxes2[..., :2])
    right_down = np.minimum(boxes1[..., 2:], boxes2[..., 2:])

    inter_section = np.maximum(right_down - left_up, 0.0)
    inter_area = inter_section[..., 0] * inter_section[..., 1]
    union_area = boxes1_area + boxes2_area - inter_area
    # 计算两个边界框之间的 iou 值
    iou = inter_area / union_area
    # 计算最小闭合凸面 C 左上角和右下角的坐标
    enclose_left_up = np.minimum(boxes1[..., :2], boxes2[..., :2])
    enclose_right_down = np.maximum(boxes1[..., 2:], boxes2[..., 2:])
    enclose = np.maximum(enclose_right_down - enclose_left_up, 0.0)
    outer_diag = (enclose[:, 0] ** 2) + (enclose[:, 1] ** 2)
    u = (inter_diag) / outer_diag
    # 根据 CIoU 公式计算 CIoU 值
    arctan = np.arctan(w2 / h2) - np.arctan(w1 / h1)
    v = (4 / (math.pi ** 2)) * np.square(np.arctan(w2 / h2) - np.arctan(w1 / h1))
    S = 1 - iou
    alpha = v / (S + v)
    w_temp = 2 * w1
    distance = w1 ** 2 + h1 ** 2
    ar = (8 / (math.pi ** 2)) * arctan * ((w1 - w_temp) * h1)
    if not normaled:
        cious = iou - (u + alpha * ar / distance)
    else:
        cious = iou - (u + alpha * ar)
    cious = np.clip(cious, a_min=-1.0, a_max=1.0)

    return cious

def bbox_ciou_tf(boxes1, boxes2, normaled=False):
    w1, h1 = boxes1[..., 2], boxes1[..., 3]
    w2, h2 = boxes2[..., 2], boxes2[..., 3]
    inter_diag = tf.reduce_sum(tf.square(boxes1[..., :2] - boxes2[..., :2]), axis=-1)
    # pred_xywh, label_xywh -> pred_xyxy, label_xyxy
    boxes1 = tf.concat([boxes1[..., :2] - boxes1[..., 2:] * 0.5,
                        boxes1[..., :2] + boxes1[..., 2:] * 0.5], axis=-1)
    boxes2 = tf.concat([boxes2[..., :2] - boxes2[..., 2:] * 0.5,
                        boxes2[..., :2] + boxes2[..., 2:] * 0.5], axis=-1)

    boxes1 = tf.concat([tf.minimum(boxes1[..., :2], boxes1[..., 2:]),
                        tf.maximum(boxes1[..., :2], boxes1[..., 2:])], axis=-1)
    boxes2 = tf.concat([tf.minimum(boxes2[..., :2], boxes2[..., 2:]),
                        tf.maximum(boxes2[..., :2], boxes2[..., 2:])], axis=-1)

    boxes1_area = (boxes1[..., 2] - boxes1[..., 0]) * (boxes1[..., 3] - boxes1[..., 1])
    boxes2_area = (boxes2[..., 2] - boxes2[..., 0]) * (boxes2[..., 3] - boxes2[..., 1])

    left_up = tf.maximum(boxes1[..., :2], boxes2[..., :2])
    right_down = tf.minimum(boxes1[..., 2:], boxes2[..., 2:])

    inter_section = tf.maximum(right_down - left_up, 0.0)
    inter_area = inter_section[..., 0] * inter_section[..., 1]
    union_area = boxes1_area + boxes2_area - inter_area
    # 计算两个边界框之间的 iou 值
    iou = inter_area / union_area
    # 计算最小闭合凸面 C 左上角和右下角的坐标
    # 计算最小闭合凸面 C 左上角和右下角的坐标
    enclose_left_up = tf.minimum(boxes1[..., :2], boxes2[..., :2])
    enclose_right_down = tf.maximum(boxes1[..., 2:], boxes2[..., 2:])
    enclose = tf.maximum(enclose_right_down - enclose_left_up, 0.0)
    outer_diag = (enclose[:, 0] ** 2) + (enclose[:, 1] ** 2)
    u = (inter_diag) / outer_diag
    # 根据 CIoU 公式计算 CIoU 值
    # arctan = tf.atan(w2 / h2) - tf.atan(w1 / h1)
    # v = (4 / (math.pi ** 2)) * np.square(tf.atan(w2 / h2) - tf.atan(w1 / h1))
    arctan = tf.atan(w2 / (h2 + epsilon)) - tf.atan(w1 / (h1 + epsilon))
    v = (4 / (math.pi ** 2)) * np.square(tf.atan(w2 / (h2 + epsilon)) - tf.atan(w1 / (h1 + epsilon)))
    S = 1 - iou
    alpha = tf.stop_gradient(v / (S + v))
    w_temp = tf.stop_gradient(2 * w1)
    distance = tf.stop_gradient(w1 ** 2 + h1 ** 2 + epsilon)
    ar = (8 / (math.pi ** 2)) * arctan * ((w1 - w_temp) * h1)
    if not normaled:
        cious = iou - (u + alpha * ar / distance)
    else:
        cious = iou - (u + alpha * ar)
    cious = tf.clip_by_value(cious, clip_value_min=-1.0, clip_value_max=1.0)

    return cious


img_width = 480.0
img_height = 320.0
gt_bboxes_xyxy = np.array([[50, 40, 200, 200], [270, 70, 400, 180]])  # xyxy
pre_bboxes_xyxy = np.array([[100, 100, 250, 300], [400, 180, 460, 300]])  # xyxy

gt_bboxes_xyxy_nomal = np.zeros(shape=gt_bboxes_xyxy.shape, dtype=np.float)
pre_bboxes_xyxy_nomal = np.zeros(shape=pre_bboxes_xyxy.shape, dtype=np.float)
gt_bboxes_xyxy_nomal[..., 0::2] = gt_bboxes_xyxy[..., 0::2] / img_width
gt_bboxes_xyxy_nomal[..., 1::2] = gt_bboxes_xyxy[..., 1::2] / img_height
pre_bboxes_xyxy_nomal[..., 0::2] = pre_bboxes_xyxy[..., 0::2] / img_width
pre_bboxes_xyxy_nomal[..., 1::2] = pre_bboxes_xyxy[..., 1::2] / img_height

gt_bboxes_xywh = np.array([[125, 120, 150, 160], [335, 125, 130, 110]])  # xywh
pre_bboxes_xywh = np.array([[175, 200, 150, 200], [430, 240, 60, 120]])  # xywh

gt_bboxes_xywh_nomal = np.zeros(shape=gt_bboxes_xywh.shape, dtype=np.float)
pre_bboxes_xywh_nomal = np.zeros(shape=pre_bboxes_xywh.shape, dtype=np.float)
gt_bboxes_xywh_nomal[..., 0::2] = gt_bboxes_xywh[..., 0::2] / img_width
gt_bboxes_xywh_nomal[..., 1::2] = gt_bboxes_xywh[..., 1::2] / img_height
pre_bboxes_xywh_nomal[..., 0::2] = pre_bboxes_xywh[..., 0::2] / img_width
pre_bboxes_xywh_nomal[..., 1::2] = pre_bboxes_xywh[..., 1::2] / img_height

# ================================================================ #
fig = plt.figure()
ax = fig.add_subplot(111)
currentAxis = plt.gca()
for idx, (gt, pt) in enumerate(zip(gt_bboxes_xywh, pre_bboxes_xywh)):
    iou = IoU(gt, pt, True)
    giou = GIoU(gt, pt, True)
    diou = DIoU(gt, pt, True)
    ciou = CIoU(gt, pt, True)
    currentAxis.text(gt[0] - gt[2] / 2, 20, 'iou={:.4f}, giou={:.4f}'.format(iou, giou),
                     bbox={'facecolor': 'yellow', 'alpha': 0.5})
    currentAxis.text(gt[0] - gt[2] / 2, gt[1] + gt[3] / 2 + 20, 'diou={:.4f}, ciou={:.4f}'.format(diou, ciou),
                     bbox={'facecolor': 'yellow', 'alpha': 0.5})
    currentAxis.add_patch(plt.Rectangle((gt[0]-gt[2]/2,gt[1]-gt[3]/2),gt[2],gt[3],
                                        fill=False, edgecolor='green', linewidth=2))
    currentAxis.text(gt[0]-gt[2]/2,gt[1]-gt[3]/2, 'g{}'.format(idx), bbox={'facecolor': 'green', 'alpha': 0.5})
    currentAxis.add_patch(plt.Rectangle((pt[0]-pt[2]/2, pt[1]-pt[3]/2), pt[2], pt[3],
                                        fill=False, edgecolor='red', linewidth=2))
    currentAxis.text(pt[0]-pt[2]/2, pt[1]-pt[3]/2, 'p{}'.format(idx), bbox={'facecolor': 'red', 'alpha': 0.5})


plt.xticks(np.arange(0, img_width+1, 40))
plt.yticks(np.arange(0, img_height+1, 40))
currentAxis.invert_yaxis()
plt.show()

# ================================================================ #

import tensorflow as tf
import torch

label_bbox = tf.placeholder(dtype=tf.float32, name='label_bbox')
predic_bbox = tf.placeholder(dtype=tf.float32, name='predic_bbox')
label_bbox_normal = tf.placeholder(dtype=tf.float32, name='label_bbox_normal')
predic_bbox_normal = tf.placeholder(dtype=tf.float32, name='predic_bbox_normal')

# ================================================================ #
#                               GIoU                               #
# ================================================================ #
gious = np.expand_dims(bbox_giou_np(gt_bboxes_xywh, pre_bboxes_xywh), axis=-1)
print('numpy publish giou:                ', gious)
# ================================================================ #
gious = tf.expand_dims(bbox_giou_tf(predic_bbox, label_bbox), axis=-1)

with tf.Session() as sess:
    result = sess.run(gious, feed_dict={label_bbox: gt_bboxes_xywh,
                                       predic_bbox: pre_bboxes_xywh}
                      )
    print('tensorflow publish giou:           ', result)
# ================================================================ #
gious = bbox_giou_torch(gt_bboxes_xywh, pre_bboxes_xywh).unsqueeze(-1)
print('pytorch publish goiu:              ', gious.numpy())

# ================================================================ #
#                               DIoU                               #
# ================================================================ #
dious = np.expand_dims(bbox_diou_np(gt_bboxes_xywh, pre_bboxes_xywh), axis=-1)
print('numpy publish diou :               ', dious)
# ================================================================
dious = bbox_diou_torch(gt_bboxes_xyxy, pre_bboxes_xyxy).unsqueeze(-1)
print('pytorch publish diou:              ', dious.numpy())
# ================================================================
label_bbox = tf.placeholder(dtype=tf.float32, name='label_bbox')
predic_bbox = tf.placeholder(dtype=tf.float32, name='predic_bbox')
dious = tf.expand_dims(bbox_diou_tf(label_bbox, predic_bbox), axis=-1)
with tf.Session() as sess:
    result = sess.run(dious, feed_dict={label_bbox: gt_bboxes_xywh,
                                       predic_bbox: pre_bboxes_xywh})
    print('tensorflow publish diou:           ', result)

# ================================================================ #
#                               CIoU                               #
# ================================================================ #
cious = bbox_ciou_torch(gt_bboxes_xyxy, pre_bboxes_xyxy, False).unsqueeze(-1)
print('pytorch publish ciou unnormaled:   ', cious.numpy())

cious = bbox_ciou_torch(gt_bboxes_xyxy_nomal, pre_bboxes_xyxy_nomal, True).unsqueeze(-1)
print('pytorch publish ciou normaled:     ', cious.numpy())
# ================================================================ #
cious = np.expand_dims(bbox_ciou_np(gt_bboxes_xywh, pre_bboxes_xywh, False), axis=-1)
print('numpy publish ciou unnormaled:     ', cious)

cious = np.expand_dims(bbox_ciou_np(gt_bboxes_xywh_nomal, pre_bboxes_xywh_nomal, True), axis=-1)
print('numpy publish ciou normaled:       ', cious)
# ================================================================ #
cious = tf.expand_dims(bbox_ciou_tf(label_bbox, predic_bbox, False), axis=-1)
cious_normal = tf.expand_dims(bbox_ciou_tf(label_bbox_normal, predic_bbox_normal, True), axis=-1)
with tf.Session() as sess:
    cious_tf, cious_tf_normal = sess.run([cious, cious_normal],
                                                  feed_dict={label_bbox_normal: gt_bboxes_xywh_nomal,
                                                             predic_bbox_normal: pre_bboxes_xywh_nomal,
                                                             label_bbox: gt_bboxes_xywh,
                                                             predic_bbox: pre_bboxes_xywh})
    print('tensorflow publish ciou unnormaled:', cious_tf)
    print('tensorflow publish ciou normaled:  ', cious_tf_normal)
# ================================================================ #
View Code

numpy publish giou:                 [[ 0.07342657]
 [-0.50800915]]
tensorflow publish giou:            [[ 0.07342657]
 [-0.50800914]]
pytorch publish goiu:               [[ 0.07342657]
 [-0.50800914]]
numpy publish diou :                [[ 0.14455897]
 [-0.25      ]]
pytorch publish diou:               [[ 0.14455898]
 [-0.25      ]]
tensorflow publish diou:            [[ 0.14455898]
 [-0.25      ]]
pytorch publish ciou unnormaled:    [[ 0.14428109]
 [-0.2600825 ]]
pytorch publish ciou normaled:      [[ 0.1392411 ]
 [-0.25120372]]
numpy publish ciou unnormaled:      [[ 0.14428107]
 [-0.26008251]]
numpy publish ciou normaled:        [[ 0.13924112]
 [-0.25120372]]
tensorflow publish ciou unnormaled: [[ 0.14428109]
 [-0.2600825 ]]
tensorflow publish ciou normaled:   [[ 0.13924108]
 [-0.25120363]]

 同事实验下来:

method GIoU DIoU CIoU
mAP 81.37% 81.46% 82.36%




posted @ 2019-12-25 17:07  xuanyuyt  阅读(6897)  评论(0编辑  收藏  举报