非极大值抑制NMS

非极大值抑制NMS

为什么需要NMS

非极大值抑制（Non-Maximum Suppression，NMS），顾名思义就是抑制不是极大值的元素，可以理解为局部最大搜索。在目标检测领域其目的就是要去除冗余的检测框,保留最好的一个。

以目标检测为例，目标检测推理过程中会产生很多检测框，其中很多检测框都是检测同一个目标，但最终每个目标只需要一个检测框，NMS选择那个得分最高的检测框，再将该候选框与剩余框计算相应的IOU值，当IOU值超过所设定的阈值，即对超过阈值的框进行抑制，抑制的做法是将检测框的得分设置为0，如此一轮过后，在剩下检测框中继续寻找得分最高的，再抑制与之IOU超过阈值的框，直到最后会保留几乎没有重叠的框。这样基本可以做到每个目标只剩下一个检测框。

如何计算NMS

前提： 目标边界框列表及其对应的置信度得分列表，设定阈值，阈值用来删除重叠较大的边界框。

IoU：intersection-over-union，即两个边界框的交集部分除以它们的并集。

非极大值抑制的流程如下：

1. 获取全部bbox信息,根据置信度得分进行排序

2. 按照置信度排序后,记录当前confidence最大的bbox

3. 计算最大confidence对应的bbox 和 剩下的其他的bbox的 IOU

4. 删除IOU 大于阈值的 边界框(bbox)    即 删除重合度较高的边界框

5. 对剩下的候选框 bbox 重复上 排序和计算IOU,删除操作[2][3][4], 直到不能删除为止

区域交并比(IOU)

IOU的原称为Intersection over Union，也就是两个box区域的交集比上并集，很好理解，用于确定两个框的位置像素距离。

常用于计算真实边界框Bgt（数据集的标注）以及预测边界框Bp（模型预测结果）的重叠程度。

思路

首先计算两个box左上角点坐标的最大值和右下角坐标的最小值
然后计算交集面积
最后把交集面积除以对应的并集面积

numpy实现

import numpy as np

def iou(boxes1, boxes2):
    """
    Arguments:
        boxes1 (Array[N, 4])
        boxes2 (Array[M, 4])
    """
    # 计算第一组矩形框的面积
    area1 = (boxes1[:, 2] - boxes1[:, 0]) * (boxes1[:, 3] - boxes1[:, 1])

    # 计算第二组矩形框的面积
    area2 = (boxes2[:, 2] - boxes2[:, 0]) * (boxes2[:, 3] - boxes2[:, 1])

    # 计算交集的坐标范围
    x1 = np.maximum(boxes1[:, 0][:, np.newaxis], boxes2[:, 0])
    y1 = np.maximum(boxes1[:, 1][:, np.newaxis], boxes2[:, 1])
    x2 = np.minimum(boxes1[:, 2][:, np.newaxis], boxes2[:, 2])
    y2 = np.minimum(boxes1[:, 3][:, np.newaxis], boxes2[:, 3])

    # 计算交集的面积
    inter_area = np.maximum(x2 - x1, 0) * np.maximum(y2 - y1, 0)

    # 计算并集的面积
    union_area = area1[:, np.newaxis] + area2 - inter_area

    # 计算IOU值
    iou = inter_area / union_area
    return iou


boxes1 = np.array([[10, 10, 50, 50],
                   [20, 20, 60, 60],
                   [30, 30, 70, 70]])
boxes2 = np.array([[20, 20, 60, 60],
                   [30, 30, 70, 70],
                   [40, 40, 80, 80]])

iou_matrix = iou(boxes1, boxes2)
print(iou_matrix)

[[0.39130435 0.14285714 0.03225806]
 [1.         0.39130435 0.14285714]
 [0.39130435 1.         0.39130435]]

pytorch实现

from torch import Tensor
import torch

def box_area(boxes: Tensor) -> Tensor:
    """
    Computes the area of a set of bounding boxes, which are specified by its
    (x1, y1, x2, y2) coordinates.
    Arguments:
        boxes (Tensor[N, 4]): boxes for which the area will be computed. They
            are expected to be in (x1, y1, x2, y2) format
    Returns:
        area (Tensor[N]): area for each box
    """
    return (boxes[:, 2] - boxes[:, 0]) * (boxes[:, 3] - boxes[:, 1])

def box_iou(boxes1: Tensor, boxes2: Tensor) -> Tensor:
    """
    Return intersection-over-union (Jaccard index) of boxes.
    Both sets of boxes are expected to be in (x1, y1, x2, y2) format.
    Arguments:
        boxes1 (Tensor[N, 4])
        boxes2 (Tensor[M, 4])
    Returns:
        iou (Tensor[N, M]): the NxM matrix containing the pairwise IoU values for every element in boxes1 and boxes2
    """
    area1 = box_area(boxes1)  # 每个框的面积 (N,)
    area2 = box_area(boxes2)  # (M,)
 
    lt = torch.max(boxes1[:, None, :2], boxes2[:, :2])  # [N,M,2] # N中一个和M个比较； 所以由N，M 个
    rb = torch.min(boxes1[:, None, 2:], boxes2[:, 2:])  # [N,M,2]
 
    wh = (rb - lt).clamp(min=0)  # [N,M,2]  #小于0的为0  clamp 钳；夹钳；
    inter = wh[:, :, 0] * wh[:, :, 1]  # [N,M]  
 
    iou = inter / (area1[:, None] + area2 - inter)
    return iou  # NxM， boxes1中每个框和boxes2中每个框的IoU值；


boxes1 = np.array([[10, 10, 50, 50],
                   [20, 20, 60, 60],
                   [30, 30, 70, 70]])

boxes2 = np.array([[20, 20, 60, 60],
                   [30, 30, 70, 70],
                   [40, 40, 80, 80]])

box1=torch.Tensor(boxes1)
box2=torch.Tensor(boxes2)
iou_matrix = box_iou(box1, box2)
print(iou_matrix)
tensor([[0.3913, 0.1429, 0.0323],
        [1.0000, 0.3913, 0.1429],
        [0.3913, 1.0000, 0.3913]])

pytorch实现

def box_iou_torch(box1, box2, eps=1e-7):
    """
    Calculate intersection-over-union (IoU) of boxes.
    Both sets of boxes are expected to be in (x1, y1, x2, y2) format.
    Based on https://github.com/pytorch/vision/blob/master/torchvision/ops/boxes.py

    Args:
        box1 (torch.Tensor): A tensor of shape (N, 4) representing N bounding boxes.
        box2 (torch.Tensor): A tensor of shape (M, 4) representing M bounding boxes.
        eps (float, optional): A small value to avoid division by zero. Defaults to 1e-7.

    Returns:
        (torch.Tensor): An NxM tensor containing the pairwise IoU values for every element in box1 and box2.
    """

    # inter(N,M) = (rb(N,M,2) - lt(N,M,2)).clamp(0).prod(2)
    (a1, a2), (b1, b2) = box1.unsqueeze(1).chunk(2, 2), box2.unsqueeze(0).chunk(2, 2)
    inter = (torch.min(a2, b2) - torch.max(a1, b1)).clamp_(0).prod(2)

    # IoU = inter / (area1 + area2 - inter)
    return inter / ((a2 - a1).prod(2) + (b2 - b1).prod(2) - inter + eps)

boxes1 = np.array([[10, 10, 50, 50],
                   [20, 20, 60, 60],
                   [30, 30, 70, 70]])

boxes2 = np.array([[20, 20, 60, 60],
                   [30, 30, 70, 70],
                   [40, 40, 80, 80]])

box1=torch.Tensor(boxes1)
box2=torch.Tensor(boxes2)
res=box_iou_torch(box1,box2)

tensor([[0.3913, 0.1429, 0.0323],
        [1.0000, 0.3913, 0.1429],
        [0.3913, 1.0000, 0.3913]])