YOLOv3-YOLOv7 COCO 数据集解析

之前我一直使用 VOC 格式的数据来训练 YOLO, 这次整理下 COCO 格式的数据。

当我们在COCO 官网下载数据后，是以下格式：

.
├── annotations
|	├── captions_train2017.json
|	├── captions_val2017.json
|	├── instances_train2017.json
|	├── instances_val2017.json
|	├── person_keypoints_train2017.json
|	└── person_keypoints_val2017.json
├── images
│   ├── test2017（40670 `.jpg`）
│   ├── train2017（118287 `.jpg`）
│   └── val2017（5000 `.jpg`）

有篇对该数据集进行介绍的文章：https://zhuanlan.zhihu.com/p/29393415

以下两个文件，是用来检测和分割的标注文件，YOLO将该数据转为了 txt, 使得每张图片对应一个 txt 文件。

|	├── instances_train2017.json
|	├── instances_val2017.json

我们可以从 coco2017labels-segments.zip 获得转换好的文件。

解压后，放在对应文件夹即可。我们得到最后的数据文件结构：

.
├── annotations
|	├── captions_train2017.json
|	├── captions_val2017.json
|	├── instances_train2017.json
|	├── instances_val2017.json
|	├── person_keypoints_train2017.json
|	└── person_keypoints_val2017.json
├── images
│   ├── test2017（40670 `.jpg`）
│   ├── train2017（118287 `.jpg`）
│   └── val2017（5000 `.jpg`）
├── labels
│   ├── train2017（117266 `.txt`）
│   └── val2017（4952 `.txt`）
├── train2017.txt
├── test-dev2017.txt
├── val2017.txt

train2017.txt test-dev2017.txt val2017.txt文件的内容是每行一个图片路径。如下：

./images/train2017/000000109622.jpg
./images/train2017/000000160694.jpg
./images/train2017/000000308590.jpg
...

那么，我们就不需要 instances_train2017.json 和 instances_val2017.json 文件了。

每一个标注文件，即 xx.txt 内的分类都是一个数字，后面跟着许多坐标，用于实例分割，可以根据这些分割坐标获取检测框的坐标，也就是如下代码（YOLOv7中的）：

def segments2boxes(segments):
	# Convert segment labels to box labels, i.e. (xy1, xy2, ...) to (xywh)
	boxes = []
	for s in segments:
		x, y = s.T  # segment xy
		boxes.append([x.min(), y.min(), x.max(), y.max()])  # xyxy 如 （8，4）表示 该图片有 8个检测框，4表示左上和右下的坐标
	return xyxy2xywh(np.array(boxes))  # xywh

最后把坐标转为中心点坐标和 w,h 这种格式。如下：

def xyxy2xywh(x):
	# Convert nx4 boxes from [x1, y1, x2, y2] to [x, y, w, h] where xy1=top-left, xy2=bottom-right
	y = x.clone() if isinstance(x, torch.Tensor) else np.copy(x)
	y[:, 0] = (x[:, 0] + x[:, 2]) / 2  # x center
	y[:, 1] = (x[:, 1] + x[:, 3]) / 2  # y center
	y[:, 2] = x[:, 2] - x[:, 0]  # width
	y[:, 3] = x[:, 3] - x[:, 1]  # height
	return y

需要注意的是，COCO格式的坐标数据为归一化的小数，也就是都是在 [0-1]之间。

下面代码是将 x,y,w,h(左上，宽高) 标注框转为 cx,cy,w,h(中心，宽高)，并进行归一化的代码，可以放到 utils/general.py：xywh2cxcywh(x, shape)

def xywh2cxcywh(x, shape):
	# Convert nx4 boxes from [x1, y1, w, h] to [cx, cy, w, h] where xy1=top-left, cxcy=centre, shape为图像的（w,h）
	y = x.clone() if isinstance(x, torch.Tensor) else np.copy(x)
	y[:, 0] = (x[:, 0] + x[:, 2] / 2)  # x center
	y[:, 1] = (x[:, 1] + x[:, 3] / 2)  # y center
	y[:, 2] = x[:, 2]   # width
	y[:, 3] = x[:, 3]   # height
	# 归一化
	dw = 1. / shape[0] 
	dh = 1. / shape[1]
	y[:, 0] = y[:, 0] * dw
	y[:, 1] = y[:, 1] * dh
	y[:, 2] = y[:, 2] * dw
	y[:, 3] = y[:, 3] * dh
	return y

yolo 格式数据读取

def cache_labels(self, path=Path('./labels.cache'), prefix=''):  # 重写
    # Cache dataset labels, check images and read shapes
    x = {}  # dict
    nm, nf, ne, nc = 0, 0, 0, 0  # number missing(所有图片没有标注的数目和), found(找到的标注和), empty(虽然有标注文件，但是文件内啥都没写), duplicate(读取时候出现问题的样本数目)
    pbar = tqdm(zip(self.img_files, self.label_files), desc='Scanning images', total=len(self.img_files)) # 产生这么个进度条，Scanning images:   0%|                          | 0/118287 [00:00<?, ?it/s]
    for i, (im_file, lb_file) in enumerate(pbar): # 循环每个样本，图像jpg-标注txt对
        try:
            # verify images
            im = Image.open(im_file) # 验证图像是否可以打开
            im.verify()  # PIL verify  # 检查文件完整性
            shape = exif_size(im)  # 获得 image size
            segments = []  # instance segments
            assert (shape[0] > 9) & (shape[1] > 9), f'image size {shape} <10 pixels'
            assert im.format.lower() in img_formats, f'invalid image format {im.format}'

            # verify labels
            if os.path.isfile(lb_file):
                nf += 1  # label found
                with open(lb_file, 'r') as f:
                    l = [x.split() for x in f.read().strip().splitlines()] # 把标注txt 文件的每行(一个标注)都读取出来组成list
                    if any([len(x) > 8 for x in l]):  # is segment 如果长度大于8那么该标注是分割
                        classes = np.array([x[0] for x in l], dtype=np.float32) # 标注的第一列代表类别，是一个字符串类型的数字， 如 '45', 这里组成当前文件的类别list：如 [45.0, 45.0, 50.0, 45.0, 49.0, 49.0, 49.0, 49.0]
                        segments = [np.array(x[1:], dtype=np.float32).reshape(-1, 2) for x in l]  # 除了第一列，后面每两个数是一个标注的坐标，把每个实例分割框的每个点坐标 reshape 下
                        l = np.concatenate((classes.reshape(-1, 1), segments2boxes(segments)), 1)  # (cls, xywh) 如(8,5) 这里 xy 是目标中心点坐标，并且xywh 都是经过归一化的
                    l = np.array(l, dtype=np.float32)
                if len(l):
                    assert l.shape[1] == 5, 'labels require 5 columns each' # 即 cls,xywh
                    assert (l >= 0).all(), 'negative labels' # 所有值都  >= 0
                    assert (l[:, 1:] <= 1).all(), 'non-normalized or out of bounds coordinate labels' # bbox 坐标不能在 图像外
                    assert np.unique(l, axis=0).shape[0] == l.shape[0], 'duplicate labels' # 标注里面有重复的框
                else:
                    ne += 1  # label empty
                    l = np.zeros((0, 5), dtype=np.float32)
            else:
                nm += 1  # label missing
                l = np.zeros((0, 5), dtype=np.float32)
            x[im_file] = [l, shape, segments] # x是一个dict，key 为 图像path，value：该图像的标注(如 8，5)， 图像的宽高，分割的坐标 
        except Exception as e:
            nc += 1
            print(f'{prefix}WARNING: Ignoring corrupted image and/or label {im_file}: {e}')

        pbar.desc = f"{prefix}Scanning '{path.parent / path.stem}' images and labels... " \
                    f"{nf} found, {nm} missing, {ne} empty, {nc} corrupted" # 更新进度条
    pbar.close()

    if nf == 0:
        print(f'{prefix}WARNING: No labels found in {path}. See {help_url}')

    x['hash'] = get_hash(self.label_files + self.img_files)
    x['results'] = nf, nm, ne, nc, i + 1 # 统计的数目
    x['version'] = 0.1  # cache version
    torch.save(x, path)  # save for next time
    logging.info(f'{prefix}New cache created: {path}')
    return x

如果想把自己的数据修改成 yolo格式的，可以参考下面的文章：https://blog.csdn.net/weixin_37707770/article/details/110441476

posted @ 2022-08-30 11:19 Zenith_Hugh 阅读(4982) 评论(0) 收藏举报

刷新页面返回顶部

Zenith Hugh

We Go To The Moon

YOLOv3-YOLOv7 COCO 数据集解析

yolo 格式数据读取

公告