coco数据格式介绍

cocodataset

简介

https://cocodataset.org/

COCO数据集是一个可用于图像检测(image detection),语义分割(semantic segmentation)和图像标题生成(image captioning)的大规模数据集。它有超过330K张图像(其中220K张是有标注的图像),包含150万个目标,80个目标类别(object categories:行人、汽车、大象等),91种材料类别(stuff categoris:草、墙、天空等),每张图像包含五句图像的语句描述,且有250,000个带关键点标注的行人。

MC COCO2017年主要包含以下四个任务:目标检测、图像分割、图像描述、人体关键点检测

2017 Train images [118K/18GB]:http://images.cocodataset.org/zips/train2017.zip
2017 Val images [5K/1GB]:http://images.cocodataset.org/zips/val2017.zip
2017 Test images [41K/6GB]: http://images.cocodataset.org/zips/test2017.zip
2017 Annotations:http://images.cocodataset.org/annotations/annotations_trainval2017.zip

Object segmentation			  : 目标级分割
Recognition in context		  : 图像情景识别
Superpixel stuff segmentation : 超像素分割
330K images (>200K labeled)	  : 超过33万张图像,标注过的图像超过20万张
1.5 million object instances  : 150万个对象实例
80 object categories		  : 80个目标类别
91 stuff categories			  : 91个材料类别
5 captions per image		  : 每张图像有5段情景描述
250,000 people with keypoints : 对25万个人进行了关键点标注

文件路径

coco 数据路径

├── coco2017: 数据集根目录
	├── train2017: 所有训练图像文件夹(118287张)
	├── val2017: 所有验证图像文件夹(5000张)
	└── annotations: 对应标注文件夹
		├── instances_train2017.json		: 对应目标检测、分割任务的训练集标注文件
		├── instances_val2017.json			: 对应目标检测、分割任务的验证集标注文件
		├── captions_train2017.json			: 对应图像描述的训练集标注文件
		├── captions_val2017.json			: 对应图像描述的验证集标注文件
		├── person_keypoints_train2017.json	: 对应人体关键点检测的训练集标注文件
		└── person_keypoints_val2017.json	: 对应人体关键点检测的验证集标注文件夹

coco数据集格式充分利用了面向对象的思路:整个标注文件是一个json对象,这个大的json对象包含几个主要的filed:"info","licenses","categories","images","annotations"。每个filed都是一个数组,里面包含所有的image对象和annotation对象。在coco格式中,每一张图片是一个json对象,每一个标注也是一个json对象,所有的对象都用一个唯一的id进行标识。注意,image对象和annotation对象的id是分开来标识的。

annotations: 对应标注文件夹
	├── instances_train2017.json		: 对应目标检测、分割任务的
	├── instances_val2017.json			: 对应目标检测、分割任务的验证集标注文件
	├── captions_train2017.json			: 对应图像描述的训练集标注文件
	├── captions_val2017.json			: 对应图像描述的验证集标注文件
	├── person_keypoints_train2017.json	: 对应人体关键点检测的训练集标注文件
	└── person_keypoints_val2017.json	: 对应人体关键点检测的验证集标注文件夹

""" 注意 """
COCO数据集格式中,bbox 的保存格式为 [x, y, w, h]  
如果需要转换为[x1,y1,x2,y2],可以通过如下进行转换
bbox = [x1, y1, x1 + w - 1, y1 + h - 1]

类别说明

['appliance', 'food', 'indoor', 'accessory', 'electronic', 'furniture', 'vehicle', 'sports', 'animal', 'kitchen', 'person', 'outdoor']
['家用设备','食物','室内','配饰','电子产品','家具','交通','运动相关','动物','厨房','户外','人']
 [	{"supercategory": "person", "id": 1, "name": "person"},
    {"supercategory": "vehicle", "id": 2, "name": "bicycle"},
    {"supercategory": "vehicle", "id": 3, "name": "car"},
    {"supercategory": "vehicle", "id": 4, "name": "motorcycle"},
    {"supercategory": "vehicle", "id": 5, "name": "airplane"},
    {"supercategory": "vehicle", "id": 6, "name": "bus"},
    {"supercategory": "vehicle", "id": 7, "name": "train"},
    {"supercategory": "vehicle", "id": 8, "name": "truck"},
    {"supercategory": "vehicle", "id": 9, "name": "boat"},
    {"supercategory": "outdoor", "id": 10, "name": "traffic light"},
    {"supercategory": "outdoor", "id": 11, "name": "fire hydrant"},
    {"supercategory": "outdoor", "id": 13, "name": "stop sign"},
    {"supercategory": "outdoor", "id": 14, "name": "parking meter"},
    {"supercategory": "outdoor", "id": 15, "name": "bench"},
    {"supercategory": "animal", "id": 16, "name": "bird"},
    {"supercategory": "animal", "id": 17, "name": "cat"},
    {"supercategory": "animal", "id": 18, "name": "dog"},
    {"supercategory": "animal", "id": 19, "name": "horse"},
    {"supercategory": "animal", "id": 20, "name": "sheep"},
    {"supercategory": "animal", "id": 21, "name": "cow"},
    {"supercategory": "animal", "id": 22, "name": "elephant"},
    {"supercategory": "animal", "id": 23, "name": "bear"},
    {"supercategory": "animal", "id": 24, "name": "zebra"},
    {"supercategory": "animal", "id": 25, "name": "giraffe"},
    {"supercategory": "accessory", "id": 27, "name": "backpack"},
    {"supercategory": "accessory", "id": 28, "name": "umbrella"},
    {"supercategory": "accessory", "id": 31, "name": "handbag"},
    {"supercategory": "accessory", "id": 32, "name": "tie"},
    {"supercategory": "accessory", "id": 33, "name": "suitcase"},
    {"supercategory": "sports", "id": 34, "name": "frisbee"},
    {"supercategory": "sports", "id": 35, "name": "skis"},
    {"supercategory": "sports", "id": 36, "name": "snowboard"},
    {"supercategory": "sports", "id": 37, "name": "sports ball"},
    {"supercategory": "sports", "id": 38, "name": "kite"},
    {"supercategory": "sports", "id": 39, "name": "baseball bat"},
    {"supercategory": "sports", "id": 40, "name": "baseball glove"},
    {"supercategory": "sports", "id": 41, "name": "skateboard"},
    {"supercategory": "sports", "id": 42, "name": "surfboard"},
    {"supercategory": "sports", "id": 43, "name": "tennis racket"},
    {"supercategory": "kitchen", "id": 44, "name": "bottle"},
    {"supercategory": "kitchen", "id": 46, "name": "wine glass"},
    {"supercategory": "kitchen", "id": 47, "name": "cup"},
    {"supercategory": "kitchen", "id": 48, "name": "fork"},
    {"supercategory": "kitchen", "id": 49, "name": "knife"},
    {"supercategory": "kitchen", "id": 50, "name": "spoon"},
    {"supercategory": "kitchen", "id": 51, "name": "bowl"},
    {"supercategory": "food", "id": 52, "name": "banana"},
    {"supercategory": "food", "id": 53, "name": "apple"},
    {"supercategory": "food", "id": 54, "name": "sandwich"},
    {"supercategory": "food", "id": 55, "name": "orange"},
    {"supercategory": "food", "id": 56, "name": "broccoli"},
    {"supercategory": "food", "id": 57, "name": "carrot"},
    {"supercategory": "food", "id": 58, "name": "hot dog"},
    {"supercategory": "food", "id": 59, "name": "pizza"},
    {"supercategory": "food", "id": 60, "name": "donut"},
    {"supercategory": "food", "id": 61, "name": "cake"},
    {"supercategory": "furniture", "id": 62, "name": "chair"},
    {"supercategory": "furniture", "id": 63, "name": "couch"},
    {"supercategory": "furniture", "id": 64, "name": "potted plant"},
    {"supercategory": "furniture", "id": 65, "name": "bed"},
    {"supercategory": "furniture", "id": 67, "name": "dining table"},
    {"supercategory": "furniture", "id": 70, "name": "toilet"},
    {"supercategory": "electronic", "id": 72, "name": "tv"},
    {"supercategory": "electronic", "id": 73, "name": "laptop"},
    {"supercategory": "electronic", "id": 74, "name": "mouse"},
    {"supercategory": "electronic", "id": 75, "name": "remote"},
    {"supercategory": "electronic", "id": 76, "name": "keyboard"},
    {"supercategory": "electronic", "id": 77, "name": "cell phone"},
    {"supercategory": "appliance", "id": 78, "name": "microwave"},
    {"supercategory": "appliance", "id": 79, "name": "oven"},
    {"supercategory": "appliance", "id": 80, "name": "toaster"},
    {"supercategory": "appliance", "id": 81, "name": "sink"},
    {"supercategory": "appliance", "id": 82, "name": "refrigerator"},
    {"supercategory": "indoor", "id": 84, "name": "book"},
    {"supercategory": "indoor", "id": 85, "name": "clock"},
    {"supercategory": "indoor", "id": 86, "name": "vase"},
    {"supercategory": "indoor", "id": 87, "name": "scissors"},
    {"supercategory": "indoor", "id": 88, "name": "teddy bear"},
    {"supercategory": "indoor", "id": 89, "name": "hair drier"},
    {"supercategory": "indoor", "id": 90, "name": "toothbrush"}]
person(人)

交通工具:bicycle(自行车) car(汽车) motorbike(摩托车) aeroplane(飞机) bus(公共汽车) train(火车) truck(卡车) boat(船)

公共设施:traffic light(信号灯) fire hydrant(消防栓) stop sign(停车标志) parking meter(停车计费器) bench(长凳)

动物:bird(鸟) cat(猫) dog(狗) horse(马) sheep(羊) cow(牛) elephant(大象) bear(熊) zebra(斑马) giraffe(长颈鹿)

生活用品:backpack(背包) umbrella(雨伞) handbag(手提包) tie(领带) suitcase(手提箱)

运动装备:frisbee(飞盘) skis(滑雪板双脚) snowboard(滑雪板) sports ball(运动球) kite(风筝) baseball bat(棒球棒) baseball glove(棒球手套) skateboard(滑板) surfboard(冲浪板) tennis racket(网球拍)

餐具:bottle(瓶子) wine glass(高脚杯) cup(茶杯) fork(叉子) knife(刀)spoon(勺子) bowl(碗)

水果:banana(香蕉) apple(苹果) sandwich(三明治) orange(橘子) broccoli(西兰花) carrot(胡萝卜) hot dog(热狗) pizza(披萨) donut(甜甜圈) cake(蛋糕)

家居:chair(椅子) sofa(沙发) pottedplant(盆栽植物) bed(床) diningtable(餐桌) toilet(厕所) tvmonitor(电视机)

电子产品:laptop(笔记本) mouse(鼠标) remote(遥控器) keyboard(键盘) cell phone(电话)
家用电器:microwave(微波炉) oven(烤箱) toaster(烤面包器) sink(水槽) refrigerator(冰箱)
家用产品:book(书) clock(闹钟) vase(花瓶) scissors(剪刀) teddy bear(泰迪熊) hair drier(吹风机) toothbrush(牙刷)

基础数据格式

{
"info": info, "images": [image], "annotations": [annotation], "licenses": [license],
}

info{
"year": int, "version": str, "description": str, "contributor": str, "url": str, "date_created": datetime,
}

image{
"id": int, "width": int, "height": int, "file_name": str, "license": int, "flickr_url": str, "coco_url": str, "date_captured": datetime,
}

license{
"id": int, "name": str, "url": str,
}

info记录关于数据集的一些基本信息

"info":{
	"description":"COCO 2017 Dataset",  # 数据描述
	"url":"http://*****.org",			# 下载地址
	"version":"1.0",					# 版本
	"year":"2017",            			# 年份
	"contributor":"COCO Consortium",	# 提供者
	"data_created":"2017/09/01",		# 创建日期
}

licenses是数据集遵循的一些许可

"licenses":[
            {
             'url': 'http://creativecommons.org/licenses/by-nc-sa/2.0/',
             'id': 1,
             'name': 'Attribution-NonCommercial-ShareAlike License'
            }
            ...
            ]

images是数据集中包含的图像,长度等于图像的数量

"images": [
            {
             "license":4 							# 可以忽略
             "file_name":000.jpg 					# 图片名称
             "coco_url":"http://****" 				# url地址
             "id": 1,                              	# 图像ID,image 唯一ID	
             "width": 48.0, 
             "height": 112.0
             "date_captured":"2022-02-02 17:02:02" 	# 可以忽略
             "flickl_url":"http://****" 			# 可以忽略
            }
            ...
            ] 	

不同Task下的annotation

Object Detection(目标检测)

annotation{
	"id"			: int,					# annotation的id,每个对象对应一个annotation
	"image_id"		: int, 					# 该annotation的对象所在图片的id
	"category_id"	: int, 					# 该检测目标所属的类别id,每个对象对应一个类别
	"segmentation"	: RLE or [polygon], 	# float类型,检测目标的轮廓分割级标签
	"area"			: float, 				# 检测目标的面积
	"bbox"			: [x,y,width,height], 	# x,y为左上角坐标
	"iscrowd"		: 0 or 1,				# 0时segmentation为polygon,1为REL,目标是否被遮盖,默认为0
}

categories[{
	"id"			: int,			# 类别id,对应以上annotations部分的category_id
	"name"			: str, 			# 类别名称,比如person、dog、cat等
	"supercategory"	: str,			# 类别的父类,例如:bicycle的父类是vehicle,如卡车和轿车都属于机动车这个大类
}]

Keypoint Detection(关键点检测)

与检测任务一样,一个图像包干若干对象,一个对象对应一个keypoint注释,一个keypoint注释包含对象注释的所有数据(包括id、bbox等)和两个附加字段。

annotation{
	"keypoints"		: [x1,y1,v1,...],   # 是一个长度为 3k 的数组,其中 k 是为该类别定义的关键点总数
	"num_keypoints"	: int, 				# v=1,2的关键点的个数,即有标记的关键点个数
	"[cloned]"		: ...,				# 
}

categories[{
	"keypoints"	: [str], 				# 长度为k的关键点名字符串
	"skeleton"	: [edge], 				# 关键点的连通性,主要是通过一组关键点边缘队列表的形式表示,用于可视化.
	"[cloned]"	: ...,                  # Object Detection注释中复制的字段
}]

Stuff Segmentation(实例分割)

分割任务的对象注释格式与上面的Object Detection相同且完全兼容(除了iscrowd是不必要的,默认值为0),分割任务主要字段是“segmentation”。

[{
	"image_id"		: int, 
	"category_id"	: int, 
	"segmentation"	: RLE,
}]

Panoptic Segmentation(全景分割)

每个注释结构是每个图像的注释,而不是每个对象的注释,与上面三个有区别。每个图像的注释有两个部分:1)存储与类无关的图像分割的PNG;2)存储每个图像段的语义信息的JSON结构。

annotation{
	"image_id": int, 
	"file_name": str, 
	"segments_info": [segment_info],
}

segment_info{
	"id": int,. 
	"category_id": int, 
	"area": int, 
	"bbox": [x,y,width,height], 
	"iscrowd": 0 or 1,
}

categories[{
	"id": int, 
	"name": str, 
	"supercategory": str, 
	"isthing": 0 or 1, 
	"color": [R,G,B],
}]

Image Captioning(图像描述)

图像字幕任务的注释用于存储图像标题,每个标题描述指定的图像,每个图像至少有5个标题。

annotation{
	"id": int, 
	"image_id": int, 
	"caption": str,
}

Dense Pose(密集姿态)

annotation{
"id": int, "image_id": int, "category_id": int, "is_crowd": 0 or 1, "area": int, "bbox": [x,y,width,height], "dp_I": [float], "dp_U": [float], "dp_V": [float], "dp_x": [float], "dp_y": [float], "dp_masks": [RLE],
}

参考资料

https://cocodataset.org/#format-data

https://www.yii666.com/blog/376421.html

posted @ 2023-06-28 22:29  贝壳里的星海  阅读(1917)  评论(0编辑  收藏  举报