coco数据格式介绍
cocodataset
简介
COCO数据集是一个可用于图像检测(image detection),语义分割(semantic segmentation)和图像标题生成(image captioning)的大规模数据集。它有超过330K张图像(其中220K张是有标注的图像),包含150万个目标,80个目标类别(object categories:行人、汽车、大象等),91种材料类别(stuff categoris:草、墙、天空等),每张图像包含五句图像的语句描述,且有250,000个带关键点标注的行人。
MC COCO2017年主要包含以下四个任务:目标检测、图像分割、图像描述、人体关键点检测
2017 Train images [118K/18GB]:http://images.cocodataset.org/zips/train2017.zip
2017 Val images [5K/1GB]:http://images.cocodataset.org/zips/val2017.zip
2017 Test images [41K/6GB]: http://images.cocodataset.org/zips/test2017.zip
2017 Annotations:http://images.cocodataset.org/annotations/annotations_trainval2017.zip
Object segmentation : 目标级分割
Recognition in context : 图像情景识别
Superpixel stuff segmentation : 超像素分割
330K images (>200K labeled) : 超过33万张图像,标注过的图像超过20万张
1.5 million object instances : 150万个对象实例
80 object categories : 80个目标类别
91 stuff categories : 91个材料类别
5 captions per image : 每张图像有5段情景描述
250,000 people with keypoints : 对25万个人进行了关键点标注
文件路径
coco 数据路径
├── coco2017: 数据集根目录
├── train2017: 所有训练图像文件夹(118287张)
├── val2017: 所有验证图像文件夹(5000张)
└── annotations: 对应标注文件夹
├── instances_train2017.json : 对应目标检测、分割任务的训练集标注文件
├── instances_val2017.json : 对应目标检测、分割任务的验证集标注文件
├── captions_train2017.json : 对应图像描述的训练集标注文件
├── captions_val2017.json : 对应图像描述的验证集标注文件
├── person_keypoints_train2017.json : 对应人体关键点检测的训练集标注文件
└── person_keypoints_val2017.json : 对应人体关键点检测的验证集标注文件夹
coco数据集格式充分利用了面向对象的思路:整个标注文件是一个json对象,这个大的json对象包含几个主要的filed:"info","licenses","categories","images","annotations"。每个filed都是一个数组,里面包含所有的image对象和annotation对象。在coco格式中,每一张图片是一个json对象,每一个标注也是一个json对象,所有的对象都用一个唯一的id进行标识。注意,image对象和annotation对象的id是分开来标识的。
annotations: 对应标注文件夹
├── instances_train2017.json : 对应目标检测、分割任务的
├── instances_val2017.json : 对应目标检测、分割任务的验证集标注文件
├── captions_train2017.json : 对应图像描述的训练集标注文件
├── captions_val2017.json : 对应图像描述的验证集标注文件
├── person_keypoints_train2017.json : 对应人体关键点检测的训练集标注文件
└── person_keypoints_val2017.json : 对应人体关键点检测的验证集标注文件夹
""" 注意 """
COCO数据集格式中,bbox 的保存格式为 [x, y, w, h]
如果需要转换为[x1,y1,x2,y2],可以通过如下进行转换
bbox = [x1, y1, x1 + w - 1, y1 + h - 1]
类别说明
['appliance', 'food', 'indoor', 'accessory', 'electronic', 'furniture', 'vehicle', 'sports', 'animal', 'kitchen', 'person', 'outdoor']
['家用设备','食物','室内','配饰','电子产品','家具','交通','运动相关','动物','厨房','户外','人']
[ {"supercategory": "person", "id": 1, "name": "person"},
{"supercategory": "vehicle", "id": 2, "name": "bicycle"},
{"supercategory": "vehicle", "id": 3, "name": "car"},
{"supercategory": "vehicle", "id": 4, "name": "motorcycle"},
{"supercategory": "vehicle", "id": 5, "name": "airplane"},
{"supercategory": "vehicle", "id": 6, "name": "bus"},
{"supercategory": "vehicle", "id": 7, "name": "train"},
{"supercategory": "vehicle", "id": 8, "name": "truck"},
{"supercategory": "vehicle", "id": 9, "name": "boat"},
{"supercategory": "outdoor", "id": 10, "name": "traffic light"},
{"supercategory": "outdoor", "id": 11, "name": "fire hydrant"},
{"supercategory": "outdoor", "id": 13, "name": "stop sign"},
{"supercategory": "outdoor", "id": 14, "name": "parking meter"},
{"supercategory": "outdoor", "id": 15, "name": "bench"},
{"supercategory": "animal", "id": 16, "name": "bird"},
{"supercategory": "animal", "id": 17, "name": "cat"},
{"supercategory": "animal", "id": 18, "name": "dog"},
{"supercategory": "animal", "id": 19, "name": "horse"},
{"supercategory": "animal", "id": 20, "name": "sheep"},
{"supercategory": "animal", "id": 21, "name": "cow"},
{"supercategory": "animal", "id": 22, "name": "elephant"},
{"supercategory": "animal", "id": 23, "name": "bear"},
{"supercategory": "animal", "id": 24, "name": "zebra"},
{"supercategory": "animal", "id": 25, "name": "giraffe"},
{"supercategory": "accessory", "id": 27, "name": "backpack"},
{"supercategory": "accessory", "id": 28, "name": "umbrella"},
{"supercategory": "accessory", "id": 31, "name": "handbag"},
{"supercategory": "accessory", "id": 32, "name": "tie"},
{"supercategory": "accessory", "id": 33, "name": "suitcase"},
{"supercategory": "sports", "id": 34, "name": "frisbee"},
{"supercategory": "sports", "id": 35, "name": "skis"},
{"supercategory": "sports", "id": 36, "name": "snowboard"},
{"supercategory": "sports", "id": 37, "name": "sports ball"},
{"supercategory": "sports", "id": 38, "name": "kite"},
{"supercategory": "sports", "id": 39, "name": "baseball bat"},
{"supercategory": "sports", "id": 40, "name": "baseball glove"},
{"supercategory": "sports", "id": 41, "name": "skateboard"},
{"supercategory": "sports", "id": 42, "name": "surfboard"},
{"supercategory": "sports", "id": 43, "name": "tennis racket"},
{"supercategory": "kitchen", "id": 44, "name": "bottle"},
{"supercategory": "kitchen", "id": 46, "name": "wine glass"},
{"supercategory": "kitchen", "id": 47, "name": "cup"},
{"supercategory": "kitchen", "id": 48, "name": "fork"},
{"supercategory": "kitchen", "id": 49, "name": "knife"},
{"supercategory": "kitchen", "id": 50, "name": "spoon"},
{"supercategory": "kitchen", "id": 51, "name": "bowl"},
{"supercategory": "food", "id": 52, "name": "banana"},
{"supercategory": "food", "id": 53, "name": "apple"},
{"supercategory": "food", "id": 54, "name": "sandwich"},
{"supercategory": "food", "id": 55, "name": "orange"},
{"supercategory": "food", "id": 56, "name": "broccoli"},
{"supercategory": "food", "id": 57, "name": "carrot"},
{"supercategory": "food", "id": 58, "name": "hot dog"},
{"supercategory": "food", "id": 59, "name": "pizza"},
{"supercategory": "food", "id": 60, "name": "donut"},
{"supercategory": "food", "id": 61, "name": "cake"},
{"supercategory": "furniture", "id": 62, "name": "chair"},
{"supercategory": "furniture", "id": 63, "name": "couch"},
{"supercategory": "furniture", "id": 64, "name": "potted plant"},
{"supercategory": "furniture", "id": 65, "name": "bed"},
{"supercategory": "furniture", "id": 67, "name": "dining table"},
{"supercategory": "furniture", "id": 70, "name": "toilet"},
{"supercategory": "electronic", "id": 72, "name": "tv"},
{"supercategory": "electronic", "id": 73, "name": "laptop"},
{"supercategory": "electronic", "id": 74, "name": "mouse"},
{"supercategory": "electronic", "id": 75, "name": "remote"},
{"supercategory": "electronic", "id": 76, "name": "keyboard"},
{"supercategory": "electronic", "id": 77, "name": "cell phone"},
{"supercategory": "appliance", "id": 78, "name": "microwave"},
{"supercategory": "appliance", "id": 79, "name": "oven"},
{"supercategory": "appliance", "id": 80, "name": "toaster"},
{"supercategory": "appliance", "id": 81, "name": "sink"},
{"supercategory": "appliance", "id": 82, "name": "refrigerator"},
{"supercategory": "indoor", "id": 84, "name": "book"},
{"supercategory": "indoor", "id": 85, "name": "clock"},
{"supercategory": "indoor", "id": 86, "name": "vase"},
{"supercategory": "indoor", "id": 87, "name": "scissors"},
{"supercategory": "indoor", "id": 88, "name": "teddy bear"},
{"supercategory": "indoor", "id": 89, "name": "hair drier"},
{"supercategory": "indoor", "id": 90, "name": "toothbrush"}]
person(人)
交通工具:bicycle(自行车) car(汽车) motorbike(摩托车) aeroplane(飞机) bus(公共汽车) train(火车) truck(卡车) boat(船)
公共设施:traffic light(信号灯) fire hydrant(消防栓) stop sign(停车标志) parking meter(停车计费器) bench(长凳)
动物:bird(鸟) cat(猫) dog(狗) horse(马) sheep(羊) cow(牛) elephant(大象) bear(熊) zebra(斑马) giraffe(长颈鹿)
生活用品:backpack(背包) umbrella(雨伞) handbag(手提包) tie(领带) suitcase(手提箱)
运动装备:frisbee(飞盘) skis(滑雪板双脚) snowboard(滑雪板) sports ball(运动球) kite(风筝) baseball bat(棒球棒) baseball glove(棒球手套) skateboard(滑板) surfboard(冲浪板) tennis racket(网球拍)
餐具:bottle(瓶子) wine glass(高脚杯) cup(茶杯) fork(叉子) knife(刀)spoon(勺子) bowl(碗)
水果:banana(香蕉) apple(苹果) sandwich(三明治) orange(橘子) broccoli(西兰花) carrot(胡萝卜) hot dog(热狗) pizza(披萨) donut(甜甜圈) cake(蛋糕)
家居:chair(椅子) sofa(沙发) pottedplant(盆栽植物) bed(床) diningtable(餐桌) toilet(厕所) tvmonitor(电视机)
电子产品:laptop(笔记本) mouse(鼠标) remote(遥控器) keyboard(键盘) cell phone(电话)
家用电器:microwave(微波炉) oven(烤箱) toaster(烤面包器) sink(水槽) refrigerator(冰箱)
家用产品:book(书) clock(闹钟) vase(花瓶) scissors(剪刀) teddy bear(泰迪熊) hair drier(吹风机) toothbrush(牙刷)
基础数据格式
{
"info": info, "images": [image], "annotations": [annotation], "licenses": [license],
}
info{
"year": int, "version": str, "description": str, "contributor": str, "url": str, "date_created": datetime,
}
image{
"id": int, "width": int, "height": int, "file_name": str, "license": int, "flickr_url": str, "coco_url": str, "date_captured": datetime,
}
license{
"id": int, "name": str, "url": str,
}
info记录关于数据集的一些基本信息
"info":{
"description":"COCO 2017 Dataset", # 数据描述
"url":"http://*****.org", # 下载地址
"version":"1.0", # 版本
"year":"2017", # 年份
"contributor":"COCO Consortium", # 提供者
"data_created":"2017/09/01", # 创建日期
}
licenses是数据集遵循的一些许可
"licenses":[
{
'url': 'http://creativecommons.org/licenses/by-nc-sa/2.0/',
'id': 1,
'name': 'Attribution-NonCommercial-ShareAlike License'
}
...
]
images是数据集中包含的图像,长度等于图像的数量
"images": [
{
"license":4 # 可以忽略
"file_name":000.jpg # 图片名称
"coco_url":"http://****" # url地址
"id": 1, # 图像ID,image 唯一ID
"width": 48.0,
"height": 112.0
"date_captured":"2022-02-02 17:02:02" # 可以忽略
"flickl_url":"http://****" # 可以忽略
}
...
]
不同Task下的annotation
Object Detection(目标检测)
annotation{
"id" : int, # annotation的id,每个对象对应一个annotation
"image_id" : int, # 该annotation的对象所在图片的id
"category_id" : int, # 该检测目标所属的类别id,每个对象对应一个类别
"segmentation" : RLE or [polygon], # float类型,检测目标的轮廓分割级标签
"area" : float, # 检测目标的面积
"bbox" : [x,y,width,height], # x,y为左上角坐标
"iscrowd" : 0 or 1, # 0时segmentation为polygon,1为REL,目标是否被遮盖,默认为0
}
categories[{
"id" : int, # 类别id,对应以上annotations部分的category_id
"name" : str, # 类别名称,比如person、dog、cat等
"supercategory" : str, # 类别的父类,例如:bicycle的父类是vehicle,如卡车和轿车都属于机动车这个大类
}]
Keypoint Detection(关键点检测)
与检测任务一样,一个图像包干若干对象,一个对象对应一个keypoint注释,一个keypoint注释包含对象注释的所有数据(包括id、bbox等)和两个附加字段。
annotation{
"keypoints" : [x1,y1,v1,...], # 是一个长度为 3k 的数组,其中 k 是为该类别定义的关键点总数
"num_keypoints" : int, # v=1,2的关键点的个数,即有标记的关键点个数
"[cloned]" : ..., #
}
categories[{
"keypoints" : [str], # 长度为k的关键点名字符串
"skeleton" : [edge], # 关键点的连通性,主要是通过一组关键点边缘队列表的形式表示,用于可视化.
"[cloned]" : ..., # Object Detection注释中复制的字段
}]
Stuff Segmentation(实例分割)
分割任务的对象注释格式与上面的Object Detection相同且完全兼容(除了iscrowd是不必要的,默认值为0),分割任务主要字段是“segmentation”。
[{
"image_id" : int,
"category_id" : int,
"segmentation" : RLE,
}]
Panoptic Segmentation(全景分割)
每个注释结构是每个图像的注释,而不是每个对象的注释,与上面三个有区别。每个图像的注释有两个部分:1)存储与类无关的图像分割的PNG;2)存储每个图像段的语义信息的JSON结构。
annotation{
"image_id": int,
"file_name": str,
"segments_info": [segment_info],
}
segment_info{
"id": int,.
"category_id": int,
"area": int,
"bbox": [x,y,width,height],
"iscrowd": 0 or 1,
}
categories[{
"id": int,
"name": str,
"supercategory": str,
"isthing": 0 or 1,
"color": [R,G,B],
}]
Image Captioning(图像描述)
图像字幕任务的注释用于存储图像标题,每个标题描述指定的图像,每个图像至少有5个标题。
annotation{
"id": int,
"image_id": int,
"caption": str,
}
Dense Pose(密集姿态)
annotation{
"id": int, "image_id": int, "category_id": int, "is_crowd": 0 or 1, "area": int, "bbox": [x,y,width,height], "dp_I": [float], "dp_U": [float], "dp_V": [float], "dp_x": [float], "dp_y": [float], "dp_masks": [RLE],
}