深度学习数据集介绍及相互转换
Pascal VOC & COCO
- 图像检测数据集的标注信息保存在 .json 文件中, 例如 2017_val 的标注数据就保存在 instances_val2017.json 文件中. 其内容如下:
{"info": {"description": "This is stable 1.0 version of the 2017 MS COCO dataset.", "url": "http://mscoco.org", "version": "1.0", "year": 2017, "contributor": "Microsoft COCO group", "date_created": "2017-11-11 02:11:36.777541" }, "images": [ {"license": 2,"file_name": "000000289343.jpg", "coco_url": "http://images.cocodataset.org/val2017/000000289343.jpg", "height": 640,"width": 529,"date_captured": "2013-11-15 00:35:14", "flickr_url": "http://farm5.staticflickr.com/4029/4669549715_7db3735de0_z.jpg","id": 289343}, ... {"license": 1,"file_name": "000000329219.jpg", "coco_url": "http://images.cocodataset.org/val2017/000000329219.jpg", "height": 427,"width": 640,"date_captured": "2013-11-14 19:21:56", "flickr_url": "http://farm9.staticflickr.com/8104/8505307842_465524a6a6_z.jpg", "id": 329219}, ... ], "annotations": [ {"segmentation": [[510.66,423.01,511.72,420.03,510.45,416.0,510...,423.01]], "area": 702.1057499999998, "iscrowd": 0, "image_id": 289343, "bbox": [473.07,395.93,38.65,28.67], "category_id": 18, "id": 1768 }, ... {"segmentation": [[304.09,266.18,308.95,263.56,313.06,262.81,...,266.55]], "area": 4290.290900000001, "iscrowd": 0, "image_id": 329219, "bbox": [297.73,252.34,60.21,108.45],"category_id": 18,"id": 8032} ], "licenses": [ {"url": "http://creativecommons.org/licenses/by-nc-sa/2.0/", "id": 1, "name": "Attribution-NonCommercial-ShareAlike License"}, ... {"url": "http://www.usa.gov/copyright.shtml", "id": 8, "name": "United States Government Work"} ], "categories": [ {"supercategory": "person", "id": 1, "name": "person"}, ... {"supercategory": "indoor", "id": 90, "name": "toothbrush"} ] }
COCO数据集annotation内容: 如instances_train2014.json训练集: {"info": {"description": "This is stable 1.0 version of the 2014 MS COCO dataset.", "url": "http://mscoco.org", "version": "1.0", "year": 2014, "contributor": "Microsoft COCO group", "date_created": "2015-01-27 09:11:52.357475"}, "images": [{"license": 5, "file_name": "COCO_train2014_000000057870.jpg", "coco_url": "http://mscoco.org/images/57870", "height": 480, "width": 640, "date_captured": "2013-11-14 16:28:13", "flickr_url": "http://farm4.staticflickr.com/3153/2970773875_164f0c0b83_z.jpg", "id": 57870},# image_id {"license": 5, "file_name": "COCO_train2014_000000384029.jpg", "coco_url": "http://mscoco.org/images/384029", "height": 429, "width": 640, "date_captured": "2013-11-14 16:29:45", "flickr_url": "http://farm3.staticflickr.com/2422/3577229611_3a3235458a_z.jpg", "id": 384029}, {"license": 1, "file_name": "COCO_train2014_000000222016.jpg", "coco_url": "http://mscoco.org/images/222016", "height": 640, "width": 480, "date_captured": "2013-11-14 16:37:59", "flickr_url": "http://farm2.staticflickr.com/1431/1118526611_09172475e5_z.jpg", "id": 222016} {"license": 4, "file_name": "COCO_train2014_000000475546.jpg", "coco_url": "http://mscoco.org/images/475546", "height": 375, "width": 500, "date_captured": "2013-11-25 21:20:23", "flickr_url": "http://farm1.staticflickr.com/167/423175046_6cd9d0205a_z.jpg", "id": 475546}], "licenses": [{"url": "http://creativecommons.org/licenses/by-nc-sa/2.0/", "id": 1, "name": "Attribution-NonCommercial-ShareAlike License"}, {"url": "http://creativecommons.org/licenses/by-nc/2.0/", "id": 2, "name": "Attribution-NonCommercial License"}, {"url": "http://creativecommons.org/licenses/by-nc-nd/2.0/", "id": 3, "name": "Attribution-NonCommercial-NoDerivs License"}, {"url": "http://creativecommons.org/licenses/by/2.0/", "id": 4, "name": "Attribution License"}, {"url": "http://creativecommons.org/licenses/by-sa/2.0/", "id": 5, "name": "Attribution-ShareAlike License"}, {"url": "http://creativecommons.org/licenses/by-nd/2.0/", "id": 6, "name": "Attribution-NoDerivs License"}, {"url": "http://flickr.com/commons/usage/", "id": 7, "name": "No known copyright restrictions"}, {"url": "http://www.usa.gov/copyright.shtml", "id": 8, "name": "United States Government Work"}], "annotations": [{"segmentation": [[312.29, 562.89, 402.25, 232.61, 560.32, 300.72, 571.89]], "area": 54652.9556, "iscrowd": 0, "image_id": 480023, "bbox": [116.95, 305.86, 285.3, 266.03], "category_id": 58, "id": 86}, #这个id表示annotation的id,因为每一个图像有不止一个annotation,所以要对每一个annotation编号 {"segmentation": [[252.46, 208.17, 267.96, 210.11, 208.45]], "area": 421.47274999999996, "iscrowd": 0, "image_id": 50518, "bbox": [245.54, 208.17, 40.14, 19.1], "category_id": 58, "id": 89}, {"segmentation": [[349.66, 143.56, 344.19, 131.38, 352.94, 139.19, 355.13, 139.97, 354.5, 144.34]], "area": 292.12984999999935, "iscrowd": 0, "image_id": 497261, "bbox": [343.72, 112.63, 17.66, 31.71], "category_id": 1, "id": 2232195}, {"segmentation": {"counts": [69901, 4, 21, 2,470, 12, 468, 13, 467, 12, 468, 12, 468, 12, 469, 10, 471, 8, 474, 4, 73630], "size": [480, 640]}, "area": 2846, "iscrowd": 1, "image_id": 554752, "bbox": [145, 275, 341, 53], "category_id": 1, "id": 900100554752}, {"segmentation": {"counts": [70375, 8, 415, 12, 411, 391, 34, 391, 34, 391, 35, 149], "size": [425, 640]}, "area": 7298, "iscrowd": 1, "image_id": 350724, "bbox": [165, 216, 474, 152], "category_id": 62, "id": 906200350724}, {"segmentation": {"counts": [99015, 6, 352, 8, 349, 8, 75781], "size": [359, 640]}, "area": 6478, "iscrowd": 1, "image_id": 554743, "bbox": [275, 207, 153, 148], "category_id": 1, "id": 900100554743}, {"segmentation": {"counts": [97214, 1, 425, 4, 6531], "size": [427, 640]}, "area": 3489, "iscrowd": 1, "image_id": 95999, "bbox": [227, 260, 397, 82], "category_id": 1, "id": 900100095999}], "categories": [{"supercategory": "person", "id": 1, "name": "person"}, # 一共80类 {"supercategory": "vehicle", "id": 2, "name": "bicycle"}, {"supercategory": "vehicle", "id": 3, "name": "car"}, {"supercategory": "vehicle", "id": 4, "name": "motorcycle"}, {"supercategory": "vehicle", "id": 5, "name": "airplane"}, {"supercategory": "vehicle", "id": 6, "name": "bus"}, {"supercategory": "vehicle", "id": 7, "name": "train"}, {"supercategory": "vehicle", "id": 8, "name": "truck"}, {"supercategory": "vehicle", "id": 9, "name": "boat"}, {"supercategory": "outdoor", "id": 10, "name": "traffic light"}, {"supercategory": "outdoor", "id": 11, "name": "fire hydrant"}, {"supercategory": "outdoor", "id": 13, "name": "stop sign"}, {"supercategory": "outdoor", "id": 14, "name": "parking meter"}, {"supercategory": "outdoor", "id": 15, "name": "bench"}, {"supercategory": "animal", "id": 16, "name": "bird"}, {"supercategory": "animal", "id": 17, "name": "cat"}, {"supercategory": "animal", "id": 18, "name": "dog"}, {"supercategory": "animal", "id": 19, "name": "horse"}, {"supercategory": "animal", "id": 20, "name": "sheep"}, {"supercategory": "animal", "id": 21, "name": "cow"}, {"supercategory": "animal", "id": 22, "name": "elephant"}, {"supercategory": "animal", "id": 23, "name": "bear"}, {"supercategory": "animal", "id": 24, "name": "zebra"}, {"supercategory": "animal", "id": 25, "name": "giraffe"}, {"supercategory": "accessory", "id": 27, "name": "backpack"}, {"supercategory": "accessory", "id": 28, "name": "umbrella"}, {"supercategory": "accessory", "id": 31, "name": "handbag"}, {"supercategory": "accessory", "id": 32, "name": "tie"}, {"supercategory": "accessory", "id": 33, "name": "suitcase"}, {"supercategory": "sports", "id": 34, "name": "frisbee"}, {"supercategory": "sports", "id": 35, "name": "skis"}, {"supercategory": "sports", "id": 36, "name": "snowboard"}, {"supercategory": "sports", "id": 37, "name": "sports ball"}, {"supercategory": "sports", "id": 38, "name": "kite"}, {"supercategory": "sports", "id": 39, "name": "baseball bat"}, {"supercategory": "sports", "id": 40, "name": "baseball glove"}, {"supercategory": "sports", "id": 41, "name": "skateboard"}, {"supercategory": "sports", "id": 42, "name": "surfboard"}, {"supercategory": "sports", "id": 43, "name": "tennis racket"}, {"supercategory": "kitchen", "id": 44, "name": "bottle"}, {"supercategory": "kitchen", "id": 46, "name": "wine glass"}, {"supercategory": "kitchen", "id": 47, "name": "cup"}, {"supercategory": "kitchen", "id": 48, "name": "fork"}, {"supercategory": "kitchen", "id": 49, "name": "knife"}, {"supercategory": "kitchen", "id": 50, "name": "spoon"}, {"supercategory": "kitchen", "id": 51, "name": "bowl"}, {"supercategory": "food", "id": 52, "name": "banana"}, {"supercategory": "food", "id": 53, "name": "apple"}, {"supercategory": "food", "id": 54, "name": "sandwich"}, {"supercategory": "food", "id": 55, "name": "orange"}, {"supercategory": "food", "id": 56, "name": "broccoli"}, {"supercategory": "food", "id": 57, "name": "carrot"}, {"supercategory": "food", "id": 58, "name": "hot dog"}, {"supercategory": "food", "id": 59, "name": "pizza"}, {"supercategory": "food", "id": 60, "name": "donut"}, {"supercategory": "food", "id": 61, "name": "cake"}, {"supercategory": "furniture", "id": 62, "name": "chair"}, {"supercategory": "furniture", "id": 63, "name": "couch"}, {"supercategory": "furniture", "id": 64, "name": "potted plant"}, {"supercategory": "furniture", "id": 65, "name": "bed"}, {"supercategory": "furniture", "id": 67, "name": "dining table"}, {"supercategory": "furniture", "id": 70, "name": "toilet"}, {"supercategory": "electronic", "id": 72, "name": "tv"}, {"supercategory": "electronic", "id": 73, "name": "laptop"}, {"supercategory": "electronic", "id": 74, "name": "mouse"}, {"supercategory": "electronic", "id": 75, "name": "remote"}, {"supercategory": "electronic", "id": 76, "name": "keyboard"}, {"supercategory": "electronic", "id": 77, "name": "cell phone"}, {"supercategory": "appliance", "id": 78, "name": "microwave"}, {"supercategory": "appliance", "id": 79, "name": "oven"}, {"supercategory": "appliance", "id": 80, "name": "toaster"}, {"supercategory": "appliance", "id": 81, "name": "sink"}, {"supercategory": "appliance", "id": 82, "name": "refrigerator"}, {"supercategory": "indoor", "id": 84, "name": "book"}, {"supercategory": "indoor", "id": 85, "name": "clock"}, {"supercategory": "indoor", "id": 86, "name": "vase"}, {"supercategory": "indoor", "id": 87, "name": "scissors"}, {"supercategory": "indoor", "id": 88, "name": "teddy bear"}, {"supercategory": "indoor", "id": 89, "name": "hair drier"}, {"supercategory": "indoor", "id": 90, "name": "toothbrush"}]}
PASCAL-VOC2012
- 数据集介绍官网:http://host.robots.ox.ac.uk/pascal/VOC/voc2012/index.html
- 数据集下载地址:benchmark_RELEASE:下载地址 voc2012:下载地址
- 检测每张图的标注:
<annotation> <folder>VOC2012</folder> <filename>2007_000027.jpg</filename> <source> <database>The VOC2007 Database</database> <annotation>PASCAL VOC2007</annotation> <image>flickr</image> </source> <size> <width>486</width> <height>500</height> <depth>3</depth> </size> <segmented>0</segmented> <object> <name>person</name> <pose>Unspecified</pose> <truncated>0</truncated> <difficult>0</difficult> <bndbox> <xmin>174</xmin> <ymin>101</ymin> <xmax>349</xmax> <ymax>351</ymax> </bndbox> <part> <name>head</name> <bndbox> <xmin>169</xmin> <ymin>104</ymin> <xmax>209</xmax> <ymax>146</ymax> </bndbox> </part> <part> <name>hand</name> <bndbox> <xmin>278</xmin> <ymin>210</ymin> <xmax>297</xmax> <ymax>233</ymax> </bndbox> </part> <part> <name>foot</name> <bndbox> <xmin>273</xmin> <ymin>333</ymin> <xmax>297</xmax> <ymax>354</ymax> </bndbox> </part> <part> <name>foot</name> <bndbox> <xmin>319</xmin> <ymin>307</ymin> <xmax>340</xmax> <ymax>326</ymax> </bndbox> </part> </object> </annotation>
VOC2012数据集分为20类,包括背景为21类,分别如下:
- Person: person
- Animal: bird, cat, cow, dog, horse, sheep
- Vehicle: aeroplane, bicycle, boat, bus, car, motorbike, train
- Indoor: bottle, chair, dining table, potted plant, sofa, tv/monitor
- 图像分割(segmentation)有关的信息,VOC2012中的图片并不是都用于分割,用于分割比赛的图片实例如下,包含原图以及图像分类分割和图像物体分割两种png图。图像分类分割是在20种物体中,ground-turth图片上每个物体的轮廓填充都有一个特定的颜色,一共20种颜色,比如摩托车用红色表示,人用绿色表示。而图像物体分割则仅仅在一副图中生成不同物体的轮廓颜色即可,颜色自己随便填充。
- 注意2个分割后图片标签的区别 :
* SegmentationClass: 标注出每一个像素的类别 ;
* SegmentationObject:: 标注出每一个像素属于哪一个物体。
- 因为VOC2012中的图片并不是都用于分割,所以需要txt文件信息来标记处哪些图片可以用于分割,写程序的时候就可以利用信息 train.txt 对图片进行挑选。train和val中的图片加一起一共2913张图。
- SegmentationClass中的png图用于图像分割分类,例如有两类物体,人和飞机,其中飞机和人都对应着特定的颜色,注意该文件夹中的图片为三通道彩色图,与之前单通道的灰度图不同。png图中对物体的分类像素不是0-20,而是对应着不同的RGB分量;
- 而SegmentationObject中的png图则仅仅对图中不同的物体进行的分割,不对其物体所属的类别进行标注;
- 在最后一步中,将fc8中分割得到的.mat格式的结果,转换成.png格式的最终分割图像,但是发现并不是很清楚各颜色代表的类别,通过将create_labels.py程序中颜色的RGB值。
图像标注工具
Labelme
Labelme适用于图像分割任务的数据集制作:它来自下面的项目:https://github.com/wkentaro/labelme;该软件实现了最基本的分割数据标注工作,在save后将保持Object的一些信息到一个json文件中,如下:https://github.com/wkentaro/labelme/blob/master/static/apc2016_obj3.json;同时该软件提供了将json文件转化为labelimage的功能:
labelImg
Labelme适用于图像检测任务的数据集制作:它来自下面的项目:https://github.com/tzutalin/labelImg;其中标签存储功能和“Next Image”、“Prev Image”的设计使用起来比较方便。该软件最后保存的xml文件格式和ImageNet数据集是一样的。yolo_mark
yolo_mark适用于图像检测任务的数据集制作:它来自于下面的项目:https://github.com/AlexeyAB/Yolo_mark;它是yolo2的团队开源的一个图像标注工具,为了方便其他人使用yolo2训练自己的任务模型。在linux和win下都可运行,依赖opencv库。
KITTI与Cityscapes简介
KITTI由德国卡尔斯鲁厄理工学院和丰田美国技术研究院联合创办,是目前国际上最大的自动驾驶场景下的计算机视觉算法评测数据集。用于评测目标(机动车、非机动车、行人等)检测、目标跟踪、路面分割等计算机视觉技术在车载环境下的性能。
KITTI包含市区、乡村和高速公路等场景采集的真实图像数据,每张图像中多达15辆车和30个行人,还有各种程度的遮挡。KITTI数据集中,目标检测包括了车辆检测、行人检测、自行车等三个单项,目标追踪包括车辆追踪、行人追踪等两个单项,道路分割包括urban unmarked、urban marked、urban multiple marked三个场景及前三个场景的平均值urban road等四个单项。
Cityscapes数据集则是由奔驰主推,提供无人驾驶环境下的图像分割数据集。用于评估视觉算法在城区场景语义理解方面的性能。Cityscapes包含50个城市不同场景、不同背景、不同季节的街景,提供5000张精细标注的图像、20000张粗略标注的图像、30类标注物体。用PASCAL VOC标准的 intersection-over-union (IoU)得分来对算法性能进行评价。 Cityscapes数据集共有fine和coarse两套评测标准,前者提供5000张精细标注的图像,后者提供5000张精细标注外加20000张粗糙标注的图像。