COCO数据库的使用
参考:https://www.cnblogs.com/q735613050/p/8969452.html
数据集下载可见https://blog.csdn.net/u013249853/article/details/84924808
微软发布的 COCO 数据库是一个大型图像数据集, 专为对象检测、分割、人体关键点检测、语义分割和字幕生成而设计。
COCO 数据库的网址是:
- MS COCO 数据集主页:http://mscoco.org/
- Github 网址:https://github.com/Xinering/cocoapi
- 关于 API 更多的细节在网站: http://mscoco.org/dataset/#download
COCO API 提供了 Matlab, Python 和 Lua 的 API 接口. 该 API 接口可以提供完整的图像标签数据的加载, parsing 和可视化。此外,网站还提供了数据相关的文章, 教程等。
在使用 COCO 数据库提供的 API 和 demo 之前, 需要首先下载 COCO 的图像和标签数据(类别标志、类别数量区分、像素级的分割等 ):
- 图像数据下载到
coco/images/
文件夹中 - 标签数据下载到
coco/annotations/
文件夹中
安装:
pip install pycocotools
出错:
from Cython.Build import cythonize ModuleNotFoundError: No module named 'Cython' ---------------------------------------- ERROR: Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.
解决,先安装Cython:
pip install Cython Collecting Cython Downloading Cython-0.29.19-cp37-cp37m-macosx_10_9_x86_64.whl (1.9 MB)
再安装就成功了
但是安装后调用还是会出错:
ModuleNotFoundError: No module named 'pycocotools._mask'
原来以为是python版本的问题,从3.7变为3.5
后面才发现不是的
其实是因为你得先在../cocoapi-master/PythonAPI目录下运行:
$ make python setup.py build_ext --inplace running build_ext cythoning pycocotools/_mask.pyx to pycocotools/_mask.c /anaconda3/envs/deeplearning/lib/python3.7/site-packages/Cython/Compiler/Main.py:369: FutureWarning: Cython directive 'language_level' not set, using 2 for now (Py2). This will change in a later release! File: /Users/user/pytorch/ssd.pytorch-master/cocoapi-master/PythonAPI/pycocotools/_mask.pyx tree = Parsing.p_module(s, pxd, full_module_name) building 'pycocotools._mask' extension creating build creating build/common creating build/temp.macosx-10.9-x86_64-3.7 creating build/temp.macosx-10.9-x86_64-3.7/pycocotools gcc -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I/anaconda3/envs/deeplearning/include -arch x86_64 -I/anaconda3/envs/deeplearning/include -arch x86_64 -I/anaconda3/envs/deeplearning/lib/python3.7/site-packages/numpy/core/include -I../common -I/anaconda3/envs/deeplearning/include/python3.7m -c ../common/maskApi.c -o build/temp.macosx-10.9-x86_64-3.7/../common/maskApi.o -Wno-cpp -Wno-unused-function -std=c99 gcc -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I/anaconda3/envs/deeplearning/include -arch x86_64 -I/anaconda3/envs/deeplearning/include -arch x86_64 -I/anaconda3/envs/deeplearning/lib/python3.7/site-packages/numpy/core/include -I../common -I/anaconda3/envs/deeplearning/include/python3.7m -c pycocotools/_mask.c -o build/temp.macosx-10.9-x86_64-3.7/pycocotools/_mask.o -Wno-cpp -Wno-unused-function -std=c99 creating build/lib.macosx-10.9-x86_64-3.7 creating build/lib.macosx-10.9-x86_64-3.7/pycocotools gcc -bundle -undefined dynamic_lookup -L/anaconda3/envs/deeplearning/lib -arch x86_64 -L/anaconda3/envs/deeplearning/lib -arch x86_64 -arch x86_64 build/temp.macosx-10.9-x86_64-3.7/../common/maskApi.o build/temp.macosx-10.9-x86_64-3.7/pycocotools/_mask.o -o build/lib.macosx-10.9-x86_64-3.7/pycocotools/_mask.cpython-37m-darwin.so copying build/lib.macosx-10.9-x86_64-3.7/pycocotools/_mask.cpython-37m-darwin.so -> pycocotools rm -rf build
该命令调用setup.py,生成pycocotools._mask这个命令
果然后面就没有再出现这个错误了
pycocoDemo.ipynb
首先加载模块
%matplotlib inline from pycocotools.coco import COCO import numpy as np import skimage.io as io import matplotlib.pyplot as plt import pylab pylab.rcParams['figure.figsize'] = (8.0, 10.0)
1.首先显示实体标注instance annotations
1)加载对应的标注json文件,初始化COCO api:
dataDir='/Users/user/pytorch/ssd.pytorch-master/download_data/coco' dataType='val2014' annFile='{}/annotations/annotations/instances_{}.json'.format(dataDir,dataType) # initialize COCO api for instance annotations coco=COCO(annFile)
对应的文件目录主要看自己的存放路径,应该进行相应的修改
返回:
loading annotations into memory... Done (t=4.33s) creating index... index created!
该json文件中的数据内容类似instances_train2014.json训练集的内容:
{"info": {"description": "This is stable 1.0 version of the 2014 MS COCO dataset.", "url": "http://mscoco.org", "version": "1.0", "year": 2014, "contributor": "Microsoft COCO group", "date_created": "2015-01-27 09:11:52.357475"}, "images": [{"license": 5, "file_name": "COCO_train2014_000000057870.jpg", "coco_url": "http://mscoco.org/images/57870", "height": 480, "width": 640, "date_captured": "2013-11-14 16:28:13", "flickr_url": "http://farm4.staticflickr.com/3153/2970773875_164f0c0b83_z.jpg", "id": 57870},# image_id {"license": 5, "file_name": "COCO_train2014_000000384029.jpg", "coco_url": "http://mscoco.org/images/384029", "height": 429, "width": 640, "date_captured": "2013-11-14 16:29:45", "flickr_url": "http://farm3.staticflickr.com/2422/3577229611_3a3235458a_z.jpg", "id": 384029}, {"license": 1, "file_name": "COCO_train2014_000000222016.jpg", "coco_url": "http://mscoco.org/images/222016", "height": 640, "width": 480, "date_captured": "2013-11-14 16:37:59", "flickr_url": "http://farm2.staticflickr.com/1431/1118526611_09172475e5_z.jpg", "id": 222016} {"license": 4, "file_name": "COCO_train2014_000000475546.jpg", "coco_url": "http://mscoco.org/images/475546", "height": 375, "width": 500, "date_captured": "2013-11-25 21:20:23", "flickr_url": "http://farm1.staticflickr.com/167/423175046_6cd9d0205a_z.jpg", "id": 475546}], "licenses": [{"url": "http://creativecommons.org/licenses/by-nc-sa/2.0/", "id": 1, "name": "Attribution-NonCommercial-ShareAlike License"}, {"url": "http://creativecommons.org/licenses/by-nc/2.0/", "id": 2, "name": "Attribution-NonCommercial License"}, {"url": "http://creativecommons.org/licenses/by-nc-nd/2.0/", "id": 3, "name": "Attribution-NonCommercial-NoDerivs License"}, {"url": "http://creativecommons.org/licenses/by/2.0/", "id": 4, "name": "Attribution License"}, {"url": "http://creativecommons.org/licenses/by-sa/2.0/", "id": 5, "name": "Attribution-ShareAlike License"}, {"url": "http://creativecommons.org/licenses/by-nd/2.0/", "id": 6, "name": "Attribution-NoDerivs License"}, {"url": "http://flickr.com/commons/usage/", "id": 7, "name": "No known copyright restrictions"}, {"url": "http://www.usa.gov/copyright.shtml", "id": 8, "name": "United States Government Work"}], "annotations": [{"segmentation": [[312.29, 562.89, 402.25, 232.61, 560.32, 300.72, 571.89]], "area": 54652.9556, "iscrowd": 0, "image_id": 480023, "bbox": [116.95, 305.86, 285.3, 266.03], "category_id": 58, "id": 86}, #这个id表示annotation的id,因为每一个图像有不止一个annotation,所以要对每一个annotation编号 {"segmentation": [[252.46, 208.17, 267.96, 210.11, 208.45]], "area": 421.47274999999996, "iscrowd": 0, "image_id": 50518, "bbox": [245.54, 208.17, 40.14, 19.1], "category_id": 58, "id": 89}, {"segmentation": [[349.66, 143.56, 344.19, 131.38, 352.94, 139.19, 355.13, 139.97, 354.5, 144.34]], "area": 292.12984999999935, "iscrowd": 0, "image_id": 497261, "bbox": [343.72, 112.63, 17.66, 31.71], "category_id": 1, "id": 2232195}, {"segmentation": {"counts": [69901, 4, 21, 2,470, 12, 468, 13, 467, 12, 468, 12, 468, 12, 469, 10, 471, 8, 474, 4, 73630], "size": [480, 640]}, "area": 2846, "iscrowd": 1, "image_id": 554752, "bbox": [145, 275, 341, 53], "category_id": 1, "id": 900100554752}, {"segmentation": {"counts": [70375, 8, 415, 12, 411, 391, 34, 391, 34, 391, 35, 149], "size": [425, 640]}, "area": 7298, "iscrowd": 1, "image_id": 350724, "bbox": [165, 216, 474, 152], "category_id": 62, "id": 906200350724}, {"segmentation": {"counts": [99015, 6, 352, 8, 349, 8, 75781], "size": [359, 640]}, "area": 6478, "iscrowd": 1, "image_id": 554743, "bbox": [275, 207, 153, 148], "category_id": 1, "id": 900100554743}, {"segmentation": {"counts": [97214, 1, 425, 4, 6531], "size": [427, 640]}, "area": 3489, "iscrowd": 1, "image_id": 95999, "bbox": [227, 260, 397, 82], "category_id": 1, "id": 900100095999}], "categories": [{"supercategory": "person", "id": 1, "name": "person"}, # 一共80类 {"supercategory": "vehicle", "id": 2, "name": "bicycle"}, {"supercategory": "vehicle", "id": 3, "name": "car"}, {"supercategory": "vehicle", "id": 4, "name": "motorcycle"}, {"supercategory": "vehicle", "id": 5, "name": "airplane"}, {"supercategory": "vehicle", "id": 6, "name": "bus"}, {"supercategory": "vehicle", "id": 7, "name": "train"}, {"supercategory": "vehicle", "id": 8, "name": "truck"}, {"supercategory": "vehicle", "id": 9, "name": "boat"}, {"supercategory": "outdoor", "id": 10, "name": "traffic light"}, {"supercategory": "outdoor", "id": 11, "name": "fire hydrant"}, {"supercategory": "outdoor", "id": 13, "name": "stop sign"}, {"supercategory": "outdoor", "id": 14, "name": "parking meter"}, {"supercategory": "outdoor", "id": 15, "name": "bench"}, {"supercategory": "animal", "id": 16, "name": "bird"}, {"supercategory": "animal", "id": 17, "name": "cat"}, {"supercategory": "animal", "id": 18, "name": "dog"}, {"supercategory": "animal", "id": 19, "name": "horse"}, {"supercategory": "animal", "id": 20, "name": "sheep"}, {"supercategory": "animal", "id": 21, "name": "cow"}, {"supercategory": "animal", "id": 22, "name": "elephant"}, {"supercategory": "animal", "id": 23, "name": "bear"}, {"supercategory": "animal", "id": 24, "name": "zebra"}, {"supercategory": "animal", "id": 25, "name": "giraffe"}, {"supercategory": "accessory", "id": 27, "name": "backpack"}, {"supercategory": "accessory", "id": 28, "name": "umbrella"}, {"supercategory": "accessory", "id": 31, "name": "handbag"}, {"supercategory": "accessory", "id": 32, "name": "tie"}, {"supercategory": "accessory", "id": 33, "name": "suitcase"}, {"supercategory": "sports", "id": 34, "name": "frisbee"}, {"supercategory": "sports", "id": 35, "name": "skis"}, {"supercategory": "sports", "id": 36, "name": "snowboard"}, {"supercategory": "sports", "id": 37, "name": "sports ball"}, {"supercategory": "sports", "id": 38, "name": "kite"}, {"supercategory": "sports", "id": 39, "name": "baseball bat"}, {"supercategory": "sports", "id": 40, "name": "baseball glove"}, {"supercategory": "sports", "id": 41, "name": "skateboard"}, {"supercategory": "sports", "id": 42, "name": "surfboard"}, {"supercategory": "sports", "id": 43, "name": "tennis racket"}, {"supercategory": "kitchen", "id": 44, "name": "bottle"}, {"supercategory": "kitchen", "id": 46, "name": "wine glass"}, {"supercategory": "kitchen", "id": 47, "name": "cup"}, {"supercategory": "kitchen", "id": 48, "name": "fork"}, {"supercategory": "kitchen", "id": 49, "name": "knife"}, {"supercategory": "kitchen", "id": 50, "name": "spoon"}, {"supercategory": "kitchen", "id": 51, "name": "bowl"}, {"supercategory": "food", "id": 52, "name": "banana"}, {"supercategory": "food", "id": 53, "name": "apple"}, {"supercategory": "food", "id": 54, "name": "sandwich"}, {"supercategory": "food", "id": 55, "name": "orange"}, {"supercategory": "food", "id": 56, "name": "broccoli"}, {"supercategory": "food", "id": 57, "name": "carrot"}, {"supercategory": "food", "id": 58, "name": "hot dog"}, {"supercategory": "food", "id": 59, "name": "pizza"}, {"supercategory": "food", "id": 60, "name": "donut"}, {"supercategory": "food", "id": 61, "name": "cake"}, {"supercategory": "furniture", "id": 62, "name": "chair"}, {"supercategory": "furniture", "id": 63, "name": "couch"}, {"supercategory": "furniture", "id": 64, "name": "potted plant"}, {"supercategory": "furniture", "id": 65, "name": "bed"}, {"supercategory": "furniture", "id": 67, "name": "dining table"}, {"supercategory": "furniture", "id": 70, "name": "toilet"}, {"supercategory": "electronic", "id": 72, "name": "tv"}, {"supercategory": "electronic", "id": 73, "name": "laptop"}, {"supercategory": "electronic", "id": 74, "name": "mouse"}, {"supercategory": "electronic", "id": 75, "name": "remote"}, {"supercategory": "electronic", "id": 76, "name": "keyboard"}, {"supercategory": "electronic", "id": 77, "name": "cell phone"}, {"supercategory": "appliance", "id": 78, "name": "microwave"}, {"supercategory": "appliance", "id": 79, "name": "oven"}, {"supercategory": "appliance", "id": 80, "name": "toaster"}, {"supercategory": "appliance", "id": 81, "name": "sink"}, {"supercategory": "appliance", "id": 82, "name": "refrigerator"}, {"supercategory": "indoor", "id": 84, "name": "book"}, {"supercategory": "indoor", "id": 85, "name": "clock"}, {"supercategory": "indoor", "id": 86, "name": "vase"}, {"supercategory": "indoor", "id": 87, "name": "scissors"}, {"supercategory": "indoor", "id": 88, "name": "teddy bear"}, {"supercategory": "indoor", "id": 89, "name": "hair drier"}, {"supercategory": "indoor", "id": 90, "name": "toothbrush"}]}
测试它自己的结果:
print(coco.dataset['info']) print(coco.dataset['images'][0]) print(coco.dataset['licenses'][0]) print(coco.dataset['annotations'][0]) print(coco.dataset['categories'][0])
返回:
{'description': 'COCO 2014 Dataset', 'url': 'http://cocodataset.org', 'version': '1.0', 'year': 2014, 'contributor': 'COCO Consortium', 'date_created': '2017/09/01'} {'license': 3, 'file_name': 'COCO_val2014_000000391895.jpg', 'coco_url': 'http://images.cocodataset.org/val2014/COCO_val2014_000000391895.jpg', 'height': 360, 'width': 640, 'date_captured': '2013-11-14 11:18:45', 'flickr_url': 'http://farm9.staticflickr.com/8186/8119368305_4e622c8349_z.jpg', 'id': 391895} {'url': 'http://creativecommons.org/licenses/by-nc-sa/2.0/', 'id': 1, 'name': 'Attribution-NonCommercial-ShareAlike License'} {'segmentation': [[239.97, 260.24, 222.04, 270.49, 199.84, 253.41, 213.5, 227.79, 259.62, 200.46, 274.13, 202.17, 277.55, 210.71, 249.37, 253.41, 237.41, 264.51, 242.54, 261.95, 228.87, 271.34]], 'area': 2765.1486500000005, 'iscrowd': 0, 'image_id': 558840, 'bbox': [199.84, 200.46, 77.71, 70.88], 'category_id': 58, 'id': 156} {'supercategory': 'person', 'id': 1, 'name': 'person'}
总结即json文件内容形如:
{ "info": info, "licenses": [license], "images": [image], "annotations": [annotation], "categories": [category] }
2)显示所有实体的类别,以及他们属于的主类类别
# display COCO categories and supercategories print(coco.getCatIds())#得到类的id cats = coco.loadCats(coco.getCatIds()) #根据类id得到类的信息,如类名name,类属于的主类supercategory,以及该类id print(cats[0]) #输出一个类 nms=[cat['name'] for cat in cats] #得到所有类名 print('COCO categories: \n{}\n'.format(' '.join(nms))) nms = set([cat['supercategory'] for cat in cats]) #去掉重复项,得到所有主类名 print('COCO supercategories: \n{}'.format(' '.join(nms)))
返回:
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 27, 28, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 67, 70, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 84, 85, 86, 87, 88, 89, 90] {'supercategory': 'person', 'id': 1, 'name': 'person'} COCO categories: person bicycle car motorcycle airplane bus train truck boat traffic light fire hydrant stop sign parking meter bench bird cat dog horse sheep cow elephant bear zebra giraffe backpack umbrella handbag tie suitcase frisbee skis snowboard sports ball kite baseball bat baseball glove skateboard surfboard tennis racket bottle wine glass cup fork knife spoon bowl banana apple sandwich orange broccoli carrot hot dog pizza donut cake chair couch potted plant bed dining table toilet tv laptop mouse remote keyboard cell phone microwave oven toaster sink refrigerator book clock vase scissors teddy bear hair drier toothbrush COCO supercategories: sports electronic indoor animal kitchen outdoor furniture appliance food vehicle person accessory
3)指定要寻找的类id,然后去找包含这些类id的图像,并加载该图像的信息
# get all images containing given categories, select one at random catIds = coco.getCatIds(catNms=['person','dog','skateboard']); #仅得到这三类的id print(catIds) imgIds = coco.getImgIds(catIds=catIds ); #根据给定的类id得到对应的包含这些类id的图像id print(imgIds) imgIds = coco.getImgIds(imgIds = [324158])#从中挑出一张图,id为324158,可以随意改成上面的任一id print(imgIds) print(len(imgIds)) #从[0,1]随机取一个整数,其实就是0从imgIds列表取一个图像id,其实就是324158,然后去加载图像 img = coco.loadImgs(imgIds[np.random.randint(0,len(imgIds))])[0] print(img)
返回:
[1, 18, 41] [438915, 209028, 500100, 372874, 282768, 360595, 366484, 449560, 28842, 241837, 324158, 231240, 493020, 547421, 549220, 255209, 353644, 279278, 45175] [324158] 1 {'license': 1, 'file_name': 'COCO_val2014_000000324158.jpg', 'coco_url': 'http://images.cocodataset.org/val2014/COCO_val2014_000000324158.jpg', 'height': 334, 'width': 500, 'date_captured': '2013-11-19 23:54:06', 'flickr_url': 'http://farm1.staticflickr.com/169/417836491_5bf8762150_z.jpg', 'id': 324158}
4)可以去你本地存放图像的位置加载图像,也可以使用其url加载图像
# load and display image # I = io.imread('%s/images/%s/%s'%(dataDir,dataType,img['file_name'])) # use url to load image print(img['coco_url'])#根据coco_url路径去加载图像 I = io.imread(img['coco_url']) plt.axis('off') plt.imshow(I) plt.show()
返回:
http://images.cocodataset.org/val2014/COCO_val2014_000000324158.jpg
5)得到标注信息的id,然后根据这个标注信息id去得到详细的标注信息
# load and display instance annotations plt.imshow(I); plt.axis('off') #根据这张图的imgId,以及想要标注的这张图中的类id-catIds,即['person','dog','skateboard'] #得到对应物体的标注id annIds = coco.getAnnIds(imgIds=img['id'], catIds=catIds, iscrowd=None) print(annIds) #根据得到的标注id,去得到对应的标注位置 anns = coco.loadAnns(annIds) print(anns) coco.showAnns(anns)
返回:
[10673, 638724, 2162813] [{'segmentation': [[216.7, 211.89, 216.16, 217.81, 215.89, 220.77, 215.89, 223.73, 217.77, 225.35, 219.12, 224.54, 219.12, 220.5, 219.66, 217.27, 219.93, 212.7, 220.46, 207.85, 219.66, 203.01, 218.85, 198.43, 217.77, 195.74, 216.7, 194.93, 215.62, 190.62, 215.62, 186.59, 214.27, 183.89, 211.85, 184.16, 211.85, 187.66, 210.24, 187.66, 209.16, 184.97, 207.81, 183.36, 205.12, 186.59, 205.12, 189.28, 201.08, 192.78, 199.74, 195.2, 196.78, 200.04, 196.51, 203.01, 198.12, 205.43, 197.32, 209.2, 196.78, 213.23, 197.05, 218.89, 199.74, 221.85, 201.62, 225.35, 201.62, 233.69, 201.08, 236.11, 202.97, 236.38, 204.85, 236.11, 204.58, 232.34, 203.78, 228.85, 205.39, 233.15, 207.81, 235.57, 208.62, 234.23, 206.74, 231.27, 205.12, 228.04, 206.74, 222.39, 208.35, 219.96, 210.77, 217.54, 211.85, 221.85, 214.54, 223.73, 212.93, 217.54, 212.93, 215.66, 215.89, 212.96, 216.16, 212.16]], 'area': 759.3375500000002, 'iscrowd': 0, 'image_id': 324158, 'bbox': [196.51, 183.36, 23.95, 53.02], 'category_id': 18, 'id': 10673}, {'segmentation': [[223.48, 251.26, 230.81, 246.74, 234.48, 247.6, 241.8, 247.6, 247.41, 243.72, 248.7, 244.15, 252.15, 249.54, 252.15, 254.71, 249.78, 255.79, 247.19, 260.32, 243.1, 263.33, 235.77, 263.33, 224.56, 262.47, 223.91, 259.24, 224.13, 254.5]], 'area': 409.74355, 'iscrowd': 0, 'image_id': 324158, 'bbox': [223.48, 243.72, 28.67, 19.61], 'category_id': 41, 'id': 638724}, {'segmentation': [[228.43, 247.9, 229.63, 206.62, 224.24, 191.07, 220.65, 179.7, 207.49, 169.53, 202.71, 163.55, 205.7, 133.04, 218.86, 121.68, 213.47, 104.33, 225.44, 96.55, 236.8, 106.12, 236.8, 116.29, 254.15, 127.06, 263.72, 150.39, 274.49, 166.54, 271.5, 177.31, 266.12, 181.5, 257.14, 159.96, 254.75, 177.91, 261.93, 192.27, 262.53, 216.79, 261.33, 234.14, 268.51, 249.1, 247.57, 246.11, 245.78, 249.69, 229.03, 248.5]], 'area': 5999.544500000001, 'iscrowd': 0, 'image_id': 324158, 'bbox': [202.71, 96.55, 71.78, 153.14], 'category_id': 1, 'id': 2162813}]
segmentation记录的是多边形点或RLE,其格式取决于这个实例是一个单个的对象(即iscrowd=0,将使用polygons格式)还是一组对象(即一群物体,iscrowd=1,将使用RLE格式)。格式如下:
annotation{ "id": int, "image_id": int, "category_id": int, "segmentation": RLE or [polygon], "area": float, "bbox": [x,y,width,height], "iscrowd": 0 or 1, }
每个对象(不管是iscrowd=0还是iscrowd=1)都会有一个矩形框bbox,表示左上角的坐标(x,y)以及这个矩形框的宽高(width,height)
area字段是area of encoded masks ,即面积
2.显示人体关键点标注
1)首先也是加载对应的json文件
# initialize COCO api for person keypoints annotations annFile = '{}/annotations/annotations/person_keypoints_{}.json'.format(dataDir,dataType) coco_kps=COCO(annFile)
返回:
loading annotations into memory... Done (t=2.27s) creating index... index created!
2)也是得到对应的标注id信息,然后进行标注
# load and display keypoints annotations plt.imshow(I); plt.axis('off') ax = plt.gca() annIds = coco_kps.getAnnIds(imgIds=img['id'], catIds=catIds, iscrowd=None) print(annIds) anns = coco_kps.loadAnns(annIds) print(anns) coco_kps.showAnns(anns)
返回:
[2162813] [{'segmentation': [[228.43, 247.9, 229.63, 206.62, 224.24, 191.07, 220.65, 179.7, 207.49, 169.53, 202.71, 163.55, 205.7, 133.04, 218.86, 121.68, 213.47, 104.33, 225.44, 96.55, 236.8, 106.12, 236.8, 116.29, 254.15, 127.06, 263.72, 150.39, 274.49, 166.54, 271.5, 177.31, 266.12, 181.5, 257.14, 159.96, 254.75, 177.91, 261.93, 192.27, 262.53, 216.79, 261.33, 234.14, 268.51, 249.1, 247.57, 246.11, 245.78, 249.69, 229.03, 248.5]], 'num_keypoints': 12, 'area': 5999.5445, 'iscrowd': 0, 'keypoints': [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 212, 135, 2, 241, 125, 2, 209, 162, 2, 257, 146, 2, 218, 172, 2, 267, 167, 2, 225, 177, 2, 247, 176, 2, 235, 203, 2, 254, 204, 2, 236, 240, 2, 254, 238, 2], 'image_id': 324158, 'bbox': [202.71, 96.55, 71.78, 153.14], 'category_id': 1, 'id': 2162813}]
关键点的格式为:
annotation{ "keypoints": [x1,y1,v1,...], "num_keypoints": int, "id": int, "image_id": int, "category_id": int, "segmentation": RLE or [polygon], "area": float, "bbox": [x,y,width,height], "iscrowd": 0 or 1,
新增的keypoints是一个长度为3*num_keypoints的数组,其中num_keypoints是category中keypoints的总数量。每一个keypoint是一个长度为3的数组,第一和第二个元素分别是x和y坐标值,第三个元素是个标志位v,v为0时表示这个关键点没有标注(这种情况下x=y=v=0),v为1时表示这个关键点标注了但是不可见(被遮挡了),v为2时表示这个关键点标注了同时也可见。所以上面的那个例子,keypoints的长度为3*17,但是因为其前面的5个keypoint没有标注x=y=v=0,所以实际有用的点是后面的那12个
num_keypoints表示这个目标上被标注的关键点的数量(v>0),比较小的目标上可能就无法标注关键点。
3.显示字幕标注信息
1)加载json文件
# initialize COCO api for caption annotations annFile = '{}/annotations/annotations/captions_{}.json'.format(dataDir,dataType) coco_caps=COCO(annFile)
返回:
loading annotations into memory... Done (t=0.34s) creating index... index created!
2)加载标注信息
# load and display caption annotations annIds = coco_caps.getAnnIds(imgIds=img['id']); print(annIds) anns = coco_caps.loadAnns(annIds) print(anns) coco_caps.showAnns(anns) plt.imshow(I); plt.axis('off'); plt.show()
返回:
[310079, 311105, 311588, 312677, 312860] [{'image_id': 324158, 'id': 310079, 'caption': 'A man is skate boarding down a path and a dog is running by his side.'}, {'image_id': 324158, 'id': 311105, 'caption': 'A man on a skateboard with a dog outside. '}, {'image_id': 324158, 'id': 311588, 'caption': 'A person riding a skate board with a dog following beside.'}, {'image_id': 324158, 'id': 312677, 'caption': 'This man is riding a skateboard behind a dog.'}, {'image_id': 324158, 'id': 312860, 'caption': 'A man walking his dog on a quiet country road.'}] #下面即字幕caption A man is skate boarding down a path and a dog is running by his side. A man on a skateboard with a dog outside. A person riding a skate board with a dog following beside. This man is riding a skateboard behind a dog. A man walking his dog on a quiet country road.
字幕的格式为:
annotation{ "id": int, "image_id": int, "caption": str }
上面调用的函数源代码:
http://cocodataset.org/#download
https://github.com/Xinering/cocoapi/blob/master/PythonAPI/pycocotools/coco.py
__author__ = 'tylin' __version__ = '2.0' # Interface for accessing the Microsoft COCO dataset. # Microsoft COCO is a large image dataset designed for object detection, # segmentation, and caption generation. pycocotools is a Python API that # assists in loading, parsing and visualizing the annotations in COCO. # Please visit http://mscoco.org/ for more information on COCO, including # for the data, paper, and tutorials. The exact format of the annotations # is also described on the COCO website. For example usage of the pycocotools # please see pycocotools_demo.ipynb. In addition to this API, please download both # the COCO images and annotations in order to run the demo. # An alternative to using the API is to load the annotations directly # into Python dictionary # Using the API provides additional utility functions. Note that this API # supports both *instance* and *caption* annotations. In the case of # captions not all functions are defined (e.g. categories are undefined). # The following API functions are defined: # COCO - COCO api class that loads COCO annotation file and prepare data structures. # decodeMask - Decode binary mask M encoded via run-length encoding. # encodeMask - Encode binary mask M using run-length encoding. # getAnnIds - Get ann ids that satisfy given filter conditions. # getCatIds - Get cat ids that satisfy given filter conditions. # getImgIds - Get img ids that satisfy given filter conditions. # loadAnns - Load anns with the specified ids. # loadCats - Load cats with the specified ids. # loadImgs - Load imgs with the specified ids. # annToMask - Convert segmentation in an annotation to binary mask. # showAnns - Display the specified annotations. # loadRes - Load algorithm results and create API for accessing them. # download - Download COCO images from mscoco.org server. # Throughout the API "ann"=annotation, "cat"=category, and "img"=image. # Help on each functions can be accessed by: "help COCO>function". # See also COCO>decodeMask, # COCO>encodeMask, COCO>getAnnIds, COCO>getCatIds, # COCO>getImgIds, COCO>loadAnns, COCO>loadCats, # COCO>loadImgs, COCO>annToMask, COCO>showAnns # Microsoft COCO Toolbox. version 2.0 # Data, paper, and tutorials available at: http://mscoco.org/ # Code written by Piotr Dollar and Tsung-Yi Lin, 2014. # Licensed under the Simplified BSD License [see bsd.txt] import json import time import matplotlib.pyplot as plt from matplotlib.collections import PatchCollection from matplotlib.patches import Polygon import numpy as np import copy import itertools from . import mask as maskUtils import os from collections import defaultdict import sys PYTHON_VERSION = sys.version_info[0] if PYTHON_VERSION == 2: from urllib import urlretrieve elif PYTHON_VERSION == 3: from urllib.request import urlretrieve def _isArrayLike(obj): return hasattr(obj, '__iter__') and hasattr(obj, '__len__') class COCO: def __init__(self, annotation_file=None): """ Constructor of Microsoft COCO helper class for reading and visualizing annotations. :param annotation_file (str): location of annotation file :param image_folder (str): location to the folder that hosts images. :return: """ # load dataset self.dataset,self.anns,self.cats,self.imgs = dict(),dict(),dict(),dict() self.imgToAnns, self.catToImgs = defaultdict(list), defaultdict(list) if not annotation_file == None: print('loading annotations into memory...') tic = time.time() dataset = json.load(open(annotation_file, 'r')) assert type(dataset)==dict, 'annotation file format {} not supported'.format(type(dataset)) print('Done (t={:0.2f}s)'.format(time.time()- tic)) self.dataset = dataset self.createIndex() def createIndex(self): # create index print('creating index...') anns, cats, imgs = {}, {}, {} imgToAnns,catToImgs = defaultdict(list),defaultdict(list) if 'annotations' in self.dataset: for ann in self.dataset['annotations']: imgToAnns[ann['image_id']].append(ann) anns[ann['id']] = ann if 'images' in self.dataset: for img in self.dataset['images']: imgs[img['id']] = img if 'categories' in self.dataset: for cat in self.dataset['categories']: cats[cat['id']] = cat if 'annotations' in self.dataset and 'categories' in self.dataset: for ann in self.dataset['annotations']: catToImgs[ann['category_id']].append(ann['image_id']) print('index created!') # create class members self.anns = anns self.imgToAnns = imgToAnns self.catToImgs = catToImgs self.imgs = imgs self.cats = cats def info(self): """ Print information about the annotation file. :return: """ for key, value in self.dataset['info'].items(): print('{}: {}'.format(key, value)) def getAnnIds(self, imgIds=[], catIds=[], areaRng=[], iscrowd=None): """ Get ann ids that satisfy given filter conditions. default skips that filter :param imgIds (int array) : get anns for given imgs catIds (int array) : get anns for given cats areaRng (float array) : get anns for given area range (e.g. [0 inf]) iscrowd (boolean) : get anns for given crowd label (False or True) :return: ids (int array) : integer array of ann ids """ imgIds = imgIds if _isArrayLike(imgIds) else [imgIds] catIds = catIds if _isArrayLike(catIds) else [catIds] if len(imgIds) == len(catIds) == len(areaRng) == 0: anns = self.dataset['annotations'] else: if not len(imgIds) == 0: lists = [self.imgToAnns[imgId] for imgId in imgIds if imgId in self.imgToAnns] anns = list(itertools.chain.from_iterable(lists)) else: anns = self.dataset['annotations'] anns = anns if len(catIds) == 0 else [ann for ann in anns if ann['category_id'] in catIds] anns = anns if len(areaRng) == 0 else [ann for ann in anns if ann['area'] > areaRng[0] and ann['area'] < areaRng[1]] if not iscrowd == None: ids = [ann['id'] for ann in anns if ann['iscrowd'] == iscrowd] else: ids = [ann['id'] for ann in anns] return ids def getCatIds(self, catNms=[], supNms=[], catIds=[]): """ filtering parameters. default skips that filter. :param catNms (str array) : get cats for given cat names :param supNms (str array) : get cats for given supercategory names :param catIds (int array) : get cats for given cat ids :return: ids (int array) : integer array of cat ids """ catNms = catNms if _isArrayLike(catNms) else [catNms] supNms = supNms if _isArrayLike(supNms) else [supNms] catIds = catIds if _isArrayLike(catIds) else [catIds] if len(catNms) == len(supNms) == len(catIds) == 0: cats = self.dataset['categories'] else: cats = self.dataset['categories'] cats = cats if len(catNms) == 0 else [cat for cat in cats if cat['name'] in catNms] cats = cats if len(supNms) == 0 else [cat for cat in cats if cat['supercategory'] in supNms] cats = cats if len(catIds) == 0 else [cat for cat in cats if cat['id'] in catIds] ids = [cat['id'] for cat in cats] return ids def getImgIds(self, imgIds=[], catIds=[]): ''' Get img ids that satisfy given filter conditions. :param imgIds (int array) : get imgs for given ids :param catIds (int array) : get imgs with all given cats :return: ids (int array) : integer array of img ids ''' imgIds = imgIds if _isArrayLike(imgIds) else [imgIds] catIds = catIds if _isArrayLike(catIds) else [catIds] if len(imgIds) == len(catIds) == 0: ids = self.imgs.keys() else: ids = set(imgIds) for i, catId in enumerate(catIds): if i == 0 and len(ids) == 0: ids = set(self.catToImgs[catId]) else: ids &= set(self.catToImgs[catId]) return list(ids) def loadAnns(self, ids=[]): """ Load anns with the specified ids. :param ids (int array) : integer ids specifying anns :return: anns (object array) : loaded ann objects """ if _isArrayLike(ids): return [self.anns[id] for id in ids] elif type(ids) == int: return [self.anns[ids]] def loadCats(self, ids=[]): """ Load cats with the specified ids. :param ids (int array) : integer ids specifying cats :return: cats (object array) : loaded cat objects """ if _isArrayLike(ids): return [self.cats[id] for id in ids] elif type(ids) == int: return [self.cats[ids]] def loadImgs(self, ids=[]): """ Load anns with the specified ids. :param ids (int array) : integer ids specifying img :return: imgs (object array) : loaded img objects """ if _isArrayLike(ids): return [self.imgs[id] for id in ids] elif type(ids) == int: return [self.imgs[ids]] def showAnns(self, anns, draw_bbox=False): """ Display the specified annotations. :param anns (array of object): annotations to display :return: None """ if len(anns) == 0: return 0 if 'segmentation' in anns[0] or 'keypoints' in anns[0]: datasetType = 'instances' elif 'caption' in anns[0]: datasetType = 'captions' else: raise Exception('datasetType not supported') if datasetType == 'instances': ax = plt.gca() ax.set_autoscale_on(False) polygons = [] color = [] for ann in anns: c = (np.random.random((1, 3))*0.6+0.4).tolist()[0] if 'segmentation' in ann: if type(ann['segmentation']) == list: # polygon for seg in ann['segmentation']: poly = np.array(seg).reshape((int(len(seg)/2), 2)) polygons.append(Polygon(poly)) color.append(c) else: # mask t = self.imgs[ann['image_id']] if type(ann['segmentation']['counts']) == list: rle = maskUtils.frPyObjects([ann['segmentation']], t['height'], t['width']) else: rle = [ann['segmentation']] m = maskUtils.decode(rle) img = np.ones( (m.shape[0], m.shape[1], 3) ) if ann['iscrowd'] == 1: color_mask = np.array([2.0,166.0,101.0])/255 if ann['iscrowd'] == 0: color_mask = np.random.random((1, 3)).tolist()[0] for i in range(3): img[:,:,i] = color_mask[i] ax.imshow(np.dstack( (img, m*0.5) )) if 'keypoints' in ann and type(ann['keypoints']) == list: # turn skeleton into zero-based index sks = np.array(self.loadCats(ann['category_id'])[0]['skeleton'])-1 kp = np.array(ann['keypoints']) x = kp[0::3] y = kp[1::3] v = kp[2::3] for sk in sks: if np.all(v[sk]>0): plt.plot(x[sk],y[sk], linewidth=3, color=c) plt.plot(x[v>0], y[v>0],'o',markersize=8, markerfacecolor=c, markeredgecolor='k',markeredgewidth=2) plt.plot(x[v>1], y[v>1],'o',markersize=8, markerfacecolor=c, markeredgecolor=c, markeredgewidth=2) if draw_bbox: [bbox_x, bbox_y, bbox_w, bbox_h] = ann['bbox'] poly = [[bbox_x, bbox_y], [bbox_x, bbox_y+bbox_h], [bbox_x+bbox_w, bbox_y+bbox_h], [bbox_x+bbox_w, bbox_y]] np_poly = np.array(poly).reshape((4,2)) polygons.append(Polygon(np_poly)) color.append(c) p = PatchCollection(polygons, facecolor=color, linewidths=0, alpha=0.4) ax.add_collection(p) p = PatchCollection(polygons, facecolor='none', edgecolors=color, linewidths=2) ax.add_collection(p) elif datasetType == 'captions': for ann in anns: print(ann['caption']) def loadRes(self, resFile): """ Load result file and return a result api object. :param resFile (str) : file name of result file :return: res (obj) : result api object """ res = COCO() res.dataset['images'] = [img for img in self.dataset['images']] print('Loading and preparing results...') tic = time.time() if type(resFile) == str or (PYTHON_VERSION == 2 and type(resFile) == unicode): anns = json.load(open(resFile)) elif type(resFile) == np.ndarray: anns = self.loadNumpyAnnotations(resFile) else: anns = resFile assert type(anns) == list, 'results in not an array of objects' annsImgIds = [ann['image_id'] for ann in anns] assert set(annsImgIds) == (set(annsImgIds) & set(self.getImgIds())), \ 'Results do not correspond to current coco set' if 'caption' in anns[0]: imgIds = set([img['id'] for img in res.dataset['images']]) & set([ann['image_id'] for ann in anns]) res.dataset['images'] = [img for img in res.dataset['images'] if img['id'] in imgIds] for id, ann in enumerate(anns): ann['id'] = id+1 elif 'bbox' in anns[0] and not anns[0]['bbox'] == []: res.dataset['categories'] = copy.deepcopy(self.dataset['categories']) for id, ann in enumerate(anns): bb = ann['bbox'] x1, x2, y1, y2 = [bb[0], bb[0]+bb[2], bb[1], bb[1]+bb[3]] if not 'segmentation' in ann: ann['segmentation'] = [[x1, y1, x1, y2, x2, y2, x2, y1]] ann['area'] = bb[2]*bb[3] ann['id'] = id+1 ann['iscrowd'] = 0 elif 'segmentation' in anns[0]: res.dataset['categories'] = copy.deepcopy(self.dataset['categories']) for id, ann in enumerate(anns): # now only support compressed RLE format as segmentation results ann['area'] = maskUtils.area(ann['segmentation']) if not 'bbox' in ann: ann['bbox'] = maskUtils.toBbox(ann['segmentation']) ann['id'] = id+1 ann['iscrowd'] = 0 elif 'keypoints' in anns[0]: res.dataset['categories'] = copy.deepcopy(self.dataset['categories']) for id, ann in enumerate(anns): s = ann['keypoints'] x = s[0::3] y = s[1::3] x0,x1,y0,y1 = np.min(x), np.max(x), np.min(y), np.max(y) ann['area'] = (x1-x0)*(y1-y0) ann['id'] = id + 1 ann['bbox'] = [x0,y0,x1-x0,y1-y0] print('DONE (t={:0.2f}s)'.format(time.time()- tic)) res.dataset['annotations'] = anns res.createIndex() return res def download(self, tarDir = None, imgIds = [] ): ''' Download COCO images from mscoco.org server. :param tarDir (str): COCO results directory name imgIds (list): images to be downloaded :return: ''' if tarDir is None: print('Please specify target directory') return -1 if len(imgIds) == 0: imgs = self.imgs.values() else: imgs = self.loadImgs(imgIds) N = len(imgs) if not os.path.exists(tarDir): os.makedirs(tarDir) for i, img in enumerate(imgs): tic = time.time() fname = os.path.join(tarDir, img['file_name']) if not os.path.exists(fname): urlretrieve(img['coco_url'], fname) print('downloaded {}/{} images (t={:0.1f}s)'.format(i, N, time.time()- tic)) def loadNumpyAnnotations(self, data): """ Convert result data from a numpy array [Nx7] where each row contains {imageID,x1,y1,w,h,score,class} :param data (numpy.ndarray) :return: annotations (python nested list) """ print('Converting ndarray to lists...') assert(type(data) == np.ndarray) print(data.shape) assert(data.shape[1] == 7) N = data.shape[0] ann = [] for i in range(N): if i % 1000000 == 0: print('{}/{}'.format(i,N)) ann += [{ 'image_id' : int(data[i, 0]), 'bbox' : [ data[i, 1], data[i, 2], data[i, 3], data[i, 4] ], 'score' : data[i, 5], 'category_id': int(data[i, 6]), }] return ann def annToRLE(self, ann): """ Convert annotation which can be polygons, uncompressed RLE to RLE. :return: binary mask (numpy 2D array) """ t = self.imgs[ann['image_id']] h, w = t['height'], t['width'] segm = ann['segmentation'] if type(segm) == list: # polygon -- a single object might consist of multiple parts # we merge all parts into one mask rle code rles = maskUtils.frPyObjects(segm, h, w) rle = maskUtils.merge(rles) elif type(segm['counts']) == list: # uncompressed RLE rle = maskUtils.frPyObjects(segm, h, w) else: # rle rle = ann['segmentation'] return rle def annToMask(self, ann): """ Convert annotation which can be polygons, uncompressed RLE, or RLE to binary mask. :return: binary mask (numpy 2D array) """ rle = self.annToRLE(ann) m = maskUtils.decode(rle) return m