MS coco中image_caption的数据格式详解
coco中image_caption的数据格式,对应的文件captions_train2014.json和captions_val2014.json
1.使用json加载文件
对应的解析代码如下:
import json if __name__=='__main__': base_path = r'/data/antonio/images_data/images/annotations/captions_train2014.json' image_caption={} with open(base_path,'r') as f: dataset=json.load(f) image_caption['annotations'] = [] for data in dataset['annotations']: image_caption['annotations'].append({}) for key in data: image_caption['annotations'][0][key]=data[key] break image_caption['images'] = [] for data in dataset['images']: image_caption['images'].append({}) for key in data: image_caption['images'][0][key]=data[key] break image_caption['info'] = {} for key in dataset['info']: #dict image_caption['info'][key]=dataset['info'][key] image_caption['licenses'] = [] for data in dataset['licenses']: #2014 have eight list image_caption['licenses'].append({}) for key in data: image_caption['licenses'][0][key]=data[key] break print(image_caption)
用json加载之后内容如下:
只显示列表中元素的第一个元素,annotations是list,存储的是字典,字典有三个键-值对,对应如下:
{ 'annotations': [{ 'image_id': 318556,#唯一的图片ID,此ID同时是图像文件名的序列号,对应的文件名:COCO_train2014_000000318556.jpg 'id': 48, # 唯一的对象ID 'caption': 'A very clean and well decorated empty bathroom' } ... ... ], 'images': [{ 'license': 5, 'date_captured': '2013-11-14 16:28:13', 'flickr_url': 'http://farm4.staticflickr.com/3153/2970773875_164f0c0b83_z.jpg', 'coco_url': 'http://images.cocodataset.org/train2014/COCO_train2014_000000057870.jpg', 'id': 57870 #此id对应的是'annotations'中的image_id 'width': 640, 'file_name': 'COCO_train2014_000000057870.jpg', 'height': 480 } ... ... ], 'licenses': [{ 'id': 1, 'url': 'http://creativecommons.org/licenses/by-nc-sa/2.0/', 'name': 'Attribution-NonCommercial-ShareAlike License' } ... ... ], 'info': { 'description': 'COCO 2014 Dataset', 'year': 2014, 'date_created': '2017/09/01', 'contributor': 'COCO Consortium', 'url': 'http://cocodataset.org', 'version': '1.0' } }
2. 如果用微软提供的pycocotools.coco加载json文件
对应的代码:
from pycocotools.coco import COCO import torch.utils.data as data import json class DataLoader(data.Dataset): def __init__(self, json, transform=None): self.coco = COCO(json) self.ids = list(self.coco.anns.keys()) self.transform = transform if __name__=='__main__': base_path = r'/data/antonio/images_data/images/annotations/captions_train2014.json' dataloader = DataLoader(base_path)
对应的dataset存放的json文件中的数据,其他部分是COCO处理得到
其中imgToAnns是image_id对应的5个caption
其中anns是id为key的字典,对应的value仍然是字典,字典中存储的是image_id,id,caption