Complementary Pairs List

*Note: The training and validation set files have been updated with minor changes on 04/26/17 to be consistent with test set. Therefore, if you downloaded the files before this date, you should download them again. Thanks!

The captions for training and validation sets of the abstract scenes can be downloaded from
here.

Overview

The annotations we release are the result of the following post-processing steps on the raw crowdsourced data:

data_type: source of the images (mscoco or abstract_v002).
data_subtype: type of data subtype (e.g. train2014/val2014/test2015 for mscoco, train2015/val2015 for abstract_v002).
question_type: type of the question determined by the first few words of the question. For details, please see README.
answer_type: type of the answer. Currently, "yes/no", "number", and "other".
multiple_choice_answer: most frequent ground-truth answer.
answer_confidence: subject's confidence in answering the question. For details, please see Antol et al., ICCV 2015.

Complementary Pairs List Format

The complementary pairs lists are stored using the JSON file format.

The complementary pairs list has the following data structure:

[
(question_id_1, question_id_2)
]

The (question, image, answer) example with question_id_1 and the (question, image, answer) example with question_id_2 are complementary of each other i.e., they share the same question for two different images with two different answers. For more details, please see Goyal et al., CVPR 2017.

data_type: source of the images (abstract_v002).
data_subtype: type of data subtype (train2015/val2015/test2015).
The file_name in images list contains the name of the image file for the corresponding abstract scene. These image files can be downloaded from the links provided in the "Download" section in this page.
The file_name in compositions list contains the name of the scene composition file for the corresponding abstract scene (see the bullet below).

A folder of the type "scene_composition_abstract_v002_[datasubset]" where [datasubset] is either "train2015" or "val2015" or "test2015". This folder contains the scene composition files for the corresponding [datasubset].

For more information on how to render the scenes from annotation files and to obtain API support for abstract scenes, please visit the GitHub repository.

The JSON files containing the captions for training and validation sets of the abstract scenes can be downloaded from the link provided in the "Download" section in this page. These files have the following data structure: