In 2015 FAIR released the bAbI dataset. The dataset consists of 20 synthetic question answering tasks that require reasoning about agents, locations, objects, and intentions. Each instance (story) consists of a sequence of clauses and questions. A full description of the dataset, motivation, and results for several system can be found in the paper Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks.

Example Instance

 1 Mary moved to the bathroom.
 2 John went to the hallway.
 3 Where is Mary?    bathroom    1
 4 Daniel went back to the hallway.
 5 Sandra moved to the garden.
 6 Where is Daniel?  hallway 4
 7 John moved to the office.
 8 Sandra journeyed to the bathroom.
 9 Where is Daniel?  hallway 4
10 Mary moved to the hallway.
11 Daniel travelled to the office.
12 Where is Daniel?     office  11
13 John went back to the garden.
14 John moved to the bedroom.
15 Where is Sandra?     bathroom    8

Overview

Number of instances: 400

Train instances: 200

Test instances: 200

Vocabulary

Size: 19 Words

Agents

john
mary
sandra
daniel

Locations

bathroom
bedroom
office
hallway
kitchen
garden

Clause Templates

AGENT went back to the LOCATION

AGENT journeyed to the LOCATION

AGENT travelled to the LOCATION

AGENT went to the LOCATION

AGENT moved to the LOCATION

Number of Possible Clauses

\[n = \#\{\text{agents}\} \times \#\{\text{clauses}\} \times \#\{\text{locations}\} = 4 \times 5 \times 6 = 120.\]

Instance Format

CLAUSE

CLAUSE

QUESTION

CLAUSE

CLAUSE

QUESTION

Deconstructing bAbI Task 1