In 2015 FAIR released the bAbI dataset. The dataset consists of 20 synthetic question answering tasks that require reasoning about agents, locations, objects, and intentions. Each instance (story) consists of a sequence of clauses and questions. A full description of the dataset, motivation, and results for several system can be found in the paper Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks.

Example Instance

 1 Mary moved to the bathroom.
 2 John went to the hallway.
 3 Where is Mary?    bathroom    1
 4 Daniel went back to the hallway.
 5 Sandra moved to the garden.
 6 Where is Daniel?  hallway 4
 7 John moved to the office.
 8 Sandra journeyed to the bathroom.
 9 Where is Daniel?  hallway 4
10 Mary moved to the hallway.
11 Daniel travelled to the office.
12 Where is Daniel?     office  11
13 John went back to the garden.
14 John moved to the bedroom.
15 Where is Sandra?     bathroom    8

Overview

Number of instances: 400
Train instances: 200
Test instances: 200

Vocabulary

Size: 19 Words

Agents

  • john
  • mary
  • sandra
  • daniel

Locations

  • bathroom
  • bedroom
  • office
  • hallway
  • kitchen
  • garden

Clause Templates

  • AGENT went back to the LOCATION
  • AGENT journeyed to the LOCATION
  • AGENT travelled to the LOCATION
  • AGENT went to the LOCATION
  • AGENT moved to the LOCATION

Number of Possible Clauses

Instance Format

  • CLAUSE
  • CLAUSE
  • QUESTION
  • CLAUSE
  • CLAUSE
  • QUESTION

Instance Composition