NLP-progress

Repository to track the progress in Natural Language Processing (NLP), including the datasets and the current state-of-the-art for the most common NLP tasks.

Common sense

Common sense reasoning tasks are intended to require the model to go beyond pattern
recognition. Instead, the model should use “common sense” or world knowledge
to make inferences.

Event2Mind

Event2Mind is a crowdsourced corpus of 25,000 event phrases covering a diverse range of everyday events and situations.
Given an event described in a short free-form text, a model should reason about the likely intents and reactions of the
event’s participants. Models are evaluated based on average cross-entropy (lower is better).

Winograd Schema Challenge

The Winograd Schema Challenge
is a dataset for common sense reasoning. It employs Winograd Schema questions that
require the resolution of anaphora: the system must identify the antecedent of an ambiguous pronoun in a statement. Models
are evaluated based on accuracy.

Example:

The trophy doesn’t fit in the suitcase because it is too big. What is too big?
Answer 0: the trophy. Answer 1: the suitcase

Visual Common Sense

Visual Commonsense Reasoning (VCR) is a new task and large-scale dataset for cognition-level visual understanding.
With one glance at an image, we can effortlessly imagine the world beyond the pixels (e.g. that [person1] ordered
pancakes). While this task is easy for humans, it is tremendously difficult for today’s vision systems, requiring
higher-order cognition and commonsense reasoning about the world. We formalize this task as Visual Commonsense
Reasoning. In addition to answering challenging visual questions expressed in natural language, a model must provide a
rationale explaining why its answer is true.