NLP-progress

Repository to track the progress in Natural Language Processing (NLP), including the datasets and the current state-of-the-art for the most common NLP tasks.

Relationship Extraction

Relationship extraction is the task of extracting semantic relationships from a text. Extracted relationships usually
occur between two or more entities of a certain type (e.g. Person, Organisation, Location) and fall into a number of
semantic categories (e.g. married to, employed by, lives in).

New York Times Corpus

The standard corpus for distantly supervised relationship extraction is the New York Times (NYT) corpus, published in
Riedel et al, 2010.

This contains text from the New York Times Annotated Corpus with named
entities extracted from the text using the Stanford NER system and automatically linked to entities in the Freebase
knowledge base. Pairs of named entities are labelled with relationship types by aligning them against facts in the
Freebase knowledge base. (The process of using a separate database to provide label is known as ‘distant supervision’)

Example:

Elevation Partners, the $1.9 billion private equity group that was founded by Roger McNamee

(founded_by, Elevation_Partners, Roger_McNamee)

Different papers have reported various metrics since the release of the dataset, making it difficult to compare systems
directly. The main metrics used are either precision at N results or plots of the precision-recall. The range of recall
has increased over the years as systems improve, with earlier systems having very low precision at 30% recall.

(+) Obtained from results in the paper “Neural Relation Extraction with Selective Attention over Instances”

SemEval-2010 Task 8

SemEval-2010 introduced ‘Task 8 - Multi-Way Classification of Semantic
Relations Between Pairs of Nominals’. The task is, given a sentence and two tagged nominals, to predict the relation
between those nominals and the direction of the relation. The dataset contains nine general semantic relations
together with a tenth ‘OTHER’ relation.

Example:

There were apples, pears and oranges in the bowl.

(content-container, pears, bowl)

The main evaluation metric used is macro-averaged F1, averaged across the nine proper relationships (i.e. excluding the
OTHER relation), taking directionality of the relation into account.

Several papers have used additional data (e.g. pre-trained word embeddings, WordNet) to improve performance. The figures
reported here are the highest achieved by the model using any external resources.

TACRED

TACRED is a large-scale relation extraction dataset with 106,264 examples built over newswire and web text from the corpus used in the yearly TAC Knowledge Base Population (TAC KBP) challenges. Examples in TACRED cover 41 relation types as used in the TAC KBP challenges (e.g., per:schools_attended and org:members) or are labeled as no_relation if no defined relation is held. These examples are created by combining available human annotations from the TAC KBP challenges and crowdsourcing.

Example:

Billy Mays, the bearded, boisterious pitchman who, as the undisputed king of TV yell and sell, became an inlikely pop culture icon, died at his home in Tampa, Fla, on Sunday.

(per:city_of_death, Billy Mays, Tampa)

The main evaluation metric used is micro-averaged F1 over instances with proper relationships (i.e. excluding the
no_relation type).

FewRel

The Few-Shot Relation Classification Dataset (FewRel) is a different setting from the previous datasets. This dataset consists of 70K sentences expressing 100 relations annotated by crowdworkers on Wikipedia corpus. The few-shot learning task follows the N-way K-shot meta learning setting. It is both the largest supervised relation classification dataset as well as the largest few-shot learning dataset till now.