NLP-progress

Repository to track the progress in Natural Language Processing (NLP), including the datasets and the current state-of-the-art for the most common NLP tasks.

Semantic textual similarity

Semantic textual similarity deals with determining how similar two pieces of texts are.
This can take the form of assigning a score from 1 to 5. Related tasks are paraphrase or duplicate identification.

SentEval

SentEval is an evaluation toolkit for evaluating sentence
representations. It includes 17 downstream tasks, including common semantic textual similarity
tasks. The semantic textual similarity (STS) benchmark tasks from 2012-2016 (STS12, STS13, STS14, STS15, STS16, STSB) measure the relatedness
of two sentences based on the cosine similarity of the two representations. The evaluation criterion is Pearson correlation.

The SICK relatedness (SICK-R) task trains a linear model to output a score from 1 to 5 indicating the relatedness of two sentences. For
the same dataset (SICK-E) can be treated as a three-class classification problem using the entailment labels (classes are ‘entailment’, ‘contradiction’, and ‘neutral’).
The evaluation metric for SICK-R is Pearson correlation and classification accuracy for SICK-E.

The Microsoft Research Paraphrase Corpus (MRPC) corpus is a paraphrase identification dataset, where systems
aim to identify if two sentences are paraphrases of each other. The evaluation metric is classification accuracy and F1.