Dataset

Description

Download

Last weeks the committee has been noticed about new errors related to some aspects of the corpus (e.g. drug labeling error, charoffset errors, etc.). Thus, the corpus has been deeply evaluated, and the noticed errors have been fixed. We apology for any inconvenience.

Also a new change has been introduced into the new corpus: the label "interaction" has been replaced by a new label "pair", which identify all possible DDI candidate pairs appearing in a single sentence. Thanks to all the participants who sent their observation about inconsistencies or errors, we appreciate very much this valuable information that make us improve the DrugDDI corpus.

The errors and inconsistencies detected in this format have been corrected. We apology for any inconvenience.

Structure

The corpus has been generated in xml format, composed by the structure described in this document.

Participants are allowed to submit a maximum of 5 runs.
Each run can include different sources of information and use different techniques.
A submission file must be an txt file that includes all pairs of drugs (at the sentence level).

Test file example

June 3, 2011 - NEW!! The Test dataset is available for registred participants here.