News

Two papers are accepted by ACL 2020

Two papers appear in JMIR and JMIR Medical Informatics

Our paper has been selected as Best Paper in The IMIA Yearbook of Medical Informatics. The title of our paper is "Extraction of information related to adverse drug events from electronic health record notes: design of an end-to-end model based on deep learning"

Home

Adverse drug events (ADEs) are common and occur in approximately 2-5% of hospitalized adult patients. Each ADE is estimated to increase healthcare cost by more than $3,200. Severe ADEs rank among the top 5 or 6 leading causes of death in the United States. Prevention, early detection and mitigation of ADEs could save both lives and dollars. Employing natural language processing (NLP) techniques on electronic health records (EHRs) provides an effective way of real-time pharmacovigilance and drug safety surveillance.

We’ve annotated 1092 EHR notes with medications, as well as relations to their corresponding attributes, indications and adverse events. It provides valuable resources to develop NLP systems to automatically detect those clinically important entities. Therefore we are happy to announce a public NLP challenge, MADE1.0, aiming to promote deep innovations in related research tasks, and bring researchers and professionals together exchanging research ideas and sharing expertise. The ultimate goal is to further advance ADE detection techniques to improve patient safety and health care quality.

Annotated Data

The entire dataset contains 1092 de-identified EHR notes from 21 cancer patients. Each EHR note was annotated with medication information (medication name, dosage, route, frequency, duration), ADEs, indications, other signs and symptoms, and relations among those entities. We split the data into a training set consisting of ~900 notes and a test set consisting of ~180 notes. Both will be released in BioC format.

Task Definitions

MADE1.0 challenge consists of three tasks defined as follows.

Named entity recognition (NER): develop systems to automatically detect mentions of medication name and its attributes (dosage, frequency, route, duration), as well as mentions of ADEs, indications, other signs & symptoms.

Relation identification (RI): given the truth entity annotations, build system to identify relations between medication name entities and its attribute entities, as well as relations between medication name entities and ADE, indications and other sign & symptoms.

Integrated task (IT): design and develop a integrative system to conduct the above two tasks together.

Evaluations

For each task, we evaluate two configurations: Standard evaluation using only MADE resources** (up to 2 runs) and Extended evaluation (up to 2 runs) using any customized resources available. Best score for each setting will be utilized for team ranking. MADE resources only refer to released training data plus the word embedding trained using wiki, de-ided Pittsburgh EHR and pubmed articles, which can be downloaded here: http://146.189.156.68/word_embed/. Please cite the NAACL 2016 paper below for word embedding.

Below are the publications of two baseline systems. In this competition, we used a different approach to partition the training and testing data. Therefore, the performance of the two baseline systems can be used as an approximation only.

Submission and Timeline

1) The test data for Task 1(Entity--NER) and Task3 (Entity+Relation--IT) will be released to teams on March 5,2018 at 10:00am Eastern Time, and each team is requested to submit results within 24 hours.

2) The test data with ground truth Entity labels for Task 2(Relation only--RI) will be released on March 6, 2018 at 11:00am, and each team is requested to submit Task 2 results within 24 hours.

3) We will release the evaluation script on Feb 1, 2018Feb 5, 2018so that groups can validate their output format and test out the script in advance. Submissions that do not conform to the provided BioC format will be rejected without consideration or notification.

4) Right after evaluation we require top performed systems (based on exact matching score using MADE resource only) to send us the software or set up an online server by March 8, 2018 so that we can validate the results on our site. System performance that can’t be validated will be rejected.

5) A special scientific panel session for this challenge will be held at AMIA Summit 2018 on March 14, 2018 (1:30pm – 3:00pm). Six teams will be invited for panel presentations, but all the teams are welcome to attend the session at AMIA summit (Visa application needs to be prepared in advance).

6) We will work on a journal special issue for this challenge after AMIA Summit, and another workshop is also in consideration.

: The test data will be released March 5 2018 at 10:00am Eastern time as scheduled. The notification will be sent out through the google group email(umassmade2018@googlegroups.com). Please notify us if you don't get it.