Article Structure

Abstract

An important search task in the biomedical domain is to find medical records of patients who are qualified for a clinical trial.

Introduction

With the increasing use of electronic health records, it becomes urgent to leverage this rich information resource about patients’ health conditions to transform research in health and medicine.

Related Work

The Medical Records track of the Text REtrieval Conference (TREC) provides a common platform to study the medical records retrieval problem and evaluate the proposed methods (Voorhees and Tong, 2011; Voorhees and Hersh, 2012).

Concept-based Representation for Medical Records Retrieval

3.1 Problem Formulation

Weighting Strategies for Concept-based Representation

4.1 Motivation

Experiments

3.1 Experiment Setup

Conclusions and Future Work

Medical record retrieval is an important domain-specific IR problem.

Topics

confidence score

Appears in 8 sentences as: confidence score (7) confidence scores (1)

In A Study of Concept-based Weighting Regularization for Medical Records Search

In particular, MetaMap (Aronson, 2001) can take a text string as the input, segment it into phrases, and then map each phrase to multiple UMLS CUIs with confidence scores .

Page 3, “Concept-based Representation for Medical Records Retrieval”

The confidence score is an indicator of the quality of the phrase-to-concept mapping by MetaMap.

Page 3, “Concept-based Representation for Medical Records Retrieval”

confidence score as well as more detailed information about this concept.

Page 3, “Concept-based Representation for Medical Records Retrieval”

Although MetaMap is able to rank all the candidate concepts with the confidence score and pick the most likely one, the accuracy is not very high.

Page 4, “Weighting Strategies for Concept-based Representation”

i(e) is the normalized confidence score of the mapping for concept 6 generated by MetaMap.

Page 4, “Weighting Strategies for Concept-based Representation”

Since each concept mapping is associated with a confidence score , we can incorporate them into the regularization function as follows:

Page 5, “Weighting Strategies for Concept-based Representation”

where i(e) is the normalized confidence score of concept 6 generated by MetaMap, and 04 is a parameter between 0 and l to control the effect of the regularization.

Page 5, “Weighting Strategies for Concept-based Representation”

As shown in Equation (3), the Balanced method regularizes the weights through two components: (1) normalized confidence score of each aspect,

vector space

Appears in 3 sentences as: vector space (3)

In A Study of Concept-based Weighting Regularization for Medical Records Search

For example, Qi and Laquerre used MetaMap to generate the concept-based representation and then apply a vector space retrieval model for ranking, and their results are one of the top ranked runs in the TREC 2012 Medical Records track (Qi and Laquerre, 2012).

Page 2, “Related Work”

However, existing studies on concept-based representation still used weighting strategies developed for term-based representation such as vector space models (Qi and Laquerre, 2012) and divergence from randomness (DFR) (Limsopatham et al., 2013a) and did not take the inaccurate concept mapping results into consideration.

Page 2, “Related Work”

After converting both queries and documents to concept-based representations using MetaMap, previous work applied existing retrieval functions such as vector space models (Singhal et al., 1996) to rank the documents.