Bioinformatics Core Supports Investigators Who Seek to Use Big Data

Bioinformatics methods are increasingly applied in clinical research studies to approach research questions from novel perspectives. For example, natural language processing (NLP) enables use of information previously embedded in narrative clinical notes. Leveraging experience from the past decade in developing and applying bioinformatics tools for clinical studies, the VERITY Bioinformatics Resource Core supports investigators who seek to apply these tools to their research in rheumatic and musculoskeletal disease.

Members of the VERITY community have access to an integrated electronic medical record in a large RA cohort with the use of NLP. This resource provides an opportunity for hands-on experience working with integrated data while testing study hypotheses. Later this year, the online informatics platform will go live, allowing our Research Community to query the integrated dataset with the use of structured data, such as ICD codes and electronic prescriptions, as well as information extracted by NLP.

The Bioinformatics Core also can provide access to a customizable chart review tool, CHANL, which utilizes NLP to assist investigators to identify outcomes of interest; additionally, this tool can improve the efficiency of chart reviews in which thousands of notes require analysis. The Bioinformatics Resource Core also assists with phenotyping via machine learning approaches and with the creation of cohort studies by the use of electronic medical records data.

Currently, the Bioinformatics Core is working with Sara Tedeschi, MD, a rheumatologist at Brigham and Women’s and an Instructor of Medicine at Harvard Medical School, to adapt existing methods to phenotype pseudogout, a common clinical condition lacking in large-scale epidemiologic studies. The Core is leveraging NLP and testing semi-supervised machine learning approaches in order to improve the ability to identify this condition in the EMR for the performance of clinical studies.

The VERITY Bioinformatics Core is led by Katherine Liao, MD, MPH, an Assistant Professor of Medicine and Bioinformatics at Harvard Medical School and a rheumatologist and clinical investigator at Brigham and Women’s Hospital; the Associate Director is Tianxi Cai, ScD, a Professor of Biostatistics at Harvard T.H. Chan School of Public Health and a Professor of Biomedical Informatics at Harvard Medical School. The Core’s faculty includes Tianrun Cai, MD, of the Brigham’s Department of Rheumatology and the creator of CHANL.