SLAS2018 Innovation Award Finalist: Pharos – A Torch to Use in Your Journey In the Dark Genome

It is well known that a relatively small set of protein targets receive the bulk of research attention and thus funding. However, there are potential (druggable) opportunities in the remaining under-studied and un-studied proteins. To address this the NIH initiated the "Illuminating the Druggable Genome" program to characterize the dark regions of the druggable genome. As part of this program, a Knowledge Management Center (KMC) was created to aggregate and integrate heterogeneous data sources and data types creating a centralized location for information about all protein targets identified as part of the druggable genome. Since then the KMC has expanded to consider the entire human proteome. In this presentation, we describe Pharos, the user interface for the KMC knowledgebase. We provide an overview of the data sources and types made available via Pharos and then describe the architecture of the system and its integration with KMC & external resources. In particular we highlight the rich search facilities that enable a user to drill down to relevant subsets of data but also support the notion of "serendipitous search". Given the heterogeneous set of data types available for individual targets, it is useful to quantify how much and what types of data is available for a target. We describe the development of knowledge profiles and a Knowledge Availability Score (KAS), both derived from the Harmonizome, which is a resource that has characterized data availability across different data sources and types in a uniform manner. We then highlight how the KAS is concordant with knowledge trends characterized by traditional metrics such as publications and grants. We discuss the use of the KAS in the Pharos interface and an example of prioritizing understudied targets by computing the similarity of their knowledge availability profiles with that of well-studied targets.

Rajarshi Guha

Research ScientistNIH

Rajarshi Guha is Group Leader (Research Informatics) in the Division of Pre-Clinical Innovation at NCATS. With over 10 years of experience in handling, analysing and visualizing chemical information, he brings a diverse range of skills and experience to his current role at NCATS. He is involved in small molecule development projects in a variety of therapeutic areas including rare cancers and infectious diseases. He is also involved in software and algorithm development in the areas of cheminformatics methods and large scale infrastructure projects including Pharos (http://pharos.nih.gov/) BARD (http://bard.nih.gov/). His research interests focus on methodology development to analyze and visualize chemical biology data sets, with specific focus on techniques to link chemical structure information to molecular, bibliographic, genomic and clinical covariates. He has held multiple leadership roles in the American Chemical Society’s Division of Chemical Information and is currently a co-Editor in Chief of the Journal of Cheminformatics.