Toward Precision Medicine: Building a Knowledge Network for Biomedical Research and a New Taxonomy of Disease (2011)

Each report is produced by a committee of experts selected
by the Academy to address a particular statement of task and
is subject to a rigorous, independent peer review; while the
reports represent views of the committee, they also are
endorsed by the Academy.
Learn more on our expert consensus reports.

A new data network that integrates emerging research on the molecular makeup of diseases with clinical data on individual patients could drive the development of a more accurate classification of disease and ultimately enhance diagnosis and treatment. Recent advances in biomedical research have caused an explosion of data, offering the potential to develop a "new taxonomy" that defines disease based on underlying molecular and environmental causes, rather than on physical signs and symptoms. This report outlines how research and clinical data can be captured in a "knowledge network" that will be broadly accessible to researchers and clinicians. As well as improving health care, the new data network could also improve biomedical research by enabling scientists to access patient information through electronic health records, while still protecting patient rights.

Key Messages

Dramatic advances in research have generated a wealth of new data that could improve health outcomes. However, currently there is a disconnect between scientific advances in research and the incorporation of this information in the clinic. In addition, researchers don't have access to the wealth of clinical data on patients that is collected at the point of care.

In order to harness the power of emerging disease data, systems are needed to collect and make the information widely accessible. The committee suggested a framework for creating an information system called a Knowledge Network of disease that integrates the rapidly expanding range of information on the causes of disease and allows researchers, health-care providers, and the public to share and update this information. Such a system is centered around an "Information Commons," a data repository that links layers of molecular data, medical histories, including information on social and physical environments, and health outcomes to individual patients. Data would be continuously contributed to the Information Commons by the research community and from the medical records of participating patients.

The Knowledge Network would impact all aspects of biomedicine and health care. By analyzing connections between information sets (for example between the genome and environmental exposures) basic scientists would be able to formulate and test disease mechanisms, and clinicians could develop new treatments based on unique features of a disease and tailored to each patient. The availability of more diverse information about each disease would allow insurers and health care providers to more precisely define disease subtypes.

The initiative to develop a New Taxonomy—and its underlying Information Commons and Knowledge Network—is a needed modernization of current approaches to integrating different types of data, not an "add-on" to existing research programs. Enormous efforts are already underway to achieve many of the goals of this report, but the committee found that a system-wide emphasis on shifting the acquisition of molecular data to point-of-care settings and the coordination required to ensure research data reach the Information Commons and Knowledge Network is often missing.

The report outlines a number of concrete steps toward creating a Knowledge Network of Disease and deriving a new taxonomy from it.

Conduct pilot studies that begin to populate the information commons with data
Pilot studies, including those conducted in heath care settings, would help scientists figure out how to integrate molecular data with medical histories and health outcomes in the ordinary course of clinical care. These studies would address the institutional, cultural, and regulatory barriers to widespread sharing of individuals' molecular profiles and health histories while still protecting patients' rights. Much of the initial work necessary to develop the information commons should take the form of observational studies, which would collect molecular and other patient data during the normal course of treatment.

Integrate Data to Construct a Knowledge Network of Disease
As data from pilot studies begins to populate the information commons, substantial effort should go into integrating these data with the results of basic biomedical research in order to create a dynamic, interactive knowledge network. This network, and the Information Commons itself, should leverage state-of-the-art information technology to provide multiple views of the data appropriate to the varying needs of different users such as basic researchers, clinicians, or outcomes researchers. The incorporation of electronic medical records into the health care system and the advent of inexpensive ways of collecting health information could also create opportunities to integrate data for the information commons more efficiently.

Initiate a process within appropriate federal agencies to assess the privacy issues associated with the research required to create the information commons.
Because privacy issues associated with genetic information have been studied extensively, this process need not start from scratch. However, investigators who wish to participate in the pilot studies discussed above—and the Institutional Review Boards who must approve their human-subjects protocols—will need specific guidance on the range of informed-consent processes appropriate for these projects.

Ensure data sharing.
Widespread data sharing is essential to the success of each stage of creating a new disease taxonomy. Most fundamentally, information on how gene sequence translates to symptoms must be broadly accessible so that a wide diversity of researchers can mine them. Data sharing standards that respect individual privacy concerns while enhancing the deposition of data into the Information Commons should be created. These standards should provide incentives that motivate data sharing over the establishment of proprietary databases for commercial intent.

Develop an efficient validation process to incorporate information from the disease knowledge network into a new taxonomy of disease.
Insights into disease classification that emerge from the Information Commons and the derived knowledge network will require validation of their reproducibility and their utility for making clinical decisions such as selecting appropriate treatment, before adoption into clinical use. The speed and complexity with which such validated information emerges will undoubtedly accelerate and will require novel decision support systems for use by all stakeholders.

Incentivize partnerships.
A new taxonomy incorporating molecular data could become self-sustaining by accelerating delivery of better health through more accurate diagnosis and more effective and cost-efficient treatments. However, to cover initial costs associated with collecting and integrating data for the Information Commons, incentives should be developed that encourage public private partnerships involving government, drug developers, regulators, advocacy groups and payers.