Building Biomedical Ontologies

ByEric K. Neumann

April 12, 2007 | Ontologies have an essential role in the process of organizing the vast amount of new biomedical information and making it available across applications. Ontologies are formalisms based on both domain knowledge and logical principles, whose creation requires a balanced amount of computer engineering and scientific domain expertise.

Why are ontologies useful? They allow for sharing common understanding of information concisely among researchers; enabling the re-use of data and information in various combinations; and fostering communities of researchers by supporting intelligent connections. They are formal descriptions of concepts and relations, mapping scientific concepts for use in information systems, in order to declare data structures, enumerate terminologies (controlled vocabulary), and provide machine-usable descriptions for processes (e.g., web-services). An example of the second category is Gene Ontology, which informaticists have found most useful for organizing and mining newly elucidated genomic information.

Ontologies are generally top-down descriptions, but they are often focused in individual domains. This allows them to be practical, usable, and importantly, achievable by a community willing to invest in their creation. Examples of biomedical ontologies include: Gene Ontology and its umbrella organization, OBO (Open Biomedical Ontologies); Systemized Nomenclature of Medicine for diseases, findings, and procedures (SNOMED); Medical Subject Heading (MeSH, National Library of Medicine); and Pathways Exchange Ontology (BioPAX).

However, researchers are realizing that although the concepts may be separable by domain, the way one formally describes these ontologies should be consistent across all ontologies. For example, the use of the ‘part-of’ relation can imply something is physically contained within something (nucleus is part of the cell) or a step within a process (mitosis is part of cell cycle); many examples abound and they may need to be distinct definitions. To ensure that the formation of all these ontologies are consistent enough for combined utility, the National Center for Biomedical Ontologies (NCBO) was funded to coordinate such efforts and establish ontology guidelines.

What Is the NCBO?The NCBO is one of the seven National Centers of Biomedical Computing (NCBC) funded by NIH as part of its Digital Roadmap Initiative. Its vision is that all biomedical knowledge and data should be disseminated on the Internet using principled ontologies such that the knowledge and data are semantically interoperable and useful for furthering biomedical science and clinical care. Its mission is “to create software and support services for the application of principled ontologies in biomedical science and clinical care, ranging from tools for application developers to software for end-users.” To this end, NCBO defines knowledge and data as being semantically interoperable “when they enable predictable, meaningful, computation across knowledge sources developed independently to meet diverse needs.” Subsequently, NCBO will recommend formats and methodologies for ontology development, maintenance, and use in order to foster the creation of these principled ontologies.

The NCBO brings together researchers, computer engineers, and clinicians to help meet its goals, as well as coordinate efforts such as OBO and others in a supportive environment. The tools and formats will take advantage of W3C’s OWL ontology language, which is also used by the Semantic Web initiative. The NCBO is also working with researchers at the National Center for Ontological Research (NCOR) on improving the quality and utility of ontologies.

Useful ToolsOne of NCBO’s current resources is the BioPortal, a Web application to access their OBO library. The content includes the ontologies of model organisms, biology, chemistry, anatomy, radiology, and medicine. BioPortal lets users browse individual ontologies with different browsing styles, and provides tools for developers to integrate its functionality into their own applications by exposing URIs for all the ontology content and Web services, a practice that aligns with the Semantic Web’s goals.

The NCBO is encouraging collaborative projects that it will fund and coordinate, thereby extending into various research communities (such as BIRN). In return, NCBO will be able to develop and use ontologies that can be combined or linked together and still remain logically consistent — a win-win for all communities. NCBO is also expanding its toolset for researchers. For instance, Phenote is a tool for annotating and storing biomedical phenotypes. In addition, an OBO Converter has been created to convert other formats (e.g. DAG-Edit) into the more standard W3C OWL format.

Ontologies take time and commitment to be constructed, but their value increases dramatically when they are well structured and logically consistent, especially across multiple domains. The success of the NCBO will have a direct effect on the success of biomedical research in all areas.

Eric K. Neumann is senior director product strategy at Teranode. Eric can be reached at eneumann@teranode.com.