This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Abstract

Background: A hierarchical taxonomy of organisms is a prerequisite for semantic integration of biodiversity data. Ideally, there would be a single, expansive, authoritative taxonomy that includes extinct and extant taxa, information on synonyms and common names, and monophyletic supraspecific taxa that reflect our current understanding of phylogenetic relationships.

Description: As a step towards development of such a resource, and to enable large-scale integration of phenotypic data across the vertebrates, we created the Vertebrate Taxonomy Ontology (VTO), a semantically defined taxonomic resource derived from the integration of existing taxonomic compilations, and freely distributed under a Creative Commons Zero (CC0) public domain waiver. The VTO includes both extant and extinct vertebrates and currently contains 106,927 taxonomic terms, 23 taxonomic ranks, 104,506 synonyms, and 162,132 taxonomic cross-references. Key challenges in constructing the VTO included (1) extracting and merging names, synonyms, and identifiers from heterogeneous sources; (2) replacing subgroups with more authoritative local taxonomies; and (3) automating this process as much as possible to accommodate updates in source taxonomies.

Conclusions: The VTO is the primary source of taxonomic information used by the Phenoscape Knowledgebase (http://phenoscape.org/), which integrates genetic and evolutionary phenotype data across both model and nonmodel vertebrates. The VTO is useful for crudely inferring phenotypic changes on the vertebrate tree of life, which enables queries for candidate genes for different episodes in vertebrate evolution.

Additional Information

Competing Interests

Todd Vision is an Academic Editor for PeerJ.

Author Contributions

Peter E Midford conceived and designed the experiments, analyzed the data, contributed reagents/materials/analysis tools, wrote the paper.

T Alex Dececchi conceived and designed the experiments, analyzed the data, wrote the paper.

James P Balhoff conceived and designed the experiments, analyzed the data, contributed reagents/materials/analysis tools, wrote the paper.

Wasila M Dahdul conceived and designed the experiments, analyzed the data, wrote the paper.

Nizar Ibrahim conceived and designed the experiments, analyzed the data, wrote the paper.

Grant Disclosures

Funding

This material is based upon work supported by the National Science Foundation (NSF, grants DBI-0641025, DBI-1062404, and DBI-1062542) and the National Evolutionary Synthesis Center (NSF EF-0423641 and NSF EF-0905606). Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the NSF. We also acknowledge support from the National Institutes of Health (HG002659). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Add your feedback

Before adding feedback, consider if it can be asked as a question instead, and if so then use the Question tab. Pointing out typos is fine, but authors are encouraged to accept only substantially helpful feedback.

Follow this preprint for updates

"Following" is like subscribing to any updates related to a preprint.
These updates will appear in your home dashboard each time you visit PeerJ.

You can also choose to receive updates via daily or weekly email digests.
If you are following multiple preprints then we will send you
no more than one email per day or week based on your preferences.

Note: You are now also subscribed to the subject areas of this preprint
and will receive updates in the daily or weekly email digests if turned on.
You can add specific subject areas through your profile settings.