Reference Genomes Data

The HMP plans to sequence, or collect from associated efforts, a
total of 3000 reference genomes isolated from human body sites. The majority of
these will be sequenced only to a high-quality draft stage. Metadata about
current, completed and targeted reference genome projects can be found in the
HMP Project Catalog. The
information gained from the Reference Genomes will aid in taxonomic assignment
and functional annotation of 16S RNA and metagenomic sequence, respectively,
from microbiome samples.

As reference genomes are released with annotation, they will become available
for download here. This page does not reflect every project found in the HMP
Project Catalog, but only those that have completed sequencing and annotation.
Isolates are organized by body site, then by genus and species name. Users can
sort within body site by Genbank project id. Hover over download icons to see
file format type and file size. The DACC provides the following four file
formats: assembly nucleotide fasta (ASM), protein multifasta (PEP), coding
sequence nucleotide multifasta (CDS), and genbank format (GBK). Sequence and
annotation data is also available at
NCBI.

This page is updated monthly. Contact the DACC if you need
access to a previous version of the dataset.

HMP BLAST Server -
BLAST against all available HMP Reference Genomes or body site specific subsets