Thank you for visiting nature.com. You are using a browser version with
limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off
compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site
without styles and JavaScript.

Subjects

Abstract

Viruses are the most abundant biological entities on Earth, but challenges in detecting, isolating, and classifying unknown viruses have prevented exhaustive surveys of the global virome. Here we analysed over 5 Tb of metagenomic sequence data from 3,042 geographically diverse samples to assess the global distribution, phylogenetic diversity, and host specificity of viruses. We discovered over 125,000 partial DNA viral genomes, including the largest phage yet identified, and increased the number of known viral genes by 16-fold. Half of the predicted partial viral genomes were clustered into genetically distinct groups, most of which included genes unrelated to those in known viruses. Using CRISPR spacers and transfer RNA matches to link viral groups to microbial host(s), we doubled the number of microbial phyla known to be infected by viruses, and identified viruses that can infect organisms from different phyla. Analysis of viral distribution across diverse ecosystems revealed strong habitat-type specificity for the vast majority of viruses, but also identified some cosmopolitan groups. Our results highlight an extensive global viral diversity and provide detailed insight into viral habitat distribution and host–virus interactions.

Acknowledgements

We thank A. Visel and H. Maughan for critical reading and feedback, A. Pati for help in earlier versions, and the IMG and GOLD teams for their support. This work was conducted by the US Department of Energy Joint Genome Institute, a DOE Office of Science User Facility, under contract number DE-AC02-05CH11231 and used resources of the National Energy Research Scientific Computing Center, supported by the Office of Science of the US Department of Energy.

Article Tools

Editorial Summary

A map of the viral world

Viruses influence virtually all of the biogeochemical processes occurring on our planet, but they remain enigmatic because it has proved difficult to detect, isolate and classify them in large-scale studies. However, in recent years a vast amount of metagenomic data have been collected, and now Nikos Kyrpides and colleagues have developed a computational approach to extract more detail from that dataset and create the first global map of viral biogeography. They explore the viral content of more than 3,000 metagenomic samples collected globally, identify 125,000 partial DNA viral genomes — including the largest known phage — and increase the number of known viral genes 16-fold.