10 years of Nature Protocols PERSPECTIVE

Over the last decade, technical advances in nucleic acid sequencing and mass spectrometry have enabled faster
and more informative metagenomic, metatranscriptomic, metaproteomic and metabolomic measurements. Here we
review key improvements in multi-omic environmental and human microbiome analyses, and discuss developments
required to address current measurement shortcomings.

the planet’s biosphere. Although microorganisms are known bacterial genomes in a matter of hours11.
to be responsible for key functions on Earth, such as carbon NextGen sequencing methods have used several different
and nutrient cycling, and determining the health and disease high-throughput platforms. The first was the Roche GS20 454
state of the planet’s plant and animal inhabitants, greater than sequencer, which was based on the polymerase cleavage of
99% of the trillions of microbes thought to exist have yet to be pyrophosphate, also known as pyrosequencing12,13. Although 454
discovered1. In addition, high microbial diversity has made sequencing was a key technological advance, and 454 sequencers
it difficult to study specific functions carried out by complex including the GS20 and GS FLX series machines and reagents
microbial communities in microbiomes (defined as the totality were used for over a decade (approximately 2005 to 2016, http://
of microorganisms and their collective genetic material www.genomeweb.com/sequencing/roche-shutting-down-454-
present in a specific environment such as all microorganisms sequencing-business), it had several drawbacks including high
inhabiting the soil or human gut)2,3. Fortunately, technological cost of sequencing reagents, high homopolymer error rates (i.e.,
advances over the last few decades have greatly facilitated errors in reading through the complex repeats), and surface
studies of complex microbiomes and their functions. Here area loading limitations owing to bead-based DNA molecule
we discuss advances related to nucleic acid sequencing and deposition that restrict the throughput and number of reads
mass spectrometry (MS) analyses that have enabled the obtained. The second NextGen sequencer was the Solexa (now
exploration and understanding of microbiomes across a range Illumina) Genome Analyzer (GA), which was introduced in
of environments as well as in our own bodies3–6.
npg

2006 and incorporated oligonucleotide array flow cells, reversible
chain terminators and bridge PCR reactions14. This technology
Nucleic acid sequencing is now routinely used to sequence DNA and RNA extracted from
Next-generation sequencing. At the forefront of advances in human and environmental microbiomes and can generate >1.8
microbiome research lie the impressive increases in the speed and terabases (TB) of data in a single run. However, the ultimate
throughput of nucleic acid sequencing technologies. In particular, goal was to sequence >18,000 human genomes (~3 gigabase-pair
there has been a revolution in next-generation (NextGen) (Gbp) haploid genome) per year at $1,000 per human genome
sequencing platforms as they have surpassed the traditional (http://www.illumina.com/systems/hiseq-x-sequencing-system/
Sanger sequencing method that dominated the field for nearly system.html). Illumina currently has several technical platforms
three decades (from 1977 to 2005)7. Sequencing a single bacterial including GA, MiSeq and HiSeq machines, with varying
genome using the Sanger dideoxynucleotide-based chain- sequence read lengths (100-300-bp paired-end reads) and
termination approach previously was a major endeavor that took throughputs to try and address this challenge. For example, the
years to complete8,9. The first bacterial genome to be completely maximum read length with overlapping paired reads on a MiSeq
sequenced using the Sanger approach was Haemophilus influenza9 platform is ~500-550 bp, but that platform has lower throughput
in 1995 (with Escherichia coli10 completed in 1997). Currently, than the HiSeq platform, which generates billions of reads per
run (Fig. 1). A relatively new approach developed by Illumina,
Earth and Biological Sciences Directorate, Pacific Northwest National Laboratory, called TruSeq synthetic long reads or Moleculo, results in longer
Richland, Washington, USA. Correspondence should be addressed to J.K.J. read lengths (>8 kbp)15, and has facilitated the assembly of highly
(janet.jansson@pnnl.gov) or E.S.B. (erin.baker@pnnl.gov).
complex soil microbiomes16 and other biological samples17,18.
Received 8 June; accepted 19 July; published online 29 September 2016; Initial results from these technological advances are enhancing
doi:10.1038/nprot.2016.148 microbiome assembly into longer contigs16–18.