Advanced Sequencing Technology Awards 2011

The National Human Genome Research Institute (NHGRI) has continued its coordinated effort to support the development of technologies to dramatically reduce the cost of DNA sequencing, a move aimed at broadening the applications of genomic information in medical research and health care. The 2011 awards were announced on August 22, 2011 (See:
NHGRI funds development of revolutionary
DNA sequencing technologies)

Award Abstracts

Collaborators: David Deamer, UCSC, Jens Gundlach, University of Washington, Meni Wanunu, Northeastern University

The long-term objective of this project is rapid, highly accurate, and inexpensive sequencing of long (up to 150 kb) single DNA strands with a nanopore based DNA sequencing device. Meeting this long-term objective requires precise control of DNA movement past a nanopore sequence detector, and improvement of nanopore sequence detector resolution between DNA bases. The specific aims of this proposal build upon progress made to date in these two areas by laboratories at the University of California, Santa Cruz (Akeson), University of Washington (Gundlach) and the Northeastern University (Wanunu). Individual laboratory expertise and knowledge will be integrated to accomplish four specific aims. Specific aim one extends promising results at UCSC with DNA polymerase Phi29, a "molecular step motor" able to precisely control DNA movement through a nanopore sequencer. This work will employ existing personnel and 12 years of success at UCSC with alpha hemolysin protein nanopores to extend understanding of Phi29 DNA polymerase function in a nanopore. Specific aim two evaluates Phi29 DNA polymerase function with two nanopores selected for their potentially superior base resolution to the alpha hemolysin nanopore. The first is MspA, a protein nanopore, which will be evaluated with Phi29 DNA polymerase by U of Washington and UCSC teams. This collaboration takes advantage of expertise with use of Phi29 DNAP at UCSC and expertise with MspA at U of Washington. The second nanopore, a solid state ultrathin silicon nitride pore with fluorescence detection, will be evaluated with Phi29 DNA polymerase by Northeastern (makers of the solid-state nanopore) and UCSC teams. The third specific aim improves DNA base resolution through increased differences in current signals from individual bases. This will be achieved by increased salt concentrations in the nanopore combined with use of salt tolerant DNA Polymerases. DNA polymerases from salt tolerant organisms will be isolated by extremophile experts currently at UCSC. Specific aim 4 will use all information gathered to generate proof of concept through sequencing of long (up to 48 KB) DNA strands using a nanopore sequencing device. Realization of this technology will provide the basis for a more complete understanding of individual genetic traits and predispositions in human and other populations.

To explore the possibility of a new way to massively sequence DNA, we propose to decorate the inside of a DNA polymerase with one or more fluorescent reporter probes that are predicted to actually touch the template bases, according to several X-ray crystal structures. The crystallographic structures show some of the amino acid residues in very different positions and at very different angles during the reaction coordinate. Similarly, each template base appears to be flipped out by 90 degrees before it is swung back into base-pairing position, reportedly stacking on a certain amino acid residue. If we can observe single molecules of positionally fixed, processive, specially-fluorescent DNA polymerases as they synthesize DNA using normal dNTPs, we believe there will be some reproducible flickering, and that this flickering pattern might be empirically diagnostic of the identity of each template base in turn. Signal level (quenching or enhancement), wavelength (a shift of the peak of fluorescence) and/or a change in polarity are anticipated. Alternatively or additionally, if the spatial position and angle of the base-touching fluor is different for each template base, we expect useful alteration of the fluorescence of one or more FRET acceptor fluors positioned 10-15 angstroms away. If some form of our idea works, it could be relatively inexpensive, since no machined chips, no unusual dNTPs, and no PCR emulsions need be involved. (but would be allowed). A great deal of computational analysis of video (taken through a prism and/or a polarizing filter) should be involved. In one scenario, the subject DNA could be gapped genomic DNA, combed out, so that the analyzed DNA polymerase molecules at the gaps, lined up in a row, would indicate relative genomic map position of the sequences that they are flashing.

We plan to build on our recently published work on DNA translocation through graphene nanopores (Merchant et al., Nano Lett. 10, 2915) and other preliminary results we describe in this application, to develop a DNA sensing technology based on measuring the current fluctuations of a graphene nanoribbon (GNR) as a single-stranded DNA molecule translocates through a pore in that ribbon. This geometry is anticipated to exhibit large electrical current changes for each nucleotide base due to the unique electrostatic potential associated with each nucleotide. These potentials modulate the charge density in the narrow ribbon, altering the corresponding GNR current levels.

In contrast to approaches which measure tunneling current through the DNA molecule, where experimentally reported conductance differences are on the order of 6 pS (Chang et al., Nano Lett 10, 1070), the proposed GNR is continuous, with a large in-plane conductance. Base-to-base conductance differences are predicted to be on the order of 1-10 mS (Nelson, et al., Nano Lett. 10, 3237). Graphene defects and scattering effects are likely to lower the practical device conductance, but scaling these predictions based on reported GNR studies suggest that base-to-base conductance differences of 1 ¼S could be achieved. The extra noise incurred by measuring at significantly higher bandwidth can be tolerated because the desired signals are so large. We anticipate that single-base resolution will be achievable at currently reported DNA translocation speeds. This eliminates the need for custom high-speed ultralow noise electronics, as many off-the-shelf photodiode amplifiers for fiber-optics are designed for these current and bandwidth ranges. It also removes the need to slow down or constrain the DNA molecule as it translocates, since the measurement speed is high enough to prevent Brownian fluctuations of the molecule from blurring the GNR signal.

DNA expansion is a highly desired process that may benefit and enable several next generation DNA sequencing approaches. However, a practical, inexpensive DNA expansion process is currently not available. DNA expansion is a physical process in which the sequence information of an oligonucleotide is encoded into the high signal-to-noise reporters that are spaced along a polymeric backbone of a construct called an XpandomerTM. The Xpandomer reporters can be designed for a variety of detection methods but are ideally suited to nanopore detection. In this proposal, Stratos Genomics will demonstrate the feasibility of this approach.

In Stratos Genomics' DNA expansion process, four-nucleotide long probes (tetramer XprobesTM) are sequentially ligated along a target DNA in a novel ligation process. A selectively cleavable bond connects the 2nd and 3rd nucleotides of the XprobeTM. The two halves of each XprobeTM are also connected via a long tether molecule. These tethers are user-defined molecules that comprise reporters for encoding the four nucleotides of the XprobeTM. Breaking of the cleavable bonds enables expansion of the polymer backbone. The resulting construct is called an XpandomerTM.

This proposal will build upon our work by transitioning from hexamer to tetramer based XprobesTM. To that end, the specificity and processivity of tetramer probe and tetramer Xprobe ligation will be investigated and quantified. This study will include testing with a complete tetramer probe library. Our two-year goal is to demonstrate serial ligation of loop-modified tetramers (tetramer XprobesTM) that are highly specific to the DNA target.

Nanopore sequencing is a technique in which DNA is driven electrophoretically through an orifice so small that each base must pass through one at a time. Translocation of thousands of bases of single stranded DNA has been demonstrated. If such long sequence runs could be read rapidly and accurately with no need for chemical reagents or the preparation of elaborate libraries, costs might be reduced to the point where personal genomes would become available for clinical use. Readouts based on the blockading of ion current have been able to resolve individual nucleotides and a single base trapped at a double-single strand junction in a hairpin but have not been able to read along a DNA molecule continuously. Very recently, we have shown that it is possible to identify individual bases and read along a DNA molecule using a technique we call Recognition Tunneling. Recognition molecules, covalently bound to electrodes, are used to transiently trap each base in turn through noncovalent bonds, giving distinct electronic signatures of all four bases and 5-methyl C. The trapping time with no external force applied to the DNA is long (seconds). However, unbinding is readily accelerated to very short times by the application of small forces, so Recognition Tunneling also provides a straightforward approach to translocation control. Here, we propose to combine Recognition Tunneling with nanopore translocation using metal or graphene nanopores, and metal or carbon nanotube reading electrodes, the probes and pores both being functionalized with recognition molecules. We will study translocation-control in functionalized, conducting nanopores, using both the bias across the pore and the surface potential of the conducting pore as control signals. Using a scanning-tunneling microscope (STM) platform, we will make measurements of Recognition Tunneling signals as DNA emerges from the nanopore. Multiscale (quantum to fluid-mechanical) simulations at Oak Ridge National Laboratory will help us to understand and optimize the translocation and readout processes. This understanding will be shared with collaborators who are developing nanopores with fixed (as opposed to STM) reading schemes with the ultimate goal of producing sequencing chips that are cheap and contain many thousands of devices.

Significant demand exists for the development of novel technologies capable of low-cost, high quality DNA sequencing. Established sequencing techniques based on capillary array electrophoresis and cyclic array sequencing offer such analytical capability, and next generation commercial sequencers deliver at a cost approaching $10,000/genome. One drawback is that these technologies identify in one segment (read) only about 10 to 1000 sequential base pairs out of the total 3 Gb in the human genome. The complex repetitive nature of DNA makes it costly and time consuming to completely and accurately reassemble a full genome. Recently, transmission electron microscopy (TEM) techniques have been proposed that label specific DNA bases with heavy atoms (e.g., osmium) and thus have the promise of significantly extending the length of individual reads. However, the accurate determination of the complete DNA sequence is complicated by the need for labeling and correlating the labeled and unlabeled bases. In addition, the relatively high electron energy used in high resolution TEMs causes radiation damage that leads to read errors and limits the usable electron dose. Electron Optica proposes to develop a novel electron microscope capable of imaging a DNA base sequence of unlimited length at a cost of $1,000/genome with the high accuracy needed for full-scale sequencing. In this technique, which we call monochromatic aberration-corrected dual-beam low energy electron microscopy, two beams illuminate the sample with electrons having energies from 0 to a few 100 eV, and the reflected electrons are utilized to form a magnified image. The microscope includes a monochromator and aberration corrector, and has the potential of delivering images of unlabeled DNA with nucleotide-specific contrast. This simplifies sample preparation and eases the computational complexity needed to assemble the sequence from individual reads. In addition, at low landing energies there is no radiation damage, so high electron doses needed for high throughput and low cost can be used. The proposed research will focus on the feasibility of the key aspects required for this approach, i.e., achieving high spatial resolution, high throughput and DNA base-specific contrast. A detailed analysis of the column optics including the aberration corrector and monochromator will be performed using state-of-the-art simulation software. Analysis of the electronic structure of DNA bases will be carried out theoretically and experimentally and the achievable contrast will be evaluated. The proposed research will develop a new approach to low cost, high quality genome sequencing needed to enable the use of genomic information in individual health care.

Massively parallel technologies have reduced the per-base cost of DNA sequencing by several orders of magnitude. However, limited read lengths and a lack of methods to establish contiguity over even modest distances have prevented these technologies from achieving the high-quality, low-cost de novo assembly of mammalian genomes. Even as revolutionary sequencing technologies further mature, it may continue to be the case that the best technologies in terms of cost-per-base yield reads that are of an insufficient length or quality for the effective de novo assembly of large genomes. To address this critical need, we are exploiting high density, random, in vitro transposition as a novel means of physically shattering genomic DNA in creative ways that facilitate the recovery of contiguity information at different scales. Our project is divided into four aims, the first three of which are respectively directed at the development of massively parallel methods for determining short-range, mid-range, and long-range contiguity. These are: 1) a method for shattering genomic DNA with symmetric tags that post hoc inform the ordering of adjacent fragmentation events in a way that is entirely independent of the primary sequence content; 2) a method for massively parallel, in vitro barcoding of fosmid or BAC-sized subsequences of a genome, thereby facilitating hierarchical assembly; 3) an in situ method for converting stretched DNA molecules into adaptor-flanked libraries, such that reads generated by massively parallel sequencing will remain linearly ordered in terms of the XY coordinates at which they originate. In the fourth aim, we will integrate these methods to demonstrate: 1) the highly cost-effective de novo assembly of the mouse genome with a quality that exceeds that of the original assembly; 2) the highly cost-effective haplotype resolved resequencing of a human genome.

While the cost of DNA sequencing has dropped significantly over the last few years due to the evolution of next-generation sequencing instruments, there still exists the need to produce new technologies that can significantly reduce sequencing cost and time and improve the level of automation to realize the ability of transitioning DNA sequencing into currently inaccessible areas, such as the clinic for in vitro diagnostics. In fact, reaching the goals mandated by the $1,000 Genome Project will provide the ability to use DNA sequencing as a de facto standard for looking at any sequence variation over the entire genome. The long term goal of this project is to generate a novel DNA sequencing platform that can substantially reduce the cost, labor and time associated with acquiring DNA sequencing information using a fully automated platform. The strategy uses nano-scale sensors that read the identity of mononucleotide bases from their characteristic flight-time through a 2-dimensional (2D) nanochannel (<10 nm in width and depth; >5 µm in length) fabricated in a thermoplastic, such as Plexiglas, via low-cost nanoimprint lithography and other replication-based techniques. The mononucleotide bases are generated from an intact DNA fragment (~50,000 bp) using a processive exonuclease, which is covalently anchored to a support contained within a bioreactor that feeds the mononucleotides into the 2D nanochannel. The identity of the mononucleotide is deduced from a molecular-dependent flight-time through the 2D nanochannel. The major focus of this R21 application is to develop a transduction modality that can measure the flight-time of mononucleotides through a 2D polymer nanochannel without requiring a reporter molecule covalently attached to the mononucleotide. The transducer to be investigated consists of 2 pairs of nanoelectrodes poised at each end of the 2D nanochannel with the signal resulting from perturbations in the conductivity induced by the mononucleotide. The sensing platform is produced from nanowires built using templating methods from anodized aluminum oxide materials and then, electrochemically thinned to the desired diameter (~10 nm). The wires are strategically placed on a nanofluidic chip using chemical patterns made via nanoimprint lithography with the required gap (<10 nm) generated via mechanical or chemical steps. The nanosensor chips are produced on a plastic module that can be integrated via novel interconnect technologies to other DNA processing modules to provide complete automation of the DNA sample processing pipeline. The envisioned DNA sequencing platform will produce ~1 x 106 nucleotide base reads s-1 when configured in an arrayed format, process an entire sample in a fully automated fashion with the cost of the modular fluidic system

The summary challenge for a next generation sequencing technology is: reading-out of letters in the genetic code corresponding to three billion base pairs in a human genome with no loss of information at very low read-out cost per genome. The sheer complexity of the task has led to one common approach in third generation sequencing technologies - discretization or piecemeal sequencing. These next generation methods include high-throughput massively parallel approaches, sequencing by synthesis, sequencing by degradation, sequencing by hybridization, nanopore based techniques, among others. In case of nanopore based sequencing technologies, both biological and solid state nanopores, another common strategy employed has been to significantly slow-down the translocation of DNA through a nanopore to aid the probing of the DNA chain. Strategies such as these, which have been employed to attain a handle on the sequencing challenge, in-turn increase the cost of sequencing the genome. We propose a novel nanopore device concept that is able to sequence the genome at very high translocation speeds. The novel technology does not require slowing- down of DNA translocation through nanopore but is inherently able to transduce base-sequence information at speeds of above 100,000 bases per second. Further the platform technology enables low cost of device manufacturing, potentially offering a high-speed very-low cost solution to sequencing. We propose the novel device concept, methods of fabrication, and exploratory studies towards realizing high speed DNA information readout. We further propose device physical modeling and numerical simulations from first principles to aid rational-design and to validate the proposed solution for $1000 genome.