The amplification of DNA fragments, cloned between user-defined 5′ and 3′ end sequences, is a prerequisite step in the use of many current applications including massively parallel sequencing (MPS). Here we describe an improved method, called homopolymer tail-mediated ligation PCR (HTML-PCR), that requires very little starting template, minimal hands-on effort, is cost-effective, and is suited for use in high-throughput and robotic methodologies. HTML-PCR starts with the addition of homopolymer tails of controlled lengths to the 3′ termini of a double-stranded genomic template. The homopolymer tails enable the annealing-assisted ligation of a hybrid oligonucleotide to the template's recessed 5′ ends. The hybrid oligonucleotide has a user-defined sequence at its 5′ end. This primer, together with a second primer composed of a longer region complementary to the homopolymer tail and fused to a second 5′ user-defined sequence, are used in a PCR reaction to generate the final product. The user-defined sequences can be varied to enable compatibility with a wide variety of downstream applications. We demonstrate our new method by constructing MPS libraries starting from nanogram and sub-nanogram quantities of Vibrio cholerae and Streptococcus pneumoniae genomic DNA.

Cloning DNA fragments as molecular libraries has become a core method used in many research, forensic and clinical settings. Common approaches for molecular library construction involve the ligation of double-stranded adapters of defined sequence to template DNA ends followed by PCR amplification (1-3). Due in part to the poor efficiency of the adapter ligation reaction, these techniques require large quantities of starting template DNA. Moreover, they are prone to the formation of adapter-dimers, an inhibitory side reaction that necessitates the purification of the DNA products of interest by gel electrophoresis and extraction. This requirement is a particular hindrance when such protocols are adapted for high-throughput robotic 96-well and 384-well plate based methods, thereby limiting the number of different libraries that are created at one time.

More recently, in vitro transposition has been used to facilitate library construction (4). This technology, referred to as Nextera, is marketed as a kit to create Illumina sequencing libraries. Sample DNA is first subjected to a transposition reaction in which the transposase inserts transposon end/Illumina sequence chimeric double-stranded DNA molecules into the sample. Two such transposition events, separated in the sample by roughly 50–500 nucleotides then serve as the template in a PCR reaction that creates the final library. Compared with adapter ligation protocols, Nextera has several major advantages. Its workflow is much faster; it is far less labor-intensive; and it is better suited to high-throughput methods. Nextera also requires significantly less starting template. The major disadvantages of this method are: (i) Nextera is expensive, (ii) kits are no longer marketed to construct libraries either for other sequencing platforms or for non-sequencing applications and since the technology is proprietary, homemade kits are not possible, and (iii) in order to obtain the correct number of transposition events separated by the appropriate distances, the ratio of transposase complexes to sample DNA is critical. For this reason, two different kits are available, one that uses 50 ng of template and another that uses 1 ng. Thus, the user must have accurate knowledge of sample concentration; however, for very dilute samples such knowledge may be inaccurate or lacking. A final disadvantage of the Nextera approach is, a requirement that the template DNA be at least 300 nucleotides in length (preferably longer) and hence, Nextera is not recommended for use in applications such as ChIP-seq (Nextera DNA Sample Preparation Guide; October 2011).

Here we developed a new method, HTML-PCR, which is unencumbered by the deficiencies associated with many other procedures. Compared with Nextera, it is more cost-effective, uses generally availablereagents, and can be used to generate libraries for applications in addition to Illumina sequencing. Furthermore, the same HTML-PCR protocol functions with template DNA concentrations that can vary by up to five orders of magnitude and with template molecules of no minimum length. Compared with adapter ligation protocols, HTML-PCR is more efficient and streamlined, and is less labor intensive. Adapters are not used, which avoids the problem of adapter-dimers and the need for gel purification in those applications where specific size ranges are not required. Thus, HTML-PCR is more compatible with high-throughput and robotic methods. In addition, the method uses an extremely efficient ligation reaction that is facilitated by annealing of an oligonucleotide to a homopolymer tail. As a result, HTML-PCR can be used to clone miniscule amounts of DNA that are below the amount of starting material needed for adapter ligation protocols.

Materials and methods

Figure 1 illustrates the application of HTML-PCR for capturing and amplifying double-stranded DNA. Sample DNA is first fragmented into a size range that is appropriate for the downstream application. The ends of the DNA are blunted and 5′ ends are phosphorylated to allow for later ligation. A homopolymer tail (e.g., poly(dC)) of controlled length is added to the 3′ termini using terminal deoxynucleotidyl transferase (TdT) and a mixture of deoxynucleotide triphosphate (e.g., dCTP) and chain-terminating dideoxynucleotide triphosphate (e.g., ddCTP). For oligo(dC) tailing, an average tail length of 20 was achieved by adjusting the ratio of dCTP to ddCTP to 19:1 (Supplementary Figure S1) (5).