Abstract:

Provided are methods for sequencing a nucleic acid that include fixing a
template to a surface through a template localizing moiety and sequencing
the nucleic acid with a sequencing enzyme, e.g. a polymerase or
exonuclease. The sequencing enzyme can optionally be exchanged with a
second sequencing enzyme, which continues the sequencing of the nucleic
acid. The template localizing moiety can optionally anneal with the
nucleic acid and/or associate with the sequencing enzyme. Also provided
are compositions comprising a nucleic acid fixed to a surface via a
template localizing moiety, and a first sequencing enzyme, which can
sequence the nucleic acid and optionally exchange with a second
sequencing enzyme present in the composition. Compositions in which a
template localizing moiety is immobilized on a surface are provided.
Compositions for sequencing reactions are provided. Also provided are
sequencing systems comprising reaction regions in which or near which
template localizing moieties are immobilized.

Claims:

1. A method of sequencing a nucleic acid, the method comprising:fixing a
template nucleic acid to a solid surface through a template localizing
moiety;sequencing a portion of at least one strand of the template
nucleic acid with a first sequencing enzyme;exchanging the first
sequencing enzyme with a second sequencing enzyme; and,continuing
sequencing of the strand with the second sequencing enzyme.

2. The method of claim 1, wherein the moiety topologically encircles the
template.

3-4. (canceled)

5. The method of claim 1, wherein the first sequencing enzyme is a first
polymerase, the second sequencing enzyme is a second polymerase, and the
template nucleic acid is a circular template nucleic acid.

6. The method of claim 5, further comprising sequencing the template
nucleic acid multiple times with a plurality of polymerases to generate a
single nucleic acid strand comprising multiple copies of a polynucleotide
complementary to the template nucleic acid.

7. A composition, comprising:a template nucleic acid tethered to a solid
surface through a template localizing moiety;a first sequencing enzyme,
wherein the first sequencing enzyme is capable of sequencing the template
nucleic acid; wherein the moiety permits the first sequencing enzyme to
be exchanged with a second sequencing enzyme present in the composition,
wherein the second sequencing enzyme is capable of continuing the
sequencing of the template nucleic acid.

8. The composition of claim 7, wherein the moiety comprises a polymer
comprising a polypeptide, a polynucleotide, one or more synthetic
structural units, or a combination thereof.

9-10. (canceled)

11. The composition of claim 8, wherein the polynucleotide comprises a
nucleotide sequence complementary to a portion of the template nucleic
acid.

12. The composition of claim 8, wherein the first sequencing enzyme is a
polymerase capable of strand displacement of the polynucleotide from the
template.

16. The composition of claim 7, wherein the first sequencing enzyme is a
first polymerase, the second sequencing enzyme is a second polymerase,
and the template nucleic acid is a circular template nucleic acid.

17-18. (canceled)

19. The composition of claim 7, wherein the sequencing enzyme is
non-covalently attached to the moiety.

20. (canceled)

21. The composition of claim 7, wherein the composition comprises one or
more fluorescently labeled nucleotide or nucleotide analogs that can
photodamage the sequencing enzyme.

22. A composition comprising a template nucleic acid and a template
localizing moiety that is not a sequencing enzyme immobilized on a planar
surface, in a well, or in a single molecule reaction region, wherein the
template localizing moiety encircles the template nucleic acid.

23. (canceled)

24. The composition of claim 22, wherein the moiety comprises a
polypeptide, a polynucleotide, one or more synthetic structural units, or
a combination thereof.

25-28. (canceled)

29. The composition of claim 22, wherein the moiety that topologically
encircles the template nucleic acid comprises a polynucleotide portion
that is complementary to at least a portion of the template nucleic acid.

30. (canceled)

31. The composition of claim 24, wherein at least some of the synthetic
structural units are polyethylene glycol units.

32. The composition of claim 22, wherein the single molecule reaction
region comprises a zero-mode waveguide.

33-35. (canceled)

36. The composition of claim 22, wherein the template nucleic acid is a
closed loop.

37. The composition of claim 22, wherein the composition comprises a
sequencing enzyme.

38. (canceled)

39. The composition of claim 37, wherein the sequencing enzyme is
covalently or non-covalently attached to the moiety.

40-45. (canceled)

46. The composition of claim 37, wherein the composition is a sequencing
reaction and further comprises a synthesis initiating moiety that
complexes with or is integral to the template nucleic acid.

47-63. (canceled)

64. A method of sequencing a template nucleic acid, the method
comprising:fixing a circular template nucleic acid to a solid surface
through a template localizing moiety;annealing an oligonucleotide primer
to the template nucleic acid;initiating template-directed nascent strand
synthesis by a polymerase that is not immobilized to the solid
surface;synthesizing a nascent strand complementary to the template
nucleic acid with the polymerasedetecting incorporations of nucleotides
into the nascent strand, wherein a temporal sequence of the
incorporations is indicative of the sequence of the nucleic acid.

65-68. (canceled)

69. The method of claim 64, further comprising sequencing the template
nucleic acid multiple times to generate a single nascent strand
comprising multiple copies of a polynucleotide complementary to the
template nucleic acid.

70. The method of claim 64, wherein the polymerase is a plurality of
polymerase enzymes, and further wherein only a single of the plurality is
engaged in the template-directed nascent strand synthesis on the template
nucleic acid at a given time.

Description:

CROSS-REFERENCE TO RELATED APPLICATIONS

[0001]This application claims the benefit of U.S. Provisional Application
No. 61/192,634, filed Sep. 19, 2008, the disclosure of which is
incorporated herein by reference in its entirety for all purposes.

BACKGROUND OF THE INVENTION

[0002]Nucleic acid sequence data is valuable in myriad applications in
biological research and molecular medicine, including determining the
hereditary factors in disease, in developing new methods to detect
disease and guide therapy (van de Vijver et al. (2002) "A gene-expression
signature as a predictor of survival in breast cancer," New England
Journal of Medicine 347: 1999-2009), and in providing a rational basis
for personalized medicine. Obtaining and verifying sequence data for use
in such analyses has made it necessary for sequencing technologies to
undergo advancements to expand throughput, lower reagent and labor costs,
and improve accuracy (See, e.g., Chan, et al. (2005) "Advances in
Sequencing Technology" (Review) Mutation Research 573: 13-40, and Levene
et al. (2003) "Zero Mode Waveguides for Single Molecule Analysis at High
Concentrations," Science 299: 682-686), the disclosures of which are
incorporated herein in their entireties for all purposes.

[0003]Single molecule real-time sequencing (SMRT) is a highly parallel
sequencing-by-synthesis technology that permits the simultaneous
surveillance of, e.g., thousands of sequencing reactions in arrays of
multiplexed detection volumes, e.g., zero-mode waveguides (ZMWs). (See
e.g., Levene et al. (2003) Zero-mode waveguides for single-molecule
analysis at high concentrations, Science 299:682-686; Eid, et al. (2009)
Real-Time DNA Sequencing from Single Polymerase Molecules, Science
323:133-138; Published U.S. Patent Application No. 2003/0044781; and U.S.
Pat. No. 6,917,726, the disclosures of which are incorporated herein in
their entireties for all purposes). Each detection volume in an array
creates an illuminated visualization chamber that is small enough to
observe the template-dependent synthesis of a single single-stranded DNA
molecule by a single DNA polymerase.

[0004]When a particular base in the template strand is encountered by the
polymerase during the polymerization reaction, e.g., in a ZMW, the enzyme
complexes with an available fluorescently labeled nucleotide or
nucleotide analog and incorporates that nucleotide or nucleotide analog
into the nascent growing nucleic acid strand. During this time, the
fluorophore emits fluorescent light whose color corresponds to the
nucleotide's or analog's base identity. The polymerase cleaves the bond
linking the fluorophore to the nucleotide or analog during the nucleotide
incorporation cycle, permitting the dye to diffuse out of the detection
volume. The signal returns to baseline, and the process repeats.

[0005]A single molecule sequencing reaction is typically localized to a
detection volume by immobilizing a DNA polymerase enzyme within or
proximal to the site at which the reaction takes place. Ideally, the
immobilized polymerase retains its activity and can be used repeatedly
and continuously in multiple sequencing reactions. However, it has been
observed that in some cases, the processivity, accuracy, and/or activity
of the polymerase enzyme can decrease. In particular, in at least some
cases, damage to the DNA polymerase, e.g., by exposure to optical energy
during fluorescent or chemiluminescent detection, can have a detrimental
effect on the enzyme's activity.

[0006]Current strategies for single molecule sequencing-by-synthesis
employ a polymerase that has been tethered within or proximal to a
reaction region within a detection volume, e.g., in a ZMW. What is needed
in the art are new methods and compositions that can maintain the
processivity, accuracy, and polymerase activity in, e.g., a
single-molecule sequencing reaction, while still localizing the
polymerization reaction to a defined observation volume. The invention
described herein fulfills these and other needs, as will be apparent upon
review of the following.

SUMMARY OF THE INVENTION

[0007]In certain aspects, the present invention provides methods and
related compositions useful for immobilizing a template nucleic acid (or
"nucleic acid template") at a reaction region. The compositions include a
template localizing moiety that is covalently attached to a surface,
e.g., a single molecule reaction region. The moiety can associate with a
template nucleic acid, e.g., a DNA, RNA, or analogs or derivatives
thereof, present in the composition and fix the template to the surface,
e.g., localizing the nucleic acid to the surface. A sequencing enzyme,
e.g., a polymerase, reverse transcriptase, exonuclease, etc., can
optionally associate with the template localizing moiety and perform
template-directed sequencing of the template nucleic acid. In preferred
embodiments, the sequencing enzyme can exchange with other sequencing
enzymes present in the composition without disrupting or terminating
sequencing of the template, thus permitting, e.g., a photodamaged
sequencing enzyme to exchange with a non-photodamaged sequencing enzyme.
Immobilizing a nucleic acid template via a template localizing moiety can
advantageously allow longer uninterrupted sequence reads in, e.g.,
synthesis- or degradation-based single-molecule sequencing reactions. In
certain aspects, the present invention provides methods and related
compositions useful for performing template-directed synthesis of a
nucleic acid. In certain aspects, the invention provides methods and
related compositions for performing exonuclease sequencing of a nucleic
acid.

[0008]Thus, in a first aspect, the invention provides methods of
performing template-directed synthesis of a nucleic acid that include
fixing a template nucleic acid to a solid surface through a template
localizing moiety, e.g., that topologically encircles the template. The
template localizing moiety can be a polymer, including but not limited to
a polypeptide (e.g., other than a polymerase to be used in the
template-directed synthesis reaction), polynucleotide, synthetic polymer,
and combinations thereof. The methods include synthesizing a nascent
strand from at least a portion of the template nucleic acid with a first
polymerase, exchanging the first polymerase with a second polymerase, and
continuing synthesis of the nascent strand with the second polymerase.
Optionally, exchanging the first polymerase can include exchanging a
photodamaged polymerase with a polymerase that is not photodamaged, and
synthesis can optionally be continued with the second, non-photodamaged
polymerase. Such embodiments can further comprise a template nucleic acid
that is circular. In certain preferred embodiments the template nucleic
acid is subjected to the template-directed synthesis reaction multiple
times with one or more polymerases to generate a single nucleic acid
strand comprising multiple copies of a polynucleotide complementary to
the template nucleic acid.

[0009]In a further aspect, the invention provides methods of performing
exonuclease sequencing of a nucleic acid that include fixing a template
nucleic acid to a solid surface through a template localizing moiety,
e.g. a polypeptide other than a polymerase or other polymer that
topologically encircles the template. The methods include degrading a
first strand of the template nucleic acid with a first exonuclease and
detecting the nucleotides so released, exchanging the first exonuclease
with a second exonuclease, and continuing degradative sequencing of the
first strand with the second exonuclease. Optionally, exchanging the
first exonuclease can include exchanging a photodamaged exonuclease with
an exonuclease that is not photodamaged, and degradation can optionally
be continued with the second, non-photodamaged exonuclease.

[0010]In a related aspect, the invention provides compositions that can be
used in the methods described above. The compositions include a template
nucleic acid tethered to a solid surface through a template localizing
moiety, e.g., a moiety that topologically encircles the template, and a
first sequencing enzyme capable of sequencing the template nucleic acid.
The template localizing moiety can comprise a polymer (natural or
synthetic), e.g., a polypeptide, polynucleotide, synthetic polymer, and
analogs, derivatives, mimetics, and combinations thereof. In certain
specific embodiments, the template localizing moiety comprises a protein,
e.g., a hexameric helicase, a PCNA, a T4 phage gp45 protein, or a β
subunit of a eubacterial DNA polymerase. In other specific embodiments,
the template localizing moiety comprises a polynucleotide comprising a
nucleotide sequence complementary to a portion of the template nucleic
acid, and the first sequencing enzyme is a polymerase capable of strand
displacement of the polynucleotide from the template nucleic acid. In
certain embodiments, the first sequencing enzyme is a first polymerase,
e.g., capable of synthesizing a nascent strand based on the nucleotide
sequence of the template nucleic acid, and the template localizing moiety
permits the first polymerase to be exchanged with a second polymerase
present in the composition without terminating template-directed
synthesis, e.g., the second polymerase is capable of continuing the
sequencing of the template nucleic acid. The polymerase can optionally
be, e.g., a DNA or RNA polymerase, e.g., a Klenow fragment, Φ29, AMV,
B103, GA-1, HIV-1 PZA, Φ15, BS32, M-MLV, M2Y, Nf, G1, Cp-1, PRD1,
PZE, SF5, Cp-5, Cp-7, PR4, PR5, PR722, L17, T4, an Archeal, an Eukaryal,
or an Eubacterial polymerase, or mutations or modified versions thereof.
Optionally, the template nucleic acid may be single-stranded or circular,
and in some preferred embodiments is both single-stranded and circular.
Optionally, the polymerase present in the compositions can be
non-covalently attached to the template localizing moiety.

[0011]The compositions can optionally include ATP, CTP, GTP, TTP, UTP or
ITP, which can modulate the rate of polymerization in a
concentration-dependent manner, e.g., when the template localizing moiety
and the polymerase participate in a template-dependent polymerization
reaction. The compositions can optionally include one or more
fluorescently labeled nucleotides or nucleotide analogs that can
photodamage the polymerase. In some embodiments, the template localizing
moiety is not susceptible to photo-induced damage caused by the one or
more fluorescently labeled nucleotide or nucleotide analogs.

[0012]Compositions that include a template localizing moiety immobilized
on a planar surface, in a well, or in a single molecule reaction region,
e.g., a zero-mode waveguide are also provided by the invention. The
immobilized moiety can optionally comprise, e.g., a polymer (e.g.,
natural or synthetic) including but not limited to a polynucleotide
and/or a polypeptide, e.g., a protein other than a polymerase, such as a
processive nuclease, a single-strand binding protein (SSBP), a helicase,
a DNA repair enzyme, a DNA processivity factor, or a protein that
non-specifically binds a double-stranded nucleic acid. The template
localizing moiety can optionally topologically encircle a template DNA
strand when a DNA strand is present in the composition. The template
localizing moiety that topologically encircles the template can
optionally comprise a PCNA, a T4 phage gp45 protein, a β subunit of
a eubacterial polymerase, one or more synthetic structural units, and/or
a polynucleotide, where the polynucleotide optionally comprises a portion
that is complementary to at least a portion of the template nucleic acid.
In certain preferred embodiments, the template localizing moiety that
topologically encircles the template comprises at least one
polynucleotide portion and at least one portion comprising synthetic
structural units, e.g., at least some of which are polyethylene glycol
units. The compositions can optionally include a template DNA, e.g., a
single-stranded DNA and/or a closed loop of DNA, which the template
localizing moiety can associate with and/or retain, and fix to the planar
surface, in a well, or in a single molecule reaction region, e.g.,
comprising a zero-mode waveguide.

[0013]Compositions in which a template localizing moiety is immobilized to
a planar surface, well, or single-molecule reaction region can optionally
include a sequencing enzyme, e.g., an exonuclease (e.g., T7 exonuclease,
lambda exonuclease, mung bean exonuclease, ExoI, Exo III, Exo IV, ExoVII,
exonuclease of Klenow fragment, exonuclease of PolI, Taq exonuclease, T4
exonuclease, etc.) or DNA polymerase (e.g., a Klenow fragment, Φ29,
B103, GA-1, PZA, Φ15, BS32, M2Y, Nf, G1, Cp-1, PRD1, PZE, SF5, Cp-5,
Cp-7, PR4, PR5, PR722, or L17 polymerase.) Optionally, the sequencing
enzyme can be non-covalently attached to the moiety, or it can be
covalently attached to the moiety, e.g., via a DNA polymerase's
C-terminal end. The template localizing moiety can optionally improve the
accuracy and/or processivity of the sequencing enzyme, when the moiety
and the sequencing enzyme participate in a nucleic acid sequencing
reaction, e.g., a sequencing-by-synthesis reaction or degradation-based
sequencing reaction. These compositions can optionally include ATP, CTP,
GTP, TTP, UTP or ITP, and/or one or more fluorescently labeled
nucleotides or nucleotide analogs, as described above.

[0014]In certain embodiments, the invention provides sequencing reactions
that include a nucleic acid template, a synthesis initiating moiety that
complexes with or is integral to the template, a DNA polymerase, and a
template localizing moiety immobilized on a substrate, e.g., a planar
surface, well, or single molecule reaction region, e.g., a zero mode
waveguide. The DNA polymerase of the sequencing reaction can optionally
associate with the immobilized template localizing moiety. The polymerase
and the template localizing moiety can optionally be non-covalently
attached. Optionally, the DNA polymerase can be covalently attached to
the moiety, e.g., via the polymerase's C-terminal end.

[0015]In certain embodiments, the invention provides sequencing reactions
that include a nucleic acid template, a synthesis initiating moiety that
complexes with or is integral to the template, a DNA polymerase, a
template localizing moiety immobilized on a substrate, which can comprise
a planar surface, a well, and/or a single molecule region, e.g., a
zero-mode waveguide. In certain embodiment, the sequencing reactions
provided herein further comprise a luciferase-based detection system for
monitoring pyrophosphate release. The DNA polymerase or components of the
luciferase-based detection system (e.g., luciferase, sulfurylase, etc.)
can optionally associate (covalently or non-covalently) with the
immobilized template localizing moiety.

[0016]The sequencing reactions provided by the invention can optionally
include one or more fluorescently labeled nucleotides or nucleotide
analogs. A polymerase present in the sequencing reaction can optionally
synthesize a complementary nascent strand from at least a portion of the
template in a template-dependent matter, optionally incorporating one or
more fluorescently labeled nucleotides or nucleotide analog into the
resulting nascent strand. In certain embodiments, the sequencing reaction
comprises a pool of nucleic acid templates, and optionally, the template
localizing moiety (or plurality thereof) comprises a polynucleotide
complementary to only one or a subset of the nucleic acid templates in
the pool. The polymerase can be non-covalently or covalently attached to
the template localizing moiety, e.g., at a C-terminal portion of the
polymerase.

[0017]In a related aspect, the invention provides sequencing systems that
include a reaction region, e.g., a planar surface, one or more well, or
one or more single molecule reaction region, and a template localizing
moiety immobilized within or proximal to the reaction region. Optionally,
the single-molecule reaction region included in the systems can be a
zero-mode waveguide. Optionally, the systems can include a sequencing
enzyme (e.g., a polymerase or nuclease) in the reaction region. The
template localizing moiety in the systems can optionally be configured to
interact with a sequencing enzyme, when a sequencing enzyme is present in
the reaction region. The sequencing enzyme and the template localizing
moiety can optionally be covalently attached or non-covalently attached,
as described above.

[0018]The systems of the invention also include a detector configured to
detect a sequencing product formed in the reaction region. A sequencing
product of the invention includes but is not limited to a newly
synthesized nucleic acid strand ("nascent strand"), released
pyrophosphate, and nucleotides released by exonuclease degradation. The
detector can optionally be configured to detect fluorescent light from
one or more fluorophores that is, e.g., linked to a nucleotide or
nucleotide analog. The system can optionally comprise an epi fluorescent
detector.

[0019]In a further aspect, the invention provides a method of sequencing a
template nucleic acid that includes fixing a circular template to a solid
surface through a template localizing moiety, annealing an
oligonucleotide primer to the template nucleic acid, initiating
template-directed nascent strand synthesis by a polymerase that is not
immobilized to the solid surface, and detecting incorporations of
nucleotides into the nascent strand. A temporal sequence of the
incorporations is indicative of the sequence of the nucleic acid.
Optionally, the incorporations are detected by monitoring signals from
detectable labels linked to the nucleotides as they are being
incorporated into the nascent strand, e.g., where the type of detectable
label corresponds to the base composition of a nucleotide. Preferably,
the detectable labels are removed during incorporation resulting in a
nascent strand that does not comprise the detectable labels. Optionally,
the incorporations are detected using a luciferase-mediated detection
system. In certain preferred embodiments, the template localizing moiety
topologically encircles the template nucleic acid. In some embodiments,
the template nucleic acid is a single-stranded nucleic acid molecule. The
sequencing methods can further comprise sequencing the template nucleic
acid multiple times to generate a single nascent strand comprising
multiple copies of a polynucleotide complementary to the template nucleic
acid. Further, in some embodiments the polymerase is a plurality of
polymerase enzymes, wherein only a single polymerase enzyme is engaged in
template-directed nascent strand synthesis on a single template at a
given time.

[0020]Those of skill in the art will appreciate that the methods provided
by the invention for sequencing of a nucleic acid, e.g., a DNA, can be
used alone or in combination with any of the compositions described
herein. DNA sequencing systems that include any of the compositions
described herein are also a feature of the invention. Such systems can
optionally include detectors, array readers, excitation light sources,
and the like.

[0021]The present invention also provides kits that incorporate the
compositions of the invention. Such kits can include, e.g., a template
localizing moiety packaged in a fashion to permit its covalent binding to
a surface of interest. Alternatively, the surface bound template
localizing moieties can be provided as components of the kits, or the
surface can be provided with binding partners suitable to bind the
template localizing moieties, which are optionally packaged separately.
Instructions for making or using surface bound template localizing
moieties are an optional feature of the invention.

[0022]Such kits can also optionally include additional useful reagents
such as one or more nucleotide analogs, e.g., for sequencing, nucleic
acid amplification, or the like. For example, the kits can include a DNA
polymerase packaged in such a manner as to enable its use with the
template localizing moiety, a set of different nucleotide analogs of the
invention, e.g., those that are analogous to A, T, G, and C, e.g., where
one or more of the analogs comprise a detectable moiety, to permit
identification in the presence of the analogs. The kits of the invention
can optionally include natural nucleotides, a control template, and other
reagents, such as buffer solutions and/or salt solutions, including,
e.g., divalent metal ions, i.e., Mg++, Mn++ and/or Fe++,
standard solutions, e.g., dye standards for detector calibration, etc.
Such kits can optionally include various sequencing enzymes (e.g., one or
more polymerases or nucleases), and components required for detection of
a sequencing product, e.g., luciferase-based detection system. Such kits
also typically include instructions for use of the compounds and other
reagents in accordance with the desired application methods, e.g.,
nucleic acid sequencing, nucleic acid labeling, amplification and the
like.

BRIEF DESCRIPTION OF THE DRAWINGS

[0023]FIG. 1 provides a schematic depiction of a surface-immobilized
template localizing moiety fixing a template nucleic acid to the surface
by topologically encircling the template.

[0024]FIG. 2 provides a schematic depiction of a surface immobilized
template localizing moiety that has fixed a closed nucleic acid loop
within a single molecule reaction region.

[0025]FIG. 3 depicts a template-directed synthesis reaction in which a
first polymerase exchanges with a second polymerase without terminating
the reaction.

[0026]FIG. 4 provides a schematic depiction of an alternate embodiment of
the compositions in which a polymerase is covalently bound to a
surface-immobilized template localizing moiety.

[0027]FIG. 5 provides a schematic depiction of a polynucleotide-containing
template localizing moiety that is complementary to a region of a
single-stranded, circular template nucleic acid and that forms a single
loop over the template upon dissociation.

[0028]FIG. 6 provides a schematic depiction of a polynucleotide-containing
template localizing moiety that is complementary to a region of a
single-stranded, circular template nucleic acid and that forms multiple
loops around the template upon dissociation.

[0029]FIG. 7 provides a schematic depiction of a polynucleotide-containing
template localizing moiety that is complementary to a region of a
template nucleic acid that comprises regions of internal complementarity.

DETAILED DESCRIPTION

Overview

[0030]Analysis of small reaction volumes, e.g., single-analyte molecule
reactions, is becoming increasingly important in high throughput
applications, e.g., in nucleic acid sequencing. However, decreases in the
activity of individual sequencing enzyme molecules over time, can have a
detrimental effect on the real time analysis of the activity of such
sequencing enzymes, e.g., in a single-molecule sequencing reaction. The
present invention is generally directed to compositions, methods, systems
and kits that can be beneficially used to localize a sequencing enzyme to
a reaction region, e.g., a ZMW, without necessarily immobilizing the
sequencing enzyme itself, within or proximal to the reaction region. For
example, a template localizing moiety, e.g., that is capable of
interacting with a sequencing enzyme, can be immobilized on a solid
surface, e.g., on a surface, a well, or a single-molecule reaction
region, and can be used to fix a nucleic acid template to the surface
(see FIG. 1). For example, in certain preferred embodiments, the methods,
compositions, and systems described herein are used with single-molecule
sequencing technologies, in particular those described in U.S. Pat. No.
7,056,661; Eid, et al. (2009) Science 299:682-686; and Korlach, et al.
(2008) Nucleosides, Nucleotides and Nucleic Acids 27:1072-1083, all of
which are incorporated herein by reference in their entireties for all
purposes.

[0031]As used herein, a "template localizing moiety" is a moiety
comprising, e.g., a natural or synthetic polymer, such as a protein other
than a polymerase, or any of the discrete materials described herein,
that can associate with and/or retain a template nucleic acid (e.g.,
comprising DNA, RNA, or analogs or derivatives thereof) and fix it to,
e.g., the surface on which the moiety itself has been immobilized. In
some embodiments, a template localizing moiety can form a complex with a
sequencing enzyme in a manner that permits the activity of the sequencing
enzyme on the template. In some embodiments, a template localizing moiety
can improve the processivity of a sequencing enzyme, and such moieties
can include, e.g., a wide variety of DNA replication factors and/or DNA
repair factors, as discussed hereinbelow.

[0032]Although certain descriptions of the invention herein are primarily
focused on template-dependent sequencing-by-synthesis methods that
monitor incorporation of labeled nucleotide analogs into a nascent
strand, it will be clear to one of ordinary skill upon review of the
instant disclosure that the template localizing moieties can be used to
immobilize template nucleic acids in myriad analytical reactions,
including but not limited to exonuclease sequencing, pyrosequencing,
nanopore-based sequencing, ligase-mediated sequencing, binding assays,
and amplification-based methods. Such methods of known in the art and are
further described, e.g., in WO/1994/023066; U.S. Pat. Nos. 5,516,633,
5,622,824, 5,750,341, 5,795,782, 5,969,119, 6,210,891, 6,258,568,
6,306,597, and 7,485,425; U.S. Ser. No. 61/186,661, filed Jun. 12, 2009;
and U.S. Patent Publication Nos. 2007115205 and 20090131642, the
disclosures of which are incorporated herein by reference in their
entireties for all purposes.

[0033]As shown in FIG. 1, template localizing moiety 110 is immobilized
within single molecule reaction region 100. Moiety 110 can fix template
nucleic acid 120 to single molecule reaction region 100 to produce
composition 130. In some embodiments of the compositions provided by the
invention, the moiety topologically encircles the template, e.g.,
surrounds and encloses the template. For example, template localizing
moiety 110 topologically encircles template 120 such that template 120
passes through moiety 110 not unlike a thread passes through the eye of a
needle.

[0034]The template nucleic acid of the compositions, e.g., a DNA or an
RNA, can be linear (see FIG. 1) or, in preferred embodiments, it can be
circular, e.g., form a "closed loop" wherein each nucleotide is
covalently joined to the nucleotides preceding and following it (see FIG.
2). As shown in FIG. 2, template localizing moiety 210 topologically
encircles circular template nucleic acid 220, fixing it within single
molecule reaction region 200. Closed nucleic acid loops that are fixed
within or proximal to a reaction region, e.g., a ZMW, through a
surface-immobilized template localizing moiety will not diffuse out of
the reaction region as readily as linear templates. This orientation of a
template nucleic acid is particularly useful for redundant sequencing
applications in which a single template is subjected to a sequencing
reaction multiple times to generate multiple replicate nucleotide
sequences that correspond (e.g., are identical or complementary) to the
template nucleic acid. For example, a rolling-circle
sequencing-by-synthesis reaction can be performed in which a polymerase
capable of strand displacement repeatedly processes a circular template
to synthesize a long, concatemeric nascent strand. The synthesis of the
nascent strand is monitored to generate a long nucleotide sequence "read"
for the nascent strand that contains multiple copies of a sequence
complementary to the template strand, and this read is subjected to
statistical analysis to determine the sequence of the template strand.
Such rolling-circle synthesis can be used in other sequencing
technologies, as well, such as pyrosequencing methods.

[0035]Typically, single molecule sequencing-by-synthesis reactions take
place in the presence of one or more fluorescently labeled nucleotides
and/or nucleotide analogues. In general, the incorporation or release of
the fluorescent label can be used to indicate the presence and
composition of a growing nucleic acid strand, e.g., providing evidence of
template-directed synthesis and/or the sequence of the nascent strand
being synthesized, and by complementarity, the sequence of the template
nucleic acid. As shown in FIG. 3, template localizing moiety 310, which
has been immobilized within single molecule reaction region 300, has
associated with and topologically encircled nucleic acid template 301,
fixing it within the reaction region. Polymerase 330 can diffuse into the
reaction region to initiate template-directed synthesis of a nascent
strand that is complementary to at least a portion of a strand of
template 301 to produce nascent strand 340. As used herein, a "nascent
strand" is a nucleic acid molecule that is synthesized by a polymerase
enzyme during the processing of a strand of a template nucleic acid.
Although it is sometimes termed a "copy" of the template strand, the
nascent strand actually comprises a sequence complementary to that of the
strand of the template nucleic acid. Likewise, template-directed
synthesis of a template nucleic acid is sometimes referred to as
"replication" of the template nucleic acid, although the nascent strand
synthesized is complementary rather than identical to the template
nucleic acid. As such, one of ordinary skill will recognize that
reference to "replication" of a template nucleic acid includes synthesis
of a nascent strand complementary to the template strand.

[0036]Over time, a polymerase's activity and fidelity can decrease. For
example, prolonged exposure of a polymerase, e.g., polymerase 330, to the
optical energy of the fluorescently labeled nucleotides or nucleotide
analogues that are incorporated into a nascent and growing nucleic acid,
e.g., nascent strand 340 can reduce the enzyme's processivity, accuracy,
and polymerase activity over time (see composition 350, which includes
inactive polymerase 335). Other environmental factors that can lead to
polymerase inactivation include, e.g., oxidation, degradation, and the
like. Inactive polymerase 335 dissociates from the template 301 and can
exchange with active polymerase 345 without terminating the sequencing
read, e.g., the polymerase-mediated processing of template 301 can
reinitiate upon association with a second polymerase, e.g., active
polymerase 345, to the immobilized template 301. Typically, nascent
strand 340 remains in single molecule reaction region 300 during such a
polymerase exchange so that active polymerase 345 can continue
incorporating nucleotides into nascent strand 340, e.g., using 301 as a
template. In certain embodiments, nascent strand 340 can be removed from
template 301 prior to reinitiation of template-directed synthesis by
active polymerase 345, e.g., by heat-denaturation, chemical treatment,
high salt concentration, etc. Since nascent strand 340 is held in
reaction region 300 only by association with template nucleic acid 301,
disruption of that association facilitates removal of nascent strand 340
from reaction region 300.

[0037]Optionally, a template localizing moiety can also form a complex
with a sequencing enzyme, e.g., to bring the sequencing enzyme to a
portion of the template that is at a reaction site and/or within an
observation (or detection) volume. For example, in certain embodiments of
the compositions (see FIG. 4), a polymerase, e.g., polymerase 400 can be
covalently attached to the surface-immobilized template localizing
moiety, e.g., moiety 410, e.g., via the polymerase's C-terminal end,
e.g., polymerase C-terminal end 420. Alternatively, an exonuclease can be
brought into proximity to a terminal portion of a template nucleic acid.
However, in preferred embodiments of the compositions, a sequencing
enzyme associates with the moiety in a non-covalent manner. Optionally, a
sequencing enzyme can bind the template tethering moiety via a reversibly
cleavable linker, e.g., a linker that can reform with a new sequencing
enzyme. This permits the sequencing enzyme to exchange with other
sequencing enzymes present, e.g., in a sequencing reaction mix, without
terminating the sequencing reaction. In yet further embodiments, a
sequencing enzyme can be covalently or non-covalently attached to a
linker bound to the surface, and in certain preferred embodiments such a
linker is a cleavable linker that allows release of a sequencing enzyme,
e.g., to facilitate exchange with another sequencing enzyme in the
reaction mixture. In certain embodiments in which a multisubunit
sequencing enzyme is used, all or only one or a subset of subunits can be
attached to the template localizing moiety and/or the surface. For
example, HIV reverse transcriptase is a heterodimer and only one of the
subunits need be attached to the template localizing moiety and/or
surface in order to maintain the enzyme at the reaction site. A
reversible attachment, e.g., a photocleavable linker, can be used to
facilitate sequencing enzyme exchange during the course of the reaction.

[0038]The compositions of the invention rely on a surface-immobilized
template localizing moiety, rather than a surface-immobilized sequencing
enzyme, to localize a sequencing reaction, e.g., template-directed
synthesis or exonuclease degradation reaction, to a defined reaction
region. Sequencing reactions that include the provided compositions,
e.g., compositions in which a first, e.g., less active or inactive,
sequencing enzyme can be exchanged with a second, e.g., active,
sequencing enzyme are not terminated when a sequencing enzyme's activity,
processivity, and fidelity decreases, e.g., as a result of the exposure
to optical energy of fluorescently labeled nucleotides and/or nucleotide
analogs. As a result, the methods and systems of the invention, in which
the compositions described above can be used, can beneficially increase
sequence throughput and improve the accuracy of sequence data. Moreover,
the invention can advantageously lower fabrication and reagent costs (see
FIG. 1). For example, an array of single molecule reaction volumes in
which individual sequencing enzymes have been immobilized is no longer
useful after the sequencing enzymes have become inactive. However, an
array of single molecule reaction regions in which individual template
localizing moieties have been immobilized, e.g., FIG. 1, array 140, can
be used repeatedly and continuously.

[0039]Further, in embodiments in which the sequencing enzyme is not
tethered to the surface or the template localizing moiety, the sequencing
enzyme activity may be enhanced by virtue of the lack of a physical
linkage to the sequencing enzyme. For example, a polymerase enzyme that
is free in solution is not hindered by being directly tethered to a
surface or template localizing moiety, which may interfere with
conformational changes required for template-directed synthesis, e.g.,
due to torsional stress, electrostatic interference, or steric hindrance
caused by the linking moiety, potentially causing a decrease in activity,
processivity, or accuracy of the enzyme. Further, a polymerase that is
free in solution can be a more "natural" polymerase than a polymerase
comprising structural alterations required for binding to the surface. In
addition, a potential source of experimental variation is eliminated
since there can be no variation due to differences in sequencing enzyme
immobilization chemistry between different reaction sites on the same or
different surfaces.

Further Details Regarding Template Localizing Moieties

[0040]The compositions of the invention rely on a surface-immobilized
template localizing moiety, rather than a surface-immobilized polymerase,
to localize a sequencing reaction, e.g., template-directed synthesis or
exonuclease sequencing reaction, to a defined reaction region. This
configuration can beneficially increase read lengths and improve the
accuracy of the sequencing data produced by e.g., a single molecule
sequencing reaction, as it permits the exchange of a first, e.g.,
inactive, e.g., photodamaged, sequencing enzyme with a second, e.g.,
active, e.g., non-photodamaged, sequencing enzyme present in, e.g., a
sequencing reaction mix, without terminating nucleic acid sequencing
(e.g., a template-directed synthesis reaction can proceed anew when an
active polymerase replaces a polymerase whose activity has decreased as a
result of prolonged exposure to the optical energy of fluorescently
labeled nucleotides and/or nucleotide analogs in the sequencing reaction
mix.) Advantageously, the compositions of the invention can decrease
reagent use and lower the fabrication costs of, e.g., ZMW arrays used in
high-throughput single-molecule sequencing systems.

[0041]In some aspects, a template localizing moiety can comprise, e.g., a
polymer, and/or any discrete material that can be coupled/associated, at
least temporarily, to or with a nucleic acid, e.g., a DNA or an RNA. Such
a polymer can comprise natural structural units (e.g., nucleotides, amino
acids, sugars, etc.), or synthetic structural units (e.g., styrene,
ethylene, propylene, etc.), or modifications and/or combinations thereof.
For example, such a polymer can comprise one or more polynucleotides,
polypeptides, polysaccharides, polystyrene, polyethylene (e.g.,
polyethylene glycol, Spacer 18, etc.), polypropylene, polymer beads,
silica beads, ceramic beads, glass beads, magnetic beads, metallic beads,
and organic resin beads can be used to localize a template nucleic acid
to a defined reaction region. Such template localizing moieties can have
essentially any shape, e.g., spherical, helical, spheroid, rod shaped,
cone shaped, disk shaped, cubic, polyhedral or a combination thereof. In
preferred embodiments, the template localizing moiety topologically
encircles the template nucleic acid. Optionally, the shape of a template
localizing moiety can also be used to orient the moiety in the relevant
well, e.g., to ensure that the immobilized nucleic acid is accessible to
a sequencing enzyme and can be used as a template in, e.g., a sequencing
reaction. Template localizing moieties can optionally be coupled to any
of a variety of reagents that facilitate surface attachment of the
nucleic acid, e.g., a DNA or an RNA.

[0042]In certain preferred embodiments, a template localizing moiety can
function not only to localize the template to a reaction region, but also
to effectively trap the sequencing enzyme in the observation or detection
volume of the reaction region. Take, for example, a template localizing
moiety large enough to allow passage of a template, but too small to
allow passage of a polymerase. Upon encountering the template localizing
moiety, A polymerase translocating on the template would be spatially
constrained at the template localizing moiety due to the inability to
"follow" the template through the template localizing moiety. Therefore,
continued translocation along the template would require the template be
pulled through the template localizing moiety by the polymerase enzyme.
Such template localizing moieties can comprise various types of polymers,
including but not limited to polynucleotides, polypeptides,
polysaccharides, and other synthetic polymers. Specific examples using
such template localizing moieties comprising polynucleotides and
combinations of natural and synthetic polymers are provided below.

[0043]Template localizing moieties of the invention can essentially be any
discrete material that can be immobilized, e.g., on a planar surface, in
a well, or in a single molecule reaction region, e.g., a ZMW. Desirably,
the material(s) that comprises a template localizing moiety permit the
moiety to associate with a template in such a manner that maintains or
increases a sequencing enzyme's processivity, e.g., in degrading the
template or performing template-directed nascent strand synthesis.
Examples of such materials can include polymer beads or particles (e.g.,
polystyrene, polypropylene, latex, nylon and many others), silica or
silicon beads, ceramic beads, glass beads, magnetic beads, metallic beads
and organic compound beads. An enormous variety of particles that can be
used to fix a template to or near a defined reaction region are
commercially available, e.g., those typically used for chromatography
(see, e.g., Catalogs from Sigma-Aldrich (Saint Louis, Mo.), Supelco
Analytical (Bellefonte, Pa.; sold, e.g., through Sigma-Aldrich), as well
as those commonly used for affinity purification (e.g., the various
magnetic Dynabeads®, which commonly include coupled reagents) supplied
e.g. by Invitrogen. For a discussion of matrix materials see also, e.g.,
Hagel et al. (2007) Handbook of Process Chromatography, Second Edition
Development, Manufacturing, Validation and Economics, Academic Press; 2nd
edition ISBN-10: 0123740231; Miller (2004) Chromatography: Concepts and
Contrasts Wiley-Interscience; 2nd edition ISBN-10: 0471472077; Satinder
Ahuja (2002) Chromatography and Separation Science (SST) (Separation
Science and Technology Academic Press, ISBN-10: 0120449811; Weiss (1995)
Ion Chromatography VCH Publishers Inc.; Baker (1995) Capillary
Electrophoresis John Wiley and Sons; Marcel Dekker and Scott (1995)
Techniques and Practices of Chromatography Marcel Dekker, Inc.

[0044]In preferred embodiments of the compositions described herein, a
template localizing moiety comprises a polypeptide, preferably a protein
other than a polymerase used to synthesize a polynucleotide complementary
to the template nucleic acid, that can be attached to, e.g., a planar
surface, a well, or a single-molecule reaction region, e.g., a ZMW, in an
orientation that preserves its nucleic acid-binding activity and,
optionally, its sequencing enzyme binding activity, wherein the protein
is configured to form a complex with a sequencing enzyme. Proteins that
can optimally be used as template localizing moieties in the methods,
compositions, systems, and kits of the invention include a wide variety
of DNA replication factors, DNA repair factors, and/or transcription
factors e.g., a processive nuclease, a single-strand binding protein
(SSBP), a helicase, a DNA repair enzyme, a polymerase mutant, fragment,
or subunit thereof that lacks nascent strand synthesis activity but is
able to translocate along a template nucleic acid, a DNA processivity
factor, e.g., a helicase, or a protein that non-specifically binds a
double-stranded nucleic acid--essentially any protein or protein mutant
that can associate with a template nucleic acid and not interfere with an
ongoing sequencing reaction. For example, human oxoguanine DNA
glycosylase 1 (hOgg1), which is a DNA glycosylase/apurinic (AP) lyase
(see, e.g., Klungland, et al. (2007) DNA Repair (Amst) 6(4): 481-8, which
is incorporated herein by reference in its entirety for all purposes) or
homologs thereof, including yeast Ogg proteins (e.g., yOgg1 or yOgg2), E.
coli Mut proteins (e.g., MutM (FPG protein), and others known in the art.
Further, multiple such proteins may be bound at a single reaction site to
immobilize a single template molecule.

[0045]As described above, a template localizing moiety of the compositions
preferably fixes a template nucleic acid to, e.g., a single molecule
reaction region by topologically encircling the template (see, e.g., FIG.
2 and corresponding description). For example, DNA polymerase sliding
clamp proteins can be beneficially included in the compositions of the
invention. Sliding clamps are a family of multimeric ring-shaped DNA
polymerase processivity factors that play essential roles in DNA
metabolism (reviewed in, e.g., Barsky, et al. (2005) "DNA sliding clamps:
just the right twist to load onto DNA." Curr Biol 15: R989-92 and
Indiani, et al. (2006) "The replication clamp-loading machine at work in
the three domains of life." Nat Rev Mol Cell Biol 7: 751-761). Sliding
clamp proteins have been identified in Bacteria, e.g., the β clamp
of E. coli DNA polymerase III; Archea, e.g., archeal PCNA; and Eukarya,
e.g., eukaryal PCNA; as well as in viruses and phages, e.g., T7 gp45.

[0049]Hexameric helicases are another class of template localizing
moieties that can be beneficially included in the methods, compositions,
kits, and systems of the invention to, e.g., fix a template nucleic acid
to a surface. Helicases can also form a processive complex with a DNA
polymerase during processing of the template in, e.g., a sequencing
reaction. Hexameric helicases, e.g., E. coli DnaB and Rho, T4 gp41, and
T7 gp4, are a class of NTP-dependent motor proteins that play a role DNA
metabolism. Hexameric helicases have a characteristic ring-shaped
structure, and these enzymes typically move along the phosphodiester
backbone of the nucleic acid to which they are bound, using the energy
produced by nucleic acid-stimulated NTP hydrolysis to translocate along
the nucleic acid while catalyzing the unidirectional, processive
separation of two strands of a complementary nucleic acid duplex. Recent
structural studies have indicated that a single strand of a DNA duplex
passes through the hexamer channel (Enemark, et al. (2006) "Mechanism of
DNA translocation in a replicative hexameric helicase," Nature 442
270-275).

[0050]A hexameric helicase can optimally be used with a non-processive,
non-strand-displacing polymerase, e.g., a Klenow fragment, in, e.g., a
sequencing reaction. In certain embodiments that include a hexameric
helicase, the concentration of NTP present in. e.g., a sequencing
reaction mix, can modulate the rate at which the helicase catalyzes the
unwinding of a double-stranded DNA template. This, in turn, can modulate
the sequencing rate of, e.g., a non-strand displacing polymerase in a
template-directed synthesis reaction.

[0052]In preferred embodiments of the compositions described herein, a
template localizing moiety comprises a polynucleotide, i.e., a
polynucleotide other than the template, that can be attached to, e.g., a
planar surface, a well, or a single-molecule reaction region, e.g., a
ZMW, in an orientation that allows it to constrain a template to which it
is initially annealed even after it has been displaced from the template,
e.g., by a translocating polymerase enzyme on the template.
Polynucleotides that can optimally be used as template localizing
moieties in the methods comprise a central region that is complementary
to at least one region of the template to be immobilized and two end
regions that associate with a surface of a reaction region such that when
bound to the surface the template localizing moiety loops over and
optionally completely around the template, thereby localizing it to the
reaction site. The template can move through the loop(s) formed by the
template localizing moiety, but cannot diffuse away from the reaction
region unless either an end of the template localizing moiety is
dissociated from the reaction region or an end of the template passes
through the loop. As such, although linear templates can be used with
such polynucleotide template localizing moieties, in certain embodiments
a circular template is preferred since a circular template can be
repeatedly processed at a reaction region without "slipping out" of the
template localizing moiety. Further, if a polymerase dissociates from the
template nucleic acid, a second polymerase can bind the template and
continue template-directed synthesis using the same template nucleic acid
at the same reaction region. Since the polymerase is not covalently
tethered, it can readily dissociate and exchange with another polymerase
in the reaction mixture. As such, a damaged polymerase can be replaced by
an undamaged polymerase, thereby allowing stalled synthesis to continue
on the same template nucleic acid. Data generated by template-directed
synthesis using a single template nucleic acid by multiple polymerases
can thereby be generated and collected sequentially, and subjected to
statistical analysis to determine a sequence of the template nucleic
acid.

[0053]A strand of double-stranded DNA usually circles the axis of the
double helix once every 10.4 base pairs. As such, in certain aspects, a
template localizing moiety comprises a polynucleotide portion that is
complementary to at least about ten or more adjacent nucleotides to
ensure that the complementary region wraps around the template strand at
least one time. In certain embodiments, the complementary region is
longer to create multiple loops around the template strand. Further, in
certain preferred embodiments, one or more loops formed by a template
localizing moiety around a template nucleic acid block passage of a
polymerase enzyme translocating on the template, effectively localizing
the polymerase to the template at the template localizing moiety. This
can serve to position the polymerase at a desired location within a
reaction region, e.g., in the observation volume. This aspect is
especially useful for large template nucleic acids that extend outside
the observation volume.

[0054]A further advantage provided by a template localizing moiety
comprising a portion complementary to a template nucleic acid is the
ability to selectively immobilized a subset of template nucleic acids
having one or more particular polynucleotide sequences of interest (e.g.,
exonic or intronic regions, regulatory regions, and the like). For
example, a whole genomic sample can be fragmented and mixed with a pool
of template localizing moeties having polynucleotide regions
complementary to a set of genetic loci known to predict susceptibility to
a given disease. Only genomic fragments having one or more of those
genetic loci of interest will be targeted and immobilized by the template
localizing moieties, and subsequently subjected to sequence analysis.
This strategy significantly reduces the amount of data generated, and
therefore the amount of statistical analysis required for determining the
relevant genotypes for an individual, and by association, their
susceptibility to the given disease.

[0055]FIG. 5 provides an exemplary embodiment of a
polynucleotide-containing template localizing moiety 510 that comprises a
polynucleotide region complementary to a region of a single-stranded,
circular template nucleic acid 520 long enough to loop over the template
nucleic acid 520 one time. The ends of the template localizing moiety 510
are derivatized with biotin 560 to promote binding of the ends of the
template localizing moiety 510 to the streptavidin tetramer 550. The
template localizing moiety 510 is annealed to the template nucleic acid
520, and is subsequently immobilized on a substrate 540 via interaction
with a streptavidin tetramer 550 bound to a biotin-derivatized surface of
the substrate 540. The template nucleic acid 520 is also annealed to
primer 570, and subsequently exposed to a polymerase 530. Binding of
polymerase 530 to the complex results in extension of the primer 570 as
the polymerase translocates along the template nucleic acid 520,
producing a nascent polynucleotide strand 580. Upon displacement of the
complementary region of the template localizing moiety 510, a single loop
is formed that passes over the template nucleic acid 520, thereby
localizing it to the reaction region on the substrate 540. Arrow 590
shows the direction of movement of the template strand 520 toward the
polymerase 530 during translocation when the polymerase 530 is blocked by
the template localizing moiety 510. Although FIG. 5 illustrates an
embodiment in which a single subunit of the streptavidin tetramer 550 is
linked to the surface and two are linked to the template localizing
moiety 510, further embodiments include utilization of the fourth
subunit, e.g., to link to the surface, the sequencing enzyme, or other
components of a reaction mixture, including but not limited to elongation
factors, components of a detection system (e.g., luciferase/sulfurylase),
etc.

[0056]FIG. 6 provides an exemplary embodiment of a
polynucleotide-containing template localizing moiety 610 that comprises a
polynucleotide region complementary to a region of a single-stranded,
circular template nucleic acid 620 long enough to loop over the template
nucleic acid 620 three times. The ends of the template localizing moiety
610 are derivatized with biotin 660 to promote binding of the ends of the
template localizing moiety 610 to the streptavidin tetramer 650. The
template localizing moiety 610 is annealed to the template nucleic acid
620, and is subsequently immobilized on a substrate 640 via interaction
with a streptavidin tetramer 650 bound to a biotin-derivatized surface of
the substrate 640. The template nucleic acid 620 is also annealed to
primer 670, and subsequently exposed to a polymerase 630. Binding of
polymerase 630 to the complex results in extension of the primer 670 as
the polymerase translocates along the template nucleic acid 620,
producing a nascent polynucleotide strand 680. Upon displacement of the
complementary region of the template localizing moiety 610, a single loop
is formed that passes over the template nucleic acid 620, thereby
localizing it to the reaction region on the substrate 640. Arrow 690
shows the direction of movement of the template strand 620 toward the
polymerase 630 during translocation when the polymerase 630 is blocked by
the template localizing moiety 610.

[0057]FIG. 7 provides an exemplary embodiment of a
polynucleotide-containing template localizing moiety 710 that comprises a
polynucleotide region complementary to a region of a single-stranded,
circular template nucleic acid 720 long enough to loop over the template
nucleic acid 720 three times. However, unlike the embodiment depicted in
FIG. 6, the template nucleic acid 720 comprises regions of internal
complementarity (shown as double-stranded region 725), such that it can
form a partially double-stranded template nucleic acid. The ends of the
template localizing moiety 710 are derivatized with biotin 760 to promote
binding of the ends of the template localizing moiety 710 to the
streptavidin tetramer 750. Primer 770 and template localizing moiety 710
are annealed to template nucleic acid 720, e.g., following
heat-denaturation. In some preferred embodiments, template localizing
moiety 710 is annealed to one strand within the duplex region of the
template nucleic acid 720. The resulting annealed complex is subsequently
immobilized on a substrate 740 via interaction with the streptavidin
tetramer 750 bound to a biotin-derivatized surface of the substrate 740.
The template nucleic acid 720 is subsequently exposed to a polymerase
730, which extends primer 770 as the polymerase translocates along the
template nucleic acid 720, separating any duplex regions in its path and
producing a nascent polynucleotide strand 780. Upon displacement of the
complementary region of the template localizing moiety 710, three loops
are formed that pass over the template nucleic acid 720, thereby
localizing it to the reaction region on the substrate 740. Arrow 790
shows the direction of movement of the template strand 720 toward the
polymerase 730 during translocation when the polymerase 730 is blocked by
the template localizing moiety 710 looped around the template nucleic
acid.

[0058]In some embodiments, the template nucleic acid 720 comprises a tag
sequence 795 in the single-stranded region that can be used to identify
certain characteristics of the template nucleic acid 720, e.g., source
information. For example, a genomic DNA sample can be fragmented to
produce a set of double-stranded DNA fragments, and each fragment can be
linked to two single-stranded hairpins, one at each end. A tag sequence
incorporated into at least one of the hairpin structures contains a
nucleotide sequence that identifies the source (e.g., individual,
species, subspecies, experimental/clinical group, etc.) from which the
genomic DNA was isolated. Such tag sequences allow pooling of samples
from various sources where the sample from each source is differentially
tagged. During sequence analysis, the identification of a particular tag
sequence in the sequencing read is used to deconvolute the pooled
sequencing data and identify the particular source of the sample. Such
tag sequences (also termed "registration sequences") and partially
double-stranded template nucleic acids are further described in U.S.
patent application Ser. No. 12/413,258, filed Mar. 27, 2009, which is
incorporated herein by reference in its entirety for all purposes.

[0059]Although described above primarily in terms of biotin-streptavidin
linkages, a polynucleotide template localizing moiety can be derivatized
at each end with other entities that preferentially associate with a
molecule immobilized at a reaction region. For example, each end of a
template localizing moiety can be derivatized with a chemically active
linkage including but not limited to "Click Chemistry" (Kolb, et al.
(2001) Angew. Chem. Int. Ed. 40:2004-2021; and CLIP- and SNAP-tag
strategies (New England BioLabs, Inc.). Further, a variety of surface
attachment strategies can be used, including disulfide bond formation,
amine linkages through an activated carbonyl, reactive groups on a number
of siloxane functionalizing reagents (described elsewhere herein), and
the like.

[0060]In certain preferred embodiments, a template localizing moiety that
comprises a polynucleotide portion that is complementary to a template
nucleic acid also comprises one or more polynucleotide portions that are
not complementary to the template nucleic acid and/or one or more
portions that do not comprise polynucleotides. In certain embodiments,
one or more ends of the complementary portion may be linked to
non-complementary portions, e.g., poly-T, poly-A, and the like. In other
embodiments, a complementary polynucleotide portion may be flanked by
portions comprising synthetic structural units, e.g., polyethylene
glycol, Spacer 18 (Integrated DNA Technologies), and the like. Spacer 18
is an 18-atom hexa-ethyleneglycol spacer (shown below) and, in certain
embodiments, between two and five units of Spacer 18 is linked to each
end of the polynucleotide portion of a template localizing moiety.

##STR00001##

In yet further embodiments, a template localizing moiety comprises both
one or more non-complementary polynucleotide portions and one or more
synthetic polymer portions. Benefits from such hybrid structures are
myriad and include less costly synthesis of the synthetic structural
units and reduced potential for interference with a translocating
polymerase. Further, the shape and/or stiffness of the portion of the
template localizing moiety that bind, directly or indirectly, to the
reaction region can be modified based upon the natural and/or synthetic
structural unit composition. The biochemical characteristics of such
structural units, as well as the chemical synthesis methods to link them,
are well understood to those of ordinary skill in the art.

[0061]The compositions of the invention include a template localizing
moiety that has been immobilized, e.g., on a planar surface, in a well,
or in a single-molecular reaction region, e.g., a zero-mode waveguide
(ZMW). In embodiments where the moiety comprises a protein, the protein
is preferably immobilized in an orientation that preserves the protein's
ability to bind/associate with a nucleic acid and, and in some
embodiments form a complex with a sequencing enzyme. The immobilized
template localizing moiety can fix a template nucleic acid to the surface
and can thereby advantageously localize, e.g., a DNA sequencing reaction,
e.g., a template-directed synthesis reaction, to a defined reaction site.
As described elsewhere herein, such compositions can beneficially
increase the lengths and accuracy of sequencing reads and lower
fabrication costs and reagent use when used in, e.g., high-throughput
single-molecule sequencing systems.

[0062]In some embodiments, the template localizing moiety can interact
directly with a surface, as described below. Alternatively or in
addition, a wide variety of linking chemistries are available for linking
template localizing moieties, e.g., those described herein, to a wide
variety of molecular, solid or semi-solid support elements. These
chemistries can be performed in situ (i.e., in the reaction region in
which the protein is to be immobilized) or prior to introduction of the
template localizing moiety into the well or reaction region. It is
impractical and unnecessary to describe all of the possible known linking
chemistries for linking proteins to a solid support. It is expected that
one of skill can easily select appropriate chemistries, depending on the
intended application.

[0063]In one preferred embodiment, the surfaces to which a template
localizing moiety is coupled comprise silicate elements (e.g., an array
of ZMWs fabricated from glass or silicate compounds). A variety of
silicon-based molecules appropriate for functionalizing surfaces are
commercially available. See, for example, Silicon Compounds Registry and
Review, United Chemical Technologies, Bristol, Pa. Additionally, the art
in this area is very well developed and those of skill will be able to
choose an appropriate molecule for a given purpose. Appropriate molecules
can be purchased commercially, synthesized de novo, or it can be formed
by modifying an available molecule to produce one having the desired
structure and/or characteristics.

[0064]The substrate linker attaches to the solid substrate through any of
a variety of chemical bonds. For example, the linker is optionally
attached to the solid substrate using carbon-carbon bonds, for example
via substrates having (poly)trifluorochloroethylene surfaces, or siloxane
bonds (using, for example, glass or silicon oxide as the solid
substrate). Siloxane bonds with the surface of the substrate are formed
in one embodiment via reactions of derivatization reagents bearing
trichlorosilyl or trialkoxysilyl groups. The particular linking group is
selected based upon, e.g., its hydrophilic/hydrophobic properties where
presentation of an attached polymer in solution is desirable. Groups
which are suitable for attachment to a linking group include amine,
hydroxyl, thiol (e.g., in the case of gold surfaces), carboxylic acid,
ester, amide, isocyanate and isothiocyanate. Preferred derivatizing
groups include aminoalkyltrialkoxysilanes, hydroxyalkyltrialkoxysilanes,
polyethyleneglycols, polyethylene imine, polyacrylamide, polyvinylalcohol
and combinations thereof.

[0075]See, for example, Leyden et al., Symposium on Silylated Surfaces,
Gordon & Breach 1980; Arkles, Chemtech 7, 766 (1977); and Plueddemann,
Silane Coupling Reagents, Plenum, N.Y., 1982. These examples are
illustrative and do not limit the types of reactive group
interconversions which are useful in conjunction with the present
invention. Additional starting materials and reaction schemes will be
apparent to those of skill in the art.

[0076]Template localizing moieties bearing a surface-exposed charge can
then be coupled to a derivatized surface, e.g., planar surface, well, or
single-molecule reaction region, e.g., ZMW. For example, the charged
group can be a carboxylate, quaternary amine or protonated amine that is
a component of e.g., an amino acid that has a charged or potentially
charged side chain. The amino acids can be either those having a
structure which occurs naturally or they can be of unnatural structure
(i.e., synthetic). Useful naturally occurring amino acids include:
arginine, lysine, aspartic acid and glutamic acid. Surfaces utilizing a
combination of these amino acids can be of use in the present invention.
Further, peptides comprising one or more residues having a charged or
potentially charged side chain are useful coating components and they can
be synthesized utilizing arginine, lysine, aspartic acid, glutamic acid
and combinations thereof. Useful unnatural amino acids are commercially
available or can be synthesized utilizing art-recognized methodologies,
such as available systems of orthogonal elements. In those embodiments in
which an amino acid moiety having an acidic or basic side chain is used,
these moieties can be attached to a surface bearing a reactive group
through standard peptide synthesis methodologies or easily accessible
variations thereof. See, for example, Jones, Amino Acid and Peptide
Synthesis, Oxford University Press, Oxford, 1992.

[0077]Linking groups can also be placed on surfaces to which a template
localizing moiety is to be immobilized. Linking groups of use in the
present invention can have a range of structures, substituents and
substitution patterns. They can, for example be derivatized with
nitrogen, oxygen and/or sulfur containing groups which are pendent from,
or integral to, the linker group backbone. Examples include, polyethers,
polyacids (polyacrylic acid, polylactic acid), polyols (e.g., glycerol,),
polyamines (e.g., spermine, spermidine) and molecules having more than
one nitrogen, oxygen and/or sulfur moiety (e.g., 1,3-diamino-2-propanol,
taurine).

[0078]In some aspects, the coupling chemistries for coupling a template
localizing moiety to a surface of interest can be light-controllable,
i.e., utilize photo-reactive chemistries. The use of photo-reactive
chemistries and masking strategies to activate coupling of molecules,
e.g., template localizing moieties, to substrates, as well as other
photo-reactive chemistries is generally known (e.g., for semi-conductor
chip fabrication and for coupling bio-polymers to solid phase materials).
Among a wide variety of protecting groups which are useful are
nitroveratryl (NVOC)-methylnitroveratryl (Menvoc), allyloxycarbonyl
(ALLOC), fluorenylmethoxycarbonyl (FMOC),
-methylnitro-piperonyloxycarbonyl (MeNPOC), --NH-FMOC groups, t-butyl
esters, t-butyl ethers, and the like. Various exemplary protecting groups
(including both photo-cleavable and non-photo-cleavable groups) are
described in, for example, Atherton et al., (1989) Solid Phase Peptide
Synthesis, IRL Press, and Greene, et al. (1991) Protective Groups In
Organic Chemistry, 2nd Ed., John Wiley & Sons, New York, N.Y. The use of
these and other photo-cleavable linking groups for nucleic acid and
peptide synthesis on solid supports is a well-established methodology.

[0079]Devices, methods and systems that incorporate functionalized regions
into the walls of a ZMW, e.g., by incorporating an annular gold ring into
the walls of the ZMW, are described, e.g., in Foquet et al. SUBSTRATES
AND METHODS FOR SELECTIVE IMMOBILIZATION OF ACTIVE MOLECULES (U.S. Ser.
No. 60/905,786, filed Mar. 7, 2007 and U.S. Patent Publication No.
20080220537), incorporated herein by reference in their entireties for
all purposes.

[0080]Template localizing moieties can include appropriate functionalities
for linking to the relevant array surface. For example, thiol chemistries
can be used to link, e.g., a template localizing moiety to, e.g., a
planar surface, a well, or a single molecule reaction region. Template
localizing moieties can include linking groups, e.g., one or more biotin
tags, SNAP tags, CLIP tags, or a combination thereof, all of which are
known in the art and commercially available. For example, a template
localizing moiety can comprise a fusion protein between a sliding clamp
protein and a biotin tag that facilitates immobilization of the sliding
clamp protein by binding to streptavidin on the surface. Template
localizing moieties that comprise recombinantly expressed proteins can
also include unnatural amino acids with any of a variety of linking
chemistries, e.g., when expressed in a host cell that includes orthogonal
elements that permit site-specific expression of the unnatural amino
acid. Systems of orthogonal elements that can be used to incorporate
unnatural amino acids, including amino acids with reactive groups, are
described in Wang, et al. (2006) "Expanding the genetic code." Annu Rev
Biophys Biomolec Struct 35: 225-249; Wang and Schultz (2005) "Expanding
the Genetic Code," Angewandte Chemie Int. Ed. 44(1):34-66; Xie, et al.
(2005) "An expanding genetic code." Methods 36: 227-38; and Xie, et al.
(2006) "A chemical toolkit for proteins: an expanded genetic code." Nat
Rev Mol Cell Biol 7: 775-82.

[0081]The site-specific incorporation of an amino acid that comprises a
reactive/linking group can be used to specifically orient, e.g., a
template localizing moiety that comprises a protein, relative to a well
or single molecule reaction region. Most preferably, such a protein is
immobilized in, e.g., a ZMW, in an orientation that permits the protein
to retain its activity, e.g., its ability to bind/associate with a
template nucleic acid and, e.g., form a complex with a polymerase. For
example, the well or reaction region can include a specific
functionalized region (e.g., a gold band, as discussed above) that can be
coupled to a specific portion of the template localizing moiety.
Additional useful strategies for coupling proteins to surfaces are
detailed in, e.g., WO 2007/076057 PROTEIN ENGINEERING STRATEGIES TO
OPTIMIZE ACTIVITY OF SURFACE ATTACHED PROTEINS by Hanzel et al.

Sequencing Enzymes

[0082]The invention provides compositions that include a localizing moiety
on, e.g., a planar surface, a well, or a single molecule reaction region.
Such compositions can be useful in fixing a template nucleic acid to the
surface, e.g., by topologically encircling the template, and localizing
the template to, e.g., a defined reaction region, e.g., a single-molecule
reaction volume. A template localizing moiety can comprise a polymer,
e.g., a protein other than a polymerase, and in particular other than a
polymerase used as a sequencing enzyme, e.g., to perform
template-directed sequencing-by-synthesis. In certain embodiments of the
invention, a sequencing enzyme can be engineered to covalently bind to a
template localizing moiety, e.g., via a polymerase's C-terminal end (see
FIG. 4 and corresponding description). Optionally, a sequencing enzyme
can be temporarily tethered to a template localizing moiety via, e.g., a
reversibly cleavable linker, e.g., a linker that can reform with a new
sequencing enzyme. In certain preferred embodiments, the template
localizing moiety is configured to non-covalently associate with a
sequencing enzyme, or to associate exclusively with the template and not
with the sequencing enzyme. In certain embodiments, a sequencing enzyme
included in the compositions can process a portion of at least one strand
of the fixed template and exchange with a second sequencing enzyme, e.g.,
without terminating the sequencing reaction. The exchange of sequencing
enzymes during nucleic acid sequencing reactions can be particularly
beneficial in, e.g., single-molecule template-directed synthesis
reactions, e.g., performed in a ZMW, where a polymerase's processivity,
accuracy, and polymerase activity can decrease over time. In one example,
a DNA polymerase that has sustained photodamage can exchange with a
non-photodamaged DNA polymerase without disrupting the sequencing read
(see FIG. 3 and corresponding description), thus maintaining the accuracy
with which the correct nucleotide is incorporated into a newly
synthesized nucleic acid and/or increasing sequence throughput.

[0083]The exchange of polymerases is also beneficial where different types
of polymerases are present in a reaction mixture, e.g., as in the
JumpStart RED HT RT-PCR kit (Sigma-Aldrich®). In certain embodiments,
more than one polymerase may be present in a template-directed sequencing
reaction in which one or more lesions may be present on the template
nucleic acid. For example, "bypass polymerases" have been discovered in
both prokaryotes and eukaryotes, most of which belong to the Y-family of
polymerases and/or are considered to be repair polymerases. In contrast
to replicative polymerases, they operate at low speed, low fidelity, and
low processivity. However, because their active sites adopt a more open
configuration than replicative polymerases they are less stringent and
can accommodate altered bases in their active sites. For more information
on bypass polymerases, see, e.g., Cordonnier, et al. (1999) Mol Cell Biol
19(3):2206-11; Friedberg, et al. (2005) Nat Rev Mal Cell Biol
6(12):943-53; Holmquist, et al. (2002) Mutat Res 510(1-2):1-7; Lehmann,
A. R. (2002) Mutat Res 509(1-2):23-34; Lehmann, A. R. (2006) Exp Cell Res
312(14):2673-6; Masutani, et al. (1999) Nature 399(6737):700-4; and
Ohmori, et al. (2001) Mol Cell 8(1):7-8, the disclosures of which are
incorporated herein by reference in their entireties for all purposes.
Certain of these polymerases can bypass lesions in a nucleic acid
template and carry out "translesion synthesis" or TLS. As such, DNA
replication in the presence of such lesions was found to require multiple
polymerases and the "polymerase switch model" was developed (see, e.g.,
Friedberg, et al. (2005) Nat Rev Mol Cell Biol 6(12):943-53; Kannouche,
et al. (2004) Cell Cycle 3(8):1011-3; Kannouche, et al. (2004) Mol Cell
14(4):491-500; and Lehmann, et al. (2007) DNA Repair (Amst) 6(7):891-9,
all of which are incorporated herein by reference in their entireties for
all purposes). In brief, the polymerase switch model is model for lesion
bypass during replication that involves replacement of a replicative
polymerase with a bypass polymerase at a lesion, synthesis of the nascent
strand by the bypass polymerase until past the lesion, and subsequent
replacement of the bypass polymerase with the more processive, higher
fidelity replicative polymerase for continued replication past the
lesion. For example, during the course of a reaction in which a
replicative polymerase encounters and is blocked by a lesion in a
template nucleic acid, the replicative polymerase is replaced by a bypass
polymerase at the site of the lesion, and the bypass polymerase
synthesizes a segment of the nascent strand that is capable of
base-pairing with the damaged base, and may further include one or more
bases prior to and/or past the site of the lesion in a process called
"translesion synthesis." The limited processivity of the bypass
polymerase causes it to dissociate and be replaced by the replicative
polymerase following translesion synthesis. The replicative polymerase
continues to synthesize the nascent strand until another blocking lesion
is encountered in the template, at which point it is once again replaced
by a bypass polymerase for translesion synthesis. (See, e.g., Friedberg,
et al. (2005) Nat Rev Mol Cell Biol 6(12):943-53; and Kannouche, et al.
(2004) Mol Cell 14(4):491-500, incorporated herein by reference above.)
The process continues until the template has been fully processed or the
reaction is terminated, e.g., by the investigator. One particular
advantage of the polymerase switch method of template-dependent
sequencing is that is it tolerant of most types of lesions in the
template nucleic acid. As such the damaged template can be sequenced
through a lesion, thereby allowing reinitiation of synthesis downstream
of the lesion and increasing read lengths on lesion-containing templates.

[0084]Various different bypass polymerases known to those of ordinary
skill in the art can be used with the methods and compositions provided
herein, include prokaryotic polymerases (e.g., DNA polymerase IV,
polymerase V, Dpo4, Dbh, and UmuC) and eukaryotic polymerases (e.g., DNA
polymerase η, DNA polymerase , DNA polymerase κ, and Rev1). In
eukaryotes, multiple bypass polymerases participate in translesion
synthesis, and a processivity factor, proliferating cell nuclear antigen
("PCNA"), is also required and can be included in a sequencing reaction.

[0085]Monitoring reactions in which a template comprises damage or other
lesions generates data that can be statistically analyzed to determine
the number and locations of lesions in the template, and can potentially
identify the type of lesion. Since the portion of the nascent strand
corresponding to the site of the lesion in the template is synthesized by
a bypass polymerase, the sequence reads generated therefrom are expected
to be less reliable than those generated from regions of the nascent
strand synthesized by the replicative polymerase. As such, redundancy in
the sequencing reaction is may be a preferred means of generating
complete and accurate sequence reads. Redundancy can be achieved in
various ways, including carrying out multiple sequencing reactions using
the same original template, e.g., in an array format, e.g., a ZMW array.
In some embodiments in which a lesion is unlikely to occur in all the
copies of a given template, the sequence data generated in the multiple
reactions can be combined and subjected to statistical analysis to
determine a consensus sequence for the template. In this way, the
sequence data generated by processing the template with a lower fidelity
bypass polymerase can be supplemented and/or corrected with sequence data
generated by processing the same template with a higher fidelity
replicative polymerase. Alternatively or additionally, a template can be
subjected to repeated sequencing reactions to generate redundant sequence
information that can be analyzed to more thoroughly characterize the
lesion(s) present in the template, e.g., by using a single-stranded
circular template nucleic acid immobilized at a reaction site by various
methods described elsewhere herein. Methods for template damage detection
and bypass are further described in U.S. Ser. No. 61/186,661, filed Jun.
12, 2009, and incorporated herein by reference in its entirety for all
purposes.

[0087]Structure/function analysis has revealed that most DNA polymerases
comprise a separate exonuclease domain. Many DNA polymerase enzymes have
been modified in any of a variety of ways, e.g., to reduce or eliminate
exonuclease activities (many native DNA polymerases have a proof-reading
exonuclease function that interferes with, e.g., sequencing
applications), to simplify production by making protease digested enzyme
fragments such as the Klenow fragment recombinant, etc. DNA polymerases
have also been modified to confer improvements in specificity,
processivity, and improved retention time of labeled nucleotides in
polymerase-DNA-nucleotide complexes (e.g., WO 2007/076057 POLYMERASES FOR
NUCLEOTIDE ANALOG INCORPORATION by Hanzel et al., and PCT/US2007/022459
POLYMERASE ENZYMES AND REAGENTS FOR ENHANCED NUCLEIC ACID SEQUENCING by
Rank et al.), to improve surface-immobilized enzyme activities (e.g., WO
2007/075987 ACTIVE SURFACE COUPLED POLYMERASES by Hanzel et al., and WO
2007/076057 PROTEIN ENGINEERING STRATEGIES TO OPTIMIZE ACTIVITY OF
SURFACE ATTACHED PROTEINS by Hanzel et al.), to increase closed complex
stability and/or reduce branching rate (e.g., 61/072,645 GENERATION OF
POLYMERASES WITH IMPROVED CLOSED COMPLEX STABILITY AND DECREASED
BRANCHING RATE by Clark, et al.), and to reduce susceptibility to
photodamage (e.g., 61/072,643 ENZYMES RESISTANT TO PHOTODAMAGE by
Bjornson, et al.). Any of these available polymerases can included with
the surface-immobilized template localizing moiety in the compositions,
methods or systems of the invention to, e.g., improve the accuracy of
sequencing data and/or increase the read lengths of sequencing reactions.

[0088]Many such polymerases are available, e.g., for use in sequencing,
labeling and amplification technologies. For example, Human DNA
Polymerase Beta is available from R&D systems. DNA polymerase I is
available from Epicenter, GE Health Care, Invitrogen, New England
Biolabs, Promega, Roche Applied Science, Sigma Aldrich and many others.
The Klenow fragment of DNA Polymerase I is available in both recombinant
and protease digested versions, from, e.g., Ambion, Chimerx, eEnzyme LLC,
GE Health Care, Invitrogen, New England Biolabs, Promega, Roche Applied
Science, Sigma Aldrich and many others. Φ29 DNA polymerase is
available from e.g., Epicentre. Poly A polymerase, reverse transcriptase,
Sequenase, SP6 DNA polymerase, T4 DNA polymerase, T7 DNA polymerase, and
a variety of thermostable DNA polymerases (Taq, hot start, titanium Taq,
etc.) are available from a variety of these and other sources. Recent
commercial DNA polymerases include Phusion® High-Fidelity DNA
Polymerase, available from New England Biolabs; GoTaq® Fiexi DNA
Polymerase, available from Promega; RepliPHI® Φ29 DNA Polymerase,
available from Epicentre Biotechnologies; PfuUltra® Hotstart DNA
Polymerase, available from Stratagene; KOD HiFi DNA Polymerase, available
from Novagen; and many others. Biocompare(dot)com provides comparisons of
many different commercially available polymerases.

[0089]DNA polymerases that are preferably included in the methods,
compositions, and/or systems of the invention, e.g., to increase the read
lengths of sequencing reactions, include Taq polymerases, exonuclease
deficient Taq polymerases, E. coli DNA Polymerase 1, Klenow fragment,
reverse transcriptases, Φ29 related polymerases including wild type
Φ29 polymerase and derivatives of such polymerases such as
exonuclease deficient forms, T7 DNA polymerase, T5 DNA polymerase, an
RB69 polymerase, etc. Further, in certain preferred embodiments,
polymerases that are preferably included in the methods, compositions,
and/or systems of the invention are capable of strand displacement. A
variety of strand displacing polymerase enzymes are readily available,
including, for example, Φ29 polymerase and Φ29-type polymerases
(See, e.g., U.S. Pat. Nos. 5,001,050, 5,576,204, the full disclosures of
which are incorporated herein by reference in their entirety for all
purposes), Bst polymerase (available from New England Biolabs), as well
as those polymerases described in commonly owned International Patent
Application Nos. WO 2007/075987, WO 2007/075873, WO 2007/076057 the full
disclosures of which are incorporated herein by reference in their
entirety for all purposes.

[0090]In one aspect, the polymerase that is included with an immobilized
template localizing moiety in the methods, compositions and/or systems of
the invention is a Φ29-type DNA polymerase. For example, the modified
recombinant DNA polymerase can be homologous to a wild-type or
exonuclease deficient Φ29 DNA polymerase, e.g., as described in U.S.
Pat. No. 5,001,050, 5,198,543, or 5,576,204. Alternately, DNA polymerase
of the methods, systems, and/or compositions can be homologous to other
Φ29-type DNA polymerases, such as B103, GA-1, PZA, Φ15, BS32,
M2Y, Nf, G1, Cp-1, PRD1, PZE, SF5, Cp-5, Cp-7, PR4, PR5, PR722, L17,
Φ21, or the like. For nomenclature, see also, Meijer et al. (2001)
"Φ29 Family of Phages" Microbiology and Molecular Biology Reviews,
65(2): 261-287.

[0091]In addition to wild-type polymerases, chimeric polymerases made from
a mosaic of different sources can be included in the compositions and
methods described herein. For example, Φ29 polymerases made taking
sequences from more than one parental polymerase into account can be used
as a starting point for mutation to produce the polymerases of the
invention. This can done, e.g., using consideration of similarity regions
between the polymerases to define consensus sequences that are used in
the chimera, or using gene shuffling technologies in which multiple
Φ29-related polymerases are randomly or semi-randomly shuffled via
available gene shuffling techniques (e.g., via "family gene shuffling";
see Crameri et al. (1998) "DNA shuffling of a family of genes from
diverse species accelerates directed evolution" Nature 391:288-291;
Clackson et al. (1991) "Making antibody fragments using phage display
libraries" Nature 352:624-628; Gibbs et al. (2001) "Degenerate
oligonucleotide gene shuffling (DOGS): a method for enhancing the
frequency of recombination with family shuffling" Gene 271:13-20; and
Hiraga and Arnold (2003) "General method for sequence-independent
site-directed chimeragenesis: J. Mol. Biol. 330:287-296). In these
methods, the recombination points can be predetermined such that the gene
fragments assemble in the correct order. However, the combinations, e.g.,
chimeras, can be formed at random. Using the methods described above, a
chimeric polymerase, e.g., comprising segments of a B103 polymerase, a
GA-1 polymerase, a PZA polymerase, a Φ15 polymerase, a BS32
polymerase, a M2Y polymerase, an Nf polymerase, a G1 polymerase, a Cp-1
polymerase, a PRD1 polymerase, a PZE polymerase, an SF5 polymerase, a
Cp-5 polymerase, a Cp-7 polymerase, a PR4 polymerase, a PR5 polymerase, a
PR722 polymerase, an L17 polymerase, and/or an F21 polymerase, can be
generated for use with template localizing moieties in compositions and
methods provided by the invention.

[0092]As described above, template localization moieties are also useful
in exonuclease sequencing applications. Briefly, exonuclease sequencing
determines the sequence of a nucleic acid by degrading the nucleic acid
unilaterally from a first end with an exonuclease to sequentially release
individual nucleotides. Each of the sequentially released nucleotides is
identified, e.g., by mass spectrometry, and the sequence of the nucleic
acid is determined from the sequence of released nucleotides. Various
exonucleases known in the art are useful for exonuclease sequencing,
including but not limited to T7 exonuclease, ExoIII, ExoVII, mung bean
nuclease, lambda exonuclease, and the exonuclease activity of various
polymerases (e.g., Klenow, poll, Taq polymerase, and T4 polymerase).
Sequencing by exonuclease degradation is described further, e.g., in U.S.
Pat. Nos. 5,622,824 and 5,516,633; and in international application no.
PCT/US1994/003416. A template nucleic acid immobilized by a template
localizing moiety can be subjected to degradation by an exonuclease and
the resulting free nucleotides can be detected by methods known in the
art, including mass spectrometry, optical detection of fluorescent or
luminescent labels on the released nucleotides, passage through a
nanopore, etc.

[0093]In further embodiments, a combination of an exonuclease and a
polymerase can be used to determine the sequence of a template nucleic
acid, e.g., by subjecting a single-stranded circular template nucleic
acid to rolling circle amplification by the polymerase, degrading the
resulting nascent strand with an exonuclease, and detecting the release
of nucleotides. This method provides an added benefit by allowing
repeated sequencing of the circular template since the exonuclease acts
only on the nascent strand.

Further Details Regarding Nucleic Acid Amplification and Sequencing

[0094]The compositions of the invention, e.g., surface-immobilized
template localizing moieties, can be used in combination with sequencing
enzyme to sequence a template nucleic acid. In certain embodiments, the
sequencing enzyme can associate with the template localizing moiety in a
non-covalent manner or bind the moiety via a reversibly cleavable linker,
e.g., a linker that can reform with a new sequencing enzyme. Thus, the
template can advantageously be sequenced in a manner that permits the
exchange of a first, e.g., inactive, sequencing enzyme, with a second,
e.g., active, sequencing enzyme, without disrupting the sequencing
reaction. For example, during template-dependent synthesis of a nascent
nucleic acid, an inactive polymerase can be replaced by an active
polymerase, allowing stalled nascent strand synthesis to reinitiate. In
other embodiments of the sequencing reactions provided by the invention,
a sequencing enzyme can be covalently bound to the immobilized template
localizing moiety, e.g., at the C-terminal end of a polymerase (see,
e.g., FIG. 4).

[0095]The template nucleic can be a linear or circular molecule, and in
certain applications, is desirably a circular template (e.g., for rolling
circle replication or for sequencing of circular templates), as shown in
FIGS. 2 and 3. Optionally, the composition can be present in an automated
nucleic acid synthesis and/or sequencing system. A template nucleic acid
can be double-stranded or single-stranded, and can comprise DNA, RNA,
analogs and/or derivatives thereof, and combinations of the same. A
template nucleic acid can comprise chemical modifications (e.g., labels,
nucleotide analogs or derivatives, etc.).

[0096]For template-directed sequencing-by-synthesis reactions, a
replication initiating moiety in the reaction mixture can be a standard
complementary oligonucleotide primer, or, alternatively, a component of
the template, e.g., the template can be a self-priming single-stranded
DNA, a nicked double-stranded DNA, or the like. Such an oligonucleotide
primer can comprise native or modified nucleotides, or derivatives,
analogs, and/or combinations thereof. Similarly, a terminal protein can
serve as an initiating moiety. At least one nucleotide analogue can be
incorporated into the DNA. Additional details of and methods for
sequencing by incorporation methods are known in the art, e.g., in U.S.
Pat. Nos. 6,787,308, 6,255,083, 5,547,839, and 6,210,896; U.S.S.N.
2004/0152119, 2003/0096253, 2004/0224319, 2004/0048300, 2003/0190647, and
2003/0215862; and international application nos. WO/1996/027025,
WO/1999/005315, and WO/1991/006678, all of which are incorporated herein
by reference in their entireties for all purposes.

[0097]The compositions of the invention can localize the incorporation of
labeled nucleotides/analogs to a defined reaction region. This can be of
particularly beneficial use in a variety of different nucleic acid
analyses, including real-time monitoring of DNA polymerization and
degradation. For example, a fluorescent or chemiluminescent label can be
incorporated, or more preferably, can be released during incorporation of
the analogue into a nascent nucleic acid strand. For example, analogue
incorporation can be monitored in real-time by monitoring label release
during incorporation of the analogue by a polymerase that can exchange
with a second polymerase in the reaction mixture, e.g., without
terminating the sequence read. The portion of a nucleotide analogue that
is incorporated, e.g., into the copied nucleic acid can be the same as a
natural nucleotide, or can include features of the analogue that differ
from a natural nucleotide. Alternatively or additionally, other methods
for detection of nucleotide incorporation may be employed, e.g.,
luciferase-mediated detection of released pyrophosphate.

[0098]In general, label incorporation or release can be used to indicate
the presence and composition of a growing nucleic acid strand, e.g.,
providing evidence of template-directed synthesis/amplification and/or
sequence of the template. Signaling from the incorporation can be the
result of detecting labeling groups that are liberated from the
incorporated analogue, e.g., in a solid phase assay, or can arise upon
the incorporation reaction. For example, in the case of FRET labels where
a bound label is quenched and a free label is not, release of a label
group from the incorporated analogue can give rise to a fluorescent
signal. Alternatively, polymerases present in a sequencing reaction
mixture, e.g., that can be exchanged during the sequencing reaction, may
be labeled with one member of a FRET pair proximal to the active site,
and incorporation of an analogue bearing the other member will allow
energy transfer upon incorporation. The use of enzyme bound FRET
components in nucleic acid sequencing applications is described, e.g., in
U.S. Patent Application Publication No. 2003/0044781, incorporated herein
by reference.

[0099]In one example reaction of interest, a surface-bound template
localizing moiety can be used to isolate a nucleic acid polymerization
reaction within an extremely small observation volume that effectively
results in observation of individual template-directed synthesis
reactions. As a result, the incorporation event provides observation of
an incorporating nucleotide analogue that is readily distinguishable from
non-incorporated nucleotide analogues. That is, when a polymerase
incorporates complementary, fluorescently labeled nucleotides into the
nucleic acid strand that is being synthesized, the enzyme holds each
nucleotide within the detection volume for tens of milliseconds, e.g.,
orders of magnitude longer than the amount of time it takes an
unincorporated nucleotide to diffuse in and out of the detection volume.
As described above, the polymerase can be exchanged with a second
polymerase in the reaction mixture without terminating the sequence of
incorporation events.

[0100]In a preferred aspect, such small observation volumes are provided
by immobilizing the template localizing moiety within an optical
confinement, such as a Zero Mode Waveguide (ZMW). For a description of
ZMWs and their application in single molecule analyses, and particularly
nucleic acid sequencing, see, e.g., U.S. Patent Application Publication
No. 2003/0044781, and U.S. Pat. No. 6,917,726, each of which is
incorporated herein by reference in its entirety for all purposes. See
also Levene et al. (2003) "Zero-mode waveguides for single-molecule
analysis at high concentrations" Science 299:682-686 and U.S. Pat. Nos.
7,056,676, 7,056,661, 7,052,847, and 7,033,764, the full disclosures of
which are incorporated herein by reference in their entirety for all
purposes. Although various embodiments of the invention are described
primarily in terms of zero-mode waveguide substrates, other types of
substrates comprising appropriately configured reaction regions are known
in the art and useful with the methods, compositions, and systems
described herein, including but not limited to waveguide substrates, TIRE
substrates, and the like. See, e.g., U.S. Patent Publication No.
20080128627; and U.S. Ser. No. 61/192,326, filed Sep. 16, 2009, both of
which are incorporated herein by reference in their entireties for all
purposes.

[0101]A surface-immobilized template localizing moiety that fixes the
template strand within, e.g., a ZMW, in the presence alone or more
nucleotides and/or one or more nucleotide analogues, e.g., fluorescently
labeled nucleotides or nucleotide analogs. For example, in certain
embodiments, labeled analogues are present representing analogous
compounds to each of the four natural nucleotides, A, T, G and C, e.g.,
in separate polymerase reactions, as in classical Sanger sequencing, or
multiplexed together, e.g., in a single reaction, as in multiplexed
sequencing approaches. When a particular base in the template strand is
encountered by a polymerase during the polymerization reaction, it
complexes with an available analogue that is complementary to such
nucleotide, and incorporates that analogue into the nascent and growing
nucleic acid strand. In one aspect, incorporation can result in a label
being released, e.g., in polyphosphate analogues, cleaving between the
α and β phosphorus atoms in the analogue, and consequently
releasing the labeling group (or a portion thereof). The incorporation
event is detected, either by virtue of a longer presence of the analogue
and, thus, the label, in the complex, or by virtue of release of the
label group into the surrounding medium. Where different labeling groups
are used for each of the types of analogues, e.g., A, T, G or C,
identification of a label of an incorporated analogue allows
identification of that analogue and consequently, determination of the
complementary nucleotide in the template strand being processed at that
time. Sequential reaction and monitoring permits a real-time monitoring
of the polymerization reaction and determination of the sequence of the
template nucleic acid.

[0102]As noted above, in particularly preferred aspects, the template
localizing moiety, e.g., that is configured to interact with a
polymerase, is provided immobilized within an optical confinement that
permits observation of an individual template-dependent synthesis
reaction in, e.g., a Zero-Mode Waveguide. An immobilized template
localizing moiety can fix a template to a surface, beneficially provide
longer and more accurate sequence reads in that, e.g., a polymerase that
has sustained photodamage as a result of exposure to the optical energy
of the fluorescently labeled nucleotides or nucleotide analogues present
in the reaction mix can exchange with, e.g., a non-photodamaged
polymerase, during a template-dependent polymerization reaction.

[0103]In addition to their use in sequencing, the surface-immobilized
template localizing moieities of the invention are also useful in a
variety of other analyses, e.g., real time monitoring of amplification,
e.g., real-time-PCR methods, and the like. For example, real-time nucleic
amplification reactions that include one or very few nucleic acid
template molecules can be performed more efficiently if the template and
polymerase were co-localized, e.g., by surface-immobilized template
localizing moiety, e.g., that has been configured to interact with a
polymerase. Further details regarding sequencing and nucleic acid
amplification can be found, e.g., Berger and Kimmel, Guide to Molecular
Cloning Techniques. Methods in Enzymology volume 152 Academic Press,
Inc., San Diego, Calif. (Berger); Sambrook et al., Molecular Cloning--A
Laboratory Manual (3rd Ed.), Vol. 1-3, Cold Spring Harbor Laboratory,
Cold Spring Harbor, N.Y., 2001 ("Sambrook"); Current Protocols in
Molecular Biology, F. M. Ausubel et al., eds., Current Protocols, a joint
venture between Greene Publishing Associates, Inc. and John Wiley & Sons,
Inc ("Ausubel"); Kaufman et al. (2003) Handbook of Molecular and Cellular
Methods in Biology and Medicine Second Edition Ceske (ed) CRC Press
(Kaufman); and The Nucleic Acid Protocols Handbook Ralph Rapley (ed)
(2000) Cold Spring Harbor, Humana Press Inc (Rapley).

Further Details Regarding Integration of Methods/Compositions into High
Throughput Sequencing Systems

[0104]The methods and compositions provided by the invention can
advantageously be integrated with systems that can, e.g., automate and/or
multiplex the sequencing reactions comprising a surface-immobilized
template localizing moiety. Systems of the invention can include one or
more modules, e.g., that automate a method herein, e.g., for
high-throughput sequencing applications. Such systems can include
fluid-handling elements and controllers that move reaction components
into contacts with one another, signal detectors, system
software/instructions, e.g., to convert a sequence of fluorescent signals
into nucleotide sequence information, and the like.

[0105]Systems provided by the invention include a reaction region in which
a template localizing moiety has been immobilized, e.g., with a covalent
bond. The template localizing moiety in the reaction region can
optionally be configured to interact with a sequencing enzyme, e.g., any
one of the sequencing enzymes described herein. The one or more
single-molecule reaction region of the system can optionally include a
sequencing enzyme, which, in certain embodiments of the systems, can be
covalently linked to the surface-immobilized template localizing moiety,
e.g., via a polymerase's C-terminal end (see FIG. 4) or linked, e.g., via
a reversibly cleavable linker, e.g., a linker that can reform with a new
sequencing enzyme.

[0106]In preferred embodiments, the sequencing enzyme can form a
non-covalent complex with the template localizing moiety in the reaction
region such that the sequencing enzyme can exchange with a second
sequencing enzyme present, e.g., in a reaction mixture, without
interrupting the sequencing reaction. This can beneficially provide
longer and more accurate sequence reads in that, e.g., a sequencing
enzyme that has sustained photodamage as a result of exposure to the
optical energy of the fluorescently labeled nucleotides or nucleotide
analogues present in the reaction mix can exchange with, e.g., a
non-photodamaged sequencing enzyme, during a sequencing reaction.

[0107]The reaction region can optionally comprise a planar surface, well,
or one or more single-molecule reaction region. In preferred embodiments,
the reaction region can optionally comprise one or more Zero Mode
Waveguides (ZMWs). (See, e.g., Levene et al. (2003) "Zero-mode waveguides
for single-molecule analysis at high concentrations" Science 299:682-686
and U.S. Pat. Nos. 7,056,676, 7,056,661, 7,052,847, and 7,033,764, the
full disclosures of which are incorporated herein by reference in their
entirety for all purposes.)

[0108]Systems of the invention can optionally include modules that provide
for detection or tracking of products, e.g., a fluorescent light from one
or more fluorophore that is linked to a nucleotide or nucleotide analog
that is being incorporated into a growing nucleic acid. Detectors can
include spectrophotometers, epifluorescent detectors, CCD arrays, CMOS
arrays, microscopes, cameras, or the like. Optical labeling is
particularly useful because of the sensitivity and ease of detection of
these labels, as well as their relative handling safety, and the ease of
integration with available detection systems (e.g., using microscopes,
cameras, photomultipliers, CCD arrays, CMOS arrays and/or combinations
thereof). High-throughput analysis systems using optical labels include
DNA sequencers, array readout systems, cell analysis and sorting systems,
and the like. For a brief overview of fluorescent products and
technologies see, e.g., Sullivan (ed) (2007) Fluorescent Proteins, Volume
85, Second Edition (Methods in Cell Biology) (Methods in Cell Biology)
ISBN-10: 0123725585; H of et al. (eds) (2005) Fluorescence Spectroscopy
in Biology: Advanced Methods and their Applications to Membranes,
Proteins, DNA, and Cells (Springer Series on Fluorescence) ISBN-10:
354022338X; Haughland (2005) Handbook of Fluorescent Probes and Research
Products, 10th Edition (Invitrogen, Inc./Molecular Probes); BioProbes
Handbook, (2002) from Molecular Probes, Inc.; and Valeur (2001) Molecular
Fluorescence: Principles and Applications Wiley ISBN-10: 352729919X.
System software, e.g., instructions running on a computer can be used to
track and inventory reactants or products, and/or for controlling
robotics/fluid handlers to achieve transfer between system
stations/modules. The overall system can optionally be integrated into a
single apparatus, or can consist of multiple apparatus with overall
system software/instructions providing an operable linkage between
modules.

Kits

[0109]The present invention also provides kits that incorporate the
compositions of the invention. Such kits can include, e.g., a template
localizing moiety packaged in a fashion to permit its covalent binding to
a surface of interest. Alternatively the surface bound template
localizing moieties can be provided as components of the kits, or the
surface can be provided with binding partners suitable to bind the
template localizing moieties, which are optionally packaged separately.
Instructions for making or using surface bound template localizing
moieties are an optional feature of the invention.

[0110]The template localizing moieties provided in such kits can also
comprise polynucleotide complementary to a polynucleotide sequence of
interest in a template nucleic acid to facilitate selective
immobilization of a subset of template nucleic acids having one or more
particular polynucleotide sequences of interest (e.g., exonic or intronic
regions, regulatory regions, and the like). For example, a kit can
comprise a pool of template localizing moeties having polynucleotide
regions complementary to a set of genetic loci known to predict
susceptibility to a given disease, identify an unknown microorganism,
determine paternity, and other forensic, medical, and agricultural
analyses. Only genomic fragments having one or more of those genetic loci
of interest will be targeted and immobilized by the template localizing
moieties, and subsequently subjected to sequence analysis, thereby
allowing selective analysis of a subset of a complex genomic sample and a
reduction in the complexity of the data set so generated.

[0111]Such kits can also optionally include additional useful reagents
such as one or more nucleotide analogs, e.g., for sequencing, nucleic
acid amplification, or the like. For example, the kits can include a
sequencing enzyme packaged in such a manner as to enable its use with the
template localizing moiety, a set of different nucleotide analogs of the
invention, e.g., those that are analogous to A, T, G, and C, e.g., where
one or more of the analogs comprise a detectable moiety, to permit
identification in the presence of the analogs. The kits of the invention
can optionally include natural nucleotides, a control template, and other
reagents, such as buffer solutions and/or salt solutions, including,
e.g., divalent metal ions, i.e., Mg++, Mn++ and/or Fe++,
standard solutions, e.g., dye standards for detector calibration, etc.
Such kits also typically include instructions for use of the compounds
and other reagents in accordance with the desired application methods,
e.g., nucleic acid sequencing, nucleic acid labeling, amplification,
enzymatic detection systems, and the like.

[0112]While the foregoing invention has been described in some detail for
purposes of clarity and understanding, it will be clear to one skilled in
the art from a reading of this disclosure that various changes in form
and detail can be made without departing from the true scope of the
invention. For example, all the techniques and apparatus described above
can be used in various combinations. All publications, patents, patent
applications, and/or other documents cited in this application are
incorporated by reference in their entirety for all purposes to the same
extent as if each individual publication, patent, patent application,
and/or other document were individually indicated to be incorporated by
reference for all purposes.