Education

Research summary

My research focuses on the development of rapid and cost-effective sequencing methods based on synthesis of nucleoside derivatives. This would be achieved by multiplex PCR using primers consisting of nucleoside derivatives which form base pairs with natural nucleobases and/or synthetic nucleobases in different hydrogen bonding patterns.

Synthetic nucleobases presenting non-Watson-Crick arrangements of hydrogen bond donor and acceptor groups can form additional nucleotide pairs that stabilize duplex DNA independent of the standard A:T and G:C pairs. The pair between 2-amino-3-nitropyridin-6-one 2'-deoxyriboside (presenting a {donor-donor-acceptor} hydrogen bonding pattern on the Watson-Crick face of the small component, trivially designated Z) and imidazo[1,2-a]-1,3,5-triazin-4(8H)one 2'-deoxyriboside (presenting an {acceptor-acceptor-donor} hydrogen bonding pattern on the large component, trivially designated P) is one of these extra pairs for which a substantial amount of molecular biology has been developed. Here, we report the results of UV absorbance melting measurements and determine the energetics of binding of DNA strands containing Z and P to give short duplexes containing Z:P pairs as well as various mismatches comprising Z and P. All measurements were done at 1 M NaCl in buffer (10 mM Na cacodylate, 0.5 mM EDTA, pH 7.0). Thermodynamic parameters (ΔH°, ΔS°, and ΔG°37) for oligonucleotide hybridization were extracted. Consistent with the Watson-Crick model that considers both geometric and hydrogen bonding complementarity, the Z:P pair was found to contribute more to duplex stability than any mismatches involving either nonstandard nucleotide. Further, the Z:P pair is more stable than a C:G pair. The Z:G pair was found to be the most stable mismatch, forming either a deprotonated mismatched pair or a wobble base pair analogous to the stable T:G mismatch. The C:P pair is less stable, perhaps analogous to the wobble pair observed for C:O6-methyl-G, in which the pyrimidine is displaced into the minor groove. The Z:A and T:P mismatches are much less stable. Parameters for predicting the thermodynamics of oligonucleotides containing Z and P bases are provided. This represents the first case where this has been done for a synthetic genetic system.

One frontier in synthetic biology seeks to move artificially
expanded genetic information systems (AEGIS) into natural living cells and to
arrange the metabolism of those cells to allow them to replicate plasmids built
from these unnatural genetic systems. In addition to requiring polymerases that
replicate AEGIS oligonucleotides, such cells require metabolic pathways that
biosynthesize the triphosphates of AEGIS nucleosides, the substrates for those
polymerases. Such pathways generally require nucleoside and nucleotide kinases
to phosphorylate AEGIS nucleosides and nucleotides on the path to these
triphosphates. Thus, constructing such pathways focuses on engineering natural
nucleoside and nucleotide kinases, which often do not accept the unnatural
AEGIS biosynthetic intermediates. This, in turn, requires assays that allow the
enzyme engineer to follow the kinase reaction, assays that are easily confused by
ATPase and other spurious activities that might arise through "site-directed
damage" of the natural kinases being engineered. This article introduces three assays that can detect the formation of both natural
and unnatural deoxyribonucleoside triphosphates, assessing their value as polymerase substrates at the same time as monitoring
the progress of kinase engineering. Here, we focus on two complementary AEGIS nucleoside diphosphates, 6-amino-5-nitro-3-
(1'-B-D-2'-deoxyribofuranosyl)-2(1H)-pyridone and 2-amino-8-(1'-B-D-2'-deoxyribofuranosyl)-imidazo[1,2-a]-1,3,5-triazin-
4(8H)-one. These assays provide new ways to detect the formation of unnatural deoxyribonucleoside triphosphates in vitro
and to confirm their incorporation into DNA. Thus, these assays can be used with other unnatural nucleotides.

In its "grand challenge" format in chemistry, "synthesis" as an activity sets out a goal that is
substantially beyond current theoretical and technological capabilities. In pursuit of this
goal, scientists are forced across uncharted territory, where they must answer unscripted
questions and solve unscripted problems, creating new theories and new technologies in
ways that would not be created by hypothesis-directed research. Thus, synthesis drives discovery
and paradigm changes in ways that analysis cannot. Described here are the products
that have arisen so far through the pursuit of one grand challenge in synthetic biology:
Recreate the genetics, catalysis, evolution, and adaptation that we value in life, but using
genetic and catalytic biopolymers different from those that have been delivered to us by
natural history on Earth. The outcomes in technology include new diagnostic tools that have
helped personalize the care of hundreds of thousands of patients worldwide. In science, the
effort has generated a fundamentally different view of DNA, RNA, and how they work.

Laboratory in vitro evolution (LIVE) might deliver
DNA aptamers that bind proteins expressed on the surface of
cells. In this work, we used cell engineering to place glypican 3
(GPC3), a possible marker for liver cancer theranostics, on the
surface of a liver cell line. Libraries were then built from a sixletter
genetic alphabet containing the standard nucleobases and
two added nucleobases (2-amino-8H-imidazo[1,2-a]-
[1,3,5]triazin-4-one and 6-amino-5-nitropyridin-2-one),
Watson-Crick complements from an artificially expanded
genetic information system (AEGIS). With counterselection
against non-engineered cells, eight AEGIS-containing aptamers
were recovered. Five bound selectively to GPC3-overexpressing
cells. This selectionâ€“counterselection scheme had
acceptable statistics, notwithstanding the possibility that cells
engineered to overexpress GPC3 might also express different
off-target proteins. This is the first example of such a combination.

This paper combines two advances to detect MERS-CoV, the causative agent of Middle East Respiratory Syndrome, that have emerged over the past few years from the new field of "synthetic biology". Both are based on an older concept, where molecular beacons are used as the downstream detection of viral RNA in biological mixtures followed by reverse transcription PCR amplification. The first advance exploits the artificially expanded genetic information systems (AEGIS). AEGIS adds nucleotides to the four found in standard DNA and RNA (xNA); AEGIS nucleotides pair orthogonally to the A:T and G:C pairs. Placing AEGIS components in the stems of molecular beacons is shown to lower noise by preventing unwanted stem invasion by adventitious natural xNA. This should improve the signal-to-noise ratio of molecular beacons operating in complex biological mixtures. The second advance introduces a nicking enzyme that allows a single target molecule to activate more than one beacon, allowing "signal amplification". Combining these technologies in primers with components of a self-avoiding molecular recognition system (SAMRS), we detect 50 copies of MERS-CoV RNA in a multiplexed respiratory virus panel by generating fluorescence signal visible to human eye and/or camera.

Noroviruses are the major cause of global viral gastroenteritis with short incubation times and small inoculums required for infection. This creates a need for a rapid molecular test for norovirus for early diagnosis, in the hope of preventing the spread of the disease. Non-chemists generally use off-the shelf reagents and natural DNA to create such tests, suffering from background noise that comes from adventitious DNA and RNA (collectively xNA) that is abundant in real biological samples, especially feces, a common location for norovirus. Here, we create an assay that combines artificially expanded genetic information systems (AEGIS, which adds nucleotides to the four in standard xNA, pairing orthogonally to A:T and G:C) with loop-mediated isothermal amplification (LAMP) to amplify norovirus RNA at constant temperatures, without the power or instrument requirements of PCR cycling. This assay was then validated using feces contaminated with murine norovirus (MNV). Treating stool samples with ammonia extracts the MNV RNA, which is then amplified in an AEGIS-RT-LAMP where AEGIS segments are incorporated both into an internal LAMP primer and into a molecular beacon stem, the second lowering background signaling noise. This is coupled with RNase H nicking during sample amplification, allowing detection of as few as 10 copies of noroviral RNA in a stool sample, generating a fluorescent signal visible to human eye, all in a closed reaction vessel.

In addition to completing the Watson-Crick nucleobase matching "concept" (big pairs with small,
hydrogen bond donors pair with hydrogen bond acceptors),
artificially expanded genetic information systems
(AEGIS) also challenge DNA polymerases with a
complete set of mismatches, including wobble mismatches.
Here, we explore wobble mismatches with AEGIS with
DNA polymerase 1 from Escherichia coli. Remarkably, we
find that the polymerase tolerates an AEGIS:standard
wobble that has the same geometry as the G:T wobble that
polymerases have evolved to exclude but excludes a wobble
geometry that polymerases have never encountered in
natural history. These results suggest certain limits to "structural analogy" and "evolutionary guidance" as tools
to help synthetic biologists expand DNA alphabets.

Reported here is a laboratory in vitro evolution (LIVE)
experiment based on an artificially expanded genetic
information system (AEGIS). This experiment delivers
the first example of an AEGIS aptamer that binds
to an isolated protein target, the first whose structural
contact with its target has been outlined and
the first to inhibit biologically important activities of
its target, the protective antigen from Bacillus anthracis.
We show how rational design based on secondary
structure predictions can also direct the use
of AEGIS to improve the stability and binding of the
aptamer to its target. The final aptamer has a dissociation
constant of ~35 nM. These results illustrate
the value of AEGIS-LIVE for those seeking to
obtain receptors and ligands without the complexities
of medicinal chemistry, and also challenge the
biophysical community to develop new tools to analyze
the spectroscopic signatures of new DNA folds
that will emerge in synthetic genetic systems replacing
standard DNA and RNA as platforms for LIVE.

Axiomatically, the density of information
stored in DNA, with just four nucleotides (GACT), is
higher than in a binary code, but less than it might be if
synthetic biologists succeed in adding independently
replicating nucleotides to genetic systems. Such addition
could also add additional functional groups, not found in
natural DNA but useful for molecular performance. Here,
we consider two new nucleotides (Z and P, 6-amino-5-
nitro-3-(1'-B-D-2'-deoxyribo-furanosyl)-2(1H)-pyridone
and 2-amino-8-(1'-B-D-2'-deoxyribofuranosyl)-imidazo-
[1,2-a]-1,3,5-triazin-4(8H)-one). These are designed to
pair via strict Watson?Crick geometry. These were added
to a laboratory in vitro evolution (LIVE) experiment; the
GACTZP library was challenged to deliver molecules that
bind selectively to liver cancer cells, but not to
untransformed liver cells. Unlike in classical in vitro
selection systems, low levels of mutation allow this system
to evolve to create binding molecules not necessarily
present in the original library. Over a dozen binding
species were recovered. The best had Z and/or P in their
sequences. Several had multiple, nearby, and adjacent Zs
and Ps. Only the weaker binders contained no Z or P at all.
This suggests that this system explored much of the
sequence space available to this genetic system and that
GACTZP libraries are richer reservoirs of functionality
than standard libraries.

Expanding the synthetic biology of artificially expanded genetic information systems (AEGIS) requires tools to make and analyze RNA molecules having added nucleotide "letters". We report here the development of T7 RNA polymerase and reverse transcriptase to catalyze transcription and reverse transcription of xNA (DNA or RNA) having two complementary AEGIS nucleobases, 6-amino-5-nitropyridin-2-one (trivially, Z) and 2-aminoimidazo[1,2a]-1,3,5-triazin-4(8H)-one (trivially, P). We also report MALDI mass spectrometry and HPLC-based analyses for oligomeric GACUZP six-letter RNA and the use of ribonuclease (RNase) A and T1 RNase as enzymatic tools for the sequence-specific degradation of GACUZP RNA. We then applied these tools to analyze the GACUZP and GACTZP products of polymerases and reverse transcriptases (respectively) made from DNA and RNA templates. In addition to advancing this 6-letter AEGIS toward the biosynthesis of proteins containing additional amino acids, these experiments provided new insights into the biophysics of DNA.