Expanding the Genetic Code

"… some warm little pond, with all sorts of ammonia and phosphoric salts, light, heat, electricity etc…", Charles Darwin, on the origins of life in tidal pools. Credit:Smithsonian

A team of investigators at The Scripps Research Institute and its Skaggs Institute for Chemical Biology in La Jolla, California has modified a form of the bacterium Escherichia coli to use a 22-amino acid genetic code.

"We have demonstrated the simultaneous incorporation of two unnatural amino acids into the same polypeptide," says Professor Peter G. Schultz, Ph.D., who holds the Scripps Family Chair in Chemistry at Scripps Research. "Now that we know the genetic code is amenable to expansion to 22 amino acids, the next question is, how far can we take it?"

In an upcoming issue of the journal Proceedings of the National Academy of Sciences, the team describes how they engineered this modified form of E. coli to make myoglobin proteins with 22 amino acids — incorporating the unnatural amino acids O-methyl-L-tyrosine and L-homoglutamine in addition to the naturally occurring 20.

Scientists have for years created proteins with such unnatural amino acids in the laboratory, but until Schultz and his colleagues began their work in this field several years ago, nobody had ever found a way to get organisms to add unnatural amino acids into their genetic code. Earlier studies by Schultz’s group described the incorporation of a number of single unnatural amino acids with a variety of uses in chemistry and biology into E. coli and into the yeast Saccharomyces cerevisiae.

The biomolecule, DNA, that twists throughout the cell nucleus

This latest result is a boon because it demonstrates that multiple unnatural amino acids can be added to the genetic code of a single modified organism. This proof-of-principle opens the door for making proteins within the context of living cells with three, four, or more additional amino acids at once.

The article, "A twenty-two amino acid bacterium with a functional quadruplet codon" is authored by J. Christopher Anderson, Ning Wu, Stephen W. Santoro, Vishva Lakshman, David S. King, and Peter G. Schultz and will be posted online during the week of May 10-16, 2004 by the journal Proceedings of the National Academy of Sciences. The article will appear in print later this year.

This work was supported by the Department of Energy and the Skaggs Institute for Research. Individual scientists involved in this study were sponsored through a National Science Foundation Predoctoral Fellowship, a Canadian Institutes of Health Research fellowship, and a Career Award in the Biomedical Sciences from the Burroughs Wellcome Fund.

Why Expand the Genetic Code?

Life as we know it is composed, at the molecular level, of the same basic building blocks. For instance, all life forms on earth use the same four nucleotides to make DNA. And with few exceptions, all known forms of life use the same common 20 amino acids — and only those 20 — to make proteins.

The question is why did life stop with 20 and why these particular 20?

RNA delivers DNA’s genetic message to the cytoplasm of a cell where proteins are made. Credit: Darryl Leja/Access Excellence

While the answer to that question may be elusive, the 20-amino acid barrier is far from absolute. In some rare instances, in fact, certain organisms have evolved the ability to use the unusual amino acids selenocysteine and pyrrolysine — slightly modified versions of the amino acids cysteine and lysine.

These rare exceptions aside, scientists have often looked for ways to incorporate unnatural amino acids into proteins in the test tube and in the context of living cells because such novel proteins are of great utility for basic biomedical research. They provide a powerful tool for studying and controlling the biological processes that form the basis for some of the most intriguing problems in modern biophysics and cell biology, like signal transduction, protein trafficking in the cell, protein folding, and protein-protein interactions.

For example, there are novel amino acids that contain fluorescent groups that can be used to site-specifically label proteins with small fluorescent tags and observe them in vivo. This is particularly useful now that the human genome has been solved and scientists are now turning their attention to what these genes are doing inside cells.

Other unnatural amino acids contain photoaffinity labels and other "crosslinkers" that could be used for trapping protein-protein interactions by forcing interacting proteins to be covalently attached to one another. Purifying these linked proteins would allow scientists to see what proteins interact with in living cells — even those with weak interactions that are difficult to detect by current methods.

Unnatural amino acids are also important in medicine, and many proteins used therapeutically need to be modified with chemical groups such as polymers, crosslinking agents, and cytotoxic molecules. Last year, Schultz and his Scripps Research colleagues also showed that glycosylated amino acids could be incorporated site-specifically to make glycosylated proteins — an important step in the preparation of some medicines.

Novel hydrophobic amino acids, heavy metal-binding amino acids, and amino acids that contain spin labels could be useful for probing the structures of proteins into which they are inserted. And unusual amino acids that contain chemical moieties like "keto" groups, which are like LEGO blocks, could be used to attach other chemicals such as sugar molecules, which would be relevant to the production of therapeutic proteins.

Combining Amber Suppression with Frame Shift Suppression

Schultz and his colleagues succeeded in making the 22-amino acid E. coli by exploiting the redundancy of the genetic code. When a protein is expressed, an enzyme reads the DNA bases of a gene (A, G, C, and T), and transcribes them into RNA (A, G, C, and U). This so-called "messenger RNA" is then translated by another protein-RNA complex, called the ribosome, into a protein. The ribosome requires the help of transfer RNA molecules (tRNA) that have been "loaded" with an amino acid, and that requires the help of a "loading" enzyme.

Each tRNA recognizes one specific three-base combination, or "codon," on the mRNA and gets loaded with only the one amino acid that is specific for that codon.

During protein synthesis, the tRNA specific for the next codon on the mRNA comes in loaded with the right amino acid, and the ribosome grabs the amino acid and attaches it to the growing protein chain.

The redundancy of the genetic code comes from the fact that there are more codons than there are amino acids used. In fact, there are 4x4x4 = 64 different possible ways to make a codon — or any three-digit combination of four letters in the mRNA (UAG, ACG, UCC, etc.). With only 20 amino acids used by the organisms, not all of the codons are theoretically necessary.

But nature uses them anyway. Several of the 64 codons are redundant, coding for the same amino acid, and three of them are nonsense codons — they don’t code for any amino acid at all.

These nonsense codons are useful because normally when a ribosome that is synthesizing a protein reaches a nonsense codon, the ribosome dissociates from the mRNA and synthesis stops. Hence, nonsense codons are also referred to as "stop" codons. One of these, the amber stop codon UAG, played an important role in Schultz’s research.

Schultz and his colleagues knew that if they could provide their cells with a tRNA molecule that recognizes UAG and also provide them with a synthetase "loading" enzyme that loaded the tRNA with an unnatural amino acid, the scientists would have a way to site-specifically insert the unusual amino acid into any protein they wanted.

They needed to find a functionally "orthogonal" pair­a tRNA/synthetase pair that react with each other but not with endogenous E. coli pairs. So they devised a methodology to evolve the specificity of the orthogonal synthetase to selectively accept unnatural amino acids.

Starting with a tRNA/synthetase pair from the organism Methanococcus jannaschii, they created a library of E.coli cells, each encoding a mutant M. jannaschi synthetase, and they changed its specificity so that it could be use to recognize the unnatural amino acid O-methyl-L-tyrosine.

To do this, they devised a positive selection whereby only the cells that load the orthogonal tRNA with any amino acid would survive. Then they designed a negative selection whereby any cell that recognizes UAG using a tRNA loaded with anything other than O-methyl-L-tyrosine dies.

In so doing, they found their orthogonal synthetase mutants that load the orthogonal tRNA with only the desired unnatural amino acid. When a ribosome reading an mRNA within the E. coli cells encounters UAG, it inserts the unnatural amino acid O-methyl-L-tyrosine.

Furthermore, any codon in an mRNA that is switched to UAG will encode for the new amino acid in that place, giving Schultz and his colleagues a way to site-specifically incorporate novel amino acids into proteins expressed by the E. coli.

Similarly, Schultz and his colleagues made an engineered tRNA/synthetase orthogonal pair from the polar archean organism Pyrococcus horikoshii that recognizes the four-base codon AGGA.

The tRNA has a four-base anticodon loop, and when a ribosome reading an mRNA within the E. coli cells encounter AGGA, it inserts the unnatural amino acid L-homoglutamine at that site.

By placing both of these systems within the same E. coli cell, Schultz and his colleagues have demonstrated, as a proof of principle, that it is technically possible to have mutually orthogonal systems operating at once in the same cell. This opens up the possibility of doing multiple site substitution with additional unnatural amino acids in the future.