Introduction

Ancestral Sequence Reconstruction refers to the construction of hypothesized protein or DNA sequences belonging to a common ancestor of extant proteins or DNA. It enables scientists to synthesize biomolecules from extinct organisms. Sequence information (Nucleic Acid and Protein) from extant species can be used to infer the sequences of common ancestor species which can be synthesized and tested in the lab. The method was originally discussed by Pauling and Zuckerkandl in 1963[1], almost 30 years before the theory was experimentally tested.

Sequence Reconstruction Example

Pipeline for Generating Ancestral Genes

Sequences from extant species of the desired common ancestral gene and outgroup genes are aligned.

The ancestral gene is inferred based on evolutionary models (typically maximum parsimony or maximum likelihood).

Methods of Inferring Ancient Sequences

Consensus Sequence - the most frequently occurring residue of a extant organisms is assumed to be the ancestral state.

Maximum Parsimony - minimization of the total number of changes required to account for the terminal sequences.

Maximum Likelihood - ancestral states are evaluated at each internal node in the tree based on the likelihood of each mutation. This process uses a statistical framework of molecular evolution which takes into account differences in certain types of mutations. The generated ancestral sequence gives the probability that each residue is correctly predicted.

Gene Synthesis

Modern advances in oligonucleotide synthesis allow for the cheap construction of synthetic ancestral genes. As of January 2012, oligonucleotides can be purchased for 0.35$/base. Using overlapping PCR assembly most synthetic genes can be constructed for several hundred dollars.

Testing Ancestral Variants

After synthesis is complete, ancestral genes can be tested for their function. For example, the ancestral genes can be cloned into bacterial expression vectors and transformed into E.coli. The proteins can be purified and tested for their activity, whether that be fluorescence spectra, binding affinity, thermoactivity, etc.

Examples of Ancestral Sequence Reconstructions

As the technology for ancestral sequence reconstruction has advanced the technique has popularized. It offers a unique way to peer into the past and revisit the biomolecules of our forefathers. The sensitivity of genetic information to degrade limits our understanding to extant organisms or circumstantially preserved ones, such as specimens preserved by ice. The ability to resurrect sequences from the dead and test them relieves us from these temporal limitations.

Evolution of Coral Pigments

One visceral example of ancestral sequence reconstruction was done by the Matz group (currently residing at the University of Texas at Austin). Fluorescent proteins from related coral species had wavelengths corresponding to Cyan, Green, and Red. The details of the evolution of fluorescent color in the GFP superfamily was not fully understand. That is, what fluorescent spectra did the common ancestors of the modern corals have? Sequences for the common ancestor nodes were synthesized and tested for their activity. The ancestral sequences revealed an interesting evolutionary history. The common ancestor to all the superfamily had a green emission peak. The more recent common ancestor of Green/Red had two emission peaks; a strong green peak and a smaller red peak. This evolution of this ancestor resolved into either a green or red peak, losing the emission bimodality and specializing.

Inferring the Paleoenvironment of ancient Earth

Ancestral sequence reconstruction has been used to infer the environmental conditions on the early Earth. The study sought to explore the evolutionary history of a translation factor EFtu. This elongation factor functions optimally at the temperature which the organism lives, for example thermophilic organisms have an EFtu optimized for high temperatures. By resurrecting the EFtu protein in the common ancestors of bacteria, the temperature profile might elucidate what the environmental conditions were. Interestingly the EFtu common ancestor to all mesophilic bacterium (~1BYA) has an optimal temperature of a thermophile. This suggests that the hypothesis of a hot early Earth is true.

Precambrian β lactamases

The structure of the last common ancestor to extant β lactamase has been reconstructed and tested using Ancestral Sequence Reconstruction[2]. Sequences for β lactamase from 75 bacterial strains, whose last CA lived 2-3 Gya, were compared. To exclude the effects of recent antibiotic driven evolution, only non-clinical varieties were chosen. Using Bayesian statistics, Risso et al. calculated the most probabilistic ancestral amino acid for each point in the sequence.

Resistance to various β lactam based antibiotics was conferred to modern microbes engineered to produce the Precambrian enzyme. The extremely high promiscuity of the Precambrian enzyme compared to extant sequences indicates that over the past few billion years, β lactamases have evolved greater substrate specificity. The Precambrian enzyme featured a denaturation temperature of ~90°C, which is 35°C above that of the highest extant sequence.

Precambrian Thioredoxin

Perez-Jiminez et. al resurrected various Precambrian Thioredoxins, belonging to the last common ancestors of the "last bacterial common ancestor (LBCA), the last archaeal common ancestor (LACA) and the archaeal-eukaryotic common ancestor (AECA)"[3].

Further Applications

Ancestral sequence reconstruction holds various benefits. The most direct benefit is a better understanding of archaic organisms and the world they inhabited. Numerous ancient protein reconstructions have yielded enzymes that are much more thermally stable than extant derivatives, even those found in thermophiles, leading to stronger evidence for extremely warm global conditions prior to 1 Gya[4]. Archaic proteins also allow for greater study of evolvability, as many reconstructed proteins feature high promiscuity[2]. The patterns in the evolution of high promiscuity Precambrian β lactamase, for instance, could help further the fight against the development of antibiotic resistance. High promiscuity is also key to directed protein evolution. In addition, the reconstruction of archaic enzymes serves as a sort of paleo-bioprospecting[3]. The enzymes discovered through this form of prospecting and those developed through directed evolution of promiscuous enzymes could serve functional roles in human engineered systems.