RNA world

A comparison of RNA (left) with DNA (right), showing the helices and nucleobases each employs.

The RNA world refers to the self-replicating ribonucleic acid (RNA) molecules that were precursors to all current life on Earth.[1][2][3] It is generally accepted that current life on Earth descends from an RNA world,[4] although RNA-based life may not have been the first life to exist.[5][6]

The RNA world hypothesis is supported by many independent lines of evidence, such as the observations that RNA is central to the translation process and that small RNAs can catalyze all of the chemical group and information transfers required for life.[6][9] The structure of the ribosome has been called the "smoking gun," as it showed that the ribosome is a ribozyme, with a central core of RNA and no amino acid side chains within 18 angstroms of the active site where peptide bond formation is catalyzed.[5] Many of the most critical components of cells (those that evolve the slowest) are composed mostly or entirely of RNA. Also, many critical cofactors (ATP, Acetyl-CoA, NADH, etc.) are either nucleotides or substances clearly related to them. This would mean that the RNA and nucleotide cofactors in modern cells are an evolutionary remnant of an RNA-based enzymatic system that preceded the protein-based one seen in all extant life.

Evidence suggests chemical conditions (including the presence of boron, molybdenum and oxygen) for initially producing RNA molecules may have been better on the planet Mars than those on the planet Earth.[2][3] If so, life-suitable molecules, originating on Mars, may have later migrated to Earth via panspermia or similar process.[2][3]

One of the challenges in studying abiogenesis is that the system of reproduction and metabolism utilized by all extant life involves three distinct types of interdependent macromolecules (DNA, RNA, and protein). This suggests that life could not have arisen in its current form, and mechanisms have then been sought whereby the current system might have arisen from a simpler precursor system. The concept of RNA as a primordial molecule[8] can be found in papers by Francis Crick[10] and Leslie Orgel,[11] as well as in Carl Woese's 1967 book The Genetic Code.[12] In 1962 the molecular biologist Alexander Rich, of the Massachusetts Institute of Technology, had posited much the same idea in an article he contributed to a volume issued in honor of Nobel-laureate physiologist Albert Szent-Györgyi.[13]Hans Kuhn in 1972 laid out a possible process by which the modern genetic system might have arisen from a nucleotide-based precursor, and this led Harold White in 1976 to observe that many of the cofactors essential for enzymatic function are either nucleotides or could have been derived from nucleotides. He proposed that these nucleotide cofactors represent "fossils of nucleic acid enzymes".[14] The phrase "RNA World" was first used by Nobel laureate Walter Gilbert in 1986, in a commentary on how recent observations of the catalytic properties of various forms of RNA fit with this hypothesis.[15]

The properties of RNA make the idea of the RNA world hypothesis conceptually plausible, though its general acceptance as an explanation for the origin of life requires further evidence.[13] RNA is known to form efficient catalysts and its similarity to DNA makes its ability to store information clear. Opinions differ, however, as to whether RNA constituted the ﬁrst autonomous self-replicating system or was a derivative of a still-earlier system.[8] One version of the hypothesis is that a different type of nucleic acid, termed pre-RNA, was the first one to emerge as a self-reproducing molecule, to be replaced by RNA only later. On the other hand, the recent ﬁnding that activated pyrimidineribonucleotides can be synthesized under plausible prebiotic conditions[16] means that it is premature to dismiss the RNA-ﬁrst scenarios.[8] Suggestions for 'simple' pre-RNA nucleic acids have included Peptide nucleic acid (PNA), Threose nucleic acid (TNA) or Glycol nucleic acid (GNA).[17][18] Despite their structural simplicity and possession of properties comparable with RNA, the chemically plausible generation of "simpler" nucleic acids under prebiotic conditions has yet to be demonstrated.[19]

RNA enzymes, or ribozymes, are found in today's DNA-based life and could be examples of living fossils. Ribozymes play vital roles, such as those in the ribosome, which is vital for protein synthesis. Many other ribozyme functions exist; for example, the hammerhead ribozyme performs self-cleavage[20] and an RNA polymerase ribozyme can synthesize a short RNA strand from a primed RNA template.[21]

Among the enzymatic properties important for the beginning of life are:

The ability to self-replicate, or synthesize other RNA molecules; relatively short RNA molecules that can synthesize others have been artificially produced in the lab. The shortest was 165-bases long, though it has been estimated that only part of the molecule was crucial for this function. One version, 189-bases long, had an error rate of just 1.1% per nucleotide when synthesizing an 11 nucleotide long RNA strand from primed template strands.[22] This 189 base pair ribozyme could polymerize a template of at most 14 nucleotides in length, which is too short for self replication, but a potential lead for further investigation. The longest primer extension performed by a ribozyme polymerase was 20 bases.[23]

The ability to catalyze simple chemical reactions—which would enhance creation of molecules that are building blocks of RNA molecules (i.e., a strand of RNA which would make creating more strands of RNA easier). Relatively short RNA molecules with such abilities have been artificially formed in the lab.[24][25]

The ability to conjugate an amino acid to the 3'-end of an RNA in order to use its chemical groups or provide a long-branched aliphatic side-chain.[26]

The ability to catalyse the formation of peptide bonds to produce short peptides or longer proteins. This is done in modern cells by ribosomes, a complex of several RNA molecules known as rRNA together with many proteins. The rRNA molecules are thought responsible for its enzymatic activity, as no amino acid molecules lie within 18Å of the enzyme's active site.[13] A much shorter RNA molecule has been synthesized in the laboratory with the ability to form peptide bonds, and it has been suggested that rRNA has evolved from a similar molecule.[27] It has also been suggested that amino acids may have initially been involved with RNA molecules as cofactors enhancing or diversifying their enzymatic capabilities, before evolving to more complex peptides. Similarly, tRNA is suggested to have evolved from RNA molecules that began to catalyze amino acid transfer.[28]

RNA is a very similar molecule to DNA, and only has two chemical differences. The overall structure of RNA and DNA are immensely similar—one strand of DNA and one of RNA can bind to form a double helical structure. This makes the storage of information in RNA possible in a very similar way to the storage of information in DNA. However RNA is less stable.

The major difference between RNA and DNA is the presence of a hydroxyl group at the 2'-position.

The major difference between RNA and DNA is the presence of a hydroxyl group at the 2'-position of the ribose sugar in RNA (illustration, right).[13] This group makes the molecule less stable because when not constrained in a double helix, the 2' hydroxyl can chemically attack the adjacent phosphodiester bond to cleave the phosphodiester backbone. The hydroxyl group also forces the ribose into the C3'-endo sugar conformation unlike the C2'-endo conformation of the deoxyribose sugar in DNA. This forces an RNA double helix to change from a B-DNA structure to one more closely resembling A-DNA.

RNA also uses a different set of bases than DNA—adenine, guanine, cytosine and uracil, instead of adenine, guanine, cytosine and thymine. Chemically, uracil is similar to thymine, differing only by a methyl group, and its production requires less energy.[29] In terms of base pairing, this has no effect. Adenine readily binds uracil or thymine. Uracil is, however, one product of damage to cytosine that makes RNA particularly susceptible to mutations that can replace a GC base pair with a GU (wobble) or AUbase pair.

RNA is thought to have preceded DNA, because of their ordering in the biosynthetic pathways. The deoxyribonucleotides used to make DNA are made from ribonucleotides, the building blocks of RNA, by removing the 2'-hydroxyl group. As a consequence a cell must have the ability to make RNA before it can make DNA.

The chemical properties of RNA make large RNA molecules inherently fragile, and they can easily be broken down into their constituent nucleotides through hydrolysis.[30][31] These limitations do not make use of RNA as an information storage system impossible, simply energy intensive (to repair or replace damaged RNA molecules) and prone to mutation. While this makes it unsuitable for current 'DNA optimised' life, it may have been acceptable for more primitive life.

Riboswitches have been found to act as regulators of gene expression, particularly in bacteria, but also in plants and archaea. Riboswitches alter their secondary structure in response to the binding of a metabolite. This change in structure can result in the formation or disruption of a terminator, truncating or permitting transcription respectively.[32] Alternatively, riboswitches may bind or occlude the Shine-Dalgarno sequence, affecting translation.[33] It has been suggested that these originated in an RNA-based world.[34] In addition, RNA thermometers regulate gene expression in response to temperature changes.[35]

The RNA world hypothesis is supported by RNA's ability to store, transmit, and duplicate genetic information, as DNA does. RNA can act as a ribozyme, a special type of enzyme. Because it can perform the tasks of both DNA and enzymes, RNA is believed to have once been capable of supporting independent life forms.[13] Some viruses use RNA as their genetic material, rather than DNA.[36] Further, while nucleotides were not found in Miller-Urey's origins of life experiments, their formation in prebiotically plausible conditions has now been reported, as noted above;[16] the purine base known as adenine is merely a pentamer of hydrogen cyanide. Experiments with basic ribozymes, like Bacteriophage Qβ RNA, have shown that simple self-replicating RNA structures can withstand even strong selective pressures (e.g., opposite-chirality chain terminators).[37]

Since there were no known chemical pathways for the abiogenic synthesis of nucleotides from pyrimidine nucleobases cytosine and uracil under prebiotic conditions, it is thought by some that nucleic acids did not contain these nucleobases seen in life's nucleic acids.[38] The nucleoside cytosine has a half-life in isolation of 19 days at 100 °C (212 °F) and 17,000 years in freezing water, which some argue is too short on the geologic time scale for accumulation.[39] Others have questioned whether ribose and other backbone sugars could be stable enough to find in the original genetic material,[40] and have raised the issue that all ribose molecules would have had to be the same enantiomer, as any nucleotide of the wrong chirality acts as a chain terminator.[41]

Pyrimidine ribonucleosides and their respective nucleotides have been prebiotically synthesised by a sequence of reactions that by-pass free sugars and assemble in a stepwise fashion by going against the dogma that nitrogenous and oxygenous chemistries should be avoided. In a series of publications, The Sutherland Group at the School of Chemistry, University of Manchester have demonstrated high yielding routes to cytidine and uridine ribonucleotides built from small 2 and 3 carbon fragments such as glycolaldehyde, glyceraldehyde or glyceraldehyde-3-phosphate, cyanamide and cyanoacetylene. One of the steps in this sequence allows the isolation of enantiopure ribose aminooxazoline if the enantiomeric excess of glyceraldehyde is 60% or greater, of possible interest towards biological homochirality.[42] This can be viewed as a prebiotic purification step, where the said compound spontaneously crystallised out from a mixture of the other pentose aminooxazolines. Aminooxazolines can react with cyanoacetylene in a mild and highly efficient manner, controlled by inorganic phosphate, to give the cytidine ribonucleotides. Photoanomerization with UV light allows for inversion about the 1' anomeric centre to give the correct beta stereochemistry, one problem with this chemistry is the selective phosphorylation of alpha-cytidine at the 2' position.[43] However, in 2009 they showed that the same simple building blocks allow access, via phosphate controlled nucleobase elaboration, to 2',3'-cyclic pyrimidine nucleotides directly, which are known to be able to polymerise into RNA.[44] This was hailed as strong evidence for the RNA world.[45] The paper also highlighted the possibility for the photo-sanitization of the pyrimidine-2',3'-cyclic phosphates.[44] A potential weakness of these routes is the generation of enantioenriched glyceraldehyde, or its 3-phosphate derivative (glyceraldehyde prefers to exist as its keto tautomer dihydroxyacetone).[citation needed]

On August 8, 2011, a report, based on NASA studies with meteorites found on Earth, was published suggesting building blocks of RNA (adenine, guanine and related organic molecules) may have been formed extraterrestrially in outer space.[46][47][48] On August 29, 2012, and in a world first, astronomers at Copenhagen University reported the detection of a specific sugar molecule, glycolaldehyde, in a distant star system. The molecule was found around the protostellar binary IRAS 16293-2422, which is located 400 light years from Earth.[49][50] Glycolaldehyde is needed to form ribonucleic acid, or RNA, which is similar in function to DNA. This finding suggests that complex organic molecules may form in stellar systems prior to the formation of planets, eventually arriving on young planets early in their formation.[51]

"Molecular biologist's dream" is a phrase coined by Gerald Joyce and Leslie Orgel to refer to the problem of emergence of self-replicatingRNA molecules, as any movement towards an RNA world on a properly modeled prebiotic early Earth would have been continuously suppressed by destructive reactions.[52] It was noted that many of the steps needed for the nucleotides formation do not proceed efficiently in prebiotic conditions.[53] Joyce and Orgel specifically referred the molecular biologist's dream to "a magic catalyst" that could "convert the activated nucleotides to a random ensemble of polynucleotide sequences, a subset of which had the ability to replicate".[52]

Joyce and Orgel further argued that nucleotides cannot link unless there is some activation of the phosphate group, whereas the only effective activating groups for this are "totally implausible in any prebiotic scenario", particularly adenosine triphosphate.[52] According to Joyce and Orgel, in case of the phosphate group activation, the basic polymer product would have 5',5'-pyrophosphate linkages, while the 3',5'-phosphodiester linkages, which are present in all known RNA, would be much less abundant.[52] The associated molecules would have been also prone to addition of incorrect nucleotides or to reactions with numerous other substances likely to have been present.[52] The RNA molecules would have been also continuously degraded by such destructive process as spontaneous hydrolysis, present on the early Earth.[52] Joyce and Orgel proposed to reject "the myth of a self-replicating RNA molecule that arose de novo from a soup of random polynucleotides"[52] and hypothesised about a scenario where the prebiotic processes furnish pools of enantiopurebeta-D-ribonucleosides.[54]

Nucleotides are the fundamental molecules that combine in series to form RNA. They consist of a nitrogenous base attached to a sugar-phosphate backbone. RNA is made of long stretches of specific nucleotides arranged so that their sequence of bases carries information. The RNA world hypothesis holds that in the primordial soup (or sandwich), there existed free-floating nucleotides. These nucleotides regularly formed bonds with one another, which often broke because the change in energy was so low. However, certain sequences of base pairs have catalytic properties that lower the energy of their chain being created, enabling them to stay together for longer periods of time. As each chain grew longer, it attracted more matching nucleotides faster, causing chains to now form faster than they were breaking down.

These chains have been proposed by some as the first, primitive forms of life.[55] In an RNA world, different sets of RNA strands would have had different replication outputs, which would have increased or decreased their frequency in the population, i.e. natural selection. As the fittest sets of RNA molecules expanded their numbers, novel catalytic properties added by mutation, which benefitted their persistence and expansion, could accumulate in the population. Such an autocatalytic set of ribozymes, capable of self replication in about an hour, has been identified. It was produced by molecular competition (in vitro evolution) of candidate enzyme mixtures.[56]

Competition between RNA may have favored the emergence of cooperation between different RNA chains, opening the way for the formation of the first protocell. Eventually, RNA chains developed with catalytic properties that help amino acids bind together (a process called peptide-bonding). These amino acids could then assist with RNA synthesis, giving those RNA chains that could serve as ribozymes the selective advantage. The ability to catalyze one step in protein synthesis, aminoacylation of RNA, has been demonstrated in a short (five-nucleotide) segment of RNA.[57]

One of the problems with the RNA world hypothesis is to discover the pathway by which RNA became upgraded to the DNA system. Ken Stedman of Portland State University in Oregon, may have found the solution. While filtering virus-sized particles from a hot acidic lake in Lassen Volcanic National Park, California, he discovered 400,000 pieces of viral DNA. Some of these, however, contained a protein coat of reverse transcriptase enzyme normally associated with RNA based retroviruses. This lack of respect for biochemical boundaries virologists like Luis Villareal of the University of California Irvine believe would have been a characteristic of a pre RNA virus world up to 4 billion years ago.[58][59] This finding bolsters the argument for the transfer of information from the RNA world to the emerging DNA world before the emergence of the Last Universal Common Ancestor. From the research, the diversity of this virus world is still with us.

Additional evidence supporting the concept of an RNA world has resulted from research on viroids, the first representatives of a novel domain of "subviral pathogens."[61][62] Viroids are mostly plant pathogens, which consist of short stretches (a few hundred nucleobases) of highly complementary, circular, single-stranded, and non-coding RNA without a protein coat. Compared with other infectious plant pathogens, viroids are extremely small in size, ranging from 246 to 467 nucleobases. In comparison, the genome of the smallest known viruses capable of causing an infection are about 2,000 nucleobases long.[63]

In 1989, Diener proposed that, based on their characteristic properties, viroids are more plausible "living relics" of the RNA world than are introns or other RNAs then so considered.[64] If so, viroids have attained potential significance beyond plant pathology to evolutionary biology, by representing the most plausible macromolecules known capable of explaining crucial intermediate steps in the evolution of life from inanimate matter (see: abiogenesis).

Apparently, Diener's hypothesis lay dormant until 2014, when Flores et al. published a review paper, in which Diener's evidence supporting his hypothesis was summarized.[65] In the same year, a New York Times science writer published a popularized version of Diener's proposal, in which, however, he mistakenly credited Flores et al. with the hypothesis' original conception.[66]

Pertinent viroid properties listed in 1989 are: 1. their small size, imposed by error-prone replication; 2. their high guanine and cytosine content, which increases stability and replication fidelity; 3. their circular structure, which assures complete replication without genomic tags; 4. existence of structural periodicity, which permits modular assembly into enlarged genomes; 5. their lack of protein-coding ability, consistent with a ribosome-free habitat; and 6. replication mediated in some by ribozymes—the fingerprint of the RNA world.[65]

The existence, in extant cells, of RNAs with molecular properties predicted for RNAs of the RNA World constitutes an additional argument supporting the RNA World hypothesis.

Eigen et al.[67] and Woese[68] proposed that the genomes of early protocells were composed of single-stranded RNA, and that individual genes corresponded to separate RNA segments, rather than being linked end-to-end as in present day DNA genomes. A protocell that was haploid (one copy of each RNA gene) would be vulnerable to damage, since a single lesion in any RNA segment would be potentially lethal to the protocell (e.g. by blocking replication or inhibiting the function of an essential gene).

Vulnerability to damage could be reduced by maintaining two or more copies of each RNA segment in each protocell, i.e. by maintaining diploidy or polyploidy. Genome redundancy would allow a damaged RNA segment to be replaced by an additional replication of its homolog. However for such a simple organism, the proportion of available resources tied up in the genetic material would be a large fraction of the total resource budget. Under limited resource conditions, the protocell reproductive rate would likely be inversely related to ploidy number. The protocell's fitness would be reduced by the costs of redundancy. Consequently, coping with damaged RNA genes while minimizing the costs of redundancy would likely have been a fundamental problem for early protocells.

A cost-benefit analysis was carried out in which the costs of maintaining redundancy were balanced against the costs of genome damage.[69] This analysis led to the conclusion that, under a wide range of circumstances, the selected strategy would be for each protocell to be haploid, but to periodically fuse with another haploid protocell to form a transient diploid. The retention of the haploid state maximizes the growth rate. The periodic fusions permit mutual reactivation of otherwise lethally damaged protocells. If at least one damage-free copy of each RNA gene is present in the transient diploid, viable progeny can be formed. For two, rather than one, viable daughter cells to be produced would require an extra replication of the intact RNA gene homologous to any RNA gene that had been damaged prior to the division of the fused protocell. The cycle of haploid reproduction, with occasional fusion to a transient diploid state, followed by splitting to the haploid state, can be considered to be the sexual cycle in its most primitive form.[69][70] In the absence of this sexual cycle, haploid protocells with a damage in an essential RNA gene would simply die.

This model for the early sexual cycle is hypothetical, but it is very similar to the known sexual behavior of the segmented RNA viruses, which are among the simplest organisms known. Influenza virus, whose genome consists of 8 physically separated single-stranded RNA segments,[71] is an example of this type of virus. In segmented RNA viruses, “mating” can occur when a host cell is infected by at least two virus particles. If these viruses each contain an RNA segment with a lethal damage, multiple infection can lead to reactivation providing that at least one undamaged copy of each virus gene is present in the infected cell. This phenomenon is known as “multiplicity reactivation”. Multiplicity reactivation has been reported to occur in influenza virus infections after induction of RNA damage by UV-irradiation,[72] and ionizing radiation.[73]

Patrick Forterre has been working on a novel hypothesis, called "three viruses, three domains":[74] that viruses were instrumental in the transition from RNA to DNA and the evolution of Bacteria, Archaea, and Eukaryota. He believes the last common ancestor (specifically, the "last universal cellular ancestor")[74] was RNA-based and evolved RNA viruses. Some of the viruses evolved into DNA viruses to protect their genes from attack. Through the process of viral infection into hosts the three domains of life evolved.[74][75] Another interesting proposal is the idea that RNA synthesis might have been driven by temperature gradients, in the process of thermosynthesis.[76] Single nucleotides have been shown to catalyze organic reactions.[77]

The hypothesized existence of an RNA world does not exclude a "Pre-RNA world", where a metabolic system based on a different nucleic acid is proposed to pre-date RNA. A candidate nucleic acid is peptide nucleic acid (PNA), which uses simple peptide bonds to link nucleobases.[78] PNA is more stable than RNA, but its ability to be generated under prebiological conditions has yet to be demonstrated experimentally.

Threose nucleic acid (TNA) has also been proposed as a starting point, as has glycol nucleic acid (GNA), and like PNA, also lack experimental evidence for their respective abiogenesis.

The iron-sulfur world theory proposes that simple metabolic processes developed before genetic materials did, and these energy-producing cycles catalyzed the production of genes.

Some of the difficulties over producing the precursors on earth are bypassed by another alternative or complementary theory for their origin, panspermia. It discusses the possibility that the earliest life on this planet was carried here from somewhere else in the galaxy, possibly on meteorites similar to the Murchison meteorite.[83] This does not invalidate the concept of an RNA world, but posits that this world or its precursors originated not on Earth but rather another, probably older, planet.

There are hypotheses that are in direct conflict to the RNA world hypothesis. The relative chemical complexity of the nucleotide and the unlikelihood of it spontaneously arising, along with the limited number of combinations possible among four base forms as well as the need for RNA polymers of some length before seeing enzymatic activity have led some to reject the RNA world hypothesis in favor of a metabolism-first hypothesis, where the chemistry underlying cellular function arose first, and the ability to replicate and facilitate this metabolism. Another proposal is that the dual molecule system we see today, where a nucleotide-based molecule is needed to synthesize protein, and a protein-based molecule is needed to make nucleic acid polymers, represents the original form of life.[84] This theory is called the Peptide-RNA world, and offers a possible explanation for the rapid evolution of high-quality replication in RNA (since proteins are catalysts), with the disadvantage of having to postulate the formation of two complex molecules, an enzyme (from peptides) and a RNA (from nucleotides). In this Peptide-RNA World scenario, RNA would have contained the instructions for life while peptides (simple protein enzymes) would have accelerated key chemical reactions to carry out those instructions.[85] The study leaves open the question of exactly how those primitive systems managed to replicate themselves — something neither the RNA World hypothesis nor the Peptide-RNA World theory can yet explain, unless polymerases —enzymes that rapidly assemble the RNA molecule— played a role.[85]

The RNA world hypothesis, if true, has important implications for the definition of life. For most of the time that followed Watson and Crick's elucidation of DNA structure in 1953, life was largely defined in terms of DNA and proteins: DNA and proteins seemed the dominant macromolecules in the living cell, with RNA only aiding in creating proteins from the DNA blueprint.

The RNA world hypothesis places RNA at center-stage when life originated. This has been accompanied by many studies[citation needed] in the last ten years that demonstrate important aspects of RNA function not previously known—and supports the idea of a critical role for RNA in the mechanisms of life. The RNA world hypothesis is supported by the observations that ribosomes are ribozymes: the catalytic site is composed of RNA, and proteins hold no major structural role and are of peripheral functional importance. This was confirmed with the deciphering of the 3-dimensional structure of the ribosome in 2001. Specifically, peptide bond formation, the reaction that binds amino acids together into proteins, is now known to be catalyzed by an adenine residue in the rRNA.