Making RNA in the prebiotic world

The RNA World hypothesis posits that RNA was one of the first self-replicating molecules leading to the origin of life. The nucleotide bases of RNA—A, U, C, and G—are chemically complex, and it has been unclear how the large purine bases A and G might have arisen on prebiotic Earth. Becker et al. show that the A and G bases can be synthesized easily and in high yield from prebiotically reasonable precursors, lending further support to the RNA World hypothesis.

Abstract

The origin of life is believed to have started with prebiotic molecules reacting along unidentified pathways to produce key molecules such as nucleosides. To date, a single prebiotic pathway to purine nucleosides had been proposed. It is considered to be inefficient due to missing regioselectivity and low yields. We report that the condensation of formamidopyrimidines (FaPys) with sugars provides the natural N-9 nucleosides with extreme regioselectivity and in good yields (60%). The FaPys are available from formic acid and aminopyrimidines, which are in turn available from prebiotic molecules that were also detected during the Rosetta comet mission. This nucleoside formation pathway can be fused to sugar-forming reactions to produce pentosides, providing a plausible scenario of how purine nucleosides may have formed under prebiotic conditions.

It is assumed that life originated from a simple set of small molecules. These prebiotic molecules can be found on comets and as components of Earth’s early atmosphere (1, 2). The RNA-world hypothesis (3, 4) posits that these molecules assembled to produce nucleosides, and later informational polymers, which are able to replicate themselves. A prebiotically plausible route to pyrimidine nucleosides provides these key building blocks in a stepwise reaction sequence in high yields (5). The situation is less clear for the purine nucleosides, which are not only central components of DNA and RNA but also constituents of adenosine triphosphate and guanosine triphosphate and substructures in many coenzymes (6).

Presently, the only prebiotic route to purine nucleosides proposes the condensation of the complete nucleobase, such as adenine (1, R1 = NH2, R2 = H), with ribose 2 in the molten state (Fig. 1A) (7). This reaction provides complex mixtures of purine ribosides with yields of about 4% for adenosine (β-fA). The reaction needs adenine 1 as its hydrochloride salt and subsequent equilibration of the mixture under basic (NH4OH) conditions (8). Because many of the N atoms of the purine skeleton can react, a regioselectivity problem is encountered, which is responsible for the low yields. In addition, the N-9 atoms of the purine heterocycles, which connect the nucleobase with the sugar in canonical purine ribosides, are particularly unreactive. This, together with the fact that the purine heterocycles themselves are available only in low yields under prebiotic conditions (0.5%) (9, 10), prompted us tosearch for an alternative prebiotic pathway (Fig. 1A, FaPy pathway).

(A) The FaPy pathway starts with the formamidopyrimidines 11 to 13, which are prebiotically available from simple starting materials (encircled) via multi-aminopyrimidines. The molecules in red are derived directly from NH4CN. For comparison, Orgel’s pathway is shown involving coupling of the full nucleobases. Ribose is generated from glycolaldehyde and formaldehyde, formed from an early atmosphere containing humid CO2 (left). (B) Nitrogen reactivity analysis of the multi-aminopyrimidines and crystal structure of the monoprotonated 4,5,6-triaminopyrimidine 8 with bond length and the bond angles of the critical amino groups.

The FaPy pathway starts from aminopyrimidines, which are accessible from simple molecules such as NH4CN under prebiotically plausible conditions. Guanidine 3 (available from cyanamide and NH3), for example reacts, with the HCN trimer aminomalononitrile 4 to produce tetraaminopyrimidine 5 (72%) (11). Despite the high nitrogen content of such molecules, they are very robust. The aminopyrimidines are often electron rich and hence readily oxidized in an oxygen-containing atmosphere but stable in the absence of oxygen (2). Other amino-substituted pyrimidines may have been formed when the HCN trimer 4 is hydrolyzed in the presence of formaldehyde, to produce aminocyanoacetamide 6 (12), which reacts with guanidine 3 to produce triaminopyrimidinone 7 (53%) (12). Finally, triaminopyrimidine 8 is available by condensation of malononitrile 9 with thiourea 10 (84%) (13), followed by nitrosation in water and desulfuration (Ni and H2 or HCOOH) (see the supplementary materials) (14). Thiourea 10 is prebiotically available via addition of H2S to cyanamide or directly from ammonium thiocyanate (15). The so-formed aminopyrimidines (5, 7, 8) feature, in contrast to the purines, high solubility in water.

Although the aminopyrimidines 5, 7, and 8 feature multiple nucleophilic positions, a deeper reactivity analysis and symmetry relations uncover a proton-assisted reactivity guidance (red, green, and blue N atoms in Fig. 1B) that eliminates the regioselectivity problem. The two in-ring N atoms (red, Fig. 1B) are due to the present exocyclic amino groups unusually basic (pKa ~ 7.5), which leads to monoprotonation already under slightly acidic conditions. Due to the symmetry of the molecules, protonation of either one of the two-ring N atoms blocks the reactivity of the exocyclic amino groups (2, 4, and 6, in blue, Fig. 1B) in ortho- and para-position. As such, the proton serves as a protecting group for most amines on the ring structure, leaving only the N-5 amino group (green, Fig. 1B) nucleophilic. To support this analysis, we crystallized monoprotonated triaminopyrimidine 8. Structural analysis shows short C–N bonds (1.33 Å) and planar geometry for amino groups 4 and 6, allowing full conjugation with the electron-deficient pyrimidine ring. In contrast, the N-5 amino group (green, Fig. 1B) has a longer C–N bond (1.41 Å) and pyramidal geometry (109°), hence showing increased sp3 character, which results in a nucleophilic lone pair oriented at an angle of 65° to the ring plane.

When we heated aminopyrimidines 5, 7, and 8 with formic acid or formamide, we obtained—in agreement with this analysis—in all cases regioselectively just the N-5 formylated FaPy product (11 to 13, Fig. 1A) in excellent yields (70 to 90%). Formylation provides the C1 unit needed for later purine formation. It also acts as a protecting group for the 5-NH2 group. The amino groups 2, 4, and 6 are now, under neutral conditions, available for a condensation reaction with a sugar. The amino groups in ortho position to the formamide are symmetry-related by a C2 axis (in 11 and 13), which eliminates the second regioselectivity problem. Reaction of either one with ribose provides the same product. Indeed, when we reacted the formamides 11 to 13 with ribose 2, followed by stirring of the mixture under slightly basic conditions, we obtained in all cases regioselectively only the N-9–reacted α/β-furanosides and α/β-pyranosides (Fig. 2). To assign the high-performance liquid chromatography and mass spectrometry (HPLC-MS)–detected compounds, we synthesized all expected α- and β-pyranosides and furanosides (Fig. 2 and supplementary materials) and performed coinjection studies. To further support the data regarding adenosine, we also isolated this compound (β-fA) by HPLC and confirmed the correct structure by nuclear magnetic resonance (fig. S1). The β- and not the α-anomers are observed as the main products in all experiments. The reaction mechanism involves condensation of FaPy compounds 11 to 13 with ribose to produce an imine intermediate, followed by sugar-ring closure to produce the FaPy ribosides. The following second-cyclization reaction under mild basic conditions provides the purine skeleton (Fig. 2A). This cyclization can occur even without previous reaction with the sugar, which generates the corresponding (sugar-free) nucleobase as a side product. The formed heterocycles can, however, be prebiotically recycled. Formamide in the presence of TiO2 or, alternatively, riboflavin in the presence of light was shown to allow degradation of the purines back into the corresponding FaPy compounds (Fig. 1A and fig. S2) (16, 17).

We studied purine nucleoside synthesis in detail and found that slightly basic conditions gave the highest yields. Reaction of 13 with 2 in pure ammonia solutions (0.5 M) provided the main purine nucleosides β-pyranosyladenosine (β-pA) (35%) and β-fA (12%) in good yields, although ammonia will certainly react with ribose as well (Fig. 2C). We found that the presence of basic amino acids (0.5 M) (fig. S3, A and B) instead of ammonia, such as lysine (β-pA, 13% and β-fA, 4%) and arginine (β-pA, 22% and β-fA, 6%), gave satisfactory yields as well. Good yields were also obtained when we used borax. In this case, the FaPy building block 13 reacted with ribose 2 (Fig. 2B) to produce mainly furanosides (18) with the canonical β-adenosine (β-fA) formed as the main product (20%). Similar results were obtained with a carbonate/borate buffer (19). Under these conditions, the yields for β-pA and β-fA were 15% and 18%, respectively (fig. S3C). Silicate buffers, reported to stabilize pentoses, however, failed in our hands (20). The FaPy pathway also provides direct access to diaminopurine and guanine ribosides upon reaction of FaPy building block 11 (Fig. 2D) and 12 (Fig. 2E) with ribose 2. Although we did not optimize the reaction conditions for these nucleosides, the purine β-furanosides are obtained in yields between 4 and 6% when the reaction is equilibrated with alkylamines. The reaction of 11 with 2 directly provides some guanosine via hydrolysis of the diaminopurine moiety. Formation of the alternative hydrolysis product isoguanosine was not observed (21, 22). Importantly, the FaPy pathway provides canonical β-adenosine in yields of up to 20%. The highest total yields for N-9 ribosides of up to 60% was achieved using simple amines as bases that were also discovered on comet 67P/Churyumov-Gerasimenko (Fig. 2, B to E) (1).

We next asked whether the FaPy pathway can be linked to potential prebiotic sugar syntheses (Fig. 3). An attractive route involves the formaldehyde and glycolaldehyde 14–based formose reaction, e.g., as a mineral-guided version (19). Formaldehyde and small amounts of 14 are both formed by electrical discharge of humid CO2 (23). The formed formaldehyde can further react with HCN by photoreduction to produce 14 (24). A central reaction in the mineral-guided sugar formation is an aldol addition of 14 with formaldehyde, which produces glyceraldehyde 15. This can react with a second molecule of 14 to pentose sugars (Fig. 1A) (19). When we reacted 14 and 15 (both 16.7 mM) in the presence of Ca(OH)2 (23 mM) (25) with FaPy 13, we indeed found purine pentosides as the main products (estimated yield, 21%). Quantification of D/L β-adenosine (D/L-β-fA, Fig. 3) gave 0.9% (based on 13). Despite overlapping signals, we were also able to estimate a yield of 2.7% for the D/L-β-pA. When we performed the reaction in the presence of borax, the D/L-β-fA yield increased among all other aldopentoses to 1.3% ultimately (18). Product identification was achieved by HPLC-MS and coinjection studies (Fig. 3C). Small amounts of purine threosides and erythrosides were also found (fig. S4).

(A) Glycolaldehyde 14 and glyceraldehyde 15 react in the presence of Ca(OH)2 and formamidopyrimidine 13 to tetrosides and pentosides. (B) HPLC analysis (UV detection) of the reaction. The four main peaks (light blue, peak at 24.30 min, two peaks overlapping) are the α- or β-pyranosyl isomers of the aldopentoses (arabinose, lyxose, ribose, and xylose). (C) Detection of the purine pentosides by HPLC-MS with a specific mass filter and coinjection of synthetic material.

The FaPy route delivers ribosides even directly from formaldehyde and glycolaldehyde 14 (Fig. 4). We mixed formaldehyde (0.21 M) and glycolaldehyde (0.40 M) with Ca(OH)2 (0.5 M), borax (0.125 M), or NH3 (0.5 M) in water and heated the mixture for 1 hour at 65°C. The FaPy compound 13 (0.12 M) was added and the mixture was heated in an open vessel for another 8 hours at 100°C. Water (or 0.5 M NH3) was added to the obtained solid, and the mixture was heated for another 2 to 3 days at 100°C. Reactions containing borax or Ca(OH)2 gave complex spectra with only small signals for nucleosides. In the experiments with NH3, however, purine threosides and erythrosides were formed as the main nucleoside components, with the threose nucleosides slightly favored (19). To identify the tetrosides, we again performed coinjection studies (Fig. 4, B and C). Besides the purine tetrosides, we were able to identify D/L-β-pA and D/L-β-fA (D/L-adenosine) in this one-pot reaction (Fig. 4D), with a preference for the β-pyranosyl compound (D/L-β-pA, 0.2% based on 13).

Overall, we show that the reported FaPy pathway with the corresponding aminopyrimidine (11, 12) precursor molecules provides a feasible pathway to purine nucleosides that is compatible with early Earth conditions and that provides adenosine under a variety of conditions in yields of up to 20%. The starting materials are small organic molecules such as HCN, NH3, and particularly formic acid derivatives that were all discovered on comets like 67P/Churyumov-Gerasimenko (1).

,
697. Organic reactions in aqueous solution at room temperature. Part I. The influence of pH on condensations involving the linking of carbon to nitrogen and of carbon to carbon. J. Chem. Soc.3155 (1951).doi:10.1039/jr9510003155

Acknowledgments: We thank the Deutsche Forschungsgemeinschaft (SFB749, SFB646, and SFB1032) and the Excellence cluster EXC114 (Center for Integrated Protein Science) for financial support. Additional figures and synthetic procedures can be found in the supplementary materials. The crystallographic data (CCDC 1416691) can be obtained from The Cambridge Crystallographic Data Centre via www.ccdc.cam.ac.uk.