Abstract

The database of RNA sequences is exploding, but knowledge of energetics, structures, and dynamics lags behind. All‐atom computational
methods, such as molecular dynamics, hold promise for closing this gap. New algorithms and faster computers have accelerated
progress in improving the reliability and accuracy of predictions. Currently, the methods can facilitate refinement of experimentally
determined nuclear magnetic resonance and x‐ray structures, but are ‘unreliable’ for predictions based only on sequence. Much
remains to be discovered, however, about the many molecular interactions driving RNA folding and the best way to approximate
them quantitatively. The large number of parameters required means that a wide variety of experimental results will be required
to benchmark force fields and different approaches. As computational methods become more reliable and accessible, they will
be used by an increasing number of biologists, much as x‐ray crystallography has expanded. Thus, many fundamental physical
principles underlying the computational methods are described. This review presents a summary of the current state of molecular
dynamics as applied to RNA. It is designed to be helpful to students, postdoctoral fellows, and faculty who are considering
or starting computational studies of RNA. WIREs RNA 2017, 8:e1422. doi: 10.1002/wrna.1422

Images

Free energy integration. This example shows the alchemical transformation of a G–C pair to an iG–iC pair, where the amino and carbonyl groups are transposed relative to G and C. In free energy integration, the system is transformed from one molecule to another, where λ indicates the progress. In this example, λ = 0 is the G–C pair and λ = 1 is the iG–iC pair. For intermediate values of λ, the pair is partially G–C and partially iG–iC.

Umbrella sampling can sample across barriers in landscapes. The x‐axis of these plots is a reaction coordinate, which can be any quantity that describes progress along the conformational change pathway, such as a distance, angle, or dihedral angle. The y‐axis is the Gibbs free energy change. In panel (a), a molecular landscape is shown, where there are two low free energy basins and a barrier between them. If this barrier is much larger than thermal energy, then molecular dynamics will be slow to cross it. In panel (b), a restraint energy (red) is added to the landscape to lower the barrier between the two low free energy states. In this way, the barrier can be easily crossed, as shown in the final landscape (green). Although in theory this would work to enable sampling, in practice the determination of the height and shape of the barrier is the goal of the simulations, hence it is not clear what potential would suffice to use as the restraint. Panel (c) shows that a series of umbrella potentials (red) can be used to guide the molecule across the barrier. Each restraint is used in a separate, equilibrium simulation, and therefore each independent equilibrium simulation will sample a distinct region of the reaction coordinate. In panel (d), each simulation has been analyzed to determine the free energy for each equilibrium simulation, but an arbitrary offset results for each. To recover the original landscape (panel a), the free energies are ‘stitched’ together using the weighted histogram analysis method.

This figure shows the refinement process that RNA force fields have undergone from the early 1990s to the present. The black bars show these connections as a cladistic tree would, and their length is proportional to the number of years between publications. AMBER FF94, FF99, Parmbsc0, and FF10 are described in Refs , , , and , respectively. These record the development of the standard AMBER force field for RNA (note that more recent standard AMBER force fields for RNA are identical to FF10). χ Yildirim and ParmTor are described in Refs and , respectively. Chen and Garcia's force field is described in Ref , and is based on the hypothesis that FF99 overemphasizes stacking which they reduce by scaling down van der Waals (VDW) radii and well depths to 95 and 80%, respectively, to agree with quantum mechanics (QM) (CCSD(T)) calculations. Cheatham's group hypothesized that phosphate oxygens might be incorrectly sized, based on work from the Case group, and produced Amber + vdW BB and Amber + vdW. All use parameters described in Ref . AmberPDB is the result of a new approach for force‐field fitting, described in Ref , where targeted metadynamics (TMD) simulations enforce a free energy curve along degrees of freedom explicitly parameterized by the force field, whereupon those parameters are fit to the TMD free energy curve. In this case the enforced free energy curve was the log‐odds of dihedral angles for α, β, ε, and from dinucleotides in the PDB. Amber + Aytenfisu is a refit of the dihedral parameters using a simultaneous linear regression to match QM energies calculated for an ensemble of dinucleotides amassed from the PDB and sampled from simulations, described in Ref . CHARMM has also offered force fields tuned for nucleic acids, with CHARMM22, 27, and 36 described in Refs , and , and , respectively.

An example of RNA sequence (left), secondary structure (middle), and tertiary structure (right). The secondary structure records base‐pairing associations for the sequence which, being mostly more favorable than other contacts, constrain tertiary structure. This is what is meant by the observation that RNA structure is hierarchical. One goal of computational methods is to accurately predict 3D structure from sequence.

Two structures commonly generated in molecular dynamics simulations of 5’GACC. (Left) An A‐form like structure consistent with nuclear magnetic resonance (NMR) spectra. (Right) A structure with G1 intercalated between C3 and C4; NMR does not detect this structure.

Results from tests of different force fields used to predict structure of UUCG tetraloop. Cluster 5 agrees best with the nuclear magnetic resonance structure, which is shown in green. (Reprinted with permission from Ref . Copyright 2015 Cold Spring Harbor Laboratory Press for the RNA Society)

Nuclear magnetic resonance (NMR) structures and free energy increments at 37°C for internal loops in 5′GCUGA¯GGCU2,5′GCGGA¯UGCU2,5′GCGGA¯CGC2, and 5′GGCGA¯GCC2. Note that the NMR spectra do not eliminate the possibility of up to 10% of a minor structure. Values for loop ΔG∘ are modified from previously published values because nonloop nearest neighbors were subtracted using parameters from Xia et al. and Chen et al.

Comparison between experimental and predicted change in ΔG∘ (kcal/mol) at 300 K for duplex formation when GC pairs are replaced by iGiC. The thermodynamic cycle requires that ΔG4∘−ΔG2∘−ΔG1∘+2ΔG3∘=0 so that ΔG4∘−ΔG1∘=ΔG2∘−2ΔG3∘. Revisions to FF99 for parameters of torsions, χ, α, γ, ε, ζ, and β increased agreement between experiment and prediction by about 10 kcal/mol.

Lennard‐Jones, JE. On the determination of molecular fields. II. From the equation of state of a gas. Proc R Soc Lond A Math Phys Eng Sci 1924, 106:463–477. Available at: http://rspa.royalsocietypublishing.org/content/106/738/463.