Experimental Average Values

In order to compare global analysis results stored in our database coming from MD simulations with experimental values, we have built a complex pipeline to obtain such values.

Starting from all the PDB files classified as Nucleic acid, we built a complete dataset of duplex structures following these steps:

Splitting NMR models in different files

Identifying DNA/RNA chains forming duplexes and extracting them to different files

Applying symmetry to those PDB files that need a transformation to obtain the final biomolecule (see PDB REMARK 350)

Next step in the pipeline consists in applying a list of filters to the set of duplex nucleic acid structures obtained. These filters work as a pre-process to the analyses we are interested in, and include, for example, checking for different strand lenghts, possible sequence mismatches, presence of essential atoms needed for subsequent analyses, etc. The final step consists in running Curves+ program to extract helical parameters information from the generated PDB files.

As an example of what we obtained, the following table shows the average results for the base-pair steps helical parameters computed for the complete list of experimental structures, together with values extracted from a similar previous work in the group (P. Dans et. al, NAR 2012 40, 10668-10678).

Amount of data(N), average values and standard deviation of the six intra-strand base pair parameters for the 10 unique bps.

For each step, the naked-DNA structures (first row), PDB structures (second row), and results from MD simulations with parmbsc0 (third row) from the paper cited are shown together with our new values computed to the set of filtered PDB files (forth row).

Values for simulations were obtained from time averages computed for individual steps in each sequence.

These computed experimental values are used as a comparison in global analyses plots (see analyses section). In the following example, vertical red line is showing the experimental average value found for the Rise helical parameter together with the histogram of Rise average values stored in the database for a particular base-pair step (in this case CG). Blue vertical line is showing the average value of the histogram data (MD simulations data).