.cndo File Converter

We have developed the CanDo file format (.cndo) that describes the sequence, topology, and geometry of a programmed DNA assembly. We also provide a set of MATLAB scripts that convert a caDNAno design to a .cndo file.

Users of this tool are kindly requested to cite the following reference:

Here we use a double crossover (DX) tile designed using the software caDNAno[6] on a honeycomb [3] lattice to demonstrate the use of this .cndo file converter. The secondary structure was stored in the file tutorial_v2.json, and the nucleotide sequences were in the file tutorial_v2.csv. The following steps are run in MATLAB.

Step 1. Add the directory in the downloaded ZIP package to the MATLAB path.

>> addpath json2cndo

Step 2. Run the MATLAB function json2cndo.m. The syntax of this function is

[] = json2cndo(jsonPATH, csvPATH, cndoPATH, latticeType)

The first and second arguments are the paths to the .json and .csv files, respectively. The third argument is the path to the .cndo file to be created. The value of the last argument should be either ‘honeycomb’ if the design is on a honeycomb [3] lattice or ‘square’ if the design is on a square [5] lattice. We assume that the .json and .csv files are saved in the current directory. Run the following command,

Each of the fields dnaTop(1), …, dnaTop(n_nt) represents a nucleotide and comprises five subfields. The subfield id is a positive integer serving as the unique identifier of the current nucleotide. The subfields up, down, and across are three integers as the unique identifiers of the nucleotides in the 5′-direction, that in the 3′-direction, and the complementary nucleotide, respectively. If any of these nucleotides does not exist, then the corresponding field equals –1. The subfield seq is the identity of the current nucleotide, which is ‘A’, ‘T’, ‘G’, ‘C’, or other letters.

The remaining fields represent the geometry of the DNA assembly. The field dNode(i) consists of the Cartesian coordinates e0 of the center of the reference frame for the i-th basepair. The field triad(i) stores the three axes e1, e2, and e3 of the reference frame for the i-th basepair. The reference frame is defined using the 3DNA convention (Figure 2). The field id_nt(i) consists of two elements, of which the element id1 is the unique identifier of the preferred nucleotide in the i-th basepair (yellow in Figure 2), and the element id2 is the unique identifier of the other nucleotide in the i-th basepair (blue in Figure 2).

Figure 2. Definition of the center and three axes of the reference frame using the 3DNA convention. The axes e1, e2, and e3 point to the major groove, the preferred nucleotide (yellow), and along the duplex axis towards the 3′-direction of the strand with the preferred nucleotide, respectively.

Line 3: a string dnaTop,id,up,down,across,seq as the header of the fields dnaTop(1), …, dnaTop(n_nt)

Line 4 – n_nt+3: six subfields separated by commas, which are the serial number (1, 2, …, n_nt), id, up, down, across, and seq.

Line n_nt+4: an empty line

Line n_nt+5: a string dNode,”e0(1)”,”e0(2)”,”e0(3)” as the header of the fields dNode(1), …, dNode(n_bp)

Line n_nt+6 – n_nt+n_bp+5: four subfields separated by commas, which are the serial number (1, 2, …, n_bp) of the basepair and the Cartesian coordinates e0 of the center of the reference frame.

Line n_nt+n_bp+6: an empty line

Line n_nt+n_bp+7: a string triad,”e1(1)”,”e1(2)”,”e1(3)”,”e2(1)”,”e2(2)”,”e2(3)”,”e3(1)”,”e3(2)”,”e3(3)” as the header of the fields triad(1), …, triad(n_bp)

Line n_nt+n_bp+8 – n_nt+2*n_bp+7: ten subfields separated by commas, which are the serial number (1, 2, …, n_bp) of the basepair and three axes e1, e2, and e3 of the reference frame.

Line n_nt+2*n_bp+8: an empty line

Line n_nt+2*n_bp+9: a string id_nt,id1,id2 as the header of the fields id_nt(1), …, id_nt(n_bp)

Line n_nt+2*n_bp+10 – n_nt+3*n_bp+9: three subfields separated by commas, which are the serial number (1, 2, …, n_bp) of the basepair, id1, and id2.

As an example, the file example_4way_junction.cndo describes a stacked-X four-way junction (Figure 3). The numbers of nucleotides and basepairs are n_nt = 64 and n_bp = 32, respectively. This file may be opened using a text editor or Microsoft Excel. In Microsoft Excel 2010 or 2013, click the button “From Text” in the “Data” tab, open the .cndo file (using the “All Files (*.*)” filter), and choose comma as the delimiter.

Figure 3. Sequence and topology information of a stacked-X four-way junction. The four DNA strands are shown in gray with the 3′-ends represented by arrowheads. The unique identifier (dnaInfo.dnaTop(i).id) and identity (dnaInfo.dnaTop(i).seq) of the i-th nucleotide are colored in blue. The basepair indices are colored in green.