Software

Download

Compiling

Windows and Mac OS X users can skip this step by downloading the above precompiled binary and extracting it to the desired location.

We recommend compiling the source code using the GCC compiler or MinGW for Windows. On Mac OS, the GCC compiler delivered with XCode Tools. The code also compiles with Microsoft Visual C++.

For older versions of GCC (prior to 4.2) or for Visual C++, you may additionally require the Boost libraries.

If using GCC/MinGW, the code can be compiled using the included Makefile:

Extract the source code to the target directory <target>.

Open a shell (or a Windows command prompt) and change to the directory using cd <target>.

Compile the sources by typing make. If using GCC prior to 4.2 and Boost, you should supply the Boost include path using make -e INCLUDE=<path>.

There should now be a subdirectory bin containing the executable file cantata (or cantata.exe). This executable can be copied to a location in the search path.

Usage — A short tutorial

The following examples outline the usage of the program based on the mammalian cell cycle network [1]. The files used in this tutorial can be downloaded here. The ZIP archive contains three files:

cellcycle.txt: The original mammalian cell cycle network model (Fauré et al. [1]) in the BoolNet network file format. In short, each line of the file describes the dependencies of one gene, starting with the target gene, and followed by a separator sign (,) and the Boolean transition function. The transition function consists of gene names, combined by the Boolean operators AND (&) and OR (|). Genes or expressions can be negated by the ! sign.

cellcycle_truncated.txt: A "draft" version of the cell cycle model in which one important dependency has been deleted: The transition function for gene CycB has been modified from

CycB, (! Cdc20 & ! Cdh1)

to

CycB, (! Cdc20)

That is, CycB lacks the dependency on Cdh1, which changes the dynamic behaviour of the model.

cellcycle_rules.txt: The rule file specifying the desired dynamics of the network for reconstruction. In this case, it contains two rules specifying the steady-state attractor and the 7-state cycle:

Each rule starts with the keywords Attractor: or Chain: specifying the type of the rule (attractor or time series). The next section, Initial condition:, specifies which start states should yield the corresponding attractor (or chain). The condition is a space-separated list of gene values. If a gene name is preceded by !, the gene is inactive (0), otherwise it is active (1). In this case, all states with CycD=1 yield the 7-state attractor, and all states with CycD=0 yield the steady-state attractor. The section State specifications: denotes the beginning of the state specification list. Each of the following lines describes the desired gene values for one attractor state, or for several successive attractor states if not all genes are specified. The format of a specification entry is the same as the initial condition.

In the following, we assume that these three files have been extracted to a directory, and that cantata is in your search path.

First, we can check whether the network models match these expectations. We first verify the original ("true") network model. Open a shell/command prompt, change to the directory where the network files are located, and type:

cantata --validate -n cellcycle.txt -r cellcycle_rules.txt

Here, --validate tells the program to validate a model specified in parameter -n according to a list of rules specified in parameter -r.

Here, the program first prints the attractors of the network and their optimal matchings with the rules. We can see that the perturbed network draft has two 4-state attractors that are matched with the 1-state attractor and the 7-state attractor of the original network. Below each matching, the violations are listed. In this example, there are many violations. CANTATA prints the the number of the state specification and the violating gene for each violation. Furthermore, the start states that lead to the matching and cause the violations are printed. By specifying -c 5, we tell the program to print only the first 5 violating states.

Now, let us try to reconstruct the true model from the disrupted network draft using the CANTATA optimization algorithm. This algorithm is started using the main option --optimize:

The parameter -o result.txt tells the program to write the results to a file result.txt. We set the number of iterations to 1000 (which is the default value) using -ni 1000. When the optimization process is complete, the file result.txt contains a header summarizing the algorithm's configuration, followed by a list of candidate networks with their three objective values, e.g.

In the example printed here, the first fitness value of both resulting candidate network models is 0, which indicates that the networks match the rules perfectly. In this case, the second resulting network is equivalent to the true network, which means that the deleted dependency of CycB on Cdh1 was reconstructed and no further changes were applied. The first candidate also recovers this dependency, but changes an & to a | in the function for p27. Depending on the random initialization of the algorithm, you might get a different result when running the example.

The result file is not readable directly by BoolNet, as it contains multiple candidate network models and additional annotation (the header and objective scores). If the candidate networks should be analyzed in BoolNet, CANTATA can write them to separate network files. For example,

writes all candidates that match the rules perfectly to files candidate_0.txt, candidate_1.txt, candidate_2.txt, ..., i.e. the %d marker is replaced by a running number. The parameter -me sets a threshold for the first objective, i.e. only files with a score in the first objective that is less than or equal to this value are written to files. As we set this error to 0 (which is the default value), only candidate network models that match the rule set perfectly are written to files.

This tutorial covers only parts of the options available in CANTATA. A full description of the command line options and the file formats is available here.