You are here

Software

"'Fitmodel' estimates the parameters of various codon-based models of substitution, including those described in Guindon, Rodrigo, Dyer and Huelsenbeck (2004). These models are especially useful as they accommodate site-specific switches between selection regimes without a priori knowledge of the positions in the tree where changes of selection regimes occurred.

The program will ask for two input files: a tree file and a sequence file. The tree should be unrooted and in NEWICK format. The sequences should be in PHYLIP interleaved or sequential format. If you are planning to use codon-based models, the sequence length should be a multiple of 3. The program provides four types of codon models: M1, M2, M2a, and M3 (see PAML manual). Moreover, M2, M2a and M3 can be combined with 'switching' models (option 'M'). Two switching models are implemented: S1 and S2. S1 constraints the rates of changes between dN/dS values to be uniform (e.g., the rates of changes between negative and positive selection is constrained to be the same as the rate of change between neutrality and positive selection) while S2 allows for differents rates of change between the different classes of dN/dS values.

If you are using a 'switching' model, 'fitmodel' will output file with the following names: your_sequence_file_trees_w1, your_sequence_file_trees_w2, your_sequence_file_trees_w3 and your_sequence_file_trees_wbest. The w1, w2 and w3 files give the estimated tree with probabilities of w1, w2, and w3 (three maximum likelihood dN/dS ratio estimates) calculated on each edge of the tree and for each site. Hence, the first tree in one of these files reports the probabilities calculated at the first site of the alignment. Instead of probabilities, the wbest file allows you to identify which of the tree dN/dS is the most probable on any give edge, at any given site. A branch with label 0.0 means that w1 is the most probable class, 0.5 indicates the w2 is the most probable and 1.0 means that w3 has the highest posterior probability." (README.txt)

"Bowtie is an ultrafast, memory-efficient short read aligner geared toward quickly aligning large sets of short DNA sequences (reads) to large genomes. It aligns 35-base-pair reads to the human genome at a rate of 25 million reads per hour on a typical workstation. Bowtie indexes the genome with a Burrows-Wheeler index to keep its memory footprint small: for the human genome, the index is typically about 2.2 GB (for unpaired alignment) or 2.9 GB (for paired-end or colorspace alignment). Multiple processors can be used simultaneously to achieve greater alignment speed. Bowtie can also output alignments in the standard SAM format, allowing Bowtie to interoperate with other tools supporting SAM, including the SAMtools consensus, SNP, and indel callers. Bowtie runs on the command line." (http://bowtie-bio.sourceforge.net/manual.shtml)

Several of our software packages require that users complete a form that is then kept on record here before we can grant access to the software. Academic users should download, print, complete, and return the form to OSC. Non-academic users should contact OSC Help for assistance.

Python is a high-level, multi-paradigm programming language that is both easy to learn and useful in a wide variety of applications. Python has a large standard library as well as a large number of third-party extensions, most of which are completely free and open source. We highly recommend using Python 2.7.1, as we have added a lot of Python packages and tuned them to perform well on our systems.

Computational stochastic approaches (Monte Carlo methods) based on random sampling are becoming extremely important research tools not only in their "traditional" fields such as physics, chemistry or applied mathematics but also in social sciences and, recently, in various branches of industry. An indication of importance is, for example, the fact that Monte Carlo calculations consume about one half of the supercomputer cycles. One of the indispensable and important ingredients for reliable and statistically sound calculations is the source of pseudo random numbers. SPRNG provides a scalable package for parallel pseudo random number generation which will be easy to use on a variety of architectures, especially in large-scale parallel Monte Carlo applications.

SPRNG 1.0 provides the user the various SPRNG random number generators each in its own library. For most users this is acceptable, as one rarely uses more than one type of generator in a single program. However, if the user desires this added flexibility, SPRNG 2.0 provides it. In all other respects, SPRNG 1.0 and SPRNG 2.0 are identical.

This page documents usage of the ScaLAPACK library installed by OSC from source. An optimized implementation of ScaLAPACK is included in MKL; see the software documentation page for Intel Math Kernel Library for usage information.

R is a language and environment for statistical computing and graphics. It is similar to the S language and environment developed at Bell Laboratories (formerly AT&T, now Lucent Technologies). R provides a wide variety of statistical and graphical techniques, and is highly extensible.