Vous êtes ici

R'MES Online User Guide

Aim

The main question R'MES addresses is "does this motif occur in that biological sequence with an expected frequency?" In other words, can we observe it so many times, or so few times, just by chance? Usually, when the answ er is no, such a motif is a candidate to have a particular biological meaning; only a candidate: statistical significance is not equivalent to biological significance.

A brief presentation of the statistical method used in R'MES to evaluate the significance of a motif frequency in a sequence can be found here or in the user guide. For more details about the methodology, please refer to the following tutorial (pdf) or to the book DNA, Wo rds and Models by Robin, Rodolphe and Schbath published by CUP in 2005 (or by BELIN in 2003 for the French version).

System Requirements

R'MES comes as a source distribution only, and needs to be compiled before use.

R'MES is written in C and C++. Our distribution was specifically designed to be compiled with the GNU GCC compiler. It has been tested on a variety of Unix platforms (Linux, Solaris, MacOS X).

R'MES' installation procedure follows the GNU package distribution standards. So, after downloading rmes-<version>.tar.gz (where <version> stands for the version number) here is the list of steps to perform to install R'MES in a default location (usually /usr/local).

For more details, refer to the user guide or to the INSTALL file included in the source distribution.

Running R'MES

To get a complete description of all the possibilities offered by R'MES, please refer to the user guide. In particular, it starts by giving the mo st basic use case of R'MES (calculating the scores of exceptionality of all the words of a given length in a given sequence and under a given Markov model) and then describes other possible cases with the associated options (using degenerated words, analyzing coding DNA sequences, using customized alphabets, finding exceptionally skewed motifs and studying clumps of motifs).

R'MES has to be run via a command line which looks like :

rmes [options] -s <filename> -o <string>

where

-s <filename>, --seq <filename>

sets the sequence file in FASTA or GenBank format,

-o <string>, --out <filename>

specifies the prefix for output files.

All the options can be obtained by typing :

rmes --help

and are described below.

The option which specifies the approximation of the word count distribution used to evaluate thep-value is nevertheless required and must take one of the following values :

--gauss

Use the Gaussian approximation,

--poisson

Use the Poisson approximation for the number of clumps,

--compoundpoisson

Use the compound Poisson approximation,

--skew

Use the Gaussian method and compute the additional scores for the skew.

(value required) Specify a string to be used as alphabet for the sequences

-z or --compress

Compress output files.

-v or --version

Displays version information and exits.

Utilities

Three utilities are provided in the R'MES package :

rmes.format displays the results contained in an output file generated by the rmes command. It produces a table with the motifs sorted according to their exceptionality scores (see the usage information).

rmes.gfam allows to generate family files when the corresponding families are degenerated DNA motifs which can be written thanks to the bases a, c, g, t and n (see the usage information).

rmes.composition allows to know the length of a sequence and its composition (see the usage information).

Citating R'MES

if you have been using R'MES or if you want to refer to R'MES, please mention the following reference :