This option allows the user to select whether the trace names used for
the samples should be the same as their file names or should be the
names stored inside the files.

Phred

Phred is a base caller which also assigns confidence values to each base.
Generally the data passed to pregap4 has already been base called. However
not all base callers assign confidence values and so it can be useful to
apply phred or ATQA (which does not base call but does assign confidence values).
Alternatively "Estimate Base Accuracies" can be applied which is a simple
program for providing numerical values which reflect the signal to noise ratio
for each base, and which can be used instead of confidence values.
(Note that if quality clipping is used, its score thresholds depend on
whether confidence values of eba values are used).

Trace Format Conversion

This option can be used to convert bulky files such as those of ABI to a
compact format such as SCF or ZTR without loss of the data required for
trace display.

Initialise Experiment Files

The input to gap4 and several of the other programs used here is a data
format known as Experiment file format. This step, which has no
configurable parameters is essential for mutation data processing.

Augment Experiment Files

The section on Reference Traces outlined the use of "Naming Schemes" for
associating pairs of forward and reverse readings, and for assigning
reference traces. The naming scheme must be loaded from pregap4's File
menu. "Augment Experiment Files" must be activated in order for the
naming scheme to be applied. No parameters need be set.

Quality Clip

The reliability of the base calls varies with position along the sequence.
Near to both ends the data is less reliable. The "Quality Clip" option
trims the ends of the sequences by analysing their confidence values or
accuracy estimates (if present) or the density of unknown bases in the
sequence. By observing these "clip points" other processing programs
will work more reliably.

Reference Traces

As explained above it is necessary to specify a reference trace (preferably
one for each strand of the data if processing data from both strands). The
Reference sequence can also be set here.
Note that
even if our suggestion to preload the reference traces into the gap4
database is followed, it is still necessary to specify them here for use
by the
mutation detection modules.

Trace Difference

This is the program which compares the patient and reference traces to
search for possible mutations. It adds data to the experiment files
to mark each predicted mutation, and this data will appear as tags in the gap4
database. It can also create a new trace file containing the difference
of the reference and the sample. The numerical parameters control the
sensitivity of the algorithms, and hence the ratio between the numbers
of false positive and negative results.

Heterozygote Scanner

This is the program which compares the patient and reference traces to
search for possible heterozygous bases. It adds data to the experiment files
to mark each predicted heterozygous base,
and this data will appear as tags in the gap4
database. The numerical parameters control the
sensitivity of the algorithms, and hence the ratio between the numbers
of false positive and negative results.

Gap4 shotgun assembly

In order to be able report the positions of mutations relative to the reference
sequence, and to be able to compare sets of samples from patients, it is
necessary to perform multiple sequence alignment on the data. This is termed
"assembly" and is usually performed by gap4, although other programs can be
operated via pregap4. If following the suggestion to preload the reference
sequence to a temporary database for each batch, supply the name of this
database here. Otherwise a new database should be named and created
from this option. (If this strategy is adopted make sure that the reference
sequence and the references traces are assembled!) The parameters
that control the assembly process and are described elsewhere.

Note that pregap4 has the facility to save its configuration and parameter
settings.
This means that the current configuration will be set automatically next
time the program is used (and hence the steps just described only need to be
performed once). In addition pregap4 can be run non-interactively
by typing a single line on the command line.
Taking thse two capabilities together, means that only one line need be
typed in order to process all subsequent batches of data (assuming the
file names are reused, which is easy to arrange.)

This page is maintained by
staden-package.
Last generated on 22 October 2002.
URL: http://www.mrc-lmb.cam.ac.uk/pubseq/manual/mutations_unix_11.html