Interpreting ABI 377 Chromatograms

Last updated on May 28, 1996

Introduction

The chromatograms that are produced by automated sequencing machines consist of
sequence typically very close to the primer and extending well beyond the limit of
accurate basecalling. Both ends of the sequence derived from the chromatogram need to
be trimmed to remove erroneous and ambiguous bases. Additionally, some bases are
inaccurately called within the 'accurate' range as well or may be called as 'N'. Often
this is the result of the chemistry and enzyme used for the sequencing reaction. Many
of these errors and ambiguities can be resolved by inspection of the traces. Below
are listed several excerpts of chromatograms that can serve as examples of
how to trim sequences and resolve ambiguous basecalls. Click here
to view a high quality chromatogram image (gif format). You may also download an example
chromatogram file generated at the UCCRC-DSF by clicking here.
You may also download a Microsoft Word 5 file of examples showing some of the more common base calling
problems compiled at Iowa State University by clicking here.

Examples

This is an example of a good chromatogram showing well-resolved peaks and
no ambiguities. Generally the first several hundred bases of a chromatogram
will look like this.

This is the start of a chromatogram showing peaks corresponding to unincorporated
dye-terminators (dye-blobs) superimposed over and partially obscuring the real peaks.
In particular notice the prominent double 'T' blobs
(red) from positions 4 to 9, and the
paired 'G' and 'C' blobs (black and
blue) covering positions 20 to 23.
Depending on incorporation and washing efficiency, dye-blobs can range in size from nothing at all
to major peaks covering several real peaks. These dye-blobs appear at specific
positions in the chromatogram, mostly interfering with the sequences within 30
to 40 bases from the primer (typically vector), but occasionally appearing up to
several hundred bases from the primer.

This is a region of a chromatogram fairly far along the sequence where some bases
in runs of 2 or more are no longer visible as single peaks. Many peaks are beginning
to broaden and smear into one another, interpretation of the peaks has become more
difficult, and the basecalling software has begun to use 'N's.

This is a region of a chromatogram where the traces have become too ambiguous for
accurate basecalling. While some parts of this region of the chromatogram can be
useful for linking to existing sequences following manual editing, it should not be
considered accurate. Note that some editing changes have been made to the
chromatogram and appear in the upper line in magenta.

These are examples from several chromatograms showing weak 'G' peaks that
have been called incorrectly or called as 'N's. Most
often weak 'G' peaks follow multiple 'A' peaks,
as seen in frames 1, 2, 3 and 5 (corrected bases appear in the upper row). However, they
can also appear after single 'A' peaks
(frames 3 through 6) and occasionally after single 'C'
peaks (frame 7). Frames 3 and 5 each show two weak 'G' peaks after single or multiple
'A' residues.

This is an example of a chromatogram with several dye blobs compared to the corresponding
raw gel image of its own and two adjacent lanes. Note that the dye blobs range in intensity
and can even partially obscure bands in adjacent lanes. In this particular example, an identical
sample has been run two lanes down which has virtually no dye-blobs and can be used to correct
the dye-blob sample. The corrections are shown below the original base calls. Note that the
strongest 'C' dye-blob obscures two actual 'C' residues. In some of the
other cases the dye-blob can be recognized and the true bases identified under the broad dye-blob
peak. (Colors from the raw image bands correspond to the following bases: blue-G,
red-C,
green-A, and
yellow-T.)