Software

A fast and accurate adapter, barcodes, and low-quality region trimming and binning program written in C for
next-generating sequencing reads. The search algorithm is based on Eugene Myers' fast bit-vector algorithm.

Low sequencing quality inevitably leads to errors in base calling,
which in turn results in wrong genotypes.
In addition, for heterozygous alleles, which are co-amplified and
sequenced in the same Sanger sequencing reaction, the sequences often
contain ambiguous bases, which usually have higher error rate in the base calling stage.
The ultimate source of resolution is the
chromatograms from which the sequences are called.
To manually read chromatograms, especially chromatograms of heterozygous
sequences, is laborious and error-prone.

I developed a program
that automatically does genotyping using chromatograms directly.
The program is highly accurate.
An online version of the program is here .
The program needs two files:

a text file that contains the names of
the genotypes and the corresponding sequences,

and the chromatogram file.

The program searches every sequences in the first file against the chromatogram file to find
the best match.

The algorithm itself has not been published.
It was used in the following publications: