Genetic studies are increasingly based on short noisy
next generation scanners. Typically complete DNA
sequences are assembled by matching short NextGen
sequences against reference genomes. Despite
considerable algorithmic gains since the turn of the
millennium, matching both single ended and paired end
strings to a reference remains computationally
demanding. Further tailoring Bioinformatics tools to
each new task or scanner remains highly skilled and
labour intensive. With this in mind, we recently
demonstrated a genetic programming based automated
technique which generated a version of the
state-of-the-art alignment tool Bowtie2 which was
considerably faster on short sequences produced by a
scanner at the Broad Institute and released as part of
The Thousand Genome Project.

Results:

Bowtie2GP and the original Bowtie2 release were
compared on bioplanet's GCAT synthetic benchmarks.
Bowtie2GP enhancements were also applied to the latest
Bowtie2 release (2.2.3, 29 May 2014) and retained both
the GP and the manually introduced
improvements.

Conclusions:

On both singled ended and paired-end synthetic next
generation DNA sequence GCAT benchmarks Bowtie2GP runs
up to 45percent faster than Bowtie2. The lost in
accuracy can be as little as 0.2--0.5percent but up to
2.5percent for longer sequences.",