Thirty years ago I was first introduced to the DNA assembly problem and I have been captivated by it ever since. So on the occasion of the Senior Scientist award, I thought I would speak on this problem that has been a consistent thread throughout my career.

I will give a brief history, from Sanger to today, of the technology and algorithmic approaches to the problem, weaving throughout it the ideas of string graphs and de-Bruijn graphs, and the surprising transition from skepticism of whole-genome shotgun sequencing to an irrational acceptance of NGS whole-genome shotgun over short reads.

Fortunately, the future portends better with long read sequencers beginning to come into play. The unusually high error rates associated with these new technologies imply that some aspects of the assembly problem are harder than ever, but because the error is truly random (unlike any previous technology), the ideal of near perfect de novo assembly is again possible. We will conclude with a description of our recent algorithmic work on an assembler we call the Dazzler (the Dresden AZZembLER) that can assemble 1-10Gb genomes directly from a shotgun, long read data set produced by PacBio RS II sequencers.

Biography:

In the 80's Gene Myers developed many efficient algorithms for sequence comparison and search, used, for example, in BLAST and UNIX diff. With Udi Manber, he invented suffix arrays that enable the Burroughs-Wheeler transform needed in todays space-efficient indices, especially for genomic data. Myers developed the overlap-layout-consensus paradigm for DNA sequencing, ultimately perfecting the string graph approach used at Celera to successfully assemble the fly, human, and mouse genomes. With Jim Weber, he was the first to propose paired-end whole genome shotgun sequencing of the human genome, the paradigm by which most genomes are sequenced today. Recently he has focused on the construction of novel microscopes and software for building single cell expression atlases across developmental epochs.

Myers has been a professor at U of Arizona and UC Berkeley, a vice president at Celera Genomics, and a group leader for HHMI and the Max-Planck Society. He is a member of the National Academy of Engineering, USA, the National Academy of Germany, and won the ACM Kannellakis Prize in 2002.