Tuesday, July 17, 2012

And so it begins ... Compressive Genomics

You probably recall last month's "What is Faster than Moore's Law and Why You Should Care" where we noticed two facts: one, the rapid rise of computing power or imaging capabilities leading to a difficulty in keeping up with data understanding. Two, a new technology (sequencing) promises to deliver larger datasets at an even faster pace. As stated then, our only recourse is developing better algorithms....fast. Here is an instance of that need being addressed in a new Nature paper entitled Compressive genomics by Po-Ru Loh,Michael Baym, Bonnie Berger. The introduction starts with:

In the past two decades, genomic sequencing capabilities have increased exponentially1, 2, 3, outstripping advances in computing power4, 5, 6, 7, 8. Extracting new insights from the data sets currently being generated will require not only faster computers, but also smarter algorithms. However, most genomes currently sequenced are highly similar to ones already collected9; thus, the amount of new sequence information is growing much more slowly.

Here we show that this redundancy can be exploited by compressing sequence data in such a way as to allow direct computation on the compressed data using methods we term 'compressive' algorithms. This approach reduces the task of computing on many similar genomes to only slightly more than that of operating on just one. Moreover, its relative advantage over existing algorithms will grow with the accumulation of genomic data. We demonstrate this approach by implementing compressive versions of both the Basic Local Alignment Search Tool (BLAST)10 and the BLAST-Like Alignment Tool (BLAT)11, and we emphasize how compressive genomics will enable biologists to keep pace with current data.

It might even be a good idea to have a session on the subject at the next BASP meeting or even sooner. We need to get that conversation going on a large scale as we don't have much time before the field begins to crumble into many little subfields.

For videos on issues related to biology, compressive sensing, streaming algorithms you may want to watch: