Compose :: Melbourne Speaker - Justin Bedő

Compose :: Melbourne will feature many excellent speakers. One of this year's lineup is Justin Bedő. If you want to see the whole lineup look here!

1:30pm - Justin Bedő

BioShake: a Haskell EDSL for bioinformatics pipelines

Recently there have been great advances in the field of biology arising from rapid technological progress, notably in certain areas such as genomics. These technologies have drastically increased the use of computing as part of data processing and analysis. It is now common to use many hours of compute to process biological data in what is known as a *bioinformatics pipeline*.

These bioinformatics pipelines are typically composed of numerous programs and stages coupled together loosely using intermediate files. These pipelines tend to be quite complex and require substantial computational time, hence a good pipeline must be able to manage intermediate files, guarantee rentrability — the ability to re-enter and continue a partially run pipeline — and also provide clear syntax to easily describe and understand pipelines.

This is a problem that resembles the problem of building software artefacts that is common in computer science. Some notable tools are the ubiquitous tool *make*, and more recently *Shake*, a robust build tool implemented as an embedded domain specific language (EDSL) in Haskell. However, bioinformatic pipelines have some unique properties that do not fit well into current build tools...