About the workshopRECOMB-seq is a workshop held two days prior to the main conference, with a distinct paper submission process and program. Our goal is to bring together the community of scientists working on methods for massively parallel sequencing.

Scope
The recent revolution in sequencing technology has opened the door for myriads of new applications and bio-medical discoveries. Projects are under way to sequence thousands of individuals (the 1000 genomes project), tens of thousands of vertebrate species (the Genome10K project), and the whole microbial ecosystem that live in our bodies (the Human Microbiome project). Simultaneously, the novelty and complexity of the data has highlighted the challenges and limitations of current methods. As the technology continues to evolve and approaches its third generation, the challenges facing the community are becoming increasingly computational.

We would like to invite contributions describing new methodology to deal with all aspects of massively parallel sequencing data, including, but not limited to,

Bio-medical applications, including cancer genomics

Discovery and genotyping of genomic variants; including SNPs, indels, and structural variants

Local and de novo sequence assembly

RNA sequencing, including the analysis of RNA expression and novel transcript assembly

In many applications of NGS, sequenced samples contain multiple distinct but sometimes very similar sequences. This talk addresses two such applications: reconstructing viral quasispecies from shotgun and amplicon 454 Life Sciences reads and transcriptome reconstruction from RNA-Seq single and paired reads.
Error-prone replication of RNA viruses with high mutation rate creates a diverse population of closely related variants known as quasispecies. By understanding the quasispecies, more effective drugs and vaccines can be manufactured as well as cost-saving metrics for infected patients implemented. Reconstructing the quasispecies spectrum is difficult since conserved regions in the genome can extend beyond the read length. We will compare a recently proposed approach of reconstruction from amplicon reads obtained from predefined overlapped genome regions with more developed reconstruction from shotgun reads randomly distributed along the entire genome.
Transcriptome reconstruction and quantification from RNA-Seq reads can be greatly improved by using existing partial annotation as well as fully utilizing paired reads. We describe a framework which removes reads that can be explained by existing annotation and focuses on remaining reads for discovering novel transcripts and estimating its frequency. We next describe an optimization approach that minimizes number of selected candidate transcripts such that mapped paired reads closely follow known insert length distribution.

Pavel PEVZNER,
University of California, San Diego, US"SPAdes: a New Genome Assembly Algorithm and its Applications to Single-Cell Sequencing"

The lion's share of bacteria in various environments cannot be cloned in the laboratory and thus cannot be sequenced using existing technologies. A major goal of single-cell genomics is to complement gene-centric metagenomic data with whole-genome assemblies of uncultivated organisms. Assembly of single-cell data is challenging because of highly non-uniform read coverage as well as elevated levels of sequencing errors and chimeric reads. We describe SPAdes, a new assembler for both single-cell and standard (multicell) assembly, and demonstrate that it improves on the recently released E+V-SC assembler (specialized for single-cell data) and on popular assem blers Velvet and SoapDeNovo (for multicell data). SPAdes generates single-cell assemblies, providing information about genomes of uncultivatable bacteria that vastly exceeds what may be obtained via traditional metagenomics studies.