Monday, July 16, 2012

Dianhui (Dennis) Zhu presented "Genomic data analysis with hadoop". He talked about using Hadoop framework to do pattern search in genomic sequence datasets. This is based on his three-year project at Baylor, which started using Hadoop a year ago. Dennis is Senior Scientific Programmer at HGSC.

The interesting technical problem that Dennis showed was to break sequence into chunks, before it gets to the Mapper - which is usually trivial in the regular applications, but is quite hard with unlimited unstructured data of the genome. The audience analyzed the actual code, asked many questions, and wanted to compare to the existing open source projects.

Sunday, July 8, 2012

SHMcloud™ Press Release 7/9/12FreeEed™ is now in the cloud!eDiscovery processing: text extraction, culling, and native/text and metadata csv delivery.Special introductory offer until August 15: $1 per machine-hour. How fast is that? At a recent show we processed 100 GB of Enron data in 1 hour for under $100, as seen here.How can you get started? ⇢ Just go to here, ⇢ download the SHMcloud(TM) Player, ⇢ and start!