The use of convert2bed for this file offers a 9.1x speed improvement. Other large BAM files show similar conversion speedups.

Further time reductions are conferred with use of bam2bedcluster and bam2starchcluster scripts (TBA) which make use of GNU Parallel or a Sun Grid Engine job scheduler, reducing conversion time even further by breaking conversion tasks down by chromosome.

Like this:

For scientific work, I have used matrix2png to make a nice PNG image from a text-formatted matrix of data values. PNG looks great on the web, but it doesn’t translate well to making publication-quality figures.

My thought was to take matrix2png and — with the help of Haru (libharu) — turn it into matrix2pdf. Maybe I can get this going on Github.

The end goal is to be able to use an iPhone to read LED displays, as commonly found on meters, etc. and then do something useful with that data (upload it somewhere, tagged with geodata). An aggregate of hundreds or thousands of users could conceivably collect data useful for themselves and also for the group as a whole.

Like this:

I wrote a data extraction utility which uses PolarSSL to export a Base64-encoded SHA-1 digest of some internal metadata (a string of JSON-formatted data), to help validate archive integrity:

$ unstarch --sha1-signature .foo
7HkOxDUBJd2rU/CQ/zigR84MPTc=

So far, so good.

But now I want to validate that the metadata are being digested correctly through some independent means, preferably via the command-line, so that I can perform regression testing. I can use the openssl, xxd and base64 tools together to test that I get the same answer:

As a note to myself: I end up stripping the trailing newline from the JSON output of unstarch because this is what the PolarSSL library ends up digesting. This very nearly had me doubting whether PolarSSL was working correctly, or whether my command-line test was correct!