October 7, 2008

I’ve started work on my first application that uses BioSharp. It is called RestrictionFinder, and its purpose is to help find a pair of restriction enzymes that give distinct cleavage patterns when an insert is present in a plasmid in the forward direction, reverse direction, or absent. It also has the ability to limit the search to pairs of enzymes with compatible buffers.

Here is a screen shot of the sequence entry form:

Sequence Entry Form

If the sequence contains uppercase and lowercase, the lowercase is assumed to be the insert, and the start/end positions are set automatically.

As part of the solution, I needed a small database (just a file, really) of enzymes and their buffers. I could not find a readily available file for this, so I wrote a small console app that extracts the data from the REBASE web site.

This is still a work in progress, but the source code is checked into the BioSharp SVN repository.

September 26, 2008

I’ve uploaded the first incarnation of the BioSharp web site. It is still a bit thin, but at least the API docs are available.

The next step in the project will be to work on an application that was the whole motivation for BioSharp. As that progresses, I’m sure I’ll be continuing to port bits. According to my port status page, I’m just under 5% done…only 1413 classes left to port. Even at this point, though, the library has some useful functionality, as demonstrated by a few of my earlier posts.

If you are interested in seeing a specific module ported over, don’t hesitate to add a comment here to post in the forums. I’ll also look at getting a mailing list set up sometime soon.

September 23, 2008

The BioSharp port is still moving forward. I have enough functionality now to be able to create a DNA sequence, find a subsequence with that sequence, and create a new sequence with the subsequence flipped around.

For example, it can take “aacgaa”, search for “cg”, flip it around, and create the new sequence “aagcaa”. It would be trivial to do this just by string manipulation; hopefully the investment in the library will be worth it.

September 1, 2008

I stumbled across the Bioinformatics Group at the UofM, and realized that I met the president at a birthday party for a mutual friend a few months ago. I may have the opportunity to contribute to a project or two in the coming semester(s), so I started reading a bit about bioinformatics (again).

I went looking for some code, and found a framework called BioPerl, which seems fairly popular. My perl skills have atrophied over the years, and when I found BioJava, I was a bit more excited. It provides a number of useful functions, and seems fairly active. There is also a related database project, BioSQL, that both BioPerl and BioJava (along with BioRuby and BioPython) have incorporated language bindings. BioJava even uses Hibernate as its O/R mapping layer.

Since I like to work in C#, I started playing around with porting BioJava to C#. It’s a huge project, but it’s also a great way to see how BioJava is put together. I’ve managed to get far enough that I can transcribe DNA to RNA using the following code:

Yup. A few dozen classes and a few hundred lines of code, and I can replace t’s with u’s. Pretty exciting, eh?

Actually, I think it is pretty cool. I’m pretty close to having the code working that will let me translate the RNA to a protein sequence or form the complement of a DNA strand. Not rocket science, but I’ve only begun to tap the surface. The framework allows reading sequence files (BLAST, FASTA), edit large sequences (efficiently), do pairwise alignment, and a whole lot more.