01 September 2008

Today was my first day as a bioinformatician at the Center for the Study of Human Polymorphisms (CEPH http://www.cephb.fr/en/cephdb) and I want to thank my former colleagues Christine K and Philippe Gesnouin (philguess on twitter/FF ) who helped me to find this position. It's a short term contract (one year).

The CEPH is localized in Paris near the St-Louis Hospital and the "Place de la République" it maintains a database of genotypes for genetic markers that have been typed on the CEPH reference family resource for linkage mapping (Genomics 6: 575-577, 1990; Science 265: 2049-2054, 1994). The CEPH works works in conjunction with the National Center of Genotyping (CNG/Evry) where I also worked height years ago and both centers are managed by Dr Mark Lathrop. One of my first objective is to develop a set of tools around OPERON with the help of his author, Mario Foglio.

As far as I've understand operon today (I may be wrong), it is a C program handling a large set of genotypes (among other things...) using BerkeleyDB as a storage engine (I blogged about BerkeleyDB a few posts ago). It seems that using this strategy, the genotypes can be quickly accessed using something like fseek(table,sizeof(genotype_t)*(sample_count*marker_index+sample_index),SEEK_SET).

As a java programmer, I wish I could write a wrapper around the Operon C API, that would be useful to embed this model in a web container (servlet, jsp) or to write a Swing interface. My first ideas to achieve this are:* using JNI (Java Native Interface, allows to call C from java) to write a java wrapper around the C API* reading the data in the berkeleyDB files using the BerkeleyDB Java API. * ...