Pages

Friday, March 31, 2006

An InChI (or the FAQ) is a line notation for a molecular structure that was recently developed by the NIST and the IUPAC. Principally they can be applied to protein too (see below), but because proteins would give lenghty InChI's and are quite well defined in terms of connectivity anyway, those can better be described by their amino acid sequence.

The March 2006 issue of CDK News, the Chemistry Development Kit project newsletter, will be released later today, and had, for the second time, the requirment that authors provide InChI's for molecular structures mentioned in the articles. Different from the previous issue is how InChI's are marked up in LaTeX. I've setup a \inchi{} for this that automatically creates a Google search query as link behind the InChI:

Now, googling for InChI's only works if one removes the InChI= part of the InChI. As an example I will show how it works for methane. The InChI for this compound is InChI=1/CH4/h1H4, so in LaTex one enters \inchi{1/CH4/h1H4}. This will create a link like: InChI=1/CH4/h1H4.

BTW, if you are interested in InChI's for proteins, here is the InChI for 1CRN, created with OpenBabel:

Saturday, March 25, 2006

As of April 3, I will be working as postdoc in the group of Christoph Steinbeck at the Cologne University BioInformatics Center, or simply CUBIC, for a year. Though no exact plans have been decided upon, the work will include CDK, CML, ontologies, Bioclipse, semantic web technologies, Jmol, and other interesting things. Research areas will at least include QSAR, but I hope to touch bits of bioinformatics too.

Saturday, March 18, 2006

Dan (the original Jmol author) has an interesting blog series: How to make money from Open Source scientific software I, II and III. Three more blog items are in the planning. The deal with how to make money from open source scientific software. He wants to be able to skeptically review the software in his field, hence open source. But open source software development, at least in chemistry, needs funding, because there are too few people working on such software on a voluntary basis.

The articles discuss possible scenarios. Article I discusses 'Sell hardware' that comes with open source software, and article II discusses the 'Sell services' scenario, which still works in the GNU/Linux OS world. He argues that selling support does not fit the chem-bla-ics world: "First, scientific software targets a relatively small group of users, and at the same time, the development and support costs are often quite large." and "Why would a researcher spend $10000 on a support contract if the problem could be solved by throwing a graduate student at the open source version of the code for a few months?" Interesting arguments indeed.

Instead, he suggests, the service sold should be knowledge. The open source based company should sell knowledge, should solve customer problems using open source software. Each problem will come with specific needs, allowing indirect funding of open source development. And, yes, this is indeed how open source chemo-/bioinformatics software is currently development: as a mean to solve scientific challenging problems.

He discusses the advantages and problems with opensource, and mentions the often lacking user-friendly GUI (true), and the the lack of literature to validate the program. It was unclear to me wether the last argument applied to the free tools, or to the open source programs; I thought the open-source projects like the CDK, JOELib, Jmol and PyMol were quite strong in this area, at least compared to the commercial software I have seen.

Today it hit Debian unstable, so upgrade my sid32 chroot and had Cacao run Jmol. I had some memory issues opening a small molecule [1], and the rendering speed was a factor 100 or so slower than Sun's JVM, but it runs!

Using the command cacao -Xmx512M -jar Jmol.jar triplebond.mol I got:

Note the exceptions copied to the console. I'll paste the full stack trace as a comment.Many thanx to the Classpath team!

After Kalzium and kfile_chemical, KDE has now be extended with kparts for 3D structure and spectrum display: Kryomol. It is written in C++ and licensed GPL. It supports several chemistry formats, among which quantum chemical formats like Gaussian03, NwChem and ACES, and 3D structures as MDL molefile and XYZ.

Search This Blog

This blog deals with chemblaics in the broader sense. Chemblaics (pronounced chem-bla-ics) is the science that uses computers to solve problems in chemistry, biochemistry and related fields. The big difference between chemblaics and areas such as chem(o)?informatics, chemometrics, computational chemistry, etc, is that chemblaics only uses open source software, open data, and open standards, making experimental results reproducible and validatable. And this is a big difference!

About Me

Assistant professor at the Dept of Bioinformatics - BiGCaT at NUTRIM, Maastricht University, studying biology at an unsupervised and atomic level. Open Science is my main hobby resulting in participation in, among many others, Bioclipse, CDK and WikiPathways. ORCID:0000-0001-7542-0286. Posts on G+ are personal.

Cookies

In the EU there is a directive upcoming requiring websites to warn people about HTTP cookies. This website uses the Blogger.com platform, Google Adsense (not that is it actually paying anything significantly), and a few scripts to count how often a blog post was tweeted, using Topsy and LinkedIn. These services undoubtedly make use of cookies, which you can disallow in your browser.