Pages

Saturday, February 25, 2012

This is more like the update interval I had originally in mind: one month per version. In fact, this is the first edition for the same CDK release as the previous edition, and thus is edition 1.4.7-1. Still, this release adds 12 new pages, consisting mostly of a new Chapter 14, about molecular descriptors, and the CDK API for descriptor calculation:

Other new content includes a short code example for generating 3D coordinates.

LaTeXSearch lets you search for mathematical equations in literature, and also is helpful in providing the LaTeX source of those equations. For example, in the above screenshot I searched for RMSE. But there are more interesting queries, like all equations with ħ or with Q2.

One of the things, is that the CiTO data added via a certain account, can be downloaded as triples:

The second is that they are improving the graphics of how it is visualized. E.g. they added an 'Expand' link, which I found when they tweeted they had hidden drag-n-drop, which I haven't found yet, though. Clicking that action, will show you the following:

Because CiteULike takes advantage of the inverse function of the CiTO predictates, they show up with the cited paper too, which is less suitable for the top-down flow graphics:

To make this advertorial a bit balanced, not all my wishes have been implemented yet, and the next up from my perspective should be Linked Data. There is some Linked Data embedded as RDFa, but the latter is not turning out to be the killer I had hoped, and regular RDF entry points should be used.

Each CiteULike entry (post) should get a unique IRI (or URI) and opening that link should give RDF about that post (wish #10). That's is dereferencibility. The RDF can be, for example, in BIBO but there are many alternatives, and I have not been keeping up with which is the best (please leave a comment, if you have an opinion on that).

Saturday, February 18, 2012

I am happy that the Chemical blogspace is back in business. There were issues with the database earlier this week, but the website is back up.
I am happy because if the pointers to interesting discussions and papers. For example, this funny entryin PubChem:

Really interesting too, is the difference in InChI between the Chemical blogspace and the PubChem entry.

Oh, and welcome back ChemBark! Check that URL; ChemBark is one of the oldest blogs on Chemical blogspace. Id = 8!

Friday, February 17, 2012

Below are the slides of the Chemical Interoperability presentation I gave this morning in Utrecht in a parallel meeting of the NBIC. The name was copied from the suggestion by Christine, who invited me.

BridgeDB was recently published (doi:10.1186/1471-2105-11-5) and is a combination of Open Source, a web services, and Creative Commons-licensed mapping data, that links many bio- and cheminformatics related databases. To learn more about the BridgeDB source code, I created a Bioclipse manager (see doi:10.1186/1471-2105-10-397) to make BridgeDB functionality (both the library and the webservice) accessible in the JavaScript environment. This is one of the scripts you can now try (of course, this is not how it will be used in Open PHACTS):

We you may or may not know, I'm on the editorial board of the yet-to-kick-of journal Open Research Computation (ORC, official website). People are scared to submit, and even the editors are reluctant with submitting work. Myself I have found the excuse of no time, to not submit something yet.

Indeed, application papers are the extra sugar, but projects and project deadlines favor a slightly different kind of paper. And, some uncertainty lies in the fact that ORC may not reach the same impact the NAR database special issue has.

However, I call upon everyone in the Open Science community, to submit a paper to ORC, describing your documented software and how it is tested. We need a sufficiently filled pipeline to make this happen.

Ideas

your CRAN/BioConductor package

your software you used for a data analysis of an already published paper (clue: it gives you a chance to cite that paper in a meaningful way)

your Cytoscape, Bioclipse, Taverna plugin that was too small for a BMC Bioinformatics / JChemInf paper

Well, surprise me! If you are uncertain about the minimal publishable unit for ORC, please contact Cameron.

Sunday, February 05, 2012

Since we are getting more and more in trouble with SourceForge :( I started looking into the more standard git code review environment, called Gerrit, so that we can use that for the CDK. With some huge learning curve, lurking, googling, and seeing what Avagadro was doing (resulting in my first ever, trivial Avogadro patch), and a major headache. This is what my patch looks like in Avogadro's Gerrit install:

As you can see, Marcus reviewed my patch, and approved it. I am not sure if they have configured to have Gerrit automatically push to GitHub, but that is an option.

It turns out I could not find documentation how to set up Gerrit for a GitHub project, but ended up installing it. That basically consists of setting up MySQL (or equivalent) with a user account and database, create a Linux account, and then follow the install instructions in a .war file. Next then is to register an account, and it nicely picks up a Google Account, but OpenID seems supported too. The first account is automatically the administration account, and that is a good choice indeed.

Finding the right documentation for creating a new project from an existing project was tricky, and I ended up with this instruction. However, typing the second step in that:

at org.eclipse.jgit.transport.PackParser.verifySafeObject(PackParser.java:959)

at org.eclipse.jgit.transport.PackParser.whole(PackParser.java:940)

at org.eclipse.jgit.transport.PackParser.indexOneObject(PackParser.java:858)

at org.eclipse.jgit.transport.PackParser.parse(PackParser.java:467)

at org.eclipse.jgit.storage.file.ObjectDirectoryPackParser.parse(ObjectDirectoryPackParser.java:178)

at org.eclipse.jgit.transport.ReceivePack.receivePack(ReceivePack.java:832)

at org.eclipse.jgit.transport.ReceivePack.service(ReceivePack.java:665)

... 15 more

It seems I am running into this know bug :( I added my experience with Gerrit 2.2.2 to the report, but the original issue is over a year old. There is little comment, let alone a workaround, so this project is now on hold... :(

Update: I figured out how to set up a git repository manually on the Gerrit server, and managed to to push a patch for review:

Search This Blog

This blog deals with chemblaics in the broader sense. Chemblaics (pronounced chem-bla-ics) is the science that uses computers to solve problems in chemistry, biochemistry and related fields. The big difference between chemblaics and areas such as chem(o)?informatics, chemometrics, computational chemistry, etc, is that chemblaics only uses open source software, open data, and open standards, making experimental results reproducible and validatable. And this is a big difference!

About Me

Assistant professor at the Dept of Bioinformatics - BiGCaT at NUTRIM, Maastricht University, studying biology at an unsupervised and atomic level. Open Science is my main hobby resulting in participation in, among many others, Bioclipse, CDK and WikiPathways. ORCID:0000-0001-7542-0286. Posts on G+ are personal.

Cookies

In the EU there is a directive upcoming requiring websites to warn people about HTTP cookies. This website uses the Blogger.com platform, Google Adsense (not that is it actually paying anything significantly), and a few scripts to count how often a blog post was tweeted, using Topsy and LinkedIn. These services undoubtedly make use of cookies, which you can disallow in your browser.