Pages

Sunday, January 29, 2012

Moving country is exhausting. Living in a house full of boxes for a few weeks. Finding a house. Changing culture. Maybe it's a linguistic thing, but EU countries do not share the same culture. OK, we too have a McDonalds on every corner, but that's about it. But returning to The Netherlands was a cultural shock. A shock? Yes. I thought I knew the country I lived in most of my life.

Then, switching position. Posthopping (=post-doc here and there, attempting to find some local optimum where you both work on exiting things and try to set up a research group) around Europe (I have pension in four EU states now), while trying to keep writing papers and on top of that try to do something that in fact has impact on our science, means that every three months before the end of a post-doc position, and three months after you started the next, it's double work: finding your way around at the new university, while finishing those studies that almost were finished, in random, unpredictable order.

And, of course, being annoyed if your prime minister then claims he sometimes cannot get his work done in 40 hours. Well, one would actually think that a country in an economic crisis, with people eating up all their hard-worked-for saving just to get around, would do all his best to turn the future of the country around... oh well...

Sometimes I really wonder what I'm doing.

And then, in a spare hour here and there do something for myself. Like writing up this post, in an attempt to give all a place. Or finishing up a further paragraph of my book(let), or working on my contributions to the Pharmaceutical Bioinformatics book (molecular representation, semantic web for the life sciences). For my own Groovy Cheminformatics book(let): seventy more pages, and it's a book. Hard-cover, and I can start touring around Europe. BTW, I enjoy and can recommend reading Reinventing Discovery. Done the first 30 pages or so, and keep wondering how those examples can be scaled down to cheminformatics.

Sometime I really wonder why I keep working in an area that everyone just takes for granted and hardly cares about.

I'm tired, and this is slowly becoming a really boring and depressing blog post. That's a shame, because I have had a really great time in Roland Grafström and Bengt Fadeel, working among and with one of the greatest, enthusiastic research teams I have seen around Europe. Having to leave that makes me sad too. In fact, I have never ever been homesick, and now going back to the country I grew up, I am homesick. Well, it's a feeling I don't like.

Weirdly, I have many really exciting ideas, research-wise, and my exciting daily work at BiGCaT, which is now in Open PHACTS, the network in The Netherlands, I have much to enjoy here. Yes, it is again hopping to another application area of cheminformatics, after interaction of cheminformatics and chemometrics (my thesis), more fundamental cheminformatics, metabolite identification, pharmaceutical research, toxicity, and now back to drug discovery but also the metabolome. But I love the complexity of the metabolome, and have so much detailed insight in the other fields now... oh, the endless possibilities!

And then I remember why I am doing this to myself.

All the endless possibilities! All the research we can do so much better than now is done! The more accurate answers we get, and actually be in a situation where we can start identifying limitations of cheminformatics! Ha, and you know I love to look beyond the edge of the world.

But, then I realize again that I need funding, and wonder how I can live my dream, if no one believes in it.

Not that I have been completely unsuccessful. Au contraire. I did get funding, for travel on many occasions, and recently small bits for research too. But I am really eager to get some funding to have research the ideas I have, rather than working on them myself. And eager to get a fixed position. Though I am grateful to Chris Evelo for offering the three-year position I am in now.

Next time someone starts talking about interdisciplinary research, get a trout out of your bag. Interdisciplinary research is a buzz word that only works when you already have a single-disciplinary fixed position. Advice to students: never start an interdisciplinary research topic. You will never be the expert people will want to fund, because interdisciplinary research can simply be done by single-discipline experts in a collaboration, and much better than you could, with your years of experience (n=1).

I also now realize that strengthening another project is also no good for your own career. Your hard work will just go to that project. You can contribute as much to some project as you like, but the corresponding Dr. Who will get the fame. No wonder people rename, brand, and use rather than collaborate. We desperately need #altmetrics.

Yes, I realize this applies to the CDK too. I am trying hard to get recognition with those who deserve it. But who reads a copyright statement. Who remembers blog posts with change logs and statistics on who did the work. Scientists in charge of funding remember only the top person.

Ha, you see that pattern applies the publishing too, right? Scientists only too often care more about the JIF of the top concept, the journal, than the actual work, your actual damn paper.

Oh well, fortunately it's almost Monday again, so that I can focus on science again, and don't have to think about these things.

And, I am deeply grateful to all that publicly support my output. A citation to one of my papers, a public review of my book, a new tool that makes stands on the shoulders of your work! That makes a difference!

Then I remember again why I am doing all this. I can make a difference.

Monday, January 16, 2012

Yeah, I did it. I made the first new development release (1.5.0) for the CDK after the fork of the stable 1.4.x series. It had to happen after the removal of IMolecule and IMoleculeSet. Well, in fact, while the list just lists all the patches specific for the current master branch, it is still fairly long. Then again, quite a few of my 'commits' are probably just merges.

Just to make clear, this is a development release, and until we freeze this branch, we expect, and actually intentionally add, API changes. Of course, we intent those to improve things, and please shout out your wishes.

First of all, this release removed IMolecule and IMoleculeSet. That was a big effort, explaining why Rajarshi and I have so many commits in this release. This release also adds the LINGO fingerprint type and a atomic signature-based fingerprint. It removes the nonotify module, as the silent module should be used instead. IChemObjectIO now extends Closeable, making it more Java7-friendly. Also noteworthy is the API for raw fingerprints, which are not of fixed length, but have key and counts. On the implementation side, the QueryAtomContainer now uses local, custom implementations of IAtom and IBond, making the module independent from the data module, freeing the rest of the library for more code clean up.

Sunday, January 15, 2012

Six month was not quite the amount of time I anticipated between the third and fourth edition, but I finally managed to upload edition 1.4.7-0 of my Groovy Cheminformatics book. The first three editions sold 37 copies, including two for myself. Enough to feel supported and to continue working on it.

So, this new edition is again thicker, summing up to 152 pages now, which is 28 pages more than the 3rd edition. Indeed, the table of contents is more than half a page longer in itself, though, just barely, still fitting on four pages. In fact, I had to remove one (new) subsection title, because it would take otherwise two further pages.

The new content is again a mix of sections and chapters. While writing new chapters, I find myself realizing I need to cover more basics. Those get typically added as new sections. I did not get many feature requests, except for one email pointing me the text promised how to interpret and handle failing atom type perception, which explains one of the new sections. The full list of new content is:

Section 2.1.4: explaining the three flavors of atomic coordinates

Extended Section 2.2: added detail about electron counts of bonds (partly in reply to this post by Rich)

Chapter 6 "IChemObjectBuilders": four pages explaining the four alternative builders CDK 1.4.7 has

Section 7.8: a new section with recipes on how to post-process read input, discussing MDL molfiles only now. It talks about what information is present in the file format, and what steps must be untertaken to add missing information

Section 8.2.4 "No atom type perceived?!"

Section 11.4: describes how to depict aromatic rings

Section 11.5: describes how to change the background color of depictions

Section 13.4: explains how to calculate the Van der Waals volume of molecules

Section 18.1.3: discussing the API improvement in the iterating readers

Appendix C: a list of all descriptors provided by the CDK

Appendix D: a list of file formats known by the CDK, indicating which has readers and writers

On top of that, I improved other bits of the book too, such as the resolution of the depictions of molecules, as well as those of various diagrams. Also the number of scripts has seriously gone up, from 94 to 134!

Appendix C is a prelude to a chapter I am already writing, but did not get finished yet: a chapter about descriptor calculation. But since I just started a new post-doc position, it may take another six months for that chapter to make it into print.

Sunday, January 08, 2012

I found in the Groovy JChemPaint repository a script I had not blogged about yet, explaining how to change the default background color. It's fairly simple, and just uses parameters. Starting from the common pattern to set up a renderer, you set the background parameter:

backgroundColor = Color.lightGray;model = renderer.getRenderer2DModel()model.set( BasicSceneGenerator.BackgroundColor.class, backgroundColor)The full script can be found here. The resulting output looks like that given below.

Search This Blog

This blog deals with chemblaics in the broader sense. Chemblaics (pronounced chem-bla-ics) is the science that uses computers to solve problems in chemistry, biochemistry and related fields. The big difference between chemblaics and areas such as chem(o)?informatics, chemometrics, computational chemistry, etc, is that chemblaics only uses open source software, open data, and open standards, making experimental results reproducible and validatable. And this is a big difference!

About Me

Assistant professor at the Dept of Bioinformatics - BiGCaT at NUTRIM, Maastricht University, studying biology at an unsupervised and atomic level. Open Science is my main hobby resulting in participation in, among many others, Bioclipse, CDK and WikiPathways. ORCID:0000-0001-7542-0286. Posts on G+ are personal.

Cookies

In the EU there is a directive upcoming requiring websites to warn people about HTTP cookies. This website uses the Blogger.com platform, Google Adsense (not that is it actually paying anything significantly), and a few scripts to count how often a blog post was tweeted, using Topsy and LinkedIn. These services undoubtedly make use of cookies, which you can disallow in your browser.