Pages

Friday, July 31, 2015

WikiPathways and two estrone-x,y-quinones added to Wikidata

WikiPathways does a lot of curation, with a team growing in size. A number of regular jobs is performed weekly by one of a group of some 15-20 curators. On top of that, some curators do much more than this weekly task, e.g. Kristina Haspers. Since I joined the BiGCaT team of Chris Evelo in Maastricht, I have been looking into the metabolites and other small molecules, and did quite a bit of work to make that information machine readable. See, for example, theseopennotebookscienceposts.

This curation is partly supported by tools, e.g. bots and tests. Tests are, among others, being run nightly on a Jenkins instance (in various configurations). One of the bots create this report, which Martina Kutmon recently reminded me of. Starting at the end of that, I started browsing it for unrecognized metabolites (for various reasons). My eyes fell on two compounds in the estrogen metabolism pathway, originally created by Pieter Giesbertz: estrone-2,3-quinone and estrone-3,4-quinone (in green):

The website was not showing up mappings to other database for the cross-references from PubChem. A quick check confirmed that HMDB, KEGG and ChEBI did not have this compound. HMDB has an entry for one of the compounds, given the name, but the chemical graph has undefined stereochemistry. That certainly explains why it did not map to the PubChem compound ID. And, indeed, PubChem does have the HMDB as substance, but not linked to a compound. So, I added them to Wikidata: Q20739847 and Q20742851.

Then, when I make the next metabolite ID mapping database for BridgeDb, it will have mappings between the cross-references in WikiPathways for these two compounds to, at the time of writing, ChemSpider, and to the CAS registry number of one of the two. Please also note that Wikidata allowed me to store the information source.

Search This Blog

This blog deals with chemblaics in the broader sense. Chemblaics (pronounced chem-bla-ics) is the science that uses computers to solve problems in chemistry, biochemistry and related fields. The big difference between chemblaics and areas such as chem(o)?informatics, chemometrics, computational chemistry, etc, is that chemblaics only uses open source software, open data, and open standards, making experimental results reproducible and validatable. And this is a big difference!

About Me

Assistant professor at the Dept of Bioinformatics - BiGCaT at NUTRIM, Maastricht University, studying biology at an unsupervised and atomic level. Open Science is my main hobby resulting in participation in, among many others, Bioclipse, CDK and WikiPathways. ORCID:0000-0001-7542-0286. Posts on G+ are personal.

Cookies

In the EU there is a directive upcoming requiring websites to warn people about HTTP cookies. This website uses the Blogger.com platform, Google Adsense (not that is it actually paying anything significantly), and a few scripts to count how often a blog post was tweeted, using Topsy and LinkedIn. These services undoubtedly make use of cookies, which you can disallow in your browser.