Pages

Sunday, April 16, 2017

First (I have never blogged much about risk and hazard), I am not an toxicological expert nor a regulator. I have deepest respect for both, as these studies are one of the most complex ones I am aware off. It makes rocket science look dull. However, I have quite some experience in the relation chemical structure to properties and with knowledge integration, which is a prerequisite for understanding that relation. Anything I do does not say what the right course of action is. Any new piece of knowledge (or technology) has pros and cons. It is science that provides the evidence to support finding the right balance. It is science I focus on.

The case
The AD national newspaper reported spilling of the compound with the name GenX in the environment and reaching drinking water. This was picked up by other newspapers, like de VK. The chemistry news outlet C2W commented on the latter on Twitter:

Translated, the tweet reports that we do not know if the compound is dangerous. Now, to me, there are then two things: first, any spilling should not happen (I know this is controversial, as people are more than happy to repeatedly pollute the environment, just because of self-interest and/or laziness); second, what do we know about the compound? In fact, what is GenX even? It certainly won't be "generation X", though we don't actually know the hazard of that either. (We have IUPAC names, but just like with the ACS disclosures, companies like to make up cryptic names.)

But having working on predictive toxicology and data integration projects around toxicology, and for just having a chemical interest, I started out searching what we know about this compound.

A side topic... if you have not looked at hypothes.is yet, please do. It allows you to annotate (yes, there are more tools that allow that, but I like this one), which I have done for the VK article:

I had a look around on the web for information, and there is not a lot. A Wikidata page with further identifiers then helps tracking your steps. Antony Williams, previous of ChemSpider fame, now working on the EPA CompTox Dashboard, added the DTX substance IDs, but the entries in the dashboard will not show up for another bit of time. For FRD-903 I found growth inhibition data in ChEMBL.

I reported this slide, as they worry seems to be about drinking water, so, oral toxicity seems appropriate (note, this is only acute toxicity). The LD50 is the median lethal dose, but is only measured for mouse and rat (these are models for human toxicity, but only models, as humans are just not rats; well, not literally, anyway). Also, >1 gram per kilogram body weight ("kg bw"; assumption) seems pretty high. In my naive understand, the rat may be the canary in the coal mine. But let me refrain from making any conclusions. I leave that to the experts on risk management!

Experts like those from the Dutch RIVM, which wrote up this report. One of the information they say is missing is that of biodistribution: "waar het zich ophoopt", or in English, where the compound accumulates.

Friday, April 14, 2017

I
believe Stu Borman was the first to cover the Division of Medicinal
Chemistry’s First Time Disclosures symposium for C&EN, but it was
Carmen Drahl who began the
practice of hand-drawing and tweeting the clinical candidates as they
were disclosed in real time. This seems like an oddball practice to
folks who aren’t at the meeting. Why not just take a picture of the
relevant slide? Well, that’s against the rules: There
are signs all over the ACS National Meeting stating that photos, video,
and audio recording of presentations are strictly prohibited. In San
Francisco, symposium organizer Jacob Schwarz repeatedly reminded
attendees that this was the case. Carmen’s brilliant
idea to get around this rule was to simply draw the structures as they
were presented, snap a photo, and then tweet it out.

I’ve
inherited the task since Carmen left the magazine a couple of years
ago. I find it incredibly stressful. For an even that’s billed as a
disclosure, the actual
disclosing is fairly fleeting. The structures are often not on the
screen for very long, and I’m never confident that I’ve got it 100%
right. Last year in San Diego I tweeted out one structure and I heard
the following day from Anthony Melvin Crasto, a chemist
in India, that based on the patent literature he thought I had an atom
wrong. I was certain that I had written this structure correctly, so I
contacted the presenting scientist. He had disclosed the wrong
structure!

I
agree that there should be some sort of database established
afterwards, and I think you all have done great work on that front. I
think you’ll find the pharmaceutical
companies reluctant to help you out in any way. They guard these
compounds so fiercely that it often makes we wonder why we have this
symposium to begin with.

At the American Chemical Society meetings drug companies disclose recent new drugs to the world. Normally, the chemical structures are already out in the open, often as part of patents. But because these patents commonly discuss many compounds, the disclosures are a big thing.

I drew the structures in Bioclipse 2.6.2 (which has CDK 1.5.13) and copy-pasted the SMILES and InChIKey into the spreadsheet. Of course, it is essential to get the stereochemistry right. The stereochemistry of the compounds was discussed on Twitter, and we think we got it right. But we cannot be 100% sure. For that, it would have been hugely helpful if the disclosures included the InChIKeys!

As I wrote before, I see Wikidata as a central resource in a web of linked chemical data. So, using the same code I used previously to add disclosures to Wikidata, I created Wikidata items for these compounds, except for one that was already in the database (see the right image). The code also fetches PubChem compound IDs, which are also listed in this spreadsheet.

The Wikidata IDs link to the SQID interface, giving a friendly GUI, one that I actually brought up before too. That said, until people add more information, it may be a bit sparsely populated:

But others are working on this series of disclosures too, and keep an eye on this blog post, as others may follow up with further information!

Saturday, April 01, 2017

Enjoying my Saturday morning (you'll can actually track down that I write more blog posts then, than any other time of the week) with a coffee (no, not beer, Christoph). Wanted to complete my Scholia profile (gree work by Finn, arxiv:1703.04222, happy to have contributes ideas and small patches) a bit more (or perhaps that of the Journal of Cheminformatics), as that relaxes me, and nicely complements rerunning some Bioclipse scripts to add metabolite/compound data to Wikidata (e.g. this post). Because this afternoon I want to do some serious work, like write up outlines for a few cool grant applications. And if lucky, I may be able to do a bit of work on this below-the-radar project.

Now, why I am blogging this (and meanwhile, adding four new DTXSIDs to Wikidata), is two observiations. First, I had not blogged about Bookmetrix yet, a cool project that reports the impact of book chapters. The ROI on writing book chapters I always considered as not so high, but then I saw the #altmetrics for this chapter:

Five citations is not that lot, but considering I do not cite book chapter much either. But look at that number of downloads, 2.39 thousand! Wow!

But there is another angle to that. We regularly report our societal impact, nowadays. It's part of the Dutch Standard Evaluation Protocol, or at least selected by our research institute as something to assess researchers on. Hang on, no, citations is not part of that category. But this is: the paper is sold for about 50 euro. Seriously? Yes, seriously. And apparently 2.39K people bought this chapter. I am not sure if I need to assume that this is mostly people buying the full book, which means the chapter is a lot cheaper. But the full book reports download numbers of above 50 thousand, so it seems not. Now, let's assume that a good part of the bought copies is via package deals and the average payment is half. That may sound high, but we ignore the 50k download for the full book to compensate for that.

Doing that math means that our joint book chapter contributed 60k euro to the European market. That's a full job the four of us created with this single book chapter. I'm impressed.

Search This Blog

This blog deals with chemblaics in the broader sense. Chemblaics (pronounced chem-bla-ics) is the science that uses computers to solve problems in chemistry, biochemistry and related fields. The big difference between chemblaics and areas such as chem(o)?informatics, chemometrics, computational chemistry, etc, is that chemblaics only uses open source software, open data, and open standards, making experimental results reproducible and validatable. And this is a big difference!

About Me

Assistant professor at the Dept of Bioinformatics - BiGCaT at NUTRIM, Maastricht University, studying biology at an unsupervised and atomic level. Open Science is my main hobby resulting in participation in, among many others, Bioclipse, CDK and WikiPathways. ORCID:0000-0001-7542-0286. Posts on G+ are personal.

Cookies

In the EU there is a directive upcoming requiring websites to warn people about HTTP cookies. This website uses the Blogger.com platform, Google Adsense (not that is it actually paying anything significantly), and a few scripts to count how often a blog post was tweeted, using Topsy and LinkedIn. These services undoubtedly make use of cookies, which you can disallow in your browser.