Pages

Saturday, October 23, 2010

The Answer to: Are these organic molecules the same?

Ten days ago I asked my readers if two molecules were the same or not. I guessed they were not, when I was asked Are these organic molecules the same? The people who replied to my post were quite convinced they were, and Peter gave the context of the pub quiz: assumptions may not be correct.

Indeed, I assumed there were hydrogens missing (implicit), and that line corners indicate places where carbons are. But the key to this problem was that I also assumed that the E/Z stereochemistry for the two double bonds were properly defined. Or, more accurately, I assumed that because I was comparing the two molecules, the E/Z stereochemistry for the double bond between the rings was identical in both drawings. We all did.

Under that assumption, these two molecules are indeed not the same. However, if the E/Z stereochemistry is actually not the same for that double bond, ... well, you get the point. Perhaps this was not the best of examples, as it is quite conventional to use 2D coordinates to determine E/Z stereochemistry... we even have a special drawing style to indicate the E/Z stereochemistry is unknown. Then again, how often does the organic chemist really use that.

A more convincing example was also drawn in the pub, and I should have given that one. Peter posted those later. These involve a spiro compounds. Here too, I assumed that the stereochemistry around the spiro carbon was identical. My bad. There was one person in the pub who spotted the problem: David Jessop.

Underlying issue, of course, is those stupid 2D drawings. Jmol has been around for more than 10 years now (and non-free tools too), and we still use 2D drawings... why, oh why? 3D coordinates and explicit hydrogens, that is what our molecular data should be represented with. Henry does this right, over and over again, in his brilliant blog. Well, most of the time anyway. Look for the 'Click for 3D' statements behind the figures, and just give it a try, e.g. in this post on I(CN)7.

4 comments:

I've sat and read this posting 4 times and I still fail to see the argument that is being made here. The discussion of E/Z geometries is meaningless in this case. To designate geometry to an alkene you must be able to assign a set of priorities to the substituents and then assess the relative positions. On both of the alkenes in this molecule the cyclohexyl rings mean that on one (or both) ends of the alkene there are substituents with exactly the same priority, which precludes assignment of geometric stereochemistry.I understand that the representation of the trisubstituted alkene has a direct bearing on how you interpret the stereochemical wedge which has been included on the methyl group. I could argue that the alkenes are perfectly correctly represented and it is the application of the stereochemical wedge which is incorrect.I'm not saying that 2D representation doesn't have drawbacks, but what is being highlighted here is really a case of authors not taking care to ensure that the structure depictions they generate convey the information that that they intend, and this will be a problem no matter how you choose to depict molecules.

Egon, I take your point completely, while statistically it is likely that there probably is 1 issue of a journal out there were two of the papers have correct/completely unambiguous representations of molecules; it is sadly likely to be tricky to find it. And I know that I can list several cases where I have been appalled by the representations included in papers. But I would say that pretty much all of these cases could have been avoided if the 2D representation had been produced correctly.

Because 3D structures are more complex (I know that this is a generalisation) than 2D drawings more time and care needs to be put into their generation and also their interpretation. I think that in many cases the flawed attitude to structure representation in 2D would be continued in the 3D models.

In many respects the problem is that many chemists don't fully appreciate the structure as a fundamental unit of knowledge transfer. In many cases they weight arbitrary (derivative) information such as names or registry identification numbers as being of equal importance to a structure.

And I think that there is a need for a huge campaign to educate the wider chemical community about best practice, highlighting flaws in some of the ways in which we think about data, its storage and reuse.

Search This Blog

This blog deals with chemblaics in the broader sense. Chemblaics (pronounced chem-bla-ics) is the science that uses computers to solve problems in chemistry, biochemistry and related fields. The big difference between chemblaics and areas such as chem(o)?informatics, chemometrics, computational chemistry, etc, is that chemblaics only uses open source software, open data, and open standards, making experimental results reproducible and validatable. And this is a big difference!

Cookies

In the EU there is a directive upcoming requiring websites to warn people about HTTP cookies. This website uses the Blogger.com platform, Google Adsense (not that is it actually paying anything significantly), and a few scripts to count how often a blog post was tweeted, using Topsy and LinkedIn. These services undoubtedly make use of cookies, which you can disallow in your browser.