As part of Materials Week at the University of Warwick, I was asked to talk about social media and how it is used by scientists (and of course I threw in a bit about how journals use it too). Because it is what I know best, the majority of my talk focused on Twitter, with a side of Tumblr, Facebook, YouTube and blogging. And I figured that the best way to find out the reasons why scientists use Twitter was to ask them, and so…

Chemtweeps! I'm giving a talk on Monday & a large part of it will be about social media and Twitter in particular… (1/2)

The response was great (and was kindly Storified by @CrimsonAlkemist). I’m certainly not the first to throw this question out there on Twitter; I know others have done it before and people have also blogged about why they use Twitter too (for example, see this post by @Alexis_Verger). Please do point to other such posts in the comments as well – I imagine there are plenty of others out there.

Anyway, the presentation that I threw together can be found by clicking on the image below; it’s essentially a series of screenshots that I talked around. I think the talk needs refining somewhat, but this is a good place to start and I’ll hone it from here if I give it again in the future… if you wish to use any of the slides yourself, please feel free to do so.

The problems with impact factors are well known – I could give you a long list of things to read that explain why, but just start with this blog post from Stephen Curry and go from there.

I have a slide that I use in my talks that sums up one particular problem – that the impact factor (IF) of any given journal tells you absolutely nothing about any given article in that journal. For example, the current IF of Organometallics is just over 4, whereas Nature‘s is more than 10 times that at just over 41. But does that mean that every Nature paper is 10 times ‘better’ than every Organometallics paper? (Answer: of course not! – and how on Earth would you measure ‘better’ anyway?). It also doesn’t guarantee that a particular Nature paper will have received more citations than any given Organometallics paper (after all, a wide distribution of citations make up an IF). Considering the perverse incentives in science, however, I wonder how many people would rather have on their CV an Organometallics paper that has received 50 citations in a year instead of a Nature paper that has garnered only 10 in the same period of time?

Anyway, I digress. The slide I have looks at things from a different point of view. Wouldn’t it be interesting if you could take exactly the same paper and publish it at roughly the same time in a bunch of different journals? Take your fancy-metal-catalyzed-cross-coupling-based synthesis of tenurepleaseamycin and submit it to (and have it published in) Angewandte, JACS, Nature Chem, Science, JOC, Tet Lett and Doklady Chemistry and then sit back and see how the citations roll in. Of course, it’s the same paper – it’s not a better paper in one journal than another, so it will get cited roughly equally in all journals, right? Well, all you can really do is speculate, because if you did try to do exactly that you’d end up really annoying some chemistry-journal editors and you might not get the paper published anywhere (well, I can think of a few places that would probably still take it, but discretion is the better part of valour and all that).

Well, never fear! The experiment has been done. Although it wasn’t an experiment, it wasn’t done for the purpose of comparing citations in different journals and it’s happened more than once. It turns out that in medical publishing, editorials/white papers occasionally get published in more than one journal. So, say hello to ‘Clinical Trial Registration — Looking Back and Moving Ahead‘. A few years back, I looked at the citations this paper had received in a range of different journals and the IFs of those journals – the slide from my talk with all of the data on is shown below.

There’s a pretty good correlation between the number of citations that this identical paper received in each journal with the IFs of those journals. Of course, perhaps more people read the New England Journal of Medicine than the Medical Journal of Australia and so a wider audience will likely mean a wider potential-citation pool. Whatever the reasons (and it’s not all that difficult to come up with others), the slide shows how silly it is to assume that the IF of a journal has any bearing on how good any particular paper in that journal is. As I have said before, the only way to figure out if a paper is any good is to actually read the damn thing – the name (or IF) of the journal in which a paper is published should never act as a proxy for how awesome (or not) a paper is.

So, as well as pointing out one specific flaw in the IF, when showing this slide it does allow me to make a joke about how the correlation would be even better if it wasn’t for some (imaginary, I hasten to add) Croatian citation ring… I apologize if I have offended any Croatian doctors who happen to be reading this… but the joke usually gets a laugh.

Over at my day job, I recently looked at the distribution of citations that 2012 and 2013 Nature Chemistry papers (Articles, Reviews and Perspectives) received in 2014 – essentially the citations that are used to calculate the 2014 impact factor of the journal. I would recommend having a read of that post before ploughing through this one. I’ve now done the analysis for five other general chemistry journals, just to see how they all stack up. In each case, the data is from Web of Science (All Databases) and is refined by document types ‘Article’ and ‘Review’. In the Sceptical Chymist post I also did the calculation for Nature Chemistry after removing the Review articles from the data, but haven’t done that here.

So, here is what Nature Chemistry looks like:

And here’s JACS, Angewandte Chemie (the International Edition), Chemical Science, Chem Comm and Chem Eur J (note that because of the wildly different volume of content across the 6 journals, the scale on the y-axis changes quite significantly – as does the smoothness of the distribution; also, for the Chem Comm and Chem Eur J, I have included magnified sections of the later portions of the distributions):

One way that you can compare journals that publish vastly different numbers of papers is to look at the percentage of published items that have more than a given number of citations. For example, each journal has 100% of papers with 0 or more citations, but what does the percentage drop to when you consider papers with 1 or more citations? If 5% of a journal’s papers have 0 citations in 2014, then the second point plotted on the graph would appear at 95% (i.e., 95% of papers would have one or more citation). If you do this analysis for the 6 journals above, this is what you find:

If you stack these graphs on top of one another, you can then compare (for the most part) across the 6 journals:

It’s interesting to note that JACS compares favourably to Angewandte, even though Angewandte publishes far more review-type articles, and also note how Chemical Science is not all that far behind Angewandte when you do this sort of analysis.

So, here’s my obligatory Back-to-the-Future Day post and, because it is me doing this, it’s obviously about chemistry publishing. I figured I’d compare one issue of a journal published in 1985, with an issue published in 2015. Because the last time I looked at chemistry publications over a particular period of time I chose JACS, I thought I’d do Angewandte Chemie (the English edition) this time so that my friends over at Wiley don’t feel all left out. So, I looked at the October issue from 1985 (yes folks, there was only one issue of Angewandte each month in those prehistoric times) and compared it with the October 26th issue from 2015 (which is 5 days from now – and that seems appropriate considering the time-travel inspired nature of this post).

I just looked at the ‘Communications’ section of the issue in each case (that’s 27 papers from the 1985 issue and 51 papers from the 2015 issue) and this is what I found from these – admittedly tiny – samples:

1. Papers now have more authors on them than 30 years ago. The average (mean) for a paper in the 1985 issue was 3.07 authors, whereas it is more than double that in the 2015 issue at 6.37 authors per paper (the medians are 3 and 6, respectively).

2. Papers are now longer than they were 30 years ago. The average page extent for a paper in the 1985 issue was 2.15 pages, whereas it is now more than double that in the 2015 issue at 4.86 pages per paper (that’s just based on page ranges; not full printed pages in the journal). For comparison, the medians are 2 and 5, respectively.

3. The geographical spread of corresponding authors is much greater now than it was 30 years ago. In 1985, German authors dominated Angewandte Chemie, but that’s not true anymore it seems – just look at the charts below.

Breakdown of geographical location of corresponding authors in Angewandte Chemie.

As I mentioned above, these are really small samples so do take the analysis with a pinch of salt. That said, @fxcoudert has looked at these trends in more depth in the past and I highly recommend that you go and check out these two blog posts here and here.

This is a follow-up post to yesterday’s that looked at word clouds made up from the titles of JACS papers from the last 115 years.

Jake Yeston commented on Twitter about the lack of catalysis-based words in the clouds. This is something that also caught my eye and I’ve now had a chance to dig a little deeper into this.

The way the word clouds work (the ones you can make using Wordle at any rate) is by counting exact copies of the same word and then scaling the size of the word in the cloud in proportion to the number of times it appears in the input text. So, if you look closely at the word clouds from yesterday’s post, you will see ‘reaction’ and ‘reactions’ both appearing in the same word cloud. Similarly, acid and acids, complex and complexes, study and studies, and so on. Wordle also does not separate hyphenated words, so you will see things like ‘gas-phase’ and ‘electron-transfer’.

What does this mean for catalysis? Well, I started looking through the titles for the 2010-2014 data and found all of the following words (and there are probably other variants that I missed):

This means that catalysis is being spread quite thin and not being lumped together as a single entry in the word clouds. But it gets worse. In the 2010-2014 cloud, if you look carefully you can find ‘palladium-catalyzed’… and remember what I said above about Wordle not separating hyphenated words? Not only is ‘palladium-catalyzed’ counted separately from ‘palladium’ and ‘catalyzed’, but also separately from things like ‘Pd-catalyzed’ too. And obviously you get lots of different ‘X-catalyzed’ terms, such as ‘gold-catalyzed’, ‘Rh-catalyzed’, ‘copper-catalyzed’, and so on. There’s an awful lot of catalysis going on, it just isn’t adequately captured in the word clouds. On the other hand, consider the word ‘synthesis’ — sure, it might lose some of its count to ‘synthetic’, but that’s about it; there aren’t anywhere near as many derivatives of ‘synthesis’ as there are of ‘catalysis’.

To get a sense of how much catalysis (in any and all of its guises) has been published in JACS down the years, I went back to the lists of titles and then searched for ‘catal’ as a fragment. For comparison, I did the same for ‘synth’ and what I found is plotted below.

In the 2000s, ‘catal’ words were almost level with ‘synth’ words, and by the end of the current decade, it looks very much like they will be in the lead. Is this the decline of synthesis?

Now, as I pointed out in yesterday’s post, it seems as though chemists really have something for acid and acids. Those words dominate the clouds in the early-to-mid part of the 20th century. On Twitter, Cafer Yavuz suggested that ‘base’ and ‘basic’ might be excluded as part of the set of common words, but I don’t think that is the case. Wanting to get a sense of acid vs base, I repeated the ‘catal’/’synth’ analysis for these words. The results are plotted below:

The analysis is not perfect, partly because ‘base’ and ‘basic’ can have different meanings (more so than acid and acidic), and ‘base’ is also a fragment of ‘based’ which might be adding to its total. Nevertheless, something interesting appears to be happening. When it comes to acids and bases, it seems that the balance of power (in JACS at least) is shifting — where acids once ruled supreme, bases took the crown in the 2000s and seem to be consolidating their position in the current decade.

If you have any questions about the analysis (or other things you want me to look for in the titles), just leave a comment or drop me a line on Twitter. Similarly, if you want the raw data, drop me a line by e-mail, I’m happy to share.

When Nature Chemistry celebrated its 5th anniversary last year, we put together a word cloud (using Wordle) featuring the 150 words that appeared most often in the titles of the papers we had published up to that point. That was a collection of just under 600 papers, but a clear winner did emerge — ‘synthesis’ was the word used in titles more than any other (excluding some common words such as ‘from’, ‘by’, ‘to’, ‘with’, ‘and’, ‘so’, ‘on’…). It seems that a large part of chemistry is still very much about making things, and that reminds me of one of my favourite chemistry quotes:

The Nature Chemistry title-word cloud was not based on a particularly large data set, however, and is also from a very recent period. I wondered if the titles of chemistry papers have changed much over time, and so I decided to look to a journal with a lot more history. I wanted it to be a general chemistry journal to ensure there was no intrinsic bias towards words associated with a particular sub-field within chemistry and so I turned to the Journal of the American Chemical Society (JACS).

The date range I chose is somewhat arbitrary, but round numbers have a certain appeal and so I started at 1900 and worked my way up to 2014, the most recent complete year of JACS papers. This amounted to a little over 168,000 article titles and just shy of 2,000,000 words in total. I may well do more analysis in time, but first of all I decided to break down the data into decades (including a half-decade of 2010-2014 to cover the most recent papers) and look at the most popular 150 words for titles in each given period (excluding the same common words as we did when analysing the titles of Nature Chemistry papers).

Note that the size of each word corresponds to the number of times it appears in titles in that period — the larger it is, the more it is used. I have not combined words with the same root and nor have I combined singular and plural versions of the same word. I have made everything lowercase for the sake of simplicity though (otherwise ‘Synthesis’ appears as a separate entry to ‘synthesis’). Also, the number of papers published varies a lot between decades, so comparing the sizes of words between different clouds is meaningless.

This is what I found:

1900-1909

So, chemists at the start of the 20th century (yes, I know the century started on January 1st, 1901, but just go with it) were a determined bunch who liked to study milk, oil, wheat, sugar and urine — perhaps not all at the same time. Also, note the presence of a decent-sized ‘sulphur’. Yes, sulphur, with a ‘ph’. And remember, this is JACS, with all its American-ness. There’s not a hint of a ‘sulf’ to be found in JACS titles in this decade!

1910-1919

Still a healthy dose of determination, but also a lot of acid. And now ‘sulphur’ has become ‘sulfur’ — in fact, there are 143 ‘sulf’-based words and only 17 ‘sulph’ ones in titles from this decade.

1920-1929

Acid still looms large, but a lot of derivatives and compounds now too. Note that there is a lot more preparation than there is synthesis.

1930-1939

Seriously, what is it with chemists and acid? Compounds and derivatives remain popular and it seems as though synthesis is catching up a little with preparation.

1940-1949

The age of synthesis is upon us. And note the appearance of the word ‘spectra’ too. Also, ‘esters’, what’s going on there?

1950-1959

Synthesis remains dominant, but words such as ‘kinetics’ and ‘mechanism’ are growing larger, suggesting that there is an increasing drive to understand reactions as well. And ‘stereochemistry’ rears its head in the cloud for the first time.

1960-1969

Synthesis is not quite as prominent in the 1960s, but still a popular word in the titles of JACS papers. A new (and quite prominent) entry is ‘resonance’, along with ‘magnetic’, and note that both ‘nuclear’ and ‘proton’ are there too, reflecting the growing use of NMR as a technique to characterize chemical compounds. Another notable entry: ‘carbonium’ (the old name for carbocations), which was an active area of research at this time.

1970-1979

Chemists’ fascination with acid finally seems to be wearing off somewhat. And ‘complexes’ is now much more prominent. I suspect that this is a result of host–guest chemistry really taking off in the 1970s and the word ‘complex’ being associated with many more things than just traditional metal-coordination compounds.

1980-1989

There’s a fairly sizeable entry for ‘total’, and the vast majority of time it is used in the context of ‘total synthesis’ — and ‘synthesis’ itself dominates once more. Also note that the popularity of the word ‘via’ is increasing and both ‘novel’ and ‘new’ are well used (‘new’ seems to be a fairly constant presence in titles throughout the decades).

1990-1999

There’s still an awful lot of synthesis going on.

2000-2009

Nanotubes and nanoparticles make an appearance in the top 150 for the first time — nano comes of age? Other notable first-time entries (although small) are ‘supramolecular’, ‘self-assembly’ and ‘quantum’; I’m a little surprised it took so long.

2010-2014

Synthesis remains at the top, but look at the topics creeping into the top 150. ‘Metal–organic’ and ‘framework’ heralds the growing popularity of MOFs and it’s easy to miss, but there is also a little innocuous ‘graphene’ creeping into the picture at the bottom. ‘C–H’ is growing in size too, which is usually found in titles in the context of C–H activation. And finally, chemists’ love of ‘via’ is sealed!

To summarize, here are the top-ten words for each period:

(EDIT added June 3rd: I forgot to mention when I first posted this that for the top-ten lists I did combine simple singular and plural versions of the same word, so ‘reaction’ is actually ‘reaction’ and ‘reactions’ combined. Same goes for study/studies, complex/complexes, acid/acids and some of the others. What I did not do, however, is go beyond that and combine words that share the same root, so ‘synthesis’ and ‘synthetic’ have not been counted together and nor have ‘molecule’ and ‘molecular’, for example.)

Just to give you a sense of scale, if you don’t exclude the really common words, the top-20 words for the last full decade (2000-2009) are shown below (and remember that the words are scaled relative to the number of times they appear – the larger the word, the more times they appear in JACS titles).

So, the most common word in JACS titles is probably ‘of’ or, more meaningfully, ‘synthesis’.

(EDIT added June 3rd: there’s now a follow-up post, with some cautionary notes about word clouds and how they can miss some concepts…)

Cyclohexane is undoubtedly an iconic molecule. Many of us learned to draw it (with varying degrees of proficiency) very early on in our organic chemistry classes as we were introduced to chairs, boats, half-chairs, twist-boats, cis, trans, A-values, conformation and, of course, axial and equatorial. Cyclohexane has six equatorial C–H bonds around the circumference of the ring and six axial C–H bonds, three pointing up and three pointing down.

The letter notes (shown below) that the labels first suggested for the different C–H bonds on the cyclohexane ring were ɛ (epsilon) for what we now call axial, and χ (chi) for what we refer to as equatorial these days:

The citation (the subscript ‘2’ in the excerpt above) is to a paper by Odd Hassel published in 1943 in something called Tidsskr. Kjemi, Bergv. Met. which turns out to be the Norwegian journal Tidsskrift For Kjemi Bergvesen og Metallurgi (I’m sure that’s cleared it up for you…). Even if my Norwegian was up to scratch (it isn’t), I thought it would be somewhat tricky to track down a copy of this article to find out the reasoning behind the choice of those particular descriptors.

Fortunately, the article was later translated into English and published in Topics in Stereochemistry in 1971, along with an article by Derek Barton that had originally been published in 1950 in the journal Experientia.

Hassel’s paper — The cyclohexane problem — explains the origin of the descriptors as follows:

For those of you paying attention, you may have noticed a problem. Whereas the Nature paper refers to ɛ and χ, the original Hassel paper refers to ɛ and κ (kappa). And based on the Greek origins of the descriptors, it is clear that it should be κ and not χ. Go back and look at the excerpt from the Nature paper above — even squinting a bit, it would need to be a very charitable interpretation to say that the symbol in question is κ and not χ. It seems that something went awry in the publication process (a couple of bookchapters also confirm that the original descriptors were ɛ and κ).

One of these book chapters also pointed me in the direction of a 1954 Science paper, which shared the same authors (Barton, Hassel, Pitzer and Prelog) — and the same title — as the earlier 1953 Nature paper. On closer inspection, the letter in Science is, with one important exception, exactly the same as the one that appeared in Nature. See if you can spot the difference in this excerpt from the Science paper:

So, Science got it right; a kappa (κ) and not a chi (χ) for the equatorial bonds. Perhaps more remarkable, however, is that both Science and Nature published the *same* letter (the Nature letter was published on Dec 12, 1953 and the one in Science on Jan 1, 1954) — I wonder whether the editors knew of the dual publication…?. Anyway, what was the ultimate purpose of this letter, this letter that was deemed so important that it should be published in both Science and Nature? Well, it was essentially just a proposal of new nomenclature for the different C–H bonds in cyclohexane.

After noting that the epsilon/kappa descriptors were difficult to remember, the authors pointed out that alternative nomenclature had been suggested by Beckett, Pitzer and Spitzer in a 1947 JACS paper. Basing their terms on geography rather than Greek, they had suggested the now-familiar ‘equatorial’ (e) for the C–H bonds around the equator of the ring and ‘polar’ (p) for the C–H bonds pointing either north or south away from the mean plane of the ring.

As highlighted in the Barton/Hassel/Pitzer/Prelog letter, however, the word ‘polar’ has another — very different — meaning in chemistry, and in an effort to prevent any confusion, they suggested that instead of polar, a better term would be ‘axial’. In another twist, the proposal to use ‘axial’ was actually made by Christopher Ingold who, despite this contribution, is only acknowledged in the Nature/Science letter (see below), rather than sharing in the authorship.

Considering that ‘equatorial’ was already suggested in the earlier Beckett, Pitzer and Spitzer JACS paper and ‘axial’ is Ingold’s idea, the role of Barton, Hassel and Prelog appears to be one of making an authoritative plea (with Pitzer) to the community for a new standard to be adopted, rather than defining the new nomenclature themselves.

UPDATE 11/05/2015 – here’s an interesting post about Hermann Sachse and his attempts to get his ideas about the conformation of cyclohexane across to the wider chemistry community towards the end of the 19th century.