Making data play nicely with others

Over the last few weeks, I’ve been helping my boss out with a review paper. Mercifully I’m not actually writing much of it, I’m mostly just helping out around the edges by sorting out references and digging through them for useful metrics by which we can compare techniques. This has led to much frustration. Enough in fact to audibly make me say “Why the f$%k wold you do that?!!” more than once.

As far as I can tell, a large percentage of the scientific community are incapable of using standardised units.

Now my writing on this subject is fairly recently informed by the review paper, but these issue aren’t specific to my field or even the technique I was looking at – this is rife in all of the sprawling tendrils of scientific publishing.

Firstly, is the ever present misuse of units. I’m not so much talking about the arguments between metric and imperial. That’s a whole different area of brain-melting stupidity (for the love of god America, give them up already). No, this is the much worse crime of ignoring standards from the papers you are citing in your own area.

Each area and technique tends to slowly build up a set of specific metrics for testing performance or properties. It can take a few years but after a few 10’s of papers on a specific subject, a general consensus will emerge about the best units. So if you’re writing a paper in this area, how about doing everyone a solid and actually using those units.

Example: In most papers about daiquiri production everyone reports their experiments in cherries per cherry daiquiri. Which is great, often I have lots of cherries and knowing which technique has the best cherries to daiquiri ratio I can make is pretty important. However, one new paper decides to be clever and wants to make his numbers look different and goes for quoting his machine’s daiquiri production as number of daiquiri per stalk of 5 cherries. Clearly this person is totally insane but now when I read their papers I will have to keep recalculating their stupid figure to make it fit with everyone else. Or far more likely, I’ll not bother and file it under “crazy person” papers – with the people that try to make kale daiquiris.

In one real and not daiquiri related example, one paper had cited a list of 4 other pieces of work each with their performance quoted in one unit, only to then write “…all of which compare well with our performance of 0.6 <totally different unit>”. That was one of the moments when I swore.

Sometimes the reason is just plain and simple laziness, I expect. Converting units is hard and might take someone a whole 5 minutes, and who has the time for that when writing a paper. Sometimes it might be because of some soap-opera style disagreement between academics. But there are also obvious cases of people deliberately using units to obfuscate poor data that doesn’t really matched what they want.

This problem also isn’t limited to units – naming is equally a victim of this nonsense.

My current project uses a material with the very handy acronym TSPP. When I started the project I started trying to buy some of this TSPP only to find that the only people to call it TSPP were the one group we were working with. Literally everyone else called it TPPS. I’ve never got to the bottom of exactly why this had happened. I’m assuming possibly a typo somewhere but they are now so committed to the new acronym they are unwilling to change back (I even asked about it).

I’ve also seen papers where people rename techniques just so that it sounds new and not just a rehash of an existing technique. I’ve seen others where old(ish) techniques are renamed with a name suspiciously similar to one of the paper authors.

But whatever the reason the result is the same – 6 different names for every technique, chemical and material, where actually a single name would make it easier for everyone. Especially those of us having to read through lots of papers while doing a literature review.

So my plea to the journal editors of the world – please, rise up against this nonsense and start calling out scientists on their arbitrary units and naming nonsense.