Archive

Here’s an excerpt from my chapter “Blood, sweat and urine” from The Bad Data Handbook. Have a lovely Christmas!

I spent six years working in the statistical modeling team at the UK’s Health and SafetyLaboratory. A large part of my job was working with the laboratory’s chemists, lookingat occupational exposure to various nasty substances to see if an industry was adheringto safe limits. The laboratory gets sent tens of thousands of blood and urine sampleseach year (and sometimes more exotic fluids like sweat or saliva), and has its own teamof occupational hygienists who visit companies and collect yet more samples.The sample collection process is known as “biological monitoring.” This is because whenthe occupational hygienists get home and their partners ask “How was your day?,” “I’vebeen biological monitoring, darling” is more respectable to say than “I spent all daygetting welders to wee into a vial.”In 2010, I was lucky enough to be given a job swap with James, one of the chemists.James’s parlour trick is that, after running many thousands of samples, he can tell thelevel of creatinine in someone’s urine with uncanny accuracy, just by looking at it. Thisskill was only revealed to me after we’d spent an hour playing “guess the creatinine level”and James had suggested that “we make it more interesting.” I’d lost two packets of figrolls before I twigged that I was onto a loser.

The principle of the job swap was that I would spend a week in the lab assisting withthe experiments, and then James would come to my office to help out generating thestatistics. In the process, we’d both learn about each other’s working practices and findways to make future projects more efficient.In the laboratory, I learned how to pipette (harder than it looks), and about the methodsused to ensure that the numbers spat out of the mass spectrometer4 were correct. So aswell as testing urine samples, within each experiment you need to test blanks (distilledwater, used to clean out the pipes, and also to check that you are correctly measuringzero), calibrators (samples of a known concentration for calibrating the instrument5),and quality controllers (samples with a concentration in a known range, to make surethe calibration hasn’t drifted). On top of this, each instrument needs regular maintainingand recalibrating to ensure its accuracy.Just knowing that these things have to be done to get sensible answers out of the ma?chinery was a small revelation. Before I’d gone into the job swap, I didn’t really thinkabout where my data came from; that was someone else’s problem. From my point ofview, if the numbers looked wrong (extreme outliers, or otherwise dubious values) theywere a mistake; otherwise they were simply “right.” Afterwards, my view is morenuanced. Now all the numbers look like, maybe not quite a guess, but certainly only anapproximation of the truth. This measurement error is important to remember, thoughfor health and safety purposes, there’s a nice feature. Values can be out by an order ofmagnitude at the extreme low end for some tests, but we don’t need to worry so muchabout that. It’s the high exposures that cause health problems, and measurement erroris much smaller at the top end.