Data: Knowing your means from your medians, and why you should tell the reader

It’s a fascinating data set to explore, slicing and dicing the data into regions, professions, gender, local authorities, constituencies and percentiles within earnings – to name but a few.

Like a lot of data sets to come out of government, averages were presented as both the mean average (in the case of salaries, this would be the total earnings in an area divided by the number of people earning) and the median average (the salary earnt by the person in the middle if the lowest earner was at one end and the highest earner at the other).

The Post chose to use the median average, as this helps reduce the impact of an extreme salary at one end or the other. And in doing so, political editor Jonathan Walker also explained why it did that:

Not only does this explain the difference between mean and median better than I could, I think it also sets a useful precedent for other journalists dealing with data. As well as making clear the source of the data, explaining to the reader which data has been chosen and why can only be a good thing.

Definitely good on the explanation and data source, but less so on the typos and seeming conflation of figures at the end. Surely rather than £25,8879 he means the £25,879 he mentions at the beginning? And if only 40 percent earn £22k or more, how can 60 percent earn more than £29k or more?

Thanks John and Andy. Good point. The data in question splits the average salaries into percentiles – so those earning at the 40% mark earned £22k or more, while those at the 60% mark earned £29k or more. So it is different to saying ‘40% earned this’ and ‘60% earned this’. So the top 20% of earners would be at the 80% and upwards end of the scale