Inside the mind of Hans Rosling

We all love Hans Rosling, the world’s most famous data guru. Yesterday, at the Skoll World Forum, I interviewed him for More or Less on the BBC World Service, and loyal listeners and podcast subscribers will get to hear what we talked about in a week or two.

But before the tape started rolling I had a more intimate conversation with Hans about how he began his remarkable career, and what his influences and inspirations were. Here’s a taste; Hans was a 24 year old medical student, with top grades in Sweden, taking a study year in Bangalore:

“I saw the lecturer put up a slide for discussion, I thought, ‘that’s kidney cancer, I’ll keep quiet and let the Indian students talk before I explain’. In six minutes, they had exhausted all my knowledge. In those moments I realised the Indian students were better than me. I had always been in the top quarter. In Bangalore, I was bottom quarter. And that was when I realised how racist we all were, how we thought we were better because we had been born in a richer country, with better institutions.”

1. The data seems very accurate considering it goes back as far as 1810. Basically: What is/are the source/s?

He gave me a source at gapminder, his website I believe, and the source was clearly labelled “WE DISCOURAGE THE USE OF THIS DATASET FOR STATISTICAL ANALYSIS” (Source gapminder spreadsheet for Life Expectancy. 31st Oct 2011.)

2. Population 1810 to present day remains constant throughout in all countries. Not even relative change AFAICS.

3. Maybe a small point but x-axis units incomplete: $400 *pa* presumably.

More generally, I added:

a) My main concern is that the presentation is *so* seductive that the audience does not question the source and/or interpretation

b) The uncriticial enthusiasm of responses is not healthy. “Wow. Look at this!” Not “I wonder where he got the stats for Guizhon.”

c) Summarising: Data Representation is only as good or as true as its source…

d) The uncritical response it seems to elicit in a lay audience means that in the wrong hands this approach could do a lot of harm

e) Given two competing statistical arguments. One presented like this would wipe the floor with the other. That’s not good!

f) I hope these points are substantial. Thanks for getting in touch. I think there are serious issues here. I am not a specialist.

I sympathise with the criticism of data visualisations with ropey data behind them – there are many of these around. But the Gapminder data is all very clearly sourced on the Gapminder website, usually from highly reputable international organisations. Rosling has been a leading campaigner for open data. I really struggle to see the objection, beyond the fact that Rosling did not engage interminably on Twitter, which I can attest is quite impossible for someone with tens of thousands of followers and a hectic travel schedule.
If there are serious flaws in the Gapminder data I’d be fascinated to hear about them, but your main objection seems to be that it’s so slickly done it can’t possibly be accurate. I don’t buy that.

“your main objection seems to be that it’s so slickly done it can’t possibly be accurate. I don’t buy that.”

I don’t buy that either which is why I didn’t say it, or anything close to it.

Please do me the courtesy of reading what I actually wrote.

If it’s interminable then I’ll distill from it three objections:

1. The gapminder dataset still says

“WE DISCOURAGE THE USE OF THIS DATASET FOR STATISTICAL ANALYSIS” (Their capitals)

Doesn’t this fundamentally discredit the whole presentation?

2. Why do the population sizes not vary? They don’t even vary relative to each other so they’ve not been standardised in some way.

3. The data is *so* seductively presented that nobody is asking questions about its veracity. This is a general point about these sorts of visualisations. In my experience the more seductive the presentation the less likely the stats are to be questioned. See e) above. The prettiest picture wins. Not the soundest argument.