... however, the more you know, the more in awe you become at your own ignorance! "...those who feel certainty are stupid, and those with any imagination and understanding are filled with doubt and indecision." --Bertrand Russel

Tuesday, February 21, 2017

What is up with all of the country ranking "Indexes"?

A fairly recent, and continuing rage among data wonks is the creation of ranking indices (or indexes). Of course, we have had indexes for many, many years— an old historical example is the Dow Jones Industrial Average. And of course, we use similar techniques to compute FIFA World Rankings, university rankings, and more. What is more recent is the creation of myriad indices for countries around the world (as well as US states) which are used to rank them on every conceivable topic. A random selection of these are:

The Economic Freedom Index

The Global Peace Index

The Good Country
Index

The Global
Competitiveness Index

Rule of Law Index

World Press
Freedom Index

Happiness Index

Global Innovation
Index

Global Entrepreneurship
Index

Social Progress
Index

Corruption Perceptions
Index

Animal Protection
Index

The Big Mac Index

Global Hunger Index

Quality of Life Index
Quality of Death Index

I could go on listing these for days! Basically, if you want to create an index, sit down and decide on some factors that you think would go into explaining an idea such as "Happiness". Collect some data on these factors, and then figure out a way to create a sort of weighted average of these factors, adjusting them to a similar scale. A BurkeyAcademy subscriber Wrote in to ask me about the United Nations Development Programme's "Human Development Index". He wanted to know how was calculated. Let's have a look.

Looking at their 2015 report, they list the variables used to construct their index: Life expectancy, expected years of schooling (for young children now), mean years of schooling for all in the country, GNI per capita, and though it is unclear if (or how) they use this in their ranking, the difference in GNI ranking and their own Human Development Ranking (they could include this difference in their ranking in a recursive fashion using several steps).

In order to figure out how this was calculated, my go-to tool is a regression. Regression attempts to discover the formula for a relationship, if you assume the function's basic form. Using a subset of 34 countries gives the following results:

For the uninitiated, the important things are the "Multiple R-squared" of 0.9966 tells us that our equation is an almost perfect fit; and the "Estimates" give us the following formula:

It appears that the "GNI rank - HDI rank" does help predict their ranking index. Exactly how they did it, I don't know — they might explain it in the document, but I did not read it fully. So now, we understand how they calculated it — what we do not know is exactly why they chose these particular numbers as weights. Again, perhaps they mention it in the 250 page report, but generally these weights are chosen subjectively based on how important the creator believes each factor to be, and then normalizing the result to be an index between zero and one or zero and 100.