In anticipation to the Soccer World Cup finals I couldn’t resist a little pre-emptive analysis of historic football (Soccer) World Cup results which may give a little insight into what might happen over June-July 2014.

The dataset is gleaned from the World Cup and FIFA/Coca-Cola World Ranking sites. For each World Cup I have compiled a collection of statistics for each team (goals for, goals against, penalties, shots on goal, games won, drawn and lost…). For each tournament, we have the final finishing position of the top four teams. The statistic which is probably most telling is the FIFA world ranking for the team in the year preceding the finals tournament.

FIFA world rankings are available from the 1994 World Cup, that is, the last five World Cups (Figure 1). In the last five World Cups, the winner has not entered the finals ranked more than 12 (that was Italy in 2006), Brazil has won twice from a starting rank of 3. We can fit a simple linear regression to these meagre 20 data points (R2=0.53 – that is, we can explain about half the variance in the finishing position based on the ranking preceding the World Cup). If the World Cup is the ultimate world ranking test then this simple analysis demonstrates that the FIFA world ranking system is pretty good at determining the best teams even though they rarely play each other. Also, the general increase (left to right in Figure 1) within each tournament (especially 2002 and 2010) suggests that the World Cup top four tend to follow the FIFA world ranking order. If the last five World Cups are anything to go by, then a win from a team ranked outside the top 12 would be unprecedented.

Figure1: World Cup finishing position Vs FIFA ranking in preceding year

Figure 1 only considers that last 5 World Cups – a very small dataset. There have been 19 World Cup tournaments. Looking at the data from all 19 World Cups, the winner has never had a tournament goal difference (tournament goals for minus goals against) of less than six (Figure 2). I have compiled the goal difference for the key contenders from their last six games (about the last 6 months) (Figure 3).

Figure 2: Tournament goal difference for 19 World Cups

Figure 3: Recent form: goal difference from the last six matches (in brackets is the FIFA world ranking for 2013).

If recent form is anything to go by then Brazil deserves the attention they are getting. Otherwise, the top 6 ranked teams for 2013 are all in a sweet spot with positive goal differences based on recent form. The big surprise here is France, which is well outside on the FIFA world rankings but with some recent high scoring wins France might just be hitting form at the right time (although they did have some lesser recent opponents).

This is of course a very superficial analysis of a very complex game but I’m keen to see what the next month brings….By which time Truii.com will have an updated World Cup data-set plus others available for your data wrangling pleasure through our public access data libraries.

…. Post Script (July 14 2014)….

Well it is all over and Germany won. It seems that the statistics on recent form gave us too much hope for France and Brazil and we should have stuck with the simple prediction based on FIFA ranking. Figure 4 shows that like the previous five world cups, the FIFA ranking in the year preceding the world cup was once again a good predictor for the 2014 FIFA world cup winner.

Figure 4: Updated version of Figure 1 after the 2014 World Cup – true to form, the FIFA world ranking in the year preceding the world cup was a gain a good predictor of the result.

looking forward to 2018……

The viz’s are easy to make. All you need to do is create a free Truii account to create and publish your own data visualizations.
Don’t forget to sign up to Truii’s news and posts (form on the right).