In Part 1 of this series, I looked at whether the combination of a changed group draw system and the failure of a few high-profile teams to qualify meant that the group stage of the 2018 FIFA World Cup would be more boring than in previous tournaments. After deciding on ELO scores as a reasonable way to quantify team strength, I looked at the groups and decided that while there was one unusually weak group, the others were generally in line with those seen at previous tournaments.

In this post, I will look at which teams were lucky and got an easier draw and which ones did not. To recap: 32 teams qualified and had to be sorted into 8 groups of 4. To do this, the teams were sorted into four ‘pots’ according to their FIFA rankings, with Pot 1 consisting of the hosts and the 7 strongest teams, Pot 2 containing the 8 next strongest teams and so on. The groups were then determined by drawing one team out of each pot using lots. There were a couple further constraints to avoid a situation in which one group contains all Asian countries, for example: each group could contain no more than two European countries and no more than one of each of the other federations (Africa, Asia, South America and North & Central America, with Oceania’s representative New Zealand having failed to qualify). This is the result:

Every time there is a group draw for one of these big tournaments, there is much debate over who got lucky and who didn’t. This seems as good a question as any to approach analytically, so in this post I outline two ways of looking at this question using ELO scores.

Group draw distributions

Before the draw, there is some set of possible groups in which any of the teams could find itself. Take Australia, for example. The Socceroos play in the Asian Football Confederation and with a relatively-low FIFA ranking found themselves in Pot 4 for the draw. That meant that their possible sets of three opponents included all combinations of one non-Asian Pot 1 team, one non-Asian Pot 2 team and one non-Asian Pot 3 team. As in Part 1 we use ELO scores to measure the strength of teams quantitatively, and so we can calculate an average ELO score for each of the possible combinations. The easiest possible group was the one with the lowest average ELO score (1711: Russia, Mexico and Tunisia), the hardest one had the highest average ELO (1976: Brazil, Spain and Denmark). In between those extremes there is a whole distribution of opponents’ average ELO scores the Socceroos might have had to play.

After the draw, Australia’s actual group opponents are France, Peru and Denmark, for an average ELO score of 1885. One way to judge whether this was a lucky or an unlucky draw is to compare this realisation to the distribution of possible draws. I do this in our first chart:

This shows a histogram of average group opponent ELO scores that Australia could have faced, with the red line indicating the actual draw they got. We can see that the realisation was slightly to the right-hand side of the distribution of possibilities, making for a group that was somewhat more difficult than they might have expected before the draw.

Now that we have looked at a single example, I repeat the exercise for all the teams in the World Cup to create the next chart:

Each panel shows the distribution and realisation for a different team, arranged in their respective groups. A few things stand out:

It matters which Pot you were put in before the draw. Teams in Pot 1 (like Brazil, Germany or Poland) have distributions further to the left, while teams in Pot 4 (like Japan or Nigeria) have distributions further to the right. This is just a reflection of the fact that you cannot be put into a group with another team from your pot: if you’re in a strong pot, you won’t face another ‘strong’ team and vice versa. This is also where the home advantage really kicks in: Russia would normally have been in one of the lower pots, but as hosts they automatically avoid all the Pot 1 teams, resulting in a more benign distribution.

As noted in Part 1, Group A is unusually weak. It has an average ELO score of 1695, not at all far from literally the lowest possible (1683: Russia, Mexico, Tunisia, Saudi Arabia). This in turn means that all the teams in this group were very lucky – most of all Uruguay, which not only drew Russia as its Pot 1 opponent, but also got two other opponents near the bottom of their respective pots.

The big favourites generally drew groups that were more or less in line with the average they might have expected. Brazil, Germany, Argentina and France all got sets of opponents with broadly-similar average ELO scores. One exception is Portugal, which drew the short straw in the form of Spain being its opponent selected out of Pot 2.

Tournament structure and luck

However, the difficulty of group stage opponents is not the only way to evaluate whether or not a team had a lucky draw. It also matters which group you’re drawn in, and who is drawn into the ‘adjacent’ group. This is because of the effect of tournament structure on the path to the final.

The top two teams out of each group move on to the Round of 16, the first of the knock-out stages. Their opponents are worked out as follows: the winner of Group A plays the runner-up of Group B, the runner-up of Group A plays the winner of Group B and so on. So one might be drawn into an easy group, but if one then has to play one of the strongest teams in the Round of 16 (e.g. because your adjacent group is the one that has both Portugal and Spain in it), this is not as good as having to play the runner-up from a weak group.

So how does one summarise all of these possibilities into one chart? We are interested in one metric that isolates the effect of the tournament’s structure on how well teams might do. To get this, let’s start with a structure that has no impact at all: imagine all the 32 teams playing in a never-ending league, in which every team plays every other team again and again. Let’s, at least for now, make a big assumption: the team with the higher ELO score always wins. If this is true, then our league will end up with a ranking exactly equal to the ELO rankings.

If some alternative tournament structure results in some teams not being able to play some other teams, so that the lucky teams avoid the more difficult opponents and the unlucky ones don’t get any easy games, then we would expect the final ranking to look different in some way. The difference between the two is the effect of the tournament structure.

So let’s run through the tournament assuming that the team with the higher ELO score always wins. This diagrams skips the group stages, but you can see what happened – the two highest-ELO teams made it through. The bolded team wins the match and progresses (and an admission – this is the only chart in this post I didn’t do in R, because it was just too much of a hassle to create a tree just for this):

So good news for Brazil then! As the team with the highest ELO score at the time I downloaded the data, they win the tournament. Germany loses the final, while Spain wins the match for third place to win Bronze.

Anyway, to translate all of this into a ranking, I assign one point to finishing last in the group. The third-placed team gets two points, and the two teams that make it into the knock-outs get three each. Another point for winning the Round of 16 match, another for the Quarterfinal and so on. With each team having been assigned a point value at the end of the tournament, we can produce a ranking and compare it to that implied by the ELO scores by themselves.

I do this in the following chart, which shows a scatterplot of the teams, with ELO ranking on the x-axis and our pretend-tournament ranking on the y-axis. Brazil is the team with the highest ELO and the most points in our World Cup, and so it ends up in the top right corner of our chart. I drew a 45-degree diagonal, with any teams that lie on this line having the exact same ranking in our pretend World Cup as they would on ELO score alone. Any countries to the left of the line are ‘lucky’: they finish higher in our World Cup than their ELO score would imply. The opposite is true for countries to the right of the line.

What do we learn? For the very top teams, the tournament structure doesn’t matter much. If they always win, because they have the highest ELO scores, then they beat everyone regardless of whether they play in a tournament or a continuous league. The degree of “luck” as seen on the chart the next ‘level’ down is largely a function of every team eliminated in the Quarterfinals being knocked out at the same time, regardless of their ELO score. In a league France would expect to be ranked more highly than Colombia (because the two would play each other directly and due to its higher ELO score France would win). In the tournament both are eliminated in the Quarters and get the same points. In that sense, Colombia derives a relative advantage from the tournament structure.

The same is true at the other levels too, but the effect gets stronger for teams eliminated earlier, simply because there are more of them. To look at the example of the Socceroos again, they are relatively “unlucky” by this measure, because they won’t get the chance to play a whole bunch of teams with lower ELO scores than themselves.

Once again Group A stands out though. Russia in particular, which has the second-highest ELO in that group and therefore makes the Round of 16, gets a vastly higher rank than its ELO score would justify. Egypt, which finishes third in their group when they would have finished last in others, also derives a relative advantage.

Summary

In this post I looked at who got lucky in the World Cup draw, using two different methods. Using both ways of approaching the problem produced two consistent conclusions:

The tournament structure particularly benefits teams that have a high FIFA ranking and are therefore put into the higher pots. By construction they avoid playing other strong teams until later in the tournament, which makes it easier for them to progress further before being eliminated.

In the 2018 World Cup, the teams that ended up in the relatively weak Group A can consider themselves lucky. Either being the hosts or being drawn into a group with them means that the strongest teams are also avoided until the knock-out stages.

There are some weaknesses though, especially in the second approach. Look at the last of the Round of 16 games: Belgium against Colombia. In our little pretend-World Cup, the latter wins, due to its ELO score being 2 (!) points higher. That tiny difference is enough to decide the match, and in our evaluation it made no difference whether it is 2 points or 200.

That makes for a pretty boring tournament! In the next post, I plan to introduce a bit of uncertainty. By adding a random element to encounters, and then repeating the simulation a bunch of times, I can generate distributions of outcomes that will hopefully add some interesting insights into teams’ chances.

As with Part 1, none of this is gambling advice. I’m happy to make my data and (poorly written!) R code available on request.

]]>https://illiquidideas.wordpress.com/2017/12/12/analysing-the-world-cup-2018-part-2-luck-of-the-draw/feed/1illiquidideasgraph_tablesgraph_Australiagraph_drawdistgraph_tournamentgraph_luckAnalysing the World Cup 2018, Part 1: Worth Watching?https://illiquidideas.wordpress.com/2017/12/09/analysing-the-world-cup-2018-part-1-worth-watching/
https://illiquidideas.wordpress.com/2017/12/09/analysing-the-world-cup-2018-part-1-worth-watching/#commentsSat, 09 Dec 2017 22:18:19 +0000http://illiquidideas.wordpress.com/?p=191It should go without saying that none of the below is gambling advice.

Some background

Last week we had the draw for the groups of the 2018 FIFA World Cup, to be held in June and July next year in Russia. I tend to get a little too excited about these big football tournaments (excited enough to follow the draw via a live ticker), and I’m looking forward to the big event.

But no sooner was the draw complete, that some people complained that we were in for a boring group stage, and there was little worth watching until the knock-outs. In the spirit of teaching myself some R, I decided to investigate. The result is a planned series of three posts. In this first one, I look at how this World Cup compares to recent tournaments. The next two posts are concerned with who got a lucky draw and who didn’t, and simulating the tournament to see who might do well.

There are 32 teams in the World Cup. Aside from the hosts Russia, who automatically qualified, they all went through a qualification process that started in 2015. That process is done via the regional football associations (e.g. UEFA in Europe), and if you want the specifics of how it went, wiki has it all.

The World Cup itself starts with a group stage, consisting of eight groups (labelled A through to H). From each group, the top two teams progress to the next round. After this, teams progress through the knock-out stages: Round of 16, Quarterfinals, Semifinals and the Final. The two losers of the semifinals play each other in a match for third place.

Is the group stage especially boring this time?

So, how excited should we be about the group stage of this tournament? If you’re pretty sure that you’ll watch most (or all?) of the matches, are you in for a world of boredom for the first couple of weeks? If you’re living somewhere where catching a game involves setting an alarm, is it worth it? Or should we just tune out until the knock-outs start?

Why even ask? Well, the question is made more interesting because of a change in the method for deciding which teams go into each group. It’s done by lots: first, the 32 teams are ranked according to their FIFA rankings and sorted into four “pots”, with Pot 1 having the 8 strongest teams and Pot 4 the 8 weakest (there are a few other rules to stop some teams from the same regions all ending up in the same group, but that’s the general idea). Then they get some football celebs to draw lots in a lengthy ceremony, with each group getting one team from each pot.

This is different from previous editions, since before only the strongest pot was selected using rankings, and the other pots were allocated by the regions from which the teams hail. The outcome should be that the groups will be more even this time… we should have been less likely to see groups with lots of strong teams (a.k.a ‘Groups of Death’) and, conversely, fewer weak groups as well. Given that two teams progress, and we’re more likely to see an obvious a priori ranking of teams within groups, one might think that it will be unusually obvious who will progress and the group stage will offer little in the way of excitement.

There is one other mechanic at work that might make this even worse, and that is the number of strong teams that failed to make it this far. In 2018, the World Cup will happen without Chile, Italy, the Netherlands or the United States. All of these are not only strong teams that would make a lot of people’s lists to make it out of their respective tournament groups (obvious dramas during their qualifying campaigns notwithstanding), but they also bring interesting and distinctive play styles and philosophies. While we all have a team or two that we enjoy watching fail, one can reasonably worry that the tournament will be poorer for their not being there.

So the question we want to answer is: in light of the changed group draw system, and the failure of several big names to qualify, is this tournament’s group stage going to be a) weaker and b) more unevenly matched than previous editions?

ELO Scores

How do we go about answering? First, we need some sort of system to measure how strong the teams are. I’m not an expert here, but one useful way would seem to be the ELO ratings (which in my case I just pulled off wikipedia in the first week of Dec 2017). They seem to be the standard of trying to quantify football team strength, there is a very detailed history easily available, and we wouldn’t want to use the FIFA rankings because a) they’re bad and b) since they were used to determine the group seeding in the first place, we wouldn’t learn anything new by using them to analyse the groups.

But can ELO scores tell us anything about whether teams are likely to do well at the World Cup? To figure that out, I looked at past World Cups. The tournament has had its current ‘8 groups of 4’ format since 1998, which means we have five previous tournaments worth of data. This first chart shows (pre-tournament) ELO scores of the teams that participated in these on the x-axis, and the points they actually earned in the group stages on the y-axis. The fitted line is from a simple linear OLS, with the grey area being a 95% confidence interval around it.

This is reassuring in two ways: first, it looks like there is a positive relationship, and it should be ok to use ELO scores in subsequent analysis. And second, while ELO scores tell you something about how teams are likely to perform relative to one another, there is plenty of variability around that. Which is nice, because otherwise it’d make for a pretty boring event!

Is this World Cup weaker than previous ones?

Having convinced ourselves that ELO scores are a reasonable way of quantifying the relative strength of teams, we can move on to comparing the 2018 edition to previous World Cups.

How would we do this? Well, from the perspective of wanting to see interesting matches, especially early on, the average of the tournament as a whole doesn’t matter that much. What matters is which teams will actually face each other in their respective groups, and so we want to know whether this tournament has unusually weak, or unusually uneven groups. The following chart helps answer that question – it shows the distribution of ELO scores for each of the eight groups for each tournament since 1998. The fat dot is the average ELO score of the group: the further to the left, the lower the average and the weaker the group. The length of the coloured line tells you the distribution of teams in the group: its left end point is the ELO score of the weakest team, and its right end is the strongest team’s score.

So what can we see? First of all, there is indeed an unusually weak group this tournament: Group A. This shouldn’t come as a huge surprise, given that this is the group that contains Uruguay, Russia, Egypt and Saudi Arabia. Partly this is the because the host country automatically gets put into Pot 1, and therefore cannot be drawn in the same group as one of the strongest teams. This meant that Russia (ELO score 1681) would take the place of one team with a minimum score of 1821 (Poland, the lowest-ranked of the Pot 1 teams), and drag the average of its group lower. As pointed out by 538, in addition to this, the other teams drawn into the group also had lower ELO scores. This need not have happened: the strongest possible group that could have been drawn with Russia would have been with Spain, Denmark and Serbia, with a group average of 1817, as opposed to the 1695 they actually got.

But the benefit of that is that the average of all the other groups is correspondingly higher, and the range of the other groups will also be a bit tighter at the margin (Saudi Arabia and Egypt would have been the weakest team in any other groups, and Russia would not have been far off). This has helped compensate for the failure of some high ELO teams to make the tournament, and made for tighter groups overall. So on balance, with the exception of Group A, the groups of the 2018 World Cup look no worse than those of the past few editions.

Which groups should you watch?

Most people who watch the World Cup will have one or more team they support, and so they’ll watch matches involving those teams. But what about all the other games? What about the neutral viewer?

What makes a group worth watching? Obviously one would, all else equal, prefer to see the best teams play. But maybe even more important is unpredictability: a strong group as measured by the average ELO score might end up being quite boring if the difference between each teams is so large that you know ahead of time who will the knock-outs. What you want is to not know ahead of time who will take the top two spots. One measure of this is the difference between the second- and third-rated teams. The closer they are, the more uncertain the group. So how do the 2018 groups stack up?

This chart plots the average ELO score for the group on the x-axis and the difference between second- and third-ranked teams (as measured by pre-tournament ELO scores) on the y-axis. The green dots are the 2018 groups, with the “WC18” dot denoting the average for the tournament as a whole. By our measure, the best groups to watch are the ones near the bottom right: they have strong teams with more than two teams strong enough to make it. By that measure, the group to tune in for is F (Germany, Mexico, Sweden and South Korea), where the second spot in the group behind Germany is very much in play. Groups C (France, Peru, Denmark and Australia) and D (Argentina, Croatia, Iceland and Nigeria) are in similar shape, with one strong team expected to finish first and the other spot up for grabs. Group G (Belgium, England, Panama and Tunisia) is potentially the least exciting group by this measure: Belgium and England have substantially higher ELO scores than Panama or Tunisia and are likely to cruise through. Not captured by our statistic, but also relevant, is the fact that the second-placed team in this group goes on to face the winner of Group H (Pot 1 team: Poland), which is not so difficult a match as to be avoided at all costs.

Summary

So what have we learned? Three things:

ELO scores are a quantitative measure of relative team strength that is easy to source and has at least some useful correlation with subsequent team performance, so it seems a good basis for analysis;

the 2018 World Cup has one very weak group, but is not really worse on average than previous editions… and because of this weak group, all others are a bit tighter and stronger than they would have been otherwise; and

if you’re a neutral viewer who wants to see good teams play in an uncertain group, and you need to economise on your time (or, depending on the timezone: sleep), you could do worse than prioritising groups F, C and D.

None of this is a guarantee that the games themselves will be any good, of course! Sometimes a close situation in which teams have a lot to lose makes for tight matches in which teams work harder to stop conceding goals than they do to score them. And of course, ELO scores are hardly the be-all and end-all of things: the bottom right points on our first chart above tell you that big upsets happen. But, short of watching all games, this little investigation seems as good a starting point as any to make sure you get a good experience out of watching the tournament.

In the next post of the series, I’m going to have a look at which teams got lucky during the draw, and which ones did not.

PS: As mentioned above, the data for this was pulled from wiki in early December, and the charts and analysis were done in R. I’m happy to share the code on request.

Employment Hysteresis from the Great Recession (In full-population longitudinal data, I find that exposure to a 1-percentage-point-larger 2007-2009 local unemployment shock caused working-age individuals to be 0.4 percentage points less likely to be employed at all in 2015, evidently via labor force exit)

On migrants moving in: Tipping and the effects of segregation (‘We examine the effect of ethnic residential segregation on short- and long-term education and labour market outcomes of immigrants and natives’; large pdf)