Pages

Wednesday, 4 September 2013

In my experience, there's no such thing as luck... Except in football.

Football's back! Hurrah!

There'll be an update on the predictive model in the near future, but suffice to say for the moment that it's not dead, which is a huge relief. The predictions will be moving off Wallpapering Fog so that I can keep accessing Opta's data, but we'll be giving the bookies another good pasting from a new home this season, starting in a month or so's time. Stay tuned.

In the meantime, I've been thinking about luck.

There's obviously a lot of luck in football. Games are low scoring and that means one extra goal is very valuable, which is why we celebrate scoring so much vs. a sport like basketball, where it almost makes more sense to celebrate thwarted opposition attacks than your own successful ones.

The margins in football are small. Overall, in the past five seasons, 9.2% of shots were goals and 1-1 is the most likely scoreline in a single game. In that context, the possibility of an additional 'lucky' shot leading to a 2-1 win, rather than a 1-1 draw, is very real.

I'm far from the first person to work this out.

James Grayson has posted an excellent analysis, which concludes that the average team in the English Premier League has a random points variation of +/- 8 points. That is, whatever your 'correct' league placing, in any given season you'll do 8 points better or worse than that, just through pure chance.

Zach Slaton follows up this work on Forbes, extending it into Arsenal's chances of a top four finish and speculating about whether they have been lucky in recent years with Champions' League qualification.

And The Numbers Game concludes that football results are 50% luck. (Say the reviews. Hands up, I haven't read it yet, so I can't treat this claim fairly. I've ordered a copy.)

I don't like luck. Or rather, I feel strongly that if a large proportion of football results are down to luck, then we need to be very sure we measure that proportion correctly. If nothing else, how do you tell a manager whose job hinges on avoiding relegation, that he might as well flip a coin, because it's out of his control? Or tell a mid table side that they'll finish anywhere between eighth and sixteenth and there's nothing they can do about it? I know that's not exactly how probability works, but it may well be how the analysis is perceived. Which is important.

I want to have a crack at this question myself using a bottom-up approach rather than looking at points variances across a league. It might come out with exactly the same answer, but I like building analyses from the ground up because you can see the assumptions more easily and it's often easier to communicate what you did, rather than asking for an audience to trust in a complex formula.

So how do we build a ground up analysis of luck?

I'm going to begin with shooting, which is an assumption running through this whole analysis. You'll see why in a second.

Across the past five seasons, the average EPL team had 14.3 shots per game and scored with 9.2% of those. We can take the 14.3 shots and do some basic (ahem, I didn't have to revise these methods at all, honest) probabilities...

In simple terms, a shot outcome is 1 or 0. Either the shot is a goal (1) or it isn't (0).

We've just seen that the chance of a single shot going in is 9.2%, so the chance of it not going in, is 90.8%.

The chance of a team having 14 shots and not scoring with any of them, is 0.908 * 0.908 * 0.908...

= 0.908^14

= 0.26

So for an average team, taking an average number of average quality shots, there's a 26% chance they don't manage to score at all in any one game.

If you want to see what the chances are that they score once, or twice, or more, then you need the binomial distribution. It gets a bit complicated and you need to use a distribution, because any combination of the 14 shots might go in, or not, and that's a lot of different combinations.

Here are the chances for a team that takes 14 shots in a game, of scoring different numbers of goals.

And here comes a big assumption. In reality, there are 'score effects' in football, which this analysis isn't going to consider. Score effects mean that certain scores are 'sticky' because they encourage teams to sit back and defend, while other scorelines let a team relax and attack. Think about what happens when an important game is at 1-0 going into the last 10 minutes, compared to if one team is already 4-0 up.

If we ignore score effects, what might the result look like, for two exactly equal and average teams playing against each other? They'll draw, right? Well no, not usually...

They'll have 14.3 shots each and score 9.2% of those.

Here are the possibilities for a draw:

So two hypothetical teams which are exactly identical in every way and take an average number of shots each, will only draw 27% of the time.

The other 73% of the time, they share the wins: 36.5% each.

This is, I think, the extreme example of luck in football. The game should be a draw, but 73% of the time it won't be. You could say that in 73% of these games, the result is being determined by chance. Overall of course, it evens out over a large number of games, but it doesn't in a single game and these teams will only play each other twice per season.

I was drawn into this topic because I didn't intuitively like the levels of luck that were being suggested and now I'm saying some results are 73% luck. Damn.

If you double the shot conversion rate, you get even fewer draws, coming out at 20% of games. Teams draw most often when either they don't shoot, or when the shot conversion rate is very low, which makes sense. At only five shots per team, the chances of a draw increase to 48%, with a lot more 0-0 and 1-1 results.

That's enough about draws. What if we look at wins?

We should pause here for a moment and define what we mean by 'luck'. I'm taking it to mean that the best team doesn't win, even if that best team is only very slightly better than their opponent. If the lower skilled team wins or draws, purely through how the dice rolls on shot conversion, then they were lucky.

This isn't totally satisfactory and goes to the heart of why I don't really like talking about luck. It's easier to call the random variation in a team's total season points 'luck', than to label a single game 'lucky' but the concept is the same - it's about winning points you don't deserve, or losing points that you do. If a non-league underdog beats a Premier League team 1-0, having taken only one shot to the superior team's twenty shots, were they lucky to win? I think most people would agree that yes, they got lucky. The newspaper sports pages wouldn't, they'd call it a giant killing, but then, if Goliath should squash David nine times out of ten, David got lucky.

Luck kills much of the narrative that we love about football. I watched Exeter City get a 0-0 draw away at Old Trafford in 2005 and it's the best game I've ever been to. Were we lucky? Of course we were. On a different day, Scholes scores and Man U win. On most days, Exeter would get battered.

But to just label that result 'lucky' and dismiss it, is to dismiss a lot of what makes football great. If a manager sets up a team to have a 1% chance of winning and wins, he's lucky. But 20% chance? 30%? I'd prefer to say you're giving yourself more chance, than that you're lucky.

However, for this analysis we need a dividing line. For the rest of this post, if you beat or draw with a superior team, purely through the way the dice rolls on shot conversion, then you're lucky.

The draw example above was a fun starting point but it's not really sensible because you'll never really have two exactly matched teams. It can give us a good idea of maximum luck though because if we assume that Team A is very, very slightly better than Team B, then we get:

Team A should win. They're the marginally better team and Team B can only beat them by being lucky. The chances Team A don't win are 63.5% (the chance that Team B wins, plus the chance of a draw).

So in a very evenly matched game between two statistically average teams, we've got 63.5% of the result being down to luck.

Let's ditch the hypothetical average team and bring in some real ones. Here are average shots per game and goal conversion rates for each EPL team, averaged across the past five years.

Properly calculating the chances that a team will win a game instead of draw it, is slightly difficult, because there are many combinations of scores that will win you a game. We work out the chance that a team will score 1 goal, with the opposition only getting 0, then of the team getting 2 goals, with the opposition only getting 1 or 0, then of getting 3 goals, with the opposition scoring fewer than that... and so on. Then we add up all those different combinations that would give you a win and you have the overall chances of winning.

To see what that looks like, here are Manchester United's goal chances - based on the table above - vs. the average team that we saw earlier.

Based on these chances, Manchester United would be expected to beat a statistically average team 55% of the time, with 22% of games drawn and 23% lost.

(I know the opposition are unlikely to get their regulation 14 shots against Man U. We'll get to that in a minute...)

If we say that Man U are the better team and that the fairest result is a Man U win, then the opposition are 'lucky' 45% of the time, when the game ends in a draw or a loss for United.

Now we've got a spread of luck, from 63.5% when two teams are almost equally matched, to 45% when Man U are playing. That feels better. Man U have historically dominated games and left less space for random chance.

We can do this exercise for the whole list of teams above. How much luck is there in the result, for each EPL team, when playing against an average opponent who gets off 14 shots, with a conversion rate of 9.2%?

Luck works both ways. Manchester United have a low luck coefficient because they're likely to win; Middlesbrough also do, but because they're likely to lose.

Following me so far? There's still far too much luck here though, because we haven't introduced defence yet.

As I mentioned earlier, Manchester United don't let the average team take 14 shots, or let them convert those shots at 9.2%. They defend much better than that.

Here's each team's attacking performance - that we saw earlier - and now also average defensive performance. The extra numbers show us how many shots the average opponent manages to hit against each team and how many of those go in. Again this is all summarising over the past five years of the English Premier League.

The final piece of the puzzle is to set up each team against their own average opponent, instead of always using the general average opponent across the whole league.

Manchester United's own and opposition scoring chances now look like this, which feels a lot more like a real picture of Sir Alex on a Saturday afternoon, against a mid-table side:

Here's what that does to the luck coefficients for each team:

We end with a spread of results that are happening by chance, from Manchester United at the bottom end, where their average opponent can achieve a result through luck only 32% of the time, to Fulham where a very high proportion of the result - 63% - is being governed by chance.

It's worth stating that this doesn't mean Fulham are particularly 'lucky', or even that they're a bad team. It just means that they're incredibly average, so when pitted against an average opponent, the result could be anything.

That's it for single game probabilities and if you've got any feedback on the methodology, or you want to tell me I've got my sums wrong, please jump on the comments below. The next step is to take these and apply them to a full season, to see if Arsenal really are fortunate to keep making that top four and also maybe try to work out whether relegation really is a lottery.

Thanks for the comment. Completely agree that the results are intuitive and in fact, if they weren't, I'd assume I'd done something wrong in my maths! The aim was to put some hard numbers of that intuition, which I hope this is a first step towards achieving.

Don't worry on the predictive model... I can't run it until I have more data on the new players in the EPL, but it will be coming back soon. I'll post here when it's up in its new home.

Interesting and very transparent analysis. "David v Golaith" reference brings up an interesting point about tactics. David played the game that suited him (slingshot instead of armour) and made Golaith's strengths useless (to a point). I'm not sure I'd call it entirely luck. Calling it luck assumes no tactics or strategic risk-taking. Same applies to football too, IMO.

So bringing it back to the prediction model and using the predictions to beat the bookies, how would this be applied?

Would you suggest that where there are two teams of an 'average' nature with high luck proportions the game should be steered clear of? Or could this be made more scientific and be used to model a certainty score for the predictions which arise from the model? So the predictions are produced using the existing model but then there is a separate measure based on the luck factor which says how likely the prediction is to occur?

I'm not so sure this one has an application to the prediction model, so much as setting a limit for its accuracy.

The bookies odds and my predictive model both score just above 50% of results called correctly. This luck analysis should be a way to see what's the best you could ever do with the model, because you can't predict the lucky part of the result.

In terms of betting if two teams are exactly evenly matched but the bookies don't think they are, you've still got a margin, even though there's a lot of luck in the final result.

I don't understand why did you change the definition of luck normally used. The definition you use is not continuous in the sense that the luck for 2 equal teams is calculated as the probability of any win, where the luck for 2 almost equal teams is calculated as the probability of the slightly better team no winning.You should keep using the definition of variance not explained by the difference in skill as in:http://jameswgrayson.wordpress.com/2013/04/07/the-variance-of-points-in-the-premiership-and-la-liga/