Statistical insignificance

nfl

My gut instinct tells me that NFL running backs are some of the most poorly treated athletes in the world of sport. The rules against hurting running backs are significantly less strict than those for wide receivers or quarterbacks which leads to a significantly larger amount of career ending injuries. Teams know the fragility of running backs and are less inclined to offer them guaranteed money on their contract (money which will be given even in the case that a player cannot keep playing due to an injury) which significantly lowers the career earnings of an unlucky running back. Further to that, coaches treat running backs as expendable due to the simplicity of their task and will often drop an injured one for a healthier model, which due to the pyramid scheme nature of the NFL there will always be. Given enough data and enough time I would like to prove all of the above is true.

However for this post, I want to show that a running back’s age affects their ability to play in the league. Not only that as a running back gets older they are less likely to get a job, but simply being on the wrong side of 30 will dramatically reduce their chance of having a job.

What I have: A database of all players currently active in the league, and a historic database of all drafted players.

What I’m going to do with it: See that running backs over 30 are disproportionately cut from NFL rosters compared to other skill positions.

Do you think if you flipped a coin in a mint, it would show heads more than tails? Imagine if we set up a small coin-stadium in or adjacent to the mint where the coin was made, where other coins would sit around watching the coin get flipped. Say we flipped the coin outside of the stadium first a bunch of times and showed that it was relatively 50/50 whether it was going to be heads or tails, but then we went back to this mint-stadium and flipped the coin 3,879 times, and it turned up heads 2,219 times. With a simple statistical test, you can show that the probability of a 50/50 coin giving this result in the stadium is 0.000000000256%.

Football is not a coin. However every team – no matter how good or bad – plays 16 games in the regular season: 8 of those at their own stadium and 8 of those at an opponents stadium, so a good team will play at home as much as a bad team will. Yet when you run through the stats the ‘home field advantage’, i.e that the home team are more likely to win than the away team, is more statistically significant () than the detection of the Higgs boson ().

What I’ve got: 14 years of regular season NFL data (2000-2014) – a few thousand games, half a million plays.

What I’m going to do with it: Try and find which bits of a football game are affected by ‘home field advantage’ in a (fairly) rigorous manner.