Modifying Football's Pythagorean Theorem

In his 1980 Baseball Abstract, Bill James introduced a statistic that could predict a team's future winning percentage more precisely than their simple win/loss record, called Pythagorean Winning Percentage. James stated that using runs scored and runs allowed provided a more accurate look at a team's record, because it eliminated any luck factors involved in games. The name stems from the formula, which reminded James of the mathematic equation.

According to James, expected winning percentage =

Runs Scored^2

----------------------------------------

Runs Scored^2 + Runs Allowed^2

Later attempts found that instead of 2, better results can be found with exponents of 1.82. The formula can be used in different sports, as well—using an exponent of 14 (or 16.5) for basketball, and 2.37 for football.

But that latter number is the subject of this article. Is 2.37 really the best exponent to gauge a team's success?

Many attempts have been made to find the perfect exponent for the formula, with two efforts leading the way: Clay Davenport's Pythagenport formula, and David Smyth's Pythagenpat formula. Both ways use a formula to find the best exponent for each team—rather than having one concrete exponent for the whole league. Davenport's formula is .45 + 1.5 * log(RPG) , where RPG is equal to the total number of runs scored and runs allowed per games played. Meanwhile, Smyth's formula is more simple: RPG^.285.

I used James' and Smyth's formulas to find the optimal exponent for football. Is it a definite number, or one that varies per team?

The following table shows (warning—this is a mouthful) the average of the total of the differences in teams' Pythagorean wins and actually wins, per year. (In essence, I went back to 2002 when the divisions were re-aligned, found the difference in Pythagorean wins and actual wins for every team using various exponents, added all of the differences up, then divided by six, the amount of seasons from 2002 through last year.)

Exponent

Difference

2.38

32.035

2.39

32.008

2.40

31.981

2.60

31.587

2.61

31.580

2.62

31.577

2.63

31.576

2.64

31.575

2.65

31.577

2.66

31.578

2.67

31.581

2.68

31.586

As the table shows, the 2.64 exponent yielded the smallest difference, even lower than the standard 2.37 exponent. In fact, there were several other exponents that created lower results than 2.37.

Then there's the Smyth method. Because it was developed for baseball, I had to find the correct exponent that had an average of 2.64 (the optimal exponent in the James method) after applied in the formula. Using 0.26 instead of .285 is the ideal exponent for football. Here's how the Smyth method compared to the James method with the 2.64 exponent.

Method

Difference

James

31.575

Smyth

31.441

The Smyth method with the 0.26 exponent generated a lower difference than the James formula, and by a large margin to boot. Using different exponents for each team turned out better than one particular number because the higher the points scored in a game by any one team, the higher the exponent should be (which is why basketball has such a large exponent). And contrary to popular belief, the 2.37 exponent has quite a bit more numbers that can be used as an exponent to find the lowest difference of Pythagorean wins and actual wins.