Ballpark Mathematics

April 21, 2009

Like the dawn of a new day, the start of the baseball season carries with it tremendous promise. These first few weeks provide a reprieve from the breakneck pace of March Madness, where every team is burdened with the knowledge that one loss is all it takes to prevent it from total victory. Instead, the major leagues are a product of the season in which they begin, and just as the warming weather invites us to spend weekend afternoons on grassy knolls looking for shapes in the clouds, so too do the opening games of the baseball season encourage us to let our hair down and reacquaint ourselves with this traditional American pastime.

However, eventually Spring must give way to Summer, and Summer must give way to Fall. As the days grow shorter, so does the window of opportunity for a team to make it into the playoffs, making every game in that final stretch increasingly important for teams that may be on the cusp of attaining a trip to the post-season. As with many things in life, the factors that determine which teams on the cusp will make it through to the playoffs are not entirely within that team's control - of course they must play to the best of their ability, but they must also hope that those teams in the race with them will falter.

For many, this is what makes the rush to the postseason so exciting. However, if you can mine the treasure trove of data that the MLB generates every year, perhaps you can take some of the mystery out of which teams will ultimately prevail.

With this goal in mind, Professor Bruce Bukiet of the New Jersey Institute of Technology developed a mathematical model 9 years ago to help predict how well each baseball team will perform throughout the season. His model has beaten the odds for several of the past years, and has garnered enough recognition for this article on Yahoo News that was posted towards the beginning of the month.

Predictions versus actuals for the National League Central. The rest of the prediction data can be found here.

What parameters does Prof. Bukiet use to model the season? The model itself seems to be somewhat shrouded in secrecy, but according to the article, "his model computes the probability of a team winning a game against another team with given hitters, bench, starting pitchers, relievers and home field advantage."

So, if you're a Yankees fan or a Red Sox fan, this mathematical model has good news for you this year. The Cubs are poised for a good season as well. Of course, there's no guarantee that the model will give accurate predictions (and for you Giants and A's fans, you should hope that it won't), but based on historical performance, the evidence suggests that these predictions will go beyond other expert predictions.

Whether your team fares well under the model or not, the fact that this model can consistently beat the odds speaks to the power of mathematical modeling, when the correct parameters are used. Of course, you may be skeptical about the robustness of this model, given that it has stumbled, and the fact that information about the model is kept somewhat secret.

So, if you think your team is undervalued, I'd encourage you to make a model that you think can best this one. Especially if your model predicts that the Giants will get their act together, already.

Psst ... did you know I have a brand new website full of interactive stories? You can check it out here!