Monday, June 4, 2012

Statistical Anomalies in Baseball - it's all in the presentation

Update, June 20: Here's a link to a related article in The Economist, about the statistics of perfect games, "An Imperfect Measure of Excellence."

Johan Santana's no-hitter for the Mets on Friday, the first in the 51 years of the team's existence, has been getting a lot of local press. A LOT - four long articles in yesterday's New York Times print edition, for example. But, at least according to this article in Slate, it's the no-hitter that's the statistical fluke, not the long wait for one. As Jim Pagels, the writer, puts it:

Johan Santana’s no-hitter is certainly remarkable—but not for the reasons the sports media are citing today. It’s noteworthy because a no-hitter itself is incredibly rare—not as rare as a four home run game, hitting for the cycle, or an unassisted triple play, but infrequent enough that a team can fairly easily go decades without one. Such a streak, in fact, isn’t statistically that improbable.

According to his calculations, the Mets had about a 1 in 100 chance of such a long streak, only slightly below the average. There's a more detailed look at the issue here, in Baseball Prospectus. This analysis also more nuanced, coming as it does from a fan's perspective. The author, Craig Glaser, argues that, while the Mets were overdue for a no-hitter, the streak wouldn't become really rare until the Mets had played 10,000 or so games.

Bottom line? An unusual individual achievement, absolutely. An unusually long streak, even a curse, that has now been broken? The answer, as usual, depends on the framework of the question.