Help Support FanGraphs

Here at FanGraphs, we’re big fans of Sean Smith’s CHONE projection system. It’s proven to be equally or more accurate than any other projection system out there, he provides the data for free on his site, and he’s a good guy.

So, today, with news from the game at a virtual standstill, I figured we should take a look at the projected standings that he recently put up, based on the CHONE forecasts and his playing time estimates.

None of the projections should be all that surprising. The East divisions are very good, the West divisions are not, and the AL is still better than the NL. Perhaps fans in LA will be surprised by the lack of wins projected for their two teams, but as we’ve talked about, the Angels win total was built on a house of cards last year, and the Dodgers haven’t re-signed Manny Ramirez yet.

I know that whenever a projection system is published, a bunch of you guys immediately look at the results and proclaim they’re too low, because the projected leaders have less wins/home runs/strikeouts/whatever than the previous historical leaders. As we always try to explain, that’s because of regression to the mean. We understand that the final AL West winner isn’t likely to have 85 wins. Even CHONE will agree with you on that.

Let’s walk through an example, shall we? The Angels are projected to win the AL West with an 85-77 record. But that’s just a mean projection, based on the range of probabilities of the Angels winning anywhere between 60 and 110 games. Obviously, at the extremes, the odds are very tiny, so the distribution of the probabilities will look like a bell curve. Actually, let’s just show it to you.

At each win total between 81 and 89, there’s a greater than 5% chance of that win total occurring, if we agree that the Angels are a true talent 85 win team. There’s less than a 1% chance of each win total at less than 72 or greater than 98, but those are still possibilities, even if they’re pretty unlikely. Those are the individual probabilities – now let’s look at the cumulative probability.

We find 76 wins at the 90% mark. In other words, we’d expect this Angels team to win at least 76 games 90% of the time. 50% gets you to 85 wins, while 10% gets you to 93 wins. So, while 85 wins is the mean for the Angels, and they have the highest mean of any team in the division, it is not predicting that the division winner will finish with 85 wins.

The Angels have a 19.3% chance of winning 90+ games, based on this distribution. But they’re not the only team in the division. The A’s, with their projected 81-81 record, have a 6.7% chance of winning 90+ games in 2009. The Mariners, with their 78-84 projected record, have a 2.4% chance of winning 90+ games. And the Rangers, projected at 72-90, have a .1% chance of winning 90+ games. The sum of these probabilities is 28.5%. In other words, despite projecting the best team in the division to win 85 games, CHONE is still saying that there’s a 28.5% chance that the division winner will win 90+ games.

Hopefully, this is somewhat helpful – when you look at projected standings, they are giving you relative strength from team to team. In every division, it’s pretty likely that some team will outperform their expected win total, and that the division winner will end up with more wins than the mean of the projected top team. This is not a flaw of projection systems – it is a reality of math.

[CHONE has] “proven to be equally or more accurate than any other projection system out there…”

Baseball Prospectus makes the same claim about their PECOTA projection system. What is the above statment based on? I’m not trying to be argumentative, but since these are two competing claims, I’m curious. Thanks.

BP is also a bunch of people that really care about baseball. It’s not an infomercial and it’s unfair to treat the people at BP like they don’t really believe in their product.

Vote Up

0

Vote Down

8 years 9 months ago

Guest

Fresh Hops

In the article Dave links there’s a really nice response to the idea that this or that projection system is better:
“I hope that more researchers evaluate the forecasting systems like David did, so that we can stop with the nonsense that one system is better than the other. As I’ve said in the past, it is a very dubious claim, akin to saying a team that wins 85 games is better than a team that wins 84 games. Marcel will win 83 or 84 games, the good forecasting systems will win 84 or 85, and the not so good will win 81 or 82.”

There’s far too much that’s unknown in baseball to create any kind of super projection system that can actually predict the season. For any projection system, random chance/unknowns, will better explain deviations from predictions that shortcomings of the model. Might CHONE or PECOTA be ever so slightly more accurate than the other? Sure. But in any given season, CHONE and PECOTA are probably more likely to agree with one another than they are with reality, because random chance (something which by definition neither can predict) has more influence on the final standings than non-chance factors that these systems account for differently.