Evaluation of 2009 Pitcher Forecasts

In order to answer this question for our projection system as well as several others we compared the accuracy of each system’s 2009 pitcher projections to Marcel the Monkey’s projections. We have previously examined hitter projections and you can see that study here.

We looked at a pool of 512 pitchers (all pitchers with a minimum of 10 IP and prior MLB stats or 25 IP and no prior MLB stats so long as 2 or more systems provided a projection). When systems did not provide projections for some of these 512 players that system was assigned the Marcel projection for those players and limited to beating Marcel (or losing to Marcel) on the projections it did make. Of these 512 pitchers, 51 of them did not have prior major league stats and, as a result, all received the same projection from Marcel:

Oliver was the most comprehensive system with projections for 510 of the 512 players. Steamer only projected 43% of the players (covering 60% of the innings) and Sporting News only projected 28% of the players (42% of the innings). Cairo, Chone, Zips and Pecota all projected at least 95% of the innings.

To make these results more compelling, and to give a sense of whether the differences between these systems are meaningful, we’re presenting the success of each system relative to Marcel as a W-L record over a 162 game schedule.

ERAProjections

System

W

L

W%

z score

ZiPS

100

62

0.618

3.01

CAIRO

95

67

0.585

2.18

Chone

91

71

0.564

1.62

Steamer

90

72

0.558

1.47

PECOTA

88

74

0.541

1.05

Oliver

82

80

0.503

0.08

Marcel

81

81

0.500

–

Fantistics

80

82

0.492

-0.20

Sporting News

73

89

0.454

-1.18

This is a big win for ZiPS and a very strong showing for CAIRO. Steamer and Pecota beat Marcel soundly and Oliver and Fantistics hold their own with the monkey. Not a great year for Fantistics.

How should you interpret these standings? The W-L records show you how well each system did when compared to the Marcel system (the baseline projection system). How likely is it that ZiPS was really no better than Marcel and simply got really lucky this year? Just as likely as it is that a 100-62 team was really just a .500 club that got lucky. Not very likely.

For the statistically inclined, what I did was calculate a z-score for the hypothesis that each system was equally as good as Marcel and matched that up with a Z-score for the hypothesis that a team is really a .500 team given a certain winning percentage.

None of the systems demonstrated a statistically significant ability to project the ERAs of the 51 players without major league stats. Granted, 51 players may simply be too small of a sample in which to expect significant results when predicting ERA particularly for guys who didn’t throw that many innings. This is why I am excited about Oliver and Pecota projecting several years forward in time. These multi-year projections might prove useless but, without them, I think it will be difficult to discern which systems have the best grasp on minor league performances and scouting.

So, ZiPS wins big. Does this mean that you should simply use ZiPS pitching projections and you’ll dominate your fantasy league? ERA is, of course, only one of five categories in most fantasy leagues that also rank teams in strikeouts, wins, saves and WHIP. Standings Gain Points (or SGPs) is a system used to determine how valuable a pitcher was across all five categories. How good were these system in projecting total fantasy value?

SGP Projections

System

W

L

W%

z score

Fantistics

87

75

0.536

0.92

Marcel

81

81

0.500

–

Steamer

81

81

0.498

-0.06

Sporting News

81

81

0.498

-0.06

PECOTA

73

89

0.453

-1.19

CAIRO

70

92

0.430

-1.79

Only Fantistics beats Marcel. In fairness, only Fantisitcs and Sporting News are aimed primarily at the fantasy audience. The PECOTA projections used here is from their weighted mean spreadsheet and not from their depth charts which are better tailored to the fantasy game. Oliver, Chone and ZiPS aren’t shown here since they didn’t predict saves, one of the components of SGPs.

IP Projections

System

W

L

W%

z score

Fantistics

96

66

0.595

2.41

Sporting News

93

69

0.572

1.82

Pecota Depth Charts

91

71

0.564

1.62

Community

81

81

0.500

–

Looking here only at the systems that try to predict playing time we see the main reason Fantistics is so successful at predicting fantasy value. Just as it did for hitters, Fantistics excelled at projecting playing time for pitchers. The “Community” line represents the fan projected playing time that Marcel relied on.

Rounding out the fantasy stats, let’s look at wins, saves and WHIP

Win Projections

System

W

L

W%

z score

Fantistics

102

60

0.627

3.23

PECOTA

95

67

0.586

2.20

Sporting News

92

70

0.568

1.74

Steamer

91

71

0.564

1.64

CAIRO

84

78

0.518

0.45

Chone

82

80

0.504

0.11

Marcel

81

81

0.500

–

ZiPS

80

82

0.491

-0.22

Another big win for Fantistics, PECOTA looks very strong here as well. Not Marcel’s finest hour.

Before looking at saves I offer the following disclaimer: Steamer left projecting saves to the last minute (right before Dash and Peter and the rest of the Steamers were departing for Florida to tune up for high school baseball season). So, to the extent that Steamer had a system for projecting saves it was “Hey, Peter, come up with some save numbers for the guys you think are closers.” “Okay.” How did he do?

Save Projections

System

W

L

W%

z score

Sporting News

99

63

0.610

2.80

Fantistics

98

64

0.602

2.60

PECOTA

92

70

0.571

1.80

Steamer

91

71

0.562

1.57

Marcel

81

81

0.500

–

CAIRO

71

91

0.437

-1.60

Not bad, Peter. Not bad. He didn’t keep up with the experts at Sporting News and Fantistics but he did beat Marcel.

WHIP Projections

System

W

L

W%

z score

ZiPS

98

64

0.605

2.66

Steamer

94

68

0.583

2.11

PECOTA

91

71

0.561

1.55

Chone

90

72

0.557

1.46

Oliver

89

73

0.552

1.32

Fantistics

85

77

0.527

0.68

CAIRO

83

79

0.511

0.29

Sporting News

82

80

0.507

0.18

Marcel

81

81

0.500

–

This impressive showing by ZiPS is no surprise after it’s dominance in ERA. My understanding is that Marcel regresses hits for pitchers no more than any other counting stat. This makes it an easy target when it comes to WHIP.

Now, that we’ve covered the fantasy baseball categories, let’s to figure out what makes ZiPS so good at projecting ERAs by looking at the components of ERA.

K/9 Projections

System

W

L

W%

z score

Oliver

92

70

0.571

1.79

PECOTA

92

70

0.568

1.73

Chone

89

73

0.552

1.33

CAIRO

85

77

0.525

0.64

Steamer

85

77

0.524

0.60

ZiPS

82

80

0.505

0.13

Marcel

81

81

0.500

–

Sporting News

78

84

0.481

-0.48

Fantistics

74

88

0.455

-1.16

Marcel divides this category nicely into two groups: formula based systems above and rotisserie experts below. Oliver and PECOTA take this category and we’re no closer to figuring out what ZiPS is doing better than everyone else.

Strikeout rate projections provide a better basis on which to judge how well these systems evaluate minor league stats. Comparing the systems that project a wider pool of players to each other we get:

System Correlation with actual K/9

Oliver

0.535

CAIRO

0.433

PECOTA

0.409

ZiPS

0.210

Oliver does thrive at applying MLEs to determine strikeout rates.

BB/9 Projections

System

W

L

W%

z score

Marcel

81

81

0.500

0.00

ZiPS

81

81

0.499

-0.01

Oliver

81

81

0.498

-0.06

Steamer

79

83

0.487

-0.34

PECOTA

76

86

0.468

-0.82

Chone

69

93

0.428

-1.83

Sporting News

68

94

0.422

-1.99

Fantistics

65

97

0.402

-2.50

CAIRO

32

130

0.197

-7.71

Two things jump out at us here. First, none of the systems beat Marcel! Okay, ZiPS and Oliver are essentially tied with Marcel but no one beats it. Second, what the heck happened to CAIRO? CAIRO excelled at projecting ERAs but how did it manage that with these walk projections? I went back to the original CAIRO spreadsheet to make sure I hadn’t messed up the data somehow but I can’t find anything wrong with it. Cairo simply had very extreme BB/9 projections. Here are the distributions of BB/9 projections for Marcel and Cairo side by side:

BB/9 Percentile

Marcel

CAIRO

97.5%

2.24 BB/9

1.15 BB/9

90%

2.74

1.77

75%

3.13

2.31

50%

3.47

2.88

25%

3.75

3.53

10%

4.07

3.98

2.5%

4.42

4.57

The good news for CAIRO? This didn’t seem to effect their ERA projections as much as I would have expected. According to the equation for FIP every addition BB/9 should increase ERA by a third of a run. And, controlling for K/9 and HR/9 an additional walk lead to an increase of 0.31 to 0.49 in ERA for the other systems. CAIRO ERAs only increased by 0.19 for every additional BB/9.

HR/9 Projections

System

W

L

W%

z score

PECOTA

97

65

0.600

2.54

CAIRO

96

66

0.590

2.28

Chone

95

67

0.587

2.21

ZiPS

94

68

0.580

2.04

Oliver

85

77

0.525

0.63

Marcel

81

81

0.500

–

Steamer

78

84

0.481

-0.48

Fantistics

57

105

0.352

-3.78

PECOTA, CAIRO, Chone and ZiPS all win big here. Steamer used a Marcel-like system to project HR/9 but for 2010 we’ll be using FB% and LD% to project HR’s and, hopefully, this will be a major area of improvement for us. HR/9 is likely an afterthought for Fantistics since it’s not typically a fantasy category. Sporting News doesn’t even bother to project it. Fantistics, along with ZiPS, projected HR/9 most aggressively (the broadest distribution) while Marcel and Steamer, limited by their ignorance of batter ball data, projected the narrowest ranges.

Now let’s wrap K/9, HR/9 and BB/9 into a statistic for overall pitching goodness, I’ll call ~ FIP. I’d use FIP but not all of these systems projected IBB and HBP so I just left them out and calculated:

~ FIP = (HR*13+BB*3-K*2)/IP

~ FIP Projections

System

W

L

W%

z score

Marcel

81

81

0.500

–

ZiPS

79

83

0.489

-0.29

Steamer

78

84

0.479

-0.54

PECOTA

77

85

0.478

-0.56

CAIRO

76

86

0.468

-0.83

Oliver

75

87

0.465

-0.88

Chone

72

90

0.442

-1.47

That’s right, none of these systems projected real ~FIP better than Marcel’s ~FIP. In fact, if Marcel had simply made these ~FIP predictions its ERA predictions, only Cairo and ZiPS would have beaten Marcel in predicting ERA. Actually, in light of Tango’s recent blog entry, I should point out that Marcel would do better than three of these systems (and do better itself!) if it just relied on it’s BB and K predictions and calculated ERA = 5.4+ 3*(BB-K)/IP.

For Steamer, I think the big lessons from all of this are:

Use FB% and LD% to project HR rates.

Don’t use a component ERA formula that doesn’t work as well as FIP.

Just let Peter project saves.

All of these systems have weaknesses.

With some much needed tinkering we hope Steamer will be significantly improved for the 2010 season. We plan to run this analysis again after the 2010 season to see if our changes paid off. Along those lines, I am going to try to collect projections (ideally, with MLBAM IDs) before the season starts so that I’ll have the results right after the season ends. If you have projections with MLBAM IDs that you want included in next year’s analysis we’d be happy to oblige.