In two previous articles, I considered the ability of freely available forecasts to predict hitter performance (part 1 and part 2), and how forecasts can be used to predict player randomness (here). In this article, I look at the performance of the same six forecasts as before (ZIPS, Marcel, CHONE, Fangraphs Fans, ESPN, CBS), but instead look at starting pitchers’ wins, strikeouts, ERA, and WHIP.

Results are quite different than for hitters. ESPN is the clear winner here, with the most accurate forecasts and the ones with the most unique and relevant information. Fangraphs Fan projections are highly biased, as with the hitters, yet they add a large amount of distinct information, and thus are quite useful. Surprisingly, the mechanical forecasts are, for the most part, failures. While ZIPS has the least bias, it is encompassed by other models in every statistic.* Marcel and CHONE are also poor performers with no useful and unique information, but with higher bias.

We see from Table 1 that each forecasting system performs about the same when it comes to forecasting wins. A simple average of the forecasts or an optimally weighted average (see Table 4) does much better. For strikeouts, the results are similar, with CBS as a bit of an outlier. For ERA, the non-technical forecasts (ESPN, Fans, and CBS) each perform better than the mechanical forecasts. For WHIP, all are about the same, with the Fans at the bottom.

Table 1: RMSE

Wins

Ks

ERA

WHIP

Marcel

2.94

30.43

0.671

0.115

Zips

3.02

29.85

0.684

0.116

CHONE

3.01

28.85

0.678

0.111

Fangraphs Fans

2.95

28.81

0.652

0.125

ESPN

2.97

28.04

0.657

0.107

CBS

3.01

32.28

0.666

0.111

Simple Average

2.73

27.36

0.653

0.108

Weighted Average

2.65

26.48

0.640

0.105

Table 2 shows that, as a whole, bias is only a small part of the forecasting error, unlike for hitters where it can be quite large. The one exception is ERA, where the non-technical forecasts are over-optimistic by 0.05-0.08 points. There isn’t much to see here, which frankly, is a good thing.

Table 2: Bias

Wins

Ks

ERA

WHIP

Marcel

-0.45

-5.13

-0.012

-0.011

Zips

0.17

0.79

0.032

-0.006

CHONE

-0.59

-3.31

0.043

0.002

Fangraphs Fans

0.72

6.62

-0.079

0.023

ESPN

0.66

5.37

-0.065

-0.011

CBS

0.86

9.89

-0.055

-0.018

Simple Average

0.23

2.37

-0.023

-0.003

Weighted Average

0.00

0.00

0.000

0.000

Table 3 presents the bias-corrected RMSEs for each stat. The bias correction is done by subtracting the bias from each forecast, then re-computing the forecast errors. Unlike the hitters, where bias corrections mattered quite a bit, they don’t seem to affect the forecast rankings for pitchers. We still see that CBS, ESPN, and the Fangraphs Fans seem to do the best.

Table 3: Bias-corrected RMSE

Wins

Ks

ERA

WHIP

Marcel

2.88

29.74

0.671

0.114

Zips

3.01

29.83

0.683

0.115

CHONE

2.92

28.54

0.676

0.111

Fangraphs Fans

2.80

27.57

0.644

0.122

ESPN

2.85

27.21

0.652

0.106

CBS

2.80

29.78

0.663

0.109

Simple Average

2.71

27.20

0.653

0.108

Weighted Average

2.65

26.48

0.640

0.105

Table 4 shows the optimal forecast weights. These are the result of forecast encompassing tests that recursively drop the forecasts that have the least amount of unique information in them. By this metric, the mechanical forecasts are nearly worthless. Marcel, ZIPS, and CHONE forecasts have no unique information for any statistic when compared to the Fans, ESPN, and CBS forecasts. Put another way—if someone had the Fangraphs Fans, ESPN, and CBS forecasts, they couldn’t add any value by adding one of the mechanical forecasts.

Table 4: Optimal forecast weights

Wins

Ks

ERA

WHIP

Marcel

0

0

0

0

Zips

0

0

0

0

CHONE

0

0

0

0

Fangraphs Fans

0.29

0.45

0.88

0

ESPN

0.22

0.55

0.12

0.71

CBS

0.49

0

0

0.29

So what does this article tell us?

1) ESPN is really good at predicting pitcher performance.

2) Mechanical forecasts are bad at predicting pitcher performance.

3) Fangraphs fan projections add a large amount of information that you can’t get anywhere else

Thanks for reading!

Next up; 2011 forecasts!

*for descriptions of some of the technical terms and concepts here, please consult the earlier articles in this series, here and here.

You can flag a comment by clicking its flag icon. Website admin will know that you reported it. Admins may or may not choose to remove the comment or block the author. And please don't worry, your report will be anonymous.

You can flag a comment by clicking its flag icon. Website admin will know that you reported it. Admins may or may not choose to remove the comment or block the author. And please don't worry, your report will be anonymous.

ESPN is the best pitcher forecasting system? Are you sure there isn’t some sort of weird selection bias here? How do you determine the player pools that you assess? By the way, it seems incorrect to me to assess different pools for different systems. You have to pick a player pool and see how each system does with that pool. Then the problem becomes “what do you do with players that a system doesn’t project?” If CBS and ESPN aren’t projecting certain players, I’m worried that they’re getting an unfair edge because the players they aren’t projecting are the hardest… Read more »

You can flag a comment by clicking its flag icon. Website admin will know that you reported it. Admins may or may not choose to remove the comment or block the author. And please don't worry, your report will be anonymous.

@philo: I was pretty surprised myself. ESPN is the best? Really??

You’re absolutely right that you can’t consider different pools of players and then try to compare the metrics. That’s why the pitchers I consider are those with forecasts from each projection system, so the player pool is the same for each metric (146 pitchers total). Hopefully this eliminates any selection bias across different projection systems.

Vote Up0Vote Down

8 years ago

Member

Half Full

You can flag a comment by clicking its flag icon. Website admin will know that you reported it. Admins may or may not choose to remove the comment or block the author. And please don't worry, your report will be anonymous.

Nice, I’ve been looking forward to your pitching weights. Since reading your hitting articles I’ve began compiling a simple average between MARCEL, FANS, ESPN, RotoChamps, CAIRO, and Bill James (ESPN.csv from your site). It looks like I’ll be dropping the mechanical forecasts for pitching thanks to this latest finding. I decided to exclude CBS since their excel format for names was {last name, first name} and unlike all others. Is there any way to get around this? I’ve been using averageif in excel to compile averages to one sheet. One final note: In the comments of your last hitters forecast… Read more »

Vote Up0Vote Down

8 years ago

Member

evo34

You can flag a comment by clicking its flag icon. Website admin will know that you reported it. Admins may or may not choose to remove the comment or block the author. And please don't worry, your report will be anonymous.

I like the idea here, and appreciate the work. But the forward-looking conclusions simply aren’t supported by looking at a single year of data. To say, “If someone had the Fangraphs Fans, ESPN, and CBS forecasts, they couldn’t add any value by adding one of the mechanical forecasts.” and “1) ESPN is really good at predicting pitcher performance. 2) Mechanical forecasts are bad at predicting pitcher performance,” is just reckless. You can say in the past tense that systems X,Y and Z provided no value over the competition in 2010; but in the grand scheme of things, this is a… Read more »

You can flag a comment by clicking its flag icon. Website admin will know that you reported it. Admins may or may not choose to remove the comment or block the author. And please don't worry, your report will be anonymous.

@Half Full: Wow, I didn’t realize anyone was looking at that yet. It’s still in development, and I haven’t put all the forecasts up there yet. I’ve finished my hitter forecasts for next year and am working on the pitchers. They should be up next week sometime. At that point, I’ll write something up for Fangraphs and hopefully they’ll give me some press. For those of you who don’t know what I’m talking about, go to http://www.williamlarson.com/projections As for your excel issue, go to tools->text to columns, then choose “,” as the delimiter. Can you email me the CAIRO, CBS… Read more »

Vote Up0Vote Down

8 years ago

Member

Half Full

You can flag a comment by clicking its flag icon. Website admin will know that you reported it. Admins may or may not choose to remove the comment or block the author. And please don't worry, your report will be anonymous.

@Will No problem, I can put those together for you. I’ll try submitting them through your website. RotoChamp projections are available on the FanGraphs projection page already if you hadn’t noticed.

Vote Up0Vote Down

8 years ago

Guest

Wade8813

You can flag a comment by clicking its flag icon. Website admin will know that you reported it. Admins may or may not choose to remove the comment or block the author. And please don't worry, your report will be anonymous.

Just to clarify – this only means ESPN did the best in 2010, correct? So they could be worse than the others, but in the year of the pitcher happened to have guessed pitchers the best?

You can flag a comment by clicking its flag icon. Website admin will know that you reported it. Admins may or may not choose to remove the comment or block the author. And please don't worry, your report will be anonymous.

@Wade: Yes, you’re right. I’m only looking at 2010. I’ll need to look at other years to see if ESPN consistently outperforms the others or not. That being said, in other areas where people look at these weighted-average forecasts, the weights tend to be correlated from period to period.

Vote Up0Vote Down

8 years ago

Guest

Joel

You can flag a comment by clicking its flag icon. Website admin will know that you reported it. Admins may or may not choose to remove the comment or block the author. And please don't worry, your report will be anonymous.

Will, did you graduate? Wondering what happened to your intriguing 2011 projection comparisons..