One of the keys to a successful fantasy season is not drafting a pitcher who misses an extensive amount of time or performs much worse than the previous year. Anyone who drafted Chris Carpenter or Dontrelle Willis in 2007 or Rich Hill or Aaron Harang in 2008 can attest to this.

There have been several theories posed in traditional publications (SI.com – Tom Verducci) and in the fantasy baseball blogosphere (FantasyPhenoms, Beyond the Box Score, RotoAuthority) that claim a correlation between a pitcher’s innings/pitches from the previous year and their performance the next year.

While these articles mentioned several positive examples, none of them really tested their theories to see if they are true predictors of either injury or decreased performance. We decided to put them to the test along with several theories we’ve been kicking around:

High Pitch Volume – Does a high volume of pitches have a carryover effect the next year? (posed by FantasyPhenoms and RotoAuthority)

Spike In Pitch Volume – What is the effect of a significant increase in pitch volume vs. previous year? (posed by Tom Verducci)

New To The Workload – Is a pitcher who first reaches the 2,700 pitch threshold more likely to fall back the next year vs. a pitcher used to the workload?

High % of Breaking Pitches – Is a pitcher who throws breaking balls more susceptible to fall back vs. a fastball/changeup pitcher?

We also added a Common Sense test which predicted that weaker pitchers from year prior (4.00+ FIP) would be more likely to get hurt, moved to the bullpen, or demoted to the minors than stronger pitchers. This ‘common sense’ test also serves as a proxy for a typical fantasy baseball drafter’s judgment as they would likely have more faith in pitchers who were successful in the previous year.

For the test, we focused on the past three years (2006-2008) and on pitchers who threw at least 2,700 pitches (~ 155-160 IP) in that year. Pitchers who retired/semi-retired the next season for reasons other than injury were removed. This left a total of 247 pitcher seasons. We defined ‘miss significant time’ as someone who threw 2,000 or less pitches (~ 20 GS) in MLB the next year and ‘measurable decrease in performance’ as a 0.50 or more increase in their FIP (this is an ERA variation that stands for Fielding Independent Pitching based on only things within a pitcher’s control – K, BB, HR – and is more stable year-to-year than ERA). If a pitcher threw less than 2,000 pitches the next year, we ignored their FIP so as not to add insult to injury (ha!).

It’s worth noting that throwing 2,000 or less pitches in MLB does not necessarily mean someone was injured – it could also mean their performance decreased to a level that was no longer MLB-worthy and they were demoted. Since these amount to the same thing for a fantasy team owner (bupkus), we did not differentiate. We also did not credit for pitches thrown in the minor leagues prior to arriving in the MLB or pitches thrown in the postseason. This is partly out of convenience (our data source FanGraphs does not include this in their ‘Leaders’ section) and partly out of intent (a pitch in the minor leagues is not as stressful as a pitch in the major leagues).

In total, 59 (24%) of the 247 qualified pitcher seasons were followed up by seasons of less than 2,000 pitches and 53 (21%) were followed up with FIP increases above 0.50. That means that about 45% of starting pitchers either pitched significantly less or had a measurable decrease in performance in the following year. Ouch!

Below are the test results:

Test Group (2006-2008, 2700+ Pitches)

Pitching Performance vs. Previous Year

Test #

Previous Year Pitch Volume

Total

<2000 Pitches

FIP Up 0.50+

Combined

Index

#

%

#

%

#

%

1

30+% Sliders/Curve Balls

54

18

33%

16

30%

34

63%

137

2

27+% Sliders/Curve Balls

80

27

34%

20

25%

47

59%

128

3

+700 Pitches vs. Previous Year

56

19

34%

14

25%

33

59%

128

4

1st Year at 2700+ Pitches

49

17

35%

9

18%

26

53%

115

5

FIP 4.00+

153

44

29%

25

16%

69

45%

98

6

6500+ Pitches Previous 2 Years Combined

76

10

13%

21

28%

31

41%

89

7

9000+ Pitches Previous 3 Years Combined (2007-2008 only)

73

15

21%

14

19%

29

40%

86

8

3500+ Pitches

23

0

0%

9

39%

9

39%

85

9

3400+ Pitches

43

2

5%

14

33%

16

37%

81

The best theory proved to be the High % of Breaking Pitches Theory as both the 27+% and 30+% thresholds finished in the top two for index (137 and 128 respectively – the 137 means that a pitcher who threw 30+% Sliders/Curve Balls in the previous year is 37% more likely than the average pitcher to fall below 2,000 pitchers or see their FIP increase 0.50+. The Spike in Pitch Volume Theory (go Verducci!) and New to the Workload Theory also performed above average.

The ‘Common Sense’ test (FIP 4.00+) finished at about a 100 index meaning picking a random pitcher would’ve been as effective as choosing one with this criteria. This average performance may be misleading since it did a better than average job at predicting significant pitch decreases (29% to 24%) and its below average performance at predicting FIP increases may be because some of the FIPs were so high to begin with in the first place.

The High Pitch Volume Theory proved to be a relative failure. We tried four variations – one-year (3400+ and 3500+), two year (6500+) and three year (9000+) and all of them finished at a below average rate. Both the 3400+ and 3500+ theories did a good job at predicting 0.50+ FIP increases but that is tempered by the fact that they had a higher percentage of pitchers throw enough innings to qualify. For instance, if you were to back out the 19 of 56 pitchers who fell below 2000 pitches in the Spike In Pitch Volume Theory, you would have 38% (14 of 37) that had 0.50+ FIP increases. So there may be a slight correlation that a heavily used pitcher does not perform as effectively the next year but they will perform.

I’d theorize the failure of High Pitch Volume Theory in predicting significant pitch declinesis because totaling high pitch counts in a year is more of a skill than an abuse. It is easier for someone who has run a marathon in the past to run another one versus someone who has not. While this group is not immune to missing time, they have built the stamina to at least pitch close to a full season (2700+ pitches). Below are the 16 members of 2007’s “3400+ Pitch Club” .

The collective average went from 3,534 pitches to 2,991 pitches which, though a decline, still represents about 29 starts. 8 of the 16 pitchers on this list finished in the top 30 in 2008 total pitches. The below 100 index in the test and the above chart indicate that a pitcher with 3400+ pitches in the previous year is a sign of durability not vulnerability (note: given the high injury rate with pitchers, though, it’s not that strong of a sign).

We then tested several combinations of the three above-average performing theories and threw a last bone to the High Pitch Volume theory (which tanked even after adding the High % of Breaking Balls requirement).

While the top two combo theories are very good predictors, they are also very rare – about 1 in 25 pitching seasons. The most impressive predictor of the bunch is test #6 – 27+% Sliders/Curve Balls OR 700+ Pitch Spike. Why? Because that 128 index applied to a huge sample – nearly half of the pitcher seasons in the study. The impact is best seen by comparing this group against its mirror image – pitchers who threw < 27% Sliders/Curve Balls AND had < 700+ Pitch Spike.

Comparison of Pitching Seasons After A 2700+ Pitch Season

Sample

< 2000 Pitches Next Year

Index

FIP Up 0.50+ Next Year

Index

<2000 Pitches Next Year OR FIP Up 0.50+ Next Year

Index

All Pitchers

247

59 (24%)

100

53 (21%)

100

112 (45%)

100

27+% Sliders/Curve Balls OR 700+ Pitch Spike

116

37 (32%)

134

30 (26%)

121

67 (58%)

128

<27% Sliders/Curve Balls AND < 700+ Pitch Spike

131

22 (17%)

70

23 (18%)

82

45 (34%)

76

So a pitcher that qualifies in at least one of these categories is 28% more likely to throw less than 2000 pitches in the next year or to see an FIP increase of 0.50+ than the average pitcher. Just as important, one that qualifies for neither category is 24%less likelythan the average pitcher to meet these respective fates.

That’s it for now but we’ll be revisiting this topic in the near future. In the next post, we’ll go over the top 20 starting pitchers risks for 2008, perhaps uncovering a couple pitchers that you’ll be surprised to see…

Rudy, my eyes were crosseyed by the time I got thru your article. Just kidding, great stuff. Predicting the unpredictable pitchers is the hardest thing to do in (Fantasy) Baseball. Will your list of pitchers to avoid be based on your findings in this post?

I am putting Lincecum on my “risky” list. The little guy threw a crap ton of pitches over what he threw the year before. Maybe his arm is made of rubber and doesn’t ever bother him, but I won’t be risking my 2nd or 3rd round pick on him.

On Lincecum- I saw this GREAT segment on MLB Network where Tim was pitching in their little studio field, and basically described how is TORQUE- like motion leaves less stress o his arm than most pitchers.

I understand the worry, but I think in a few years Lincecum may be that rubber armed anomaly like a Wells or Sabathia.

I try to keep a log of pitchers who miss time with non arm-related ailments, as the reduced workload seems to bode well for the following season. Obviously you have take the nature of the problem and likelihood of recurrence into consideration. I’m scouting for ankle and calf injuries, mental health issues, anal fissures, venereal diseases, and things that of nature.

This, of course, presumes that the pitcher had decent base skills in the first place.

@IowaCubs: Thanks Iowa! We’ve been getting a good amount of link/hype love of late (linked in Hardball Times yesterday under their Shysterball blog (a great read)). If you want to help the cause, you can hit the Ballhype button right above the comment section :)

@Tony: I’ll cover Lincecum but I don’t see him finishing in the riskiest top 10.

@Kevin: Thanks. Who’s ahead of us? Do we have to be smarter and/or snarkier to get up on the list?

@Baron Von Vulturewins: C’mon, I gotta hedge my bets more than that! I’ll be providing lists of riskiest and safest pitchers. I’ll also add this stuff in as notes for the Official Point Shares rankings.

Rudy, I found this article from the link on rotoauthority and its the first time I’ve visited your site. First off, absolutely awesome analysis, very impressive.

I was actually thinking of doing a similar analysis because of all the different theories about projecting pitchers that you mentioned, but you did the topic way more justice than I ever could have.

I look very forward to your list of risky pitchers.

Also I agree with the rubber arm comment on Lincecum. I think he’ll be fine this year, although I generally wait a long time to pick pitchers.

I feel like you can make smart SP picks in later rounds a lot easier then doing the same with batters. Plus if you swing and miss on some late round pitchers there are always some Cliff Lee type guys to grab off waivers if you are quick enough (maybe not quite as good as him, but you know what I mean).

With that said, I like Lincecum for next year, but will let someone else pick him and form my pitching staff later on.

I have a few suggestions for what is a good start on some interesting work. If I understand what you’ve done, your test here is flawed because you change your sample each time. The important question is not “how many times was a 30% breaking ball season followed by a decline in performance” (which is what your index seems to be measuring. The key question is how many of the total bad outcomes did each criterion identify and how many times did they predict a bad outcome and were wrong. In other words, you might want to think about further detailing the results in terms of sensitivity and specificity. And, you really need to account for the fact that your pitcher seasons are correlated across time and among individuals.

@…Like a Ham Sandwich: I have to agree. As a long-time (well, four year) adherent to the safe strategy of building a four-man rotation carefully (e.g. one guy in the 4th, one in the 6th, one in the 9th, one in the 11th, or thereabouts) I am entering this season convinced I need to draft offense in the first eight rounds and worry about pitching in the mid-rounds. (This has nothing to do with the fact that two of my SP picks last year, Smoltz and Gallardo, were injury busts, and I finished last in home runs. CORRECTION: It has everything to do with that.)

There are simply too many valuable pitchers available later to waste high picks on pitchers. I know a Santana or Lincecum delivers high value for their ADP; what I don’t know is who’s going to be Santana or Lincecum in 2009. (My best guess: Chad Billingsley.) Plus, a dozen useful pitchers emerge out of nowhere, as per your reference to Cliff Lee, above.

So, in short, this year in Camp Von Vulturewins it’s going to be Scrubs and Freaks and Hope for Good Pick-Ups in the First Few Weeks.*

@Baron Von Vulturewins: A good value drafted for a pitcher (such as Haren at 60) can provide that offense you’re looking for in a trade. So it might be a good idea to grab one or two with the intention of flipping that SP for a solid top 40 bat.

I’ve also noticed that there is almost no reason to draft a pitcher in the rounds 4-6 (in 12 team), and I follow your principle of no pitching until the 8th. That said, if Johan falls to you in the 3rd, you’re not going to jump on that and then flip it for awesomeness?

@Rudy Gamble:
Me want info on closers and injuries to schmoes like Putz and Cordero and eventually this season, Marmol.

@…Like a Ham Sandwich: Thanks. Welcome to the site. Always great to get new commenters. Drafting pitchers is an area that tends to polarize. I don’t think there’s a right or wrong way to it. I prefer to balance drafting throughout and think there are pitchers with value in the early rounds. I haven’t crunched through all the data but i’m less bullish this year than last year when I thought Santana, Peavy, Webb, and Haren were very good values.

@Gotowarmissagnes: Finally a criticism! The sample is over multiple years and was clearly stated that (2006-2008). A single year just nets too low of a sample. This does mean that a pitcher could be counted multiple times under three unique pitching seasons. I suppose league FIP does vary slightly year-to-year which could over/under-estimate the number of 0.50+ FIP pitchers. I could look into that…(but pitches thrown shouldn’t matter per inning).

The data as its displayed shows how often the theory was right and wrong. For instance, “30+% Curve Balls/Sliders” led to a bad outcome 63% of the time (which means 37% of the time it turned out okay) and this indexed at 137 (since there’s a high baseline of failure in pitching). When I list the top 20 riskiest pitchers, it’s with the caveat that even the riskiest pitcher could pan out.

Not sure what you mean with the sensitivity and specificity.

I’ll readily admit this isn’t a purely scientific analysis and would love it if someone with a statistical eye built upon this analysis….as long as the bastards reference me when doing it…

@IowaCubs: It would be interesting to see if there’s a predictability for relievers but I think it’ll be a lot tougher. Pitching is such an unnatural activity and relieving takes it to an extreme. I’ve been betting against K-Rod for years b/c his slider is ridiculous but he’s still going….I thought Wagner would’ve folded at 30 since he’s too small to throw that fast….

@Baron Von Vulturewins: For the guys saying there’s always a cliff lee, etc around you can grab this is true, but there’s also always a Mcclouth, Quentin, and Ludwick….

My keypoint is (at least in my league) Johan does not mean you’re going to be able to trade him for say Carlos Lee, get me? So if you’re taking one of these “big name big game pitchers” dont do it for trading purposes, you better be doing it because you’re planning on riding them. In my H2H league pitching is very valuable, but trade wise its not worth half of a stick…. So I too go with drafting offense for about the first 7-8 rounds, then take a pitcher, and BILLS is looking nice this year?

@Tony Y: Lee is drafted before Johan because Carlos Lee is more predictable. But if you’ve invested in the risk with Johan by drafting him, built your team around him and the season has started, you do not trade Santana away, except for a 1st rounder. He’s more valuable than Carlos Lee once the season starts and I would trade you my Carlos Lee for your Johan any day of the week if it helped my team.

Also, I don’t know your H2H’s rules, but usually H2H is the most easily manipulated for pitching, i.e., grab some relievers and two start pitchers and forget ratios. This makes drafting pitching especially useless.

@Grey: We have a H2H league, 12 teams, 20 cats, 10 pitching, 10 hitting, so its fairly even, but a good offense in our league seems to win out. When it came down to it last year I lost out in the championship because I couldn’t get ONE dimwit to throw 6 decent innings, so maybe pitching Efff’d me in the end, but I took the league regular season by storm with my hitting and pitcher pickups and good relief crew. I guess its kinda like real baseball, you can get thru the year without studly pitching, but when it comes down to the playoffs (world series) PITCHING WINS OUT…. i guess i need ONE ace LOL

@BriGuy: Yeah, color me negative on breaking ball pitchers at least in general. I’ll be producing rankings using our Point Share methodology in about 3 weeks. You can check out preliminary ones in the 2009 Fantasy Baseball Rankings section up at the top of the page.

Glad to see you take my comments as intended, Rudy. The percentage you compute here is what is called the “positive predictive value” in stats, and it’s one of the most important measures. It’s basically the answer to the question “If I see a positive, how likely is it to be a true positive?” The 30% breaking measure is weaker on sensitivity–how many of the bad seasons did it actually detect–only 34 of 112.

The problem with having pitchers with multiple seasons is that you have to control for that statistically. Nothing you can really do in a simple set up like this. I can understand that you have to use multiple seasons.

In any case, there are multiple tradeoffs here in trying to identify the best predictor (the old Type I and Type II errors from stats), and the base measure you have here does better, at least on this simple measure, than others.

@IowaCubs: In my league of misers and miscreants, pitching is hard to trade for hitting. Everyone always thinks you’re pulling a fast one — they’re too scared your John Smoltz will blow out his arm on the day you swap him for Adam Dunn.

Oh, wait — that happened to me.

@Tony Y: It’s true about McLouth et al. but more true, I think about pitchers. Though I seem to recall trying to make a definitive list of undrafted breakouts last year and the pitcher to hitter ratio was about 3:2. This was very unscientific, however.

Also, breakout hitters seem to get snapped up in the first month. And you can ride pitchers hot streaks more easily than hitters, it seems. (Not sure if that works in H2H, which sounds like a supplement and should, similarly, be illegal.)

Was there anyone of value, hitting-wise, to emerge in the second half last year? Bruce? Davis? Salty? I seem to recall them all sitting together sad-faced in a diner, Boulevard of Broken Dreams-style. It all seems so long ago, thankfully…

Also, has anyone here ever tried a pitcher-first drafting strategy? One guy in my league last year drafted Peavy and Webb in the 2nd and 4th, respectively. Didn’t help him much. His offense sucked, partly because he fell madly in love with Josh Hamilton, rather than trading him at midseason as ANY SANE MAN WOULD HAVE DONE.

@Steve: Sorry, I might not have been clear, but once you’ve set your team around having Johan, he’s more important after the draft than before because of how you draft around him. If you drafted Johan then chances are you waited really long for a 2nd starter, so you can’t take Johan out of that team without having someone to step in. Now if you could pull a Billingsley and Carlos Lee trade for Johan, that would be a different story, but also probably a protested trade. Ah… Member those? Protested trades. They’re coming back soon. Can you feel the excitement? Wait, what was I saying? Hmm…

Very interesting stuff. A couple of questions. First, can you explain why you used the cutoffs for “breaking ball” pitchers at 30%+ and 27%+? Because that gives you about half of the sample?

Also, I’d be interested to see if you had similar findings if you used something like QERA or xFIP instead of FIP. FIP is definitely preferable to ERA in a study like this, but a small increase in HRs, possibly due only to random variation, can have a large effect on a single year’s FIP.

@Grey: No, you were pretty clear, was just a nuance I’ve not really had to consider to this point.
The one, once drafted, is of course rather harder to replace than the other.
Reminds me of when I once acquired Lee in return for Derek Lowe and Akinori Otsuka (when he was a killer 8th inning guy – yes, I was a cuddle boy even back then).
Ah yes – simpler times.

On Lincecum. Yes his arm doesn’t take as much beating because of the “torgue” he generates, but it’s the other parts of his body I would worry about. That’s a lot of twisting and movement……..think Tiger Woods golf swing. Something has got to give. I’ll take guys like Greinke much later.

Yeah, FanGraphs only has FIP at the moment. I think I’ve heard that they’re considering tRA (from StatCorner), which would be super-cool, as in my opinion FIP is good at measuring *what* happened, but some of the other DIPS metrics are better at measuring what *should* have happened.

Since you’re looking at sort of predicting performance here, I think either of these might be slightly better metrics to use. Of course, I’ve been known to argue about pitching metrics longer than most…

@Nick: Ha…I think you can outargue me on this too. I think the FIP part is less important, though, than the < 2000 IP mark. Not saying the < 2000 IP is more important for fantasy purposes (i’d suggest that the worst case is last year’s Harang who pitched but just ineffectively enough that you hold onto him vs. just ditch him for a Nolasco) but it just seems to be easier to predict.

There seems to be a long period trend here that you’re not accounting for in that pitchers are not getting any younger. Wouldn’t you expect as a function of age the # of pitchers someone throws to be in general less than the previous year. Interesting study either way!

@Andy: Great point. I have been thinking about adding an age component to this although I’m not convinced it’ll be a major factor except on the extremes (e.g, Randy Johnson less likely to repeat 2700 pitches than Chad Billingsley). Will keep it in mind…

I enjoyed the article and appreciate the work. You mention yourself that there is some failing with this in that you didn’t add post season pitch counts as well. It would seem to me that these are very important pitches thrown.

I realize that Sabbathia is a horse, but I think he threw enough pitches down the stretch and in the post season that it will have some effect this coming season. Zambrano may have felt similar effects down the stretch this season.

Alternately, it could be effective if you removed pitchers from your analysis that had X amount of post season pitches. Often times these are guys who are simply extrodinary and may help to skew the data anyway.

Great article. This could definitely be the foundation for something a little more econometric.

One correction: “In total, 59 (24%) of the 247 qualified pitcher seasons were followed up by seasons of less than 2,000 pitches and 53 (21%) were followed up with FIP increases above 0.50. That means that about 45% of starting pitchers either pitched significantly less or had a measurable decrease in performance in the following year. ”

You cannot add those two percentages together. This is a common error in statistics. The two events are not “mututally exclusive”, as they like to say. They could have considerable overlap between them, and most likely do as you would (and pointed out) probably find a pretty decent correlation between increases in FIP and a decrease in innings thrown just as a result of the pitcher getting so much worse that they stop using him.