Many individuals believe that limiting pitch counts and increasing days of rest can improve performance and reduce injuries. Though the belief that overuse can hamper pitchers is widespread, there exists little evidence that adjusting pitch counts and rest has much effect on pitcher performance. In this study, Bradbury and Forman use newly available game-level pitch count data from 1988 to 2009 to evaluate the impact of pitch counts and rest days on future performance. They discuss their employment of linear and non-linear multiple regression analysis techniques to estimate the impact of pitch counts — in recent games and cumulatively over a season — and days of rest on pitcher performances while controlling for the effects of other factors.

In the 1970s, using team revenue and player performance data, Gerald Scully employed the standard marginal revenue product framework frequently used by labor economists to estimate the financial contributions of players. Bradbury’s study employs new information about baseball’s economic structure and sabermetric performance metrics in an updated Scully framework to estimate the dollar value of current major league baseball players. He compares player salaries and estimated worth by service class, presents a method for projecting player worth over the duration of long-term contracts, identifies some of baseball’s best and worst deals, and ranks teams according to their abilities to manage their budgets.

Wouldn’t expected ERA *have* to go up after a high-pitch start if you’re controlling for season ERA? A high-pitch start is probably above average for the season. Therefore, the rest of the starts MUST be *below* average for that season. So a high-pitch start would be followed by a slightly worse start (and vice-versa) even if the number of pitches doesn’t do anything at all to the pitcher’s arm.

Taking it the other way: suppose a pitcher has an ERA of 3.00 over 99 innings (33 ER). In one start, he throws only 40 pitches and gives up 6 runs in 2 innings. His ERA over the rest of the season would be 27 ER in 97 innings, for an ERA of 2.51. So, you would definitely expect him to be better than his season 3.00 in his next start, even if the low number of pitches didn’t otherwise affect his arm.

If you replace the “ERA this season” term in the regression by “ERA this season in all other games than this one,” do the results still remain?

P.S. Optional but even better, replace “ERA this season” by “Average ERA by game, in all other games than this one.” That should work better since you’re measuring average game ERA, which is higher than regular ERA since it’s weighted by game and not IP (as evidenced by table 2 where the mean game ERA is 5.64).

No, it certainly doesn’t *have* to go up. I think it would be quite a stretch to expect your reasoning to drive the results. I ran lots of different models in lots of different ways, and the estimates were stable.

JC and Sean. I think that Phil is right in his assertion that if you compare the “next game” to the rest of the season, including the high or low pitch game(s), that it is a biased sample, assuming that the high or low pitch games are good or bad games (which they will be, on the average), and the “next game” will regress to a lower (if following a bad game) or higher (if following a good game) ERA.

The correct way is not to include the biased (high or low pitch) game I think or somehow control for the fact that that game is a high or low ERA game as well.

Now, how much of an effect it has, I don’t know. Since you found a relatively small effect from high or low one-game pitch counts (.007 in ERA per “extra” pitch thrown), I would think that it is possible that the bias could account for that entire effect.

In your response to Phil, are you saying that he is incorrect in his assertion or that the effect of the bias is too small to make any difference in your conclusions – IOW, the effect that Phil describes is substantially less than .007 runs per extra pitch?

Estimating the effect using ERA excluding the game of analysis is doable but problematic. The average for the season is included to proxy for the ability of the pitcher in that season. By not holding it constant for a pitcher for all the games in a season introduces unwanted variance to the model, because the estimation accounts for a fluctuating ERA of a factor we want to remain constant and may dilute the estimated effects of other factors. But, putting that aside, when I estimate the regression with a seasonal ERA taking the game of analysis out, the estimated effect is approximately 0.005. This is less than the model’s non-linear approximation at the 101st pitch, which is 0.007. However, a difference between the two estimated models is that the new model is linear while the old is not. When I estimate the old as linear (that is including seasonal ERA w/out taking out the game of analysis) the estimate is 0.005.

One method of avoiding the potential bias in the estimate is to estimate ERA outside of the model using an instrumental variable technique. Unfortunately, a method for instrumental variable multiple fractional polynomial estimation hasn’t been developed. One of the important estimation approaches to answering the questions is to not impose a shape on the estimated relationship with an MFP estimation, so that pitches and rest days can be measured non-linearly. However, because the estimate is virtually linear for most models, I estimated a IV-MFP model that proxied ERA in a season from past ERA and age. After doing this the estimated effect of pitches thrown is 0.009.

We could throw variables in and out all day long, choose different estimators, parse the sample, etc. The point of this project is to generate a rough estimate of the impact of pitches thrown on performance. This impact will surely differ across pitchers and situations, whether the impact is 0.005 or 0.009 is of little practical relevance. It is more of a general guide that indicates, pitching less improves future performance and pitching more diminishes future performance.

JC,
When I look at this from 2005-2009 (with ERA adjusted as Phil suggests) I cannot disprove the null hypothesis that pitch counts have no effect on subsequent performance. Is it possible for you to re-run the results with this smaller (but arguably most relevant for today’s game) dataset and see if your results still hold?
Is this material in the new book? If so, are you able to revise that content before publication?

I will reassess what I have done. This is still an on-going research project. I may not be able to satisfy you, but I want to be sure that the ERA issue is not clouding the estimates. So, for now, consider the above results preliminary. I’ll provide an updated version of the paper when it’s done.

[…] JC did the oral presentation of the data he investigated with Sean Forman of baseball-reference.com JC’s own site is sabernomics.com. He has put the slides and a draft of the paper up on his site if you’d like to see more detail than I can provide in this liveblogging recap. Here. […]