Monday, September 20, 2010

When it comes to picking the statistics I include in the spreadsheets I post here at the end of each season, my basic philosophy is to include only statistics that I can calculate myself (and that by extension the reader could calculate for himself) using data that is readily available. This necessarily results in using sub-optimal methods, as more advanced data (like that which can be culled from play-by-play accounts) can sharpen the focus of metrics and cut out some noise.

One example of this is in the adjustments I make to a pitcher's standard Run Average, a metric called Relief Run Average. The original RRA is based on Sky Andrecheck's article in the August 1999 By the Numbers, and uses inherited runner data to modify relief pitcher's RA. This year, using the data on bequeathed runners available at Baseball Prospectus, I've expanded RRA to consider both bequeathed and inherited runners for both starting and relief pitchers.

The major weakness of my approach is that I only consider the raw number of runners bequeathed and inherited (and their fates), not the base/out situation in which the pitching change is made. As such, it is not as comprehensive as the metrics available at BP which consider base/out state, and for pitchers with unusual patterns of bequeathed and inherited runners, the adjustments might actually make RRA a less accurate gauge of true performance than standard RA. On average, though, taking even the rudimentary data on runners into account will make for a more accurate evaluation of the pitcher, and so I use it.

Let me clearly define the statistical categories that go into RRA. IP and R are obvious, but we also have:

i is an important value that pops up twice in the formula. In the 2009 AL it was .337, while in the NL it was .303.

First, let's account for bequeathed runners. The number of bequeathed runners that can be expected to score is simply i*BR. We can compare this to BRS in order to get our crude estimate of bullpen support for the pitcher. The only issue is what positive and negative should indicate.

For inherited runners, I have always put a positive sign on "inherited runs saved" when the reliever performs better than expected. Since bequeathed runs saved will be the opposite of inherited runs saved, this suggests that the sign should be reversed. When the bullpen allows fewer bequeathed runners to score than expected, this means that the bequeathing pitcher's RA is lower than it otherwise would be.

A similar requirement is that if we add a pitcher's figures in the bequeathed and inherited categories, the signs should work out. Relievers will both bequeath and inherit runners, so they need to match up. Positive bequeathed runs saved (BRSV) offset positive inherited runs saved (IRSV). I realize that this explanation is confusing, perhaps it will be more clear when I run through an example for a reliever.

Let's start with a pitcher who made all of his appearances as a starter and thus had only bequeathed runners. Barry Zito is such a pitcher (33 appearances, all starts), and he led all exclusive starters with 27 bequeathed runners, of whom 11 scored. We would expect that 27*.303 = 8.18 would have scored, so Zito was charged with an additional 2.82 runs:

BRSV = BRS - BR*i = 11 - 27*.303 = 2.82

Zito was charged with 89 runs in 192 innings for a 4.17 RA. His RRA will be (89 - 2.82)*9/192 = 4.04.

Now let's look at a reliever who did not bequeath any runners. There are only four pitchers with 50+ innings in 2009 who did not bequeath a runner: three closers (Jenks, Papelbon, Valverde) and one remarkable starter (Halladay). Of the three relievers, Papelbon inherited the most runners (16, 4 of whom scored). We would have expected 16*.337 = 5.39 to score, so Papelbon saved an additional 1.39 runs:

IRSV = IR*i - IRS = 16*.337 - 4 = 1.39

Papelbon was charged with 15 runs in 68 innings for a 1.99 RA. His RRA is (15 - 1.39)*9/68 = 1.80.

Now let's consider the man who had the highest total of IR + BR, Grant Balfour. He bequeathed 43 of which 19 scored; we would have expected 14.49 to score, so his BRSV was 4.51, which means we're going to reduce his actual runs allowed by 4.51. On the other hand, he inherited 67 runners and allowed 18 to score; we would have expected him to allow 22.58, so his IRSV is 4.58. That means we reduce his runs allowed by a total (BRSV + IRSV) of 9.09 runs. He was charged with 38 runs in 67.1 IP for a 5.08 RA, but his RRA is figured using 28.91 runs and is thus 3.86.

These estimates will not always match with BP's Fair RA, which considers the actual base/out state. They have Balfour at 4.24, which is still a far cry from the standard 5.08 but also not as generous as RRA. If a relief pitcher tended to inherit runners in favorable base/out situations, then bequeath them without altering the base/out state too unfavorably, considering both bequeathed and inherited runners could overstate their effectiveness.

For example, suppose Balfour inherited all of his runners at first base with one out (obviously not a realistic example, but it makes the point), and faced only one righty in each case before being relieved for a left-hander. He would be inheriting runners in a favorable base/out state, then bequeathing them in an often more favorable state (if he recorded an out).

If you're going to account for inherited and bequeathed runners, it certainly is best to consider the base/out state as BP does. My contention with is simply that it is generally better to make the crude adjustment than no adjustment at all.

There's one important thing I haven't considered: park factors, which I overlooked in my previous applications of RRA as well. My old procedure was to add IRSV to runs allowed, then park adjust the entire RRA once constructed. However, this approach is flawed because it holds i constant across parks. Suppose a reliever in Coors Field allows 30.5% of his inherited runners to score. He will get negative IRSV. The park adjustment applied to his overall RRA will not change the fact that his RRA is higher than his standard RA, when in fact his performance in stranding inherited runners was likely above-average given the park.

So I've changed the way I apply the park factors this year. They are applied separately to each part of the RRA. I've assumed that the "i park factor" is equal to the square root of the standard runs park factor (PF). Square root adjustments are also good approximations for run components, and some very crude tests I ran suggest that it's a decent approach, at least one that is better than using PF with no adjustment or using the old, flawed method described above.

I'll close with a couple of charts showing the largest percentage differences between RRA and RA in 2009. For these charts, no park factors were used and i was set equal to the 2009 major league average of .313 rather than a league-specific value.

First, the pitchers who take the biggest hit by using RRA:

All setup relievers, unsurprisingly. Saito and Howell can still be said to have pitched well when runners are taken into account, but the other three go from having adequate RAs to below-average (at least for a reliever).

The biggest beneficiaries:

Two closers and three setup men, all of whom appeared fairly effective through the lens of RA but were even more impressive when bequeathed and inherited runners are considered.

It is not surprising that RRA has the biggest effect on relief pitchers, who routinely bequeath and inherit runners. Starters never inherit runners, are more often pulled between innings, and appear in far less games. So here are the biggest moves for starting pitchers, limited to pitchers with more than 100 innings and no inherited runners for the season:

And the flip side:

AJ Burnett had a 4.30 RA, while his teammate Andy Pettitte allowed 4.67. However, the Yankees' pen allowed just one of Burnett's nineteen bequeathed runners to score, while fifteen of Pettitte's twenty-five scored. Pettitte accordingly bests Burnett in RRA, 4.34 to 4.52. Roy Oswalt had it worse--eleven of his twelve bequeathed runners came around to score.