Tuesday, August 30, 2011

There are a number of methods available to evaluate pitcher’s starts on a game-by-game basis rather than the more traditional full season method. There are Game Scores, Support Neutral records, Win Values, and a number of other approaches. There really is no need to add another approach to the mix, and I’m not really going to here--I'm simply going to take a conventional approach for evaluating a full season pitching line and apply it to individual starts.

Of course, I’m not going to claim that this approach is better than the others, because it’s not. It is relatively easy for me to implement, though, and I thought it would be nice to be able to offer a category on my year end stat report for starting pitchers that would consider distribution of performance rather than just aggregate performance as is the case for the rest of the metrics.

The idea is basically to estimate the winning percentage that a team should have over the long haul given the runs allowed and innings pitched of the starting pitcher. I am not using any sort of component RA estimate, and will not bother to explain the implications of this, which I’ll assume you’re well aware of (and thus can also decide for yourself whether you still have any interest in the results). To do this, I use Pythagenpat and assume that the performance of everyone other than the starting pitcher is league average. That is, the offense scores an average number of runs, and the bullpen allows an average number of runs. For the latter, I’m not going to account for the difference between the RA allowed by relievers and the overall league average.

That is not at all an inevitable choice, and if I was trying to construct a perfect metric I wouldn’t do it. But this is so obviously not a perfect metric that the extra effort would be of questionable value. Making that adjustment would also highlight the fact that this approach does not attempt to account for the effect of the number of innings the starter is able to log on the subsequent performance of the bullpen, despite research that suggests there is such an effect. Of course, using the league average doesn’t do anything to address that issue, but it also keeps things simple.

An additional benefit of not making any adjustment for the lower RA of the bullpen is that it allows this metric to be more easily comparable to other metrics that compare the performance of starting pitchers directly to the overall league average--which is a sizeable number. The overall expected winning percentage for the team of a league average starting pitcher at the end of this road will be sub-.500, which while obviously false does in fact match the results of many full season-type metrics.

One thing that cannot be ignored is park effects; the question is how to apply them. One option is to only apply them to the elements of the team other than the pitcher--the bullpen and the offense. I’ll call that option A; option B is to apply the park adjustment only to the pitcher himself.

Option A is a little harder to implement, since there are two adjustments that need to be made. On the other hand, it has some appeal because it allows us to keep the actual run environment of the game rather than recasting it in an imaginary neutral park. I’ve decided to go with Option B because simplicity is a guiding principle here, and because it is more consistent with the way I apply park adjustments to full season metrics. Again, it’s far from an inevitable choice. I’m also assuming that all games are nine innings.

With the thought process out of the way, this isn’t a particularly hard metric to demonstrate. I’ll start simple, with a pitcher throwing a complete game in a neutral park in which he allows zero runs. His expected winning percentage for the game is 1.000.

Seriously, let’s consider a pitcher in a neutral park working seven innings and allowing two runs. I’ll assume it’s an AL pitcher, so we need to know that the 2010 AL average R/G was 4.45 (this is the constant N later). The pitcher’s team can thus be expected to allow 2 + (9 - 7)*4.45/9 = 2.99 runs and score 4.45 runs. This is a 2.99 + 4.45 = 7.44 RPG environment, which has a Pythagenpat exponent of 7.44^.29 = 1.79, and thus the pitcher’s team has an expected W% of 4.45^1.79/(4.45^1.79 + 2.99^1.79) = .671.

We could go through and count up the wins (.671) and the losses (1 - .671 = .329), but I’d rather keep it in rate terms, so the final result will just be the average expected winning percentage across a pitcher’s starts.

To generalize the formula, let N be the league average R/G with R and IP as the runs allowed and innings pitched for the starting pitcher in a particular. Let dPF be the Park Factor without any adjustment so that it can be applied to full season statistics combined for home and road games. For example, the park factors I publish (which are the ones I’ll use here naturally) adjust for this. A 1.03 PF does not mean that the park inflates scoring by 3%--it means that the park inflates scoring by 6%, and is diluted by averaging with 1.00 (neutral park) so that it can be applied to full seasons statistics which are, at least in theory, comprised of one-half home games and one-half road games.

There’s really not much to it when you write it in math rather than English.

I’m very tempted to cap it off by unscrambling it from a W% back into an estimated run average, but I’d rather not deal with the implications of aggregating multiple Pythagorean exponents. One of the advantages of a game-by-game approach is that you’re able to better match performance with the run environment in which it actually occurred, and thus avoid some of the distortions that are inevitable when performance is aggregated across different run environments.

I intend to implement this fully for 2011 starting pitchers (although I might change my mind on that depending on how I feel about the effort/usefulness tradeoff in October), but for now I ran the top five AL starting pitchers from 2010 (IMHO) through the process. For comparison, I’ve included a column called sW% which is based on a traditional use of a pitcher’s full season line to estimate the theoretical W% of his team (albeit without making any adjustment for innings/start):

X = (RA/PF + N)^.29
sW% = N^X/(N^X + (RA/PF)^X)

If you use R and IP as the criteria, Felix Hernandez turned in the best performance of any AL starter, whether you aggregate or consider each game separately. Sabathia and Weaver come out about the same either way, but Lee and Price move in opposite directions when you look at the game level. This implies that Lee’s distribution of runs allowed and innings was such that it would figure to produce more wins than the averages would suggest, with Price the opposite.

This is more of a freak show stat than anything else, but it does provide a relatively simple way to compare starters at the game level on their bottom line results, and if a Cy Young race is particularly close, you may want to consider it. Or you may not; I don’t have a lot of conviction about this, and there are more rigorous approaches available, but there you go.

Tuesday, August 16, 2011

During the early part of the season, the perceived high dependence of the Yankees’ offense on home runs was often fodder for discussion in the mainstream baseball media. The implication was that while the Yankees were scoring a lot of runs, the fact that many of the runs were being scored on home runs was a sign that the performance was either unsustainable or would not be duplicable against quality pitching. The statistic most commonly cited in these discussions was the percentage of runs that scored on home runs. This post is not intended to comment on the sustainability or quality of pitching issues, but rather to offer a quick critique of the “percentage of runs scored on home runs” figure.

If one broadly divides approaches to credit runs to various individuals or events into two categories, there are those that only consider the final outcome (whether the run scored or not, and who did the scoring or the driving in) and those that attempt to assign credit at all steps in the process, even incremental steps that don’t directly push a run across the plate. The former class includes runs scored and RBI, of course, while the second class includes methods like linear weights and Base Runs.

The percentage of runs a team scores on homers obviously falls into the first class, and in fact when you think about it you will realize that the statistic is founded on a RBI perspective. The way a run gets tossed into the “resulting from a home run” bucket is to score on a home run--that is, to be driven in by a home run. Of course, one could also look at the question from the runs scored perspective--what percentage of the runners that score reached base on a home run? This is also very easy to compute, as it is simply the ratio between home runs and runs scored, data that is readily available for any team (whereas the number of runs actually scored on homers is harder to come by).

The RBI-based approach is subject to possible distortions in a manner somewhat similar to the issues with earned runs. If a home run is involved at all, the entire run is chalked up to the homer. Often, the home run is the key event enabling a run, but outside of the batter that actually hits the home run, it is never the only contributing event. In some cases, it is even relatively insignificant. If a home run scores a runner from third with no one out, it really didn’t have a large marginal value with respect to scoring the runner from third--the probability of that runner was scoring was already very high, and any number of other events would have allowed him to score. On the flip side, when a runner on first base with two outs is driven home by a home run, the home run is much more vital to scoring the runner.

The discussion in the last paragraph is making the case from transitioning from the outcome perspective to the run expectancy perspective of linear weights. Using play-by-play data, one can calculate the actual linear weight value of the home runs hit by a team. Such an approach will still be subject to sequencing fluctuations and arguably may not be as predictive as a more context-neutral approach.

One obvious context-neutral approach is to use standard linear weight values applied uniformly to all events to estimate the number of runs contributed by home runs. This figure can then be compared to the total number of runs scored. Using fixed linear weight values, though, this approach ends up boiling down to the ratio of home runs to runs scored, times a constant. For example, if the linear weight value of a home run is 1.4 runs, the result of that calculation will just be 1.4 times the simple home run to run ratio.

The next refinement is to not use actual runs scored at all; this post is going to be way too dry as is, so I won’t even bother trying to explain why mixing actual runs scored with estimated run contributions is a bad idea--it should be relatively obvious. Instead, you can compare the estimated run value of the team’s home runs to the run value of all of its offensive events.

There is a complicating factor in using linear weights (or intrinsic weights derived from a dynamic run estimator as I will in a moment) in this manner--the negative run value of the out. Simply taking Number of Event * Coefficient of Event for every event and dividing by the estimate of runs scored will result in percentages that sum to more than 100%, until outs are subtracted (and outs will have a negative percentage). This means that you can’t use the value literally--if the ratio of 1.4*HR/estimated runs scored is 25%, it doesn’t mean that 25% of the runs were scored because of home runs. Alternatively, one could look only at positive events, but then the denominator is no longer runs at all. As long as the number is viewed as a ratio and not a true percentage contribution, the result can still be useful in measuring the contribution of the home run to the offense.

Using a dynamic run estimator like Base Runs has the advantage of attempting to take into account the interaction between the offensive events rather than just assuming a fixed value. However, in the case of the home run, the additional value of considering dynamism is less than it might be for some other events because the value of a home run stays relatively fixed. The intrinsic value of a home run in BsR is:

((B + C)*A*b - A*B*b)/(B + C)^2 + 1

Where A, B, and C are the total A, B, and C factors for the team, and b and c are the respective B and C coefficients for the home run.

I’ve also figured the intrinsic weights for the other events so that I can also show you the percentage of the positive intrinsic linear weight total contributed by home runs (“POS” in the chart below).

With this, we can look at the four different approaches I’ve discussed for 2010. In the chart below, “hr” is the intrinsic LW of the home run, “RonHR” is the number of runs that actually scored on home runs, “%onHR” is the RBI-perspective figure that gets a lot of media play (RonHR/R), HR/R is the run-scored perspective figure (HR/R), BsR% is (hr*HR/BsR), and Pos% is hr*HR divided by the sum of the other products of positive event counts and their respective intrinsic weights.

I’m not going to add much comment on these figures. This list is sorted by BsR%, which I think is the best measure of how large of a share of the offense the home run represented. Toronto was in its own world, of course, with respect to home runs hit and the share of offense contributed by the homer no matter how one estimates it. Also note the fact that the estimated linear weight value of every major league team falls in the [1.401, 1.444] range except for the Jays, 3.7 standard deviations below the mean at 1.355.

Wednesday, August 10, 2011

In my last post on ERA estimators, I described how my philosophy towards constructing those metrics is predicated on starting with a solid model to estimate runs. By using a solid foundation, you can be confident that, at the very least, your metric will adhere to the fundamental constraints of the run scoring process. The designer retains freedom to experiment and estimate when it comes to selecting the inputs into the model (i.e., what gets filled in for hits, walks, home runs, etc.)

In the article I sort of asserted that this could be done, and while I granted that it might be a more difficult process, I didn’t demonstrate how it could be done. This post will offer an (admittedly simple) estimator using BsR with limited inputs and a lot of estimation. The point is not to develop a metric that anyone will actually use.

I’m going to define plate appearances as AB + W, which can be approximated by IP*2.84 + H + W (it can also of course be calculated from the horribly named BFP column), but I’ll just refer to it as PA in the equations. The BsR equation I’ll be using as a basis is:

We only have direct knowledge of walks. Everything else will have to be filled in using estimation, for which I’ll use the 2010 major league totals. I’m not going to attempt to state any interrelationships between strikeouts, walks, and the events to be estimated--everything will simply be based on a scalar times (PA - W - K), a quantity which I’ll call N (the estimate of N based on IP is IP*2.84 + H - K).

In 2010, the ratio of hits to N was .369; the ratio of homers to N was .04; the ratio of total bases to N was .578; and the ratio of (AB - H - K) to N was .768. Thus:

To convert to RA, multiply by 9 and divide by (C/2.84), which is a rough estimate of total outs (ideally, you would separate strikeouts from outs in play for this estimate). This is equivalent to multiplying by 25.56 and dividing by C:

The range for the estimated RA when applied is not as wide as the range for actual RA, which shouldn’t be a surprise since I intentionally took everything except strikeouts and walks out of the equation and didn’t do anything to amplify their value. For example, the top five starters in the AL in 2010 according to this formula were:

Again, the point is not to offer this as an equation that should be used. It’s simply an illustration of constructing a Base Runs equation while restricting the list of available inputs, yet still estimating each component separately. This same idea can be expanded upon (adding home runs to walk and strikeouts, for instance, would result in a standard DIPS-style estimator, and there are many other possible combinations of inputs), though, to produce a metric that is grounded in the foundation of the Base Runs model. As I mentioned in the previous post, one need not tie themselves to the “dumb” kind of estimation on display here (i.e. assuming that the allowed variables have no ability to improve the prediction of the missing variables).

Sunday, August 07, 2011

On Friday night, JB Shuck made his major league debut with the Astros. He entered the game as part of a double switch in the top of the fifth, singling in the bottom of the inning and later grounding out to second. On Saturday night, he again entered in a double switch, drawing a walk in the seventh and singling off (literally) John Axford in the ninth. Unfortunately, he also made the last out at third base, trying to advance after Axford’s throw to first ended up in right field.

Shuck is not a top prospect by any respect; he’s a 24 year-old left-handed hitting outfielder who would be stretched in center and doesn’t have any power (career .085 minor league ISO). Was he not in an organization with little talent to begin with that just traded 2/3 of their outfield, he wouldn’t be getting a chance at the majors, and he probably won’t have much of a career. Of course, I would love to be wrong about all that.

Shuck was notable during his OSU career (2006-08) as a two-way player, a type that was always quite rare on Bob Todd coached teams. Shuck was a very good left-handed pitcher for OSU (as you can probably guess, he didn’t have great stuff, but for a Big Ten left-hander he had plenty) and played left or center as well, often batting third.

However Shuck’s career ends up, he has helped to make this a banner year for Buckeyes in the major leagues. Two of his former teammates, Eric Fryer and Matt Angle, also broke in this year, making it the first season with three OSU debuts since 1969 (Steve Arlin, Chuck Brinkman and Fred Scherman). Three Bucks also debuted in 1961 (Galen Cisco, Johnny Edwards and Ron Nischwitz) and 1927 (Arlie Tarbert, Marty Karow and Russ Miller). To put the size of this crop into perspective, during 2000-2009 only three Buckeyes made the majors in total (Nick Swisher, Josh Newman and Scott Lewis).

Along with Nick Swisher (2004 debut) and Cory Luebke (2010), five OSU products have appeared in the majors in 2011. The last time that many Buckeyes played in the majors was 1974; we have a lot of work left to do to reach the highwater mark of nine in 1969. While the three newbies are not really prospects (Fryer probably has the best prospectus on the basis of being a catcher), Swisher is an established quality contributor and Luebke is on his way to establishing himself as such, and the streak of at least one major league Buckeye (which dates to 1990, re-established after a three year drought from 1987-89) appears to be safe for some time to come.