Monday, September 16, 2002

Pitching XR: Dividing The Responsibility for Runs

How much is fielding worth?

Statistical analysts have been scratching their heads for years trying to
determine how the responsibility for run prevention should be divided between
pitching and defense. Pete Palmer and John Thorn, in The Hidden Game of Baseball,
suggest that defense is about 12% of run prevention, because about 12% of runs
(at that time) were unearned. Bill James, in developing the Win Shares method,
uses a formula that assigns about 32.5% of the responsibility for run prevention
to the defense. In his Defensive Translations that are published in Baseball
Prospectus, Clay Davenport first removes events that cannot be affected
by the fielders from his analysis, then assigns the defense responsibility for
70% of the remaining runs. Voros
McCracken’s DIPS analysis suggests that differences between pitchers
are small enough so that they can be ignored, and the defense can be assigned
100% of the responsibility for preventing runs on balls in play. In this case,
the pitchers would be responsible only for runs that are scored (or prevented)
as the result of those events that cannot be affected by the fielders - strikeouts,
walks, HBP, and HRs.

We do have methods for estimating the impact of pitcher-specific events on run scoring - for example, Extrapolated Runs (XR). The formula for XR contains explicit weight for strikeouts, walks (including intentional walks), hit batsmen, and home runs. By using the XR formula, we can estimate the pitcher-specific contribution to XR - which I’ve called Pitching XR (PXR) - and use that as a starting point for dividing up the responsibility for run prevention between pitching and fielding.

Using season statistics from CBS Sportsline and ESPN, I calculated both XR and PXR for each major league team as of 9 September 2002. The results are listed in Table 1, below.

Table 1. XR and Pitching XR, 9 Sep 2002

Team

XR

PXR

%

Cleveland

616.0

334.0

54.2%

Oakland

726.1

371.5

51.2%

San Francisco

727.6

371.0

51.0%

Chicago(N)

672.9

342.5

50.9%

Texas

770.5

391.1

50.8%

Chicago(A)

739.8

374.8

50.7%

New York(A)

808.1

408.5

50.6%

Philadelphia

708.5

330.4

46.6%

Baltimore

617.1

283.7

46.0%

Cincinnati

665.5

304.9

45.8%

Pittsburgh

593.0

270.3

45.6%

Montreal

665.9

302.6

45.4%

St. Louis

681.4

308.5

45.3%

Arizona

700.1

315.2

45.0%

Atlanta

648.3

291.6

45.0%

New York(N)

605.1

271.9

44.9%

Houston

684.7

304.9

44.5%

Toronto

690.9

304.4

44.1%

Boston

745.3

318.5

42.7%

San Diego

618.6

263.5

42.6%

Florida

653.3

276.2

42.3%

Kansas City

644.5

271.3

42.1%

Milwaukee

619.6

260.7

42.1%

Seattle

750.7

314.0

41.8%

Colorado

667.5

269.6

40.4%

Los Angeles

615.7

248.5

40.4%

Minnesota

702.3

274.0

39.0%

Tampa Bay

606.7

232.0

38.2%

Anaheim

731.8

270.6

37.0%

Detroit

559.2

200.7

35.9%

Totals

20236.8

9081.5

44.9%

I’ve looked at this for several other recent seasons, and the range of performance has generally been about the same, with pitcher XR being close to 45% of total XR. That would suggest that, if we follow Voros’s approach and assume that pitcher contributions to run prevention from BIP can be ignored, the defenders are responsible for 55% of run prevention. If we instead use the Davenport model of dividing the responsibility for run prevention from balls in play as 70% fielding and 30% pitching, we then get an estimate that the fielders are responsible for about 38.5% of run prevention, closer to the 32.5% that James uses in Win Shares.

Voros’s research demonstrates that pitchers do not have a large impact on balls in play, and that therefore the fielders should assume more importance when most of the run scoring derives from balls in play. An approach based on pitching XR (or any similar RC method that allows separation of the impact of pitcher-specific events from the impact of fielder-affected events) balances the relative responsibilities of pitchers and fielders, especially before 1920, far better than an absolute percentage assignment such as James uses.

Reader Comments and Retorts

Statements posted here are those of our readers and do not represent the BaseballThinkFactory. Names are provided by the poster and are not verified. We ask that posters follow our submission policy. Please report any inappropriate comments.

Something it looks like you missed -- you need to account for the hit and error prevention capabilities of a strikeout. As such, you need to increase the value of a strikeout to accomodate these, so the total value is about 0.20 runs.

Mike and I (along with others) have already begun a discussion of this topic on SABR-L, so I'll just try to summarize my views. I think this topic is extremely important and I welcome any and all approaches.

Having said that, I really have a tough time understanding how Voros McCracken's DIPS research has any significant bearing on this topic. Mike suggests that McCracken's research indicates that fielders deserve almost all of the credit for converting balls in play into outs. This is allegedly due to the finding that a pitcher's H$ average varies wildly from year to year (and is better explained by team stats). I don't want to re-open the DIPS debates, so I'll stipulate to those findings. But I just don't see how those findings imply that pitchers deserve virtually no credit for converting balls in play into outs.

Even if you believe that Greg Maddux and Jose Lima have the same ability to generate routine ground balls or easy pop-ups/fly-outs, I am not sure why this ability has to be ZERO.

Maybe this is an argument over "ability" versus "outcome". Even if Greg Maddux has no ability to get a greater preponderance of routine ground balls than Jose Lima, I think it is clear that both pitchers deserve credit for the routine ground balls (pop-ups/fly-outs) that they do generate. One has little to do with the other.

I like to pose two extreme examples. In one universe, for whatever reason there are a lot of routine ground balls (e.g., easy two hoppers) and easy fly balls. You can think of the dead ball era if you like. Maybe pitchers are able to fool the hitters with a dazzling array of off-speed pitches that generate a lot of poorly-struck balls.

In the other universe, for whatever reason there are a lot more hard hit balls. Think of baseball with a super ball. Maybe pitchers cannot throw dazzling curve balls or fool the batters (Coors Field?). Here most of the outs on balls in play are the result of great fielding plays (or positioning).

In each of these two universes McCracken's thesis could still well apply. There could be wide season-to-season variability in each pitcher's H$. However, that is largely irrelevant to how much credit should be given to the class of pitchers (versus the class of fielders) for converting balls in play into outs.

The class of pitchers in the first universe deserve a great deal more credit for converting balls in play into outs than the class of pitchers in the second universe. In the first universe, fielding is no big deal. Any high school player could convert these balls into outs. In the second universe, though, high school kids would be running for cover. It takes special fielding skills in this universe to convert balls in play into outs.

Once you admit that possibility, the issue of how much of the credit should be allocated to pitchers vs fielders becomes central. But I don't think that McCracken's research sheds much light on that important question. Based solely upon observables throughout the league-season, in my mind the question comes down to what pct of balls in play are easily converted into outs.

"McCracken's research indicates that fielders deserve almost all of the credit for converting balls in play into outs"

No, Voros shows that pitchers have little control. That doesn't mean that FIELDERS deserver the rest. It could be luck, and it probably is. The easiest/safest is to split the difference, and assume that on balls in play, the fielders and the pitchers are equally responsible.

Actually, I said that "balls in play" can be split 50-50 between fielding and pitching. The other balls not in play (HR,K,BB) is almost 100% pitching. Overall, you'd end up with defense being roughly 70% pitching, and 30% fielding (depending who the pitcher is).

I agree that a 50/50 split between pitcher and fielders seems reasonable for the allocation of credit for converting balls in play into outs. Using Mike's estimate that pitcher-specific elements (HR, BB, SO) contribute about 45% of runs prevention, this would mean that pitchers are responsible for about 72.5% and fielders 27.5%.

I would have no objection if one wanted to shade the percents in favor of fielders to reflect McCracken's findings (maybe to 70/30 overall).

By the way, is there any problem with expressing PXR/XR as a percent since the effect of strikeouts is to decrease runs allowed? For example, suppose no home runs or walks. Suppose XR is 100 and PXR is -10. What does the PXR/XR ratio mean in this case? Maybe I'm misinterpreting what Mike is doing here.

Finally, doesn't anyone else think that we can estimate (guestimate) the proper split by looking at the percent of routine outs?

I would have no objection if one wanted to shade the percents in favor of fielders to reflect McCracken's findings (maybe to 70/30 overall).

That's exactly what Clay Davenport did; see the article on pp 4-10 of BP 2002.

Finally, doesn't anyone else think that we can estimate (guestimate) the proper split by looking at the percent of routine outs?

What's a routine out (or routine play, if one wants to split hairs like John Pastier did on SABR-L)? If Barry Bonds is batting, is a groundball to 3B a routine play? If an infield shift is on, in most cases it won't be, because the 3B will either be playing SS or over behind the 2B bag, depending on how his team implements the shift. (I saw Adam Dunn get a double in a minor league game last year on a similar ball.) Is a popup in front of and to the right of the pitchers' mound a routine play? It wasn't in game 2 of the Southern League divisional series between Jacksonville and Carolina, in which four fielders converged on such a popup only to allow it to drop to the ground untouched.

For a play to be what we normally call routine, the fielders have to (a) be positioned in such a way as to reach the ball easily, and (b) handle the ball properly, both in catching it and (where necessary) in throwing it. Those are aspects of fielding skill that need to be part of any measurement of that skill - and if we foreclose ourselves from measuring the impact of positioning and clean handling and throwing by eliminating balls in play that we don't *think* are affected by those skills, we could easily bias our evaluation. I would therefore argue that, for the purpose of evaluating the pitching/fielding split, we cannot assume that *any* ball in play is a routine out.

No, Voros shows that pitchers have little control. That doesn't mean that FIELDERS deserver the rest. It could be luck, and it probably is.

It probably isn't *luck* (I prefer *random variation*, by the way), but some combination of the opposition hitting and fielding with random variation thrown in. The relationship between season-to-season variation in $H for hitters is stronger than that for pitchers, which certainly suggests that the opposition hitters have quite a bit to do with it. The team relationship season-to-season, as ColinM noted, is positive and stronger than the relationship for individual pitchers, indicating that the combination of pitching+fielding has something to do with it.

One thing Voros did not do - which needs to be done - is control for the opposition hitters. If the hitters contribute to the determination as to whether a ball in play becomes a hit, then some of the individual variation we see for pitchers might result from facing a different mix of hitters. With the changes to the schedule in recent years, if your timing is good you might never see Barry Bonds at all if you're in the NL Central or East, and the next season when your timing is lousy you might see him in every series you play the Giants. Weight the opposing hitters by PA and $H, figure a weighted average of what they do when they are facing a generic pitcher, and compare it to how you actually performed; there may be some differences in the end result.

The last time I read Mr. McCracken's update, this is what he said -- this is a cut and paste quote, and not my recapitulation:

1. The amount that MLB pitchers differ with regards to allowing hits on balls in the field of play, is much less than had been previously assumed. Good pitchers are good pitchers due to their ability to prevent walks and homers and get strikeouts in some sort of combination of those three.

2. The differences that do exist between pitchers in this regard are small enough so that if you completely ignore them, you still get a very good picture of the pitcher's overall abilities to prevent runs and contribute to winning baseball games.

3. That said, the small differences do appear to be statistically significant if generally not very relevant.

As I read it, Mr. McCracken was not saying that the pitcher does not have any control over whether the batted ball becomes a hit, but that the difference among Major League pitchers in this regard is of minimal importance as compared to differences walk rates, strike outs, etc.

I firmly believe that line drives are more likely to be hits than flies and ground balls. I can't prove it, so if you have facts to prove me wrong, let me have it with both barrells. But if I am right, then some of the unstated logic behind Mr. McCracken's results may be the fact that, in order to make it to the major leagues, the pitchers already have to fit within a certain class -- pitchers whose balls in play do do not tend, to too great an extent, to be line drives, otherwise they probably would not make it this far.

So they are among a class of pitchers who, if they are not striking out guys or walking them or giving up HRs, are giving up batted balls within certain parameters of line drives, ground balls, fly balls. The ratio of line drive to ground balls and fly balls has to be within a reasonable range, or they would not make it this far. The actual number of each will vary some each year, and actual number of each category which becomes hits in any given year is a combination of factors including dumb luck and the skills of the fielders.

So I don't think we should say that the pitcher does not get any credit for balls in play. Rather, that the credit has to be shared, particularly for those classes of pitchers (knuckleballers?) who show a greater tendency to a low BIP BA.

Under this approach then, there is no data used. We simply choose a number out of thin air (50/50). I generally try to think of ways we can answer the question before resorting to the thin air approach.

One reason I bring this up is that most people would believe that the proper allocation between pitchers and fielders for converting balls in play into outs should *vary over time*. Maybe significantly higher in the deadball era and lower today.

If we use the thin air approach and say 50/50 seems reasonable, then we have no "flexibility" to pluck some other number out of thin air for other eras (50/50 being the single predominant split).

Plus, of course, there is no guarantee that 50/50 is anywhere near the proper split. In situations like this, I like to ask myself what questions would I ask a clairvoyant, even if they are questions I cannot possibly know the answers to. The questions can then be looked at in different ways to try to infer answers.

In this case, I would ask the clairvoyant what pct of outs on balls in play would have been turned into outs by a team of fielders made up of typical high school players. In the deadball era, this might be around 50% (I'm just guessing), whereas today it might be around 25% (ditto). Maybe others would ask the clairvoyant different questions, but I think we'd all ask some questions.

I firmly believe that line drives are more likely to be hits than flies and ground balls. I can't prove it, so if you have facts to prove me wrong, let me have it with both barrells.

No, this is dead-on correct. I need to rerun this data, but IIRC the ratios were about 15% of fly balls, 25% of ground balls, and 80% of line drives becoming hits (data from the 1998-2000 PBP database that I licensed from Gary Gillette and Pete Palmer).

The ratio of line drive to ground balls and fly balls has to be within a reasonable range, or they would not make it this far.

Whether a ball in play is a ground ball or a fly ball is something that the pitcher *does* control, more so than the hitter. It could be that a fly ball pitcher has an inherent advantage over a ground ball pitcher, and that advantage is countered by the fly ball pitcher also allowing more line drives (perhaps because he's more likely to hang a pitch than the ground ball pitcher who's usually throwing some kind of sinker). Something worth checking...

Rather, that the credit has to be shared, particularly for those classes of pitchers (knuckleballers?) who show a greater tendency to a low BIP BA.

Bill James once suggested that the reason that more knuckleballers weren't successful was that good teams rarely saw a need for a knuckleballer, while bad teams would be more likely to take a chance on one. It's possible that knuckleballers, like "Tommy John" pitchers, need a good defense behind them to succeed, especially since unlike most pitchers, the knuckleballer can't always rely upon the relative certainty that a strikeout will be an out. It's possible, maybe even likely, that the "knuckleballer advantage" comes from the fact that the knuckleballer can't succeed unless the defense behind him is doing a good job.

"Finally, doesn't anyone else think that we can estimate (guestimate) the proper split by looking at the percent of routine outs?" I don't see what that has anything to do with it, even if we could define routine. Perhaps Rob you can clarify?

Rob and others: this split that we try to come up with, I think we should be careful. For example, if you have a .300 hitter facing a .200 pitcher, and you have a .200 hitter facing a .300 pitcher (let's assume fielding is lumped with pitching), what is the expected BA of this outcome (assume a .260 league)? Would you believe that the resultant BA is the *same*?

If on the other hand "hitting" was more responsible for this PA, then the expected BA of these outcomes would shade closer to the batter's actual BA. This does not happen.

So, I go back to Rob's statement about the 60/40 split. Are you saying then that the shading of such outcomes actually is towards the pitcher? That is, if you take a .300/.200 matchup, that the result is different if it was hitters or pitchers?

The reason why pitchers have more control is they throw the damn ball. High pitches go for flyballs, low pitches go for groundballs. Even with Mark McGwire's golf swing, he was more likely to put a high pitch in the seats than a low pitch.

Replying to Tango's points, yes, I was saying that I believed that the pitcher (+fielding) influences the outcome of a plate appearance more than the hitter. Your approach of looking at specific batter-pitcher matchups and comparing the outcomes to the prediction based upon a 50/50 split is what I did earlier this year. I only looked at a few cases since I was working essentially by hand. I am very interested in any research you (or others) have done in this area.

Secondly, I refuse to believe that I am the only one who believes that the proper allocation of credit between pitchers and fielders for converting balls in play into outs depends upon how many balls are easy vs difficult plays for the fielders.

I am not saying that the pitcher deserves ALL the credit for routine outs (supposing for the moment that we can define that term). What I am saying is that the pitcher deserves MORE of the credit for routine outs than for non-routine outs. Maybe not everybody agrees with me.

To make the point clearer, consider playing baseball with mechanical pitchers. In one league, the mechanical pitchers are set to be Double-A quality pitchers, and in another league the mechanical pitchers are set to be top-notch MLB pitchers. In the "AA" league, presumably fewer outs will be routine balls than in the MLB league. I would conclude that the fielders in the MLB league deserve less of the credit for converting balls in play into outs than in the AA league. Do people not agree with this point?

What I am saying is that the pitcher deserves MORE of the credit for routine outs than for non-routine outs.

*** Rob, I totally agree with this, as long as you have variations of hard to not-hard hit balls. In slo-pitch, no-curve, 12-foot arc, no walk, softball leagues, the pitcher is really irrelevant, regardless of the hard-hit, soft-hit aspect. Essentially, there should be no variation pitcher-to-pitcher in the hard/soft categories.

In MLB, this is not necessarily true. I am sure that Maddux gives up more soft hit balls, and this might explain why he has a slight advantage over his teammates when looking at his $H.

I think you and I agree generally about this aspect, and if we were to discuss this further, we are probably in agreement on the specifics as well.

Suppose we think about the slow-pitch softball example. At one stylized extreme every ball is hard hit and there is no variation across pitchers. Here most would say that pitchers deserve little of the credit for converting balls in play into outs. Every out is the result of a good fielding play (including good positioning).

Now slowly introduce a bit of reality. Batters are not perfect, sometimes they mis-hit a ball and hit a routine grounder or weak pop up. But continue to suppose that pitchers have absolutely no influence in generating these routine plays. In this stylized example it would probably be fair to split the credit for these mis-hit outs 50/50 between pitcher and fielder (and still very little of the credit to the pitcher for the hard-hit outs).

Okay, now dial up the pitcher's ability to generate these routine plays (ignore the issue of variation across pitchers for now). Then most would say that pitchers deserve more of the credit than in the second example above. And as the ability to generate these routine plays increases, the allocation of credit to the pitcher increases. (Voros' research may well provide an upper bound for this?)

My question to the assemblage is whether it is possible to distinguish the second and third examples. That is, how can we reasonably decide how much (if any) influence the pitcher has in generating weakly hit balls? Does it require variation across pitchers? (I hope not.)

Thanks to Mike, Tango, and everybody for their patience as I pedantically try to better understand this important issue.

Rob, this is a very important issue, and you bring up an excellent point:

====
Batters are not perfect, sometimes they mis-hit a ball and hit a routine grounder or weak pop up. But continue to suppose that pitchers have absolutely no influence in generating these routine plays. In this stylized example it would probably be fair to split the credit for these mis-hit outs 50/50 between pitcher and fielder
===

Again, why the split? Based on this, you are saying that the reason there was an out is 100% because of the batter. Therefore, why not attribute the mis-hit outs on the defensive side to "luck" / "random variation". That is, both the fielders and the pitchers had nothing to do with getting the out. You could have put a 4-year old out there, and he would have gotten the out only because the batter mis-hit (according to your example).

Therefore, there are THREE components on the defense: pitchers, fielders, AND LUCK. Let's assume that the split in MLB is 20% pitching, 20% fielding, and 60% luck. Since we don't like to give any credit to luck, and since the players are playing, we feel the need to give out the other 60% to the players. And hence the split.

But this is exactly like your example. Simply by virtue of having a 4-year old *exist* on the playing field we attribute him the out. This does not make it right, necessarily, but it gives credit to the players only.

So, while Voros' DIPS may say that the pitchers are "responsible" for 20% doesn't mean that the fielders get the other 80%. If you look at it the other way, you may find that the fielders only deserve say 30% of the outs. And then you figure the other 50%, which are converted to out virtually by random variation on the defense should be allocated back to the players.

This is why I'm happy to give a 50-50 split on balls in play, though we should try to find the better split.

Yes, I skipped over what I thought everyone already agreed with (my bad). If we absolutely knew that it was just LUCK that determined if a ball in play goes for a hit or not, we would still need to allocate the credit for it among the pitcher and the fielders. Right?

That is, Mike framed the question as basically asking how much of the observed run prevention credit should be allocated to the pitcher. For example, a pitcher with a 3.00 ERA in a league with a 4.50 ERA. Essentially we need to decide how much of the credit for this differential of 1.50 in ERA should go to the pitcher. In this sense, we are concerned with actual outcomes, not "ability" or "skill" issues. Perhaps this is where the confusion comes in with trying to use the insights of Voros' research here?

Anyway, back to the luck story. I thought we would all agree that the allocation between pitcher and fielders for the LUCK element should be 50/50. (Luck is the piece not directly attributable to either the pitcher or the fielders.) Seems reasonable to me.

But my core point is that we may be unable to tell how important "luck" is in converting balls in play into outs. I really believe that the pitcher may have more influence on this than is measurable. Therefore methods based upon this approach may unduly undervalue pitchers.

I am sure that Maddux gives up more soft hit balls, and this might explain why he has a slight advantage over his teammates when looking at his $H.

Maybe not. A soft-hit ball takes longer to get to the fielder, and reduces the amount of time that the fielder has to get rid of the ball in order to make the out. Many infield singles are of the *soft-hit* variety. A soft-hit ball is also (generally) less likely to be hit where the fielder is expecting it to be, and also is (generally) more likely to be judged incorrectly by the fielder because his initial expectation is that the ball is hit harder. Thus the fielder's skill set still plays an important role in determining whether or not that ball becomes an out.

We need to get away from the concept that there are balls that we can a priori define as being routine plays, as I noted earlier. Every ball in play needs to be treated as though it *could be* an out, or *could be* a hit, and that the batter, pitcher, and fielder all play some role in the final result. Only then can we hope to come to an unbiased conclusion that properly weights the individual contributions.

There are some aspects of those contributions that are pretty clear when one looks at actual major league performance data:

1. The pitcher has the greatest role in determining whether a ball in play is a ground ball or a fly ball.
2. The hitter has the greatest role in determining where the ball is hit (yes, even against Randy Johnson).
3. (a corollary of #2) Teams position their fielders based primarily on the hitter's tendencies, not the pitcher's.
4. There are some pitchers - "Tommy John" pitchers - whose hallmark is allowing lots of balls to be put into play. These pitchers do not succeed on poor teams, and do succeed on good teams often beyond the level of their teammates.
5. A pitcher's strikeout rate is known to be the best indicator of his likely level of success, and there are almost no pitchers who succeed for very long without striking out an average number of hitters or more.

All of these factors can be explained by a model that gives the fielders and the hitters (in some combination) most of the credit for what happens to a ball once it IS hit, with the pitcher's primary contribution being the determination as to whether the hitter can hit the ball in the first place, and his secondary contribution being the determination as to whether the ball is hit on the ground or in the air. A model in which pitchers get the lion's share of credit for results from balls in play has difficulty explaining the existence of "Tommy John" pitchers, and also is less satisfactory in explaining why strikeout pitchers are consistently more successful for a longer period of time than pitchers who don't strike out hitters. So does a model that has extensive *random variation* built into it.

Mike's post makes me think we are talking at cross purposes. I thought the issue was the allocation *between pitchers and fielders* for run prevention. That is, hitters have no say in the matter.

Another way of saying the same thing. Suppose God tells you that the allocation of credit for whether or not a ball in play goes for a hit is 40% batter, 30% pitcher, 20% fielders, and 10% luck (these are just made up numbers). Then I think the answer to the question posed in the first paragraph is 55% pitchers and 45% fielders. The contributions of the pitcher and fielders are scaled up to 100% (non-proportionately). Half of the batter's portion and half of the luck portion is given equally to the pitcher and the fielders. This scheme applies whether or not "luck" is explicitly credited.

The reason we do this is that we are attempting to allocate credit among the defense (pitchers and fielders) for run prevention. To repeat, from a valuation perspective the hitter (as hitter) receives no credit for this. Agreed?

And since the allocation between fielders and pitchers is probably 20/30 one way or the other (with the rest being luck), I usually go with the 50-50 approach to BIP overall. I would be very surprised if it was say 20/50 either way, though I have only my instincts on this subject.

I don't think fielding positioning varies that much between fielders. Bonds and Mo Vaughn are extreme examples, and those have to be PAs that aren't Ks HRs or BBs and they have to be with nobody on base. That really reduces the total number of times a fielder is wildly out of position *and* a ball is put in play. The NL has 100,000 PAs; 69,000 are BIP. That makes Bonds/Vaughn 800 PAs insignificant. (Hmmm, Sean doesn't have these numbered and referencable). I score tons of baseball games, and the players, the overwhelming majority of the time, are standing in a 10 ft circle. The OF circle is a little larger.

I think there are mostly routine plays in defense. Not every play, but largely.

Mike, while you have your book open - what percentage of hits (in play) are GB/FB/LD? I'm betting on a 30/20/50 split. If not higher in favor of the LD.

This is really the crux of much of what Voros' work *and* the pitching/fielding breakdown gets to. It also will point up how Defensive Efficiency, the newest stathead toy, doesn't tell us what we think it tells us.

The reason we do this is that we are attempting to allocate credit among the defense (pitchers and fielders) for run prevention. To repeat, from a valuation perspective the hitter (as hitter) receives no credit for this.

My point was not specifically to try to credit the hitter for some portion of run prevention, but just to point out that any model of run prevention has to explain the evidence. The primary factor that determines whether or not a ball in play becomes an out is the location of the ball relative to where the fielders are positioned. On the available evidence, the pitcher has almost nothing to do with either the location of the ball or the position of the fielders, and thus he can't get a lot of the credit for the fact that a ball in play becomes an out (regardless of whether you include the hitters or not). He does get some credit because one of the factors that helps determine the location of the ball in play is the type of ball that is hit, which the pitcher does control, but that credit has to be significantly less than the credit the fielder gets for being in the right place (based on the characteristics of the hitter at the plate, not the pitcher) and for handling the ball cleanly.

If it is true that fielders have significantly more control over the results from balls in play than do the pitchers, then it would naturally follow that (a) reducing the number of balls in play via the strikeout gives a pitcher a greater chance of being successful over the long term, and (b) pitchers who allow a large number of balls in play would be unusually dependent on the quality of the fielders behind them to be successful. We have a lot of evidence from baseball history that supports the latter two conclusions, and the additional evidence from Voros's research supports the underlying assertion about the relationship between pitchers and fielders on responsibility for outs on BIP from which the other two conclusions come naturally. It seems to me that we have a much more solid foundation for giving the lion's share of the credit for outs on BIP to the fielders than we do to the pitchers, based on the available evidence.

I score tons of baseball games, and the players, the overwhelming majority of the time, are standing in a 10 ft circle. The OF circle is a little larger.

10 feet sounds small, but there's still a lot room for individual manuevering within a 10-foot circle. There's not nearly that much difference between an infield playing normally and one that is playing at double-play depth (it's usually a matter of a foot or two in and a foot or two to the right, for a SS) and you will see a fair number of balls that look like routine GB6s when hit go through to the OF just out of the reach of the SS when the infield is at DP depth. A foot or two can also mean the difference between throwing on the run and having time to get set. You don't have to move a lot to change the area of the field that you can routinely cover.

Again, the assertion that a play is routine is based in large part upon the fielder being correctly positioned relative to the location of the ball and handling the ball cleanly. The fact that many plays appear to be routine doesn't necessarily allow you to infer that the pitcher is, or should be, mostly responsible for making them so.

Mike, while you have your book open - what percentage of hits (in play) are GB/FB/LD? I'm betting on a 30/20/50 split. If not higher in favor of the LD.

I don't know for sure, but it sure seems to me that the first category, routine outs and sure-fire hits, constitute the majority of BIP, and that therefore the average pitcher is more than 50% responsible, perhaps _much_ more than 50%, for BIP results.

One more time:

If this were true, then you would expect to see (a) a significant number of pitchers with year-to-year consistency (within the range of normal variation) in $H; (b) a group of pitchers who are able to be dominant year-in and year-out while striking out fewer batters than the norm; and (c) only minor variations in performance for pitchers who allow lots of balls in play when they move from team to team. But you see *none* of those effects. How do you explain the lack of those in the context of a model that grants a significant amount of responsibility to the pitcher for the conversion of balls in play into outs?

Well, Mike, we are on the same page wrt positioning. The reason I think static zones do work is that everyone has the same static zone - deciding what the static zone is is another discussion. Yes, a ball could sneak by just out of reach due to positioning w/in the small positional diffrences. (I chose a 10' circle to decrease arguments about that - I think your description is more accurate). But it could also result in easier reach of a ball the other direction. If ball direction is largely *not* the pitcher, the fielder has sole responsibility for "guessing" correctly where to adjust his position. He should benefit from guessing correctly and, yes, be penalized for not doing so.

Hmmm, I was off on my GB/FB/LD distribution. ;-) Baseball is not that different from softball wrt FBs. If you hit it in the air, and it doesn't go out of the park - you're out.

As for pitcher credit for OBIP (would that be $O?), didn't your SABR presentation have something to do with "easy" "6" chances? I'm sorry I haven't seen it. Nonetheless, there is a baseline for MLB SS on "6" GBs. That is - 95% are turned into outs (I think you told me this in another thread). Any play that has a 95% conversion rate is routine.

I think we could even come to a concensus what that % should be to be "routine". That's why I like hte STATS zones a little better. In the "6" BB WS zone, there are 3-4 STATS zones. So only half of the "6" zone should be considered routine (possibly). Certainly for the OF.

I have a lot of ideas rattling around in my head about this, and I don't want to get off on a stream-of-conciousness type post.

David - 80% of LDs are hits, *and* that represents 51% of hits. LDs are usually not gorks. The gork is a small percentage of that 80%. So the primary determinant is how squarely the ball is hit.

And the team that batters hit the fewest LDs off of has the best Defensive Efficiency, by and large. Defensive Efficiency does not describe team defense. It describes pitching luck, er, random variation.

If Mike can break it out by team for GB/FB/LD, he can demonstrate this (or disprove my assertion).

Mike's last post is probably the crux of the matter. Mike is saying that the recent research on H$ provides evidence (inferential, circumstantial) that the pitcher deserves little of the credit for outs on balls in play.

This is where I am not so sure. I have tried to put forth arguments and hypotheticals that purport to show that Voros' findings can be consistent with the pitcher deserving a lot of the credit.

Having thought about it some more, I think the real issue is "outcome" versus "ability". Voros' findings suggest that pitchers do not have the ability to generate a preponderance of outs on balls in play (forward looking).

However, that is not the critical question. The question is *after you observe an out* on a ball in play, who deserves the credit? I really think these are two separate (though related) questions.

I love to think in hypotheticals, so please indulge me once more. Suppose God created a game in which conditions are ripe for our analysis. Every hard hit ball goes for a hit and every softly hit ball goes for an out. If a pitch is in a certain zone, the ball will be softly hit (and produce an out). Otherwise it will be hard hit (and go for a hit). In this world, of course, all pitchers try to get the ball into the soft-zone, but suppose none have any special ability to do so. Think of the pitch location as being determined entirely by chance.

In this game, Voros' findings still hold. However, I would argue that pitchers deserve 100% of the credit for producing outs on ball in play. It is entirely due to the pitcher throwing the ball into the special zone that generates a softly hit ball and an out. Fielders have absolutely nothing to do with it.

I realize that this is a stylized hypothetical world. But I think it demonstrates that Voros' findings are not direct evidence on the question of how much credit should go to pitchers versus fielders.

In this game, Voros' findings still hold. However, I would argue that pitchers deserve 100% of the credit for producing outs on ball in play. It is entirely due to the pitcher throwing the ball into the special zone that generates a softly hit ball and an out. Fielders have absolutely nothing to do with it.

This is ONLY true if there is variation from pitcher to pitcher, and little variation for the same pitcher from season to season.

In this particular example, the control can be entirely due to the hitter teeing off on a softball pitcher with 20 fielders. Either it's a hit or an out, regardless of the fielder's skill. The fielder is simply standing there.

If there is no variation from pitcher to pitcher, then in this situation it is 100% luck on the defense side. If there IS variation (say Pedro and I are pitching in this league), then Pedro gives up tons more soft hit balls than I do.

Furthermore, what if there is alot of variation between pitchers, but next year, the correlation is very low? This is what Voros is about. This might again imply alot of luck.

We disagree on this point. I would say that the pitcher does deserve 100% of the credit for the OUTCOME of generating an out. He does not have to have the ABILITY in differing amounts than other pitchers.

Suppose God uses dice to see if each pitch is in the soft zone. And God uses the same set of dice for every pitcher in the league. Every pitcher then has the same expected prob of getting an out. But not every game will have the same number of hits, of course, due to random variation. There may even be some no-hitters.

Consider such a no-hitter. An observer would conclude that the pitcher was hot that day, had his best stuff, had pinpoint control, etc. Each batter hit an easy two hopper to the shortstop that every high school kid would have fielded flawlessly. The pitcher would be credited (rightly) for generating these easy two-hoppers.

You don't need pitcher-to-pitcher variation to make this point. Of course, pitcher-to-pitcher variation makes the stronger case, but I'm arguing that it is not necessary. All you need is that the pitcher influences the outcome, be it how hard hit is the ball, flyball-groundball-linedrive, etc. And it doesn't matter if the influence is due to an ability or not. All that matters is that he influences the outcome.

And it doesn't matter if the influence is due to an ability or not. All that matters is that he influences the outcome.

Again, *how* did he influence the outcome? Using the softball, high-arc, no-curve, no-walk scenario, this might as well have been a T-ball game. In essence, the identity, even the existence of the pitcher is irrelevant.

Sometimes, by the throw of the dice, you'll end up with a no-hitter. But again, how do you influence something without having an ability?

The pitcher, in this example, just happens to be the guy who threw the ball. The action of throwing the ball in this softball example does not influence an outcome. The entire influence is based on the batter making contact, and the positioning of the fielders. And if we assumed in this example the hard/soft scenario with the 15 fielders on the field, the entire influence rests with the batter (and I suppose the condition of the park).

You have to show that the pitcher in this case has some influence. If he has no influence, he can't get credit.

I think part of the difficulty here is in viewing the outcomes that a pitcher generates - K, BB, HR, ball-in-play - as discrete and independent skills. The credit that a pitcher gets for putting a ball in play has to do with what the alternative outcome might have been. A pitcher who gets a batter to put the ball in play on a 3-0 count has done a good thing on average since he has converted a likely BB into a good chance of an out. The opposite is true wihen the count is 0-2. (How the count got to 3-0 or 0-2 is partly the pitcher's fault but it is also the batter's).

Wait a minute. We are criss-crossing on our stylized examples. I have previously stipulated to the fact that in T-ball or slow-pitch softball the pitcher deserves very little (none) of the credit for converting balls in play into outs.

However, what about my hypothetical example? A ball in play goes for a hit if and only if it is hard hit. And a batter hits the ball hard if and only if it is not in a special unhittable part of the strike zone. Here I have constructed the example so that only the pitcher has influence over the outcome (kinda a reverse T-ball).

In this game, I claim that the pitcher deserves 100% of the credit. Even though he has no special ability to generate outs. Remember I said that you got this information right from God -- you know that the pitcher has influence. This example shows that the H$ evidence does not necessarily say that the pitcher has little influence.

Once people grant that argument, then we can discuss how an observer (who does not have a special pipeline to God) can go about investigating whether or not the pitcher has influence. This was my key question from a few posts ago.

Anyway, I wanted to stop at this intermediate point with this example to put aside the argument that Voros proves that the pitcher has no influence. It is merely a less than perfect inference that may or may not be correct.

F., you
are right on it. Field-able balls that go for hits (FBs and GBs) are the fielders fault. The LD hits are the pitchers fault. Plus whatever HRs he allows.

As for Rob's "outs only" discussion, the pitcher *does not* control the direction of the ball. If he doesn't control the direction, then he cannot direct the ball to where it will be an out. Thus he cannot be credited with creating the out. As I wrote above, generating BIP in zones where outs are converted (say) 90% of the time, the pitcher *should be credited with half of that. He threw the pitch that put it there, and the fielder converted it.

SO defense really breaks down like this:
Fielders are responsible for about 32.5% of *hits*.
Pitchers are responsible for about 67.5% of *hits*(including HRs).

Fielders are 50% responsible for $O in zones with 90%+ conversion rates, and then 50% adjusted by the percentage of $O in each associated zone. I don't know what percentage of $O are on balls hit to where the std position is - where 90%+ conversion occurs.

Pitchers get the rest - somewhere below 50% of outs. I'll guess 37%.

So hits are 28.5% of BIP (I think it is 20,000 hits in 70,000 ABs NL01).

Pitchers are responsible for 67.5% of those, or 19% of BIP. For outs, pitchers get 37% of outs (71.5% of BIP) or 26.5% of BIP.

So pitchers are 45.5% of defense and fielders are 54.5%, with some adjustment for however many plays are made outside a well-defined "routine" zone. The split of credit fo an out will always be difficult to determine, but I think this is an equitable suggestion.

I think we should guesstimate - 56% fielders and 44% pitching.

Thanks, Mike. I've been waiting on those numbers of yours for about 5 years.

Rob, I agree with you, and will stipulate to that as well (we sound like lawyers don't we?).

How about this hypothetical example: say that MLB only plays once a week, and there are only 10 teams in the league, and there are only 5 innings in a game. Essentially, you only need 10 pitchers in the league. Those 10 pitchers are named: Pedro, Randy, Curt, Dwight, Sandy, Lefty, Walter, Tom, and Roger. They are all top of the line. They all pitch with similar styles. Therefore, we expect over their careers to all produce similar $H, $K, $BB, and $HR. Let's assume as well we have 20 fielders on the field, so fielding is not a required ability to play this game.

Therefore, do we credit the pitchers (and fielders) with ANY part of the defense, or can we attribute it all to statistical random variation?

After all, we know the impact that Pedro has in real-life, because of how he is so much better than his peers. But if you have 10 (or 100 or 300) pitchers exactly like him, then does he really deserve any of the credit?

Given a universe where the 300 best pitchers can pitch as good as Pedro (and league ERA will now be 2.2), and given a universe where the 300 best pitchers pitch like me (and the league ERA will now be 22.0), is there a difference?

In fact, the teams will pay all these pitchers the league minimum. So how much impact can pitchers in any of these scenarios have?

This is tangently related, but it suddenly occurred to me how we can see if $H variance is an evolutionary characteristic. I figured those with better data sets and databases would want to try.

If a pitcher's $H is an evolutionary characteristic, I figure the following must be true:

* The variance of $H decreases over baseball history, as the quality of the worst players improves.

* The variance of $H decreases as level of play improves. It probably is not the best group because of preselection already, but we would expect pitchers in Class A to have a higher variance of $H than ML and AA/AAA pitchers do.

These last few posts demonstrate how difficult it is to decouple the fielding credit from the pitching credit. I don't have the magic answer, and I don't think anyone else does either unfortunately.

What I have been attempting to do in this thread, with only limited success, is to decouple issues about what we mean by the proper allocation of credit from issues about how we would estimate the proper allocation of credit. I believe that we have not yet sufficiently established what it is we are trying to capture here.

Replying to Tango's last post. Yes, this is a good example. Let me extend the conversation. Suppose that there is another league that has the same basic setup but its pitchers are the equivalent of batting practice pitchers (same quality batters and fielders). After watching a lot of games in both leagues, we'd quickly come to the conclusion that pitchers in the Pedro-league deserve a lot more credit for outs on balls in play than the batting-practice-pitchers-league. Right? Why would we say that? Presumably because there are fewer hard hit balls in the Pedro-league than in the other league. More routine outs, or rather what appear to be routine outs.

Drum roll please . . . . Thus, I think this is evidence that the percent of hard hit balls is a good measure of how much credit the pitcher deserves in converting balls in play into outs. (This is probably obvious, but it is worth going through the logical argument.)
This shows, again, that you don't need pitcher-to-pitcher variation to have the pitcher deserve a significant amount of the credit.

Similar arguments could be given to demonstrate that the fielders' ranges and the quality of their gloves (both literally and figuratively) are relevant too. And presumably the quality of the batters, the quality of the bats, etc.

Bottom line: I don't think we have sufficiently worked through what is the question we are trying to answer.

Thus, I think this is evidence that the percent of hard hit balls is a good measure of how much credit the pitcher deserves in converting balls in play into outs.

Let me continue with the "no-variation" theme among pitchers. Suppose that instead of a fixed T stick holding up the ball in a T-ball league, we had a *moving* T stick. And a fast moving one at that. Most of the balls in play will be poorly/softly hit.

A league of Pedro-only pitchers is equivalent to a fast-moving T-stick. They both result in tons of soft hits / routine outs.

Should the moving T get more credit than a fixed T? No, it is part of the basic conditions of the playing field. At the same time, a league of all Pedros pitching or all Tangos pitching deserve the same credit.

I'm posting this recognizing that this discussion is deep into the details. With the risk that I'm saying something obvious or repeating an old point, here goes:

In trying to allocate responsibility for BIP, has anyone checked the variance for hitters of BA/SLG on BIP from year to year? What I'm thinking is that if that variance is small, then we could conclude that hitters affect the GB/FB/LD ratio more than pitchers do.

No, no, no, no, no! This is specious reasoning. It's like saying you know the answer ahead of time and then reason to get that answer. This is a manifestation of the problem. You are trying to answer a problem that is not well posed.

Let me say it for the umpteenth time. While variance is indeed the key (of course), it need not be variance among the league's pitchers.

You haven't even commented on the parallel league. Your argument would imply that in both leagues the credit to the pitcher is zero. But this violates the axiom of transitivity, or whatever the heck it is called. We KNOW that there is a difference between leagues. It is obvious. Any approach that does not acknowledge a difference has to be wrong.

Okay, here's another example. Suppose it is back in the 19th century. Suppose it is the year before gloves are allowed. Now compare to the year after gloves are allowed (suppose the newly introduced gloves are of the modern variety to make the point crystal clear). Suppose the variation among fielders has not changed one iota. By your argument the credit to fielders for converting balls in play into outs would not change. But of course fielders now deserve a lot more of the credit. The percentage of balls hit that turn into outs is now much higher, due exclusively to the increased quality of fielding.

The point is that this information does not reside within the confines of any one league. As Tango's all-Pedro league shows. But that is the point. Information must come from outside the specific league to answer this question. That's why this is such a difficult question and smart people like us have not yet come up with the answer.

By the way, the earlier post that estimated the empirical splits by looking at the pct of GB/FB/LD's is probably on the right track. Assign credit between the pitcher and fielders in each one of these categories (this is the information that must come from outside the system), and then weight the categories by their empirical distribution. But note that this method has absolutely nothing to do with variation per se.

By your argument the credit to fielders for converting balls in play into outs would not change. But of course fielders now deserve a lot more of the credit. The percentage of balls hit that turn into outs is now much higher, due exclusively to the increased quality of fielding.

The point is that this information does not reside within the confines of any one league.

Up until now, I was pretty much in-line with Rob's thinking. On this specific point, I am 100% against Rob's reasoning.

My position, while I'm not as 100% convinced as the previous poster was, is that the information can only reside within the league.

The introduction of gloves to all fielders doesn't change how much of the BIP should go to the pitchers (as long as the variance among fielders and among pitchers remains constant). You can double the number of fielders. You can allow one-hops to be automatic outs. It doesn't matter. It's about ability to do something, GIVEN the playing conditions.

If the pitcher is playing with conditions where fielders are given these great gloves, and get to double the number of fielders, and get the one-hop out, then that doesn't change the relative impact the pitcher has on the BIP. (Assuming the variance remains the same.)

Rob, I think we both argued our points well, and we probably exhausted our thoughts on this matter. How about you get the last word, and then we take a little break? I'm exhausted!!

Okay. Final post by me. I'll try to summarize my view, which I realize is not winning the day.

There are two difficulties here. A meta-issue of pinning down whatinhell we are trying to capture. And an estimation issue of trying to measure "it" given the limitations of our observations.

Perhaps the key hurdle in the meta-issue is whether we are trying to capture "abilities" or "outcomes". You'd clearly analyze the issue differently depending on your answer here, and of course likely get very different ultimate estimates. As I have stated, I think that "outcomes" is the proper arena, not "abilities". But I can see how reasonable people can differ on this.

The estimation issue is even more difficult than the meta-issue for all the obvious reasons. All we observe is a ball in put into play. Some go for hits and some get converted into outs. We now want to separate the effect of the pitcher versus that of the fielders. When faced with similar tasks, we often look to see how the ultimate outcomes vary across pitchers. I have tried to argue that this is a natural inclination but that it is far from a perfect method in this particular case. Again, I readily admit that reasonable people can differ on this too.

In parting, I hope this thread has served to percolate thinking on this important topic. Thanks Mike, Tango, and everyone else for participating. I am sure we have not heard the last of this topic.