Wednesday, September 21, 2011

A freebie from Prospectus: a study on catchers and ranking them by their pitch-framing and ump-gulling ability, replete with animated GIFs and heat maps.

To calculate the catcher performance, I first established a baseline for each pitcher over the period 2007-2011. I used the strike zone definition described here and counted the number of extra strikes and subtracted the number of extra balls tallied by each pitcher. I also applied a small correction to the pitch location data as described here. I divided the net number of extra strikes by the total number of called pitches for that pitcher to arrive at an expected net extra strike rate for each pitcher. (The term “extra strikes” will be used in this article to refer to the net of extra strikes minus extra balls.)

Next, I applied the same procedure for each pitcher-catcher pair and subtracted the pitcher baseline from the result. Then, I summed the results by catcher. I also calculated an approximate run value for the extra strikes saved or lost by each catcher using Dan Turkenkopf’s finding that switching the call from a ball to a strike on a close pitch was worth about 0.13 runs on average.

Here is how catchers have done over the past five seasons, according to this method, at saving runs for their team by getting extra strike calls at the edge of the zone: ...

Reader Comments and Retorts

Statements posted here are those of our readers and do not represent the BaseballThinkFactory. Names are provided by the poster and are not verified. We ask that posters follow our submission policy. Please report any inappropriate comments.

Isn't this supposed to be Varitek's strength? If he's really this bad at framing pitches, then he has no business on a baseball field, because he cannot throw anymore, is not very good at blocking pitches in the dirt, never has been good at catching foul tips for strikeouts, and cannot hit left-handed.

Fangraphs has Varitek at 0.5 WAR. If you subtract the 6 runs below average he's costing the team according to this article, he is replacement level.

Varitek is supposed to be great at pitch-calling and preparation, knowing all the hitters and their strengths and weaknesses, and knowing his pitchers so he know how each pitcher should pitch to each hitter. I have no idea if Tek is really so good at this, but that's always the story teammates and managers and media folks tell.

It's not about framing pitches. Tek never seemed particularly good at that.

Jose molina always seemed incredibly good at framing pitches, so I'm happy to see him up there.

Walt/2, I adjusted for the performance of the pitchers. If you don't do that, the results are much, much bigger (too much to believe). So these numbers are relative to what you would expect given the pool of pitchers that each catcher caught.

There are some caveats to the study which I mention in my article, not least of which is the problem of the extent to which you compare a catcher to himself and his backup catcher(s).

Tom/7, I didn't do home/road splits for this study, but yes, umpires are more likely to call strikes for the home pitcher. Dan Turkenkopf did a study of that a while back:
http://www.beyondtheboxscore.com/2008/5/12/506919/a-nibble-here-a-nibble-the

And J-Doug Mathewson also had a similar finding here:
http://www.beyondtheboxscore.com/2011/1/29/1961942/strike-zone-a-marginal-component-of-home-field-advantage

Varitek is supposed to be great at pitch-calling and preparation, knowing all the hitters and their strengths and weaknesses, and knowing his pitchers so he know how each pitcher should pitch to each hitter. I have no idea if Tek is really so good at this, but that's always the story teammates and managers and media folks tell.

Yes, and the story goes that he is so good at this that the team gets the same benefits when the backup catcher is playing, too, so our attempts to measure catcher defense tend to miss it, too.

It's sort of depressing that this article has only 14 comments here since it was posted, while in that same span of time a soccer thread has 19 and a college football thread has 27.

This is incredible work. Great job Mike. The illustration of 'Tek vs Lucroy's "called strike patterns" was beautiful. Also, the finding about the head bob is actionable and should be in every MLB manager's inbox _tonight_.

I started coming to this site because of articles like this one many years ago. Pretty cool to see them still pop up from time to time.

The ball/strike maps for Lucroy and Varitek against RHBs are stunning. Walt, I think the huge numbers for Molina, Lucroy, Doumit, Posada, etc. are due in fairly large part to the fact that this is observed performance - since these guys are the outliers, it's much more likely that their true talent is not as bad as their results. The most extreme true talent is probably at most +/- 2.5 wins, with most catchers falling in the +/- 1 win range.

Combine this with the presentation at SABR a few years back about catchers preventing PBs/WPs*, and we're getting quite a picture on defense behind the plate.

* - The one that said Mike Piazza, of all people, was very good at preventing these. (This supported what my eyes told me: With the Dodgers, he caught a variety of styles - Nomo's splitter and Candiotti's knuckler, for example - but was good at blocking errant pitches.)

This does explain why teams will play guys who are horrid with the bat for years and years for his defense at catcher. Molina, in a full time 120 game role, would be worth about $12 mil a year for his pitch tricks based on this. The highest for defensive runs saved is 40 iirc (baseball-reference) thus this says what 'accepted wisdom' says - catchers defense is by far the most important of any players defense on the field.

Great article, Mike. I particularly like how you were able to connect the numbers to an observable behavior with catchers. You always read about catchers who are good at framing pitches, but I had no idea it had such a large impact on their value.

There seems to be a pretty good argument in here for computer-called location.

Definitely.

But, I think the magnitude seems too high. Jose Molina's pitch framing alone is worth twice Ozzie Smith's or Willie May's entire defensive contribution. This would have to show up in CERA in a big way.

#23 Unfortunately the one thing that's missing (as far as I can see -- if I've missed it I'm sorry) is a sanity double-check. Something like the method that either Keith Woolner or Craig Wright used to verify that the expected results manifest themselves in runs.

Snapper/25, in the article I mentioned Sean Smith's catcher ERA study which shows that this effect does show up in catcher ERA in a big way. Sean's approach to catcher ERA was much better than previous studies (Woolner, etc.).

Ron/26, please read the section of the article that I entitled, "Do These Numbers Make Sense?" That's where I addressed your concern.

Also, snapper, the numbers you are considering for Jose Molina are in part-time duty and are not regressed toward the league average to estimate actual persistent skill. It's unlikely that Molina would put up numbers at the same rate in a much larger sample.

Nonetheless, I agree that I'm making a very significant claim, and it warrants further scrutiny.

One aspect that Mike mentions, but has not been really discussed, is the requisite people skills that some catchers bring to the table.

In my job I'm constantly monitoring a lot of non-verbal human interaction.I also watch a lot of Blue Jay games and you can certainly tell that Jose Molina is having a good time with the plate umps. He's always laughing, smiling and generally cajoling the umpires. You can tell, by their smiles and verbal asides, that the umpires enjoy this tete- a- tete.

On the other hand , I've always thought that Jose Posada is just about the biggest jerk I've ever seen behind the plate. Shaking his head at close calls, holding the ball that extra second when he doesn't get a call. Turning his head and arguing with umpires from the crouch. You can tell from his sarcastic sneer that he and the umpire are constantly at odds. As a batter he openly shows up the umpire frequently, stepping away on close calls.

Now I admit, this may be partially some anti Yankee angst that many baseball fans like myself have, still Posada just seems to be a guy with a very poor rapport with the umpires, and it shows in the numbers.

There seems to be a pretty good argument in here for computer-called location.

Not surprisingly, I disagree. This seems to be a skill that some catchers possess and others lack. I think areas where players can differentiate themselves from their peers is inherently a good thing for the sport.

Honestly, I oppose any attempts to standardize the strike zone. While I think umpires should by consistent within their own zones, and be reasonable within the rule book zone, the fact that pitchers/batters must make in-game adaptations to the home plate umpire's zone is a feature of the game, not a bug. I know I'm kind of alone in that opinion.

Technically, the strike zone is not static as it relates to the size of the player. So while the rules for calling the strike zone have been standardized the strike zone does flex from batter to batter.

I think what annoys fans is that umpires not only deviate from the rules defined but then are ALSO inconsistent.

Speaking only for myself I have definitely noticed a degradation in umpires' ability to call a consistent zone. I now believe that overall umpire quality is at its lowest point since before the union was broken. It it really disheartening seeing this huge step back in performance when so much many is poured into enhancing the fan experience.

Excellent article.
The animated .gif comparisons are great.
I think it would be even better that if you mention 11 different pitches, that you show all 11 pitches in animated .gif form.
(Or a link to a page that shows them, if you want to keep the main page cleaner.)

I think what annoys fans is that umpires not only deviate from the rules defined but then are ALSO inconsistent.

And, just to be clear, I don't support the latter. It doesn't bother me at all, and think it's a good thing, if one ump likes the high strike while another will give an inch or so on the outside but won't do the same on the inside of the plate. But I do agree that players should be able to expect consistency within the game - if it's a strike there in the first inning, it's a strike in the seventh.

RTG/32, it's not a small amount of work to make an animated GIF that succinctly shows what happened, i.e., without ballooning the file size.

If you are interested in the list of pitches, I would be happy to share them so that you can go look up the video for yourself on MLB.tv archives. Or if there is one particular pitch from that list that you are interested in, and you don't have MLB.tv, I'd be happy to send you the 20-MB zipped AVI file for it. I won't commit to doing that for all 50+ pitches that I recorded, though, as that's a lot of time/bandwidth on my part.

if it's a strike there in the first inning, it's a strike in the seventh.

If I recall correctly, a fair amount of research has demonstrated that umpires of late are loathe to call guys out on strike 3 later in games, among other things that do not happen later that happened earlier in the game.

It wasn't a criticism in any way. I just thought of it like an appendix for the article.
Flooding the page with animated .gifs/images would definitely be distracting.
But a link that says "to see the individual pitches, go here" wouldn't be bad.

Because the players are your competition. The umpires aren't part of the competition - umps don't accumulate wins or losses. Within the context of a baseball game, umpires are more analogous to the baselines or the bases or the outfield fence: they're part of the framework on which the game is played. But the game is played between players.

The umpire's role is not to define the strike zone. MLB's rule book has done that already. The umpire's role is to impartially declare how pitches relate to that pre-defined strike zone.

Because the players are your competition. The umpires aren't part of the competition - umps don't accumulate wins or losses. Within the context of a baseball game, umpires are more analogous to the baselines or the bases or the outfield fence: they're part of the framework on which the game is played. But the game is played between players.

The umpire's role is not to define the strike zone. MLB's rule book has done that already. The umpire's role is to impartially declare how pitches relate to that pre-defined strike zone.

Has that ever been true in the history of baseball? That the strike zone has been defined by the rule book zone and not the men behind the plate calling the games. I tend to doubt that "personalized zones" are a creation of the aughts. Hell, in the 1980s the consensus was there was a significant variance between the strike zones between the two leagues, and that seemed to be accepted without issue.

If Jose Molina has mastered the ability to gain strikes for his pitcher through framing, and some lazy ass catcher has spent no time working on that part of his game, damn right I don't want that eliminated in the pursuit of homogeneity.

To me, these variances make for a richer game. Just as having variety in the distance and heights of the outfield walls at the league's ballparks, to name one of your examples, makes for a better sport.

Has that ever been true in the history of baseball? That the strike zone has been defined by the rule book zone and not the men behind the plate calling the games.

It's been equally true throughout baseball history that the strike zone is defined in the rule book. I'm sure it's also been equally true throughout baseball history that individual umpires had different effective strike zones. I suspect that umpiring today is better than it was in, say, the 1920s. But as an ideal: there's a definition of the strike zone given in the rule book, just like there's a definition of an out. My ideal is that the rules be objective and consistent. If robot umpires help move toward that ideal, I support them.

But it's just a personal preference. I can see your argument - the outfield wall analogy is a good one. I just don't agree with it.

To me, it's like signing a hockey player because you value that he's especially skilled at diving and drawing penalties.

I was going to make a "if that were true Mike Ribeiro would still be in the NHL" joke...but I see he's actually been doing quite well in Dallas for several years. It's amazing how quickly you can get out of the loop in a sport. But THIS year, this year I'll start paying attention again.

Ribeiro actually used to be one of my favourite players, seeing that he's on the Stars actually hurts him in more eyes more than his on-ice shenanigans. It was Boston after all.

This is a fascinating article. What would be very interesting is to see how the catcher tendencies correlate with umpire tendencies. Are the catchers on the top of the list having more games called by umps with a tighter zone?

And, just to be clear, I don't support the latter. It doesn't bother me at all, and think it's a good thing, if one ump likes the high strike while another will give an inch or so on the outside but won't do the same on the inside of the plate. But I do agree that players should be able to expect consistency within the game - if it's a strike there in the first inning, it's a strike in the seventh.

It doesn't bother you at all to compare the strike zones against LHH and RHH, like in the plots in the article, and see how frequently left-handed hitters get screwed over by pitches that are as far outside as the RH batters' box showing up as called strikes? Consistency against different hitters and within a game just doesn't exist with how umpires are calling the strike zone. They are influenced by the count, who the pitcher is, who the batter is, and they frequently have no idea where the outside corner is for LHH.

I really don't mean to derail the discussion in the comments on a wonderful article by taking it more in the direction of umpire complaints, but to suggest that umpires not calling a uniform strike zone is okay if they are at least consistent with their own strike zone within a game is completely irrelevant, because they aren't consistent within games AT ALL.

Skill at deceiving the officiators shouldn't be an advantage, it hurts the sport. The computerized strike zone needs to be here yesterday.

Not surprisingly, I disagree. This seems to be a skill that some catchers possess and others lack. I think areas where players can differentiate themselves from their peers is inherently a good thing for the sport.

Yes, against their peers. Not subterfuge that undermines the rulebook by manipulating the fallibility inherent in a human judge.

Rewarding a catcher for framing pitches is a bit like letting Rosie Ruiz keep her Boston Marathon win. When the technology for electronic chip timing became feasible, marathons started implementing them. There were no tears wept for marathon runners who were better at taking shortcuts than their peers.

Skill at deceiving the officiators shouldn't be an advantage, it hurts the sport. The computerized strike zone needs to be here yesterday.

And if it makes the game less interesting to watch, can we go back?

The ideal of the consistent and objective zone Kiko pointed to sounds good. But since we've never had one, it's impossible to know what effect that will have on the game itself. I tend to think it will make the game less interesting, just as I think most moves toward homogeneity have that effect on things I like. You think otherwise. In either case, it is just personal preference and not objective fact.

Rewarding a catcher for framing pitches is a bit like letting Rosie Ruiz keep her Boston Marathon win.

No, it's really not like that at all.

By the way, it's amusing that many people here really liked this article, but would prefer a baseball where this article wouldn't exist.

For those who didn't read the article or just initially glanced at it like I did, you are missing a very good piece. Whether the study stands up might be something different, but I love the little tidbits where he examined video of borderline calls and compared the actions of the catchers, the "video" the article has on each catcher match up with his comments.

I also calculated an approximate run value for the extra strikes saved or lost by each catcher using Dan Turkenkopf’s finding that switching the call from a ball to a strike on a close pitch was worth about 0.13 runs on average.

Let me echo what others have said: this is a fascinating, well-researched, and well-written piece. But, as others have also said, I'm not so sure the results pass the smell test to me. Reading what I quote here: are you saying that a "net strike" is worth +0.13 runs and a "net ball" is worth -0.13 runs, and, if so, is that possibly double-counting? In other words, wouldn't a range from ball to strike of 0.13 imply run values of +/- 0.065 for strikes and balls relative to average (assuming "average" is a 50/50 chance of ball/strike). It could be, of course, that I've simply misunderstood what you did and the range is just much larger than I (and most people, I think) expected.

to suggest that umpires not calling a uniform strike zone is okay if they are at least consistent with their own strike zone within a game is completely irrelevant, because they aren't consistent within games AT ALL.

This is something that I would really like to figure out how to quantify and understand better, because it's not at all obvious to me that it is true (or untrue), and to exactly what extent that is so.

If the assertion is that every umpire has made a bad strike call in their career, well, sure. But once we get beyond perfection as the ultimate standard, what's good enough consistency, and how do we measure whether umpires achieve it? I have not seen anyone else suggest, nor have I yet been able to come up with a good method or standard for doing that.

I've seen lots of people use the bad strike box that's drawn on TV or Gameday to judge that umpires blew a call, or even worse, to use their distorted perspective from TV or the stands to determine that the umpire blew a call. Even if they were right in the criticism (sometimes they are, but mostly not), that would not address the issue of consistency. One of the most common criticisms occurs when umpires call strikes outside off the plate to LHB. But they do that pretty consistently, so the batters and pitchers expect it, and I don't see how that harms the game. The problem comes when the umpires call pitches differently than the batter and/or pitcher expect based upon their past experience in the game. And that's not an easy thing to quantify, I've found.

Is that number really so hard to believe? In the AL this season, batters hit .277/.384/.450 after a 1-0 count and .227/.268/.348 after a 0-1 count. Batters hit .257/.388/.409 after a 2-1 count and .187/.234/.289 after a 1-2 count. And imagine the difference in value between a full count pitch getting called a strike for a strikeout and called a ball for a walk.

It's an article studying and examining something that is unfair within the game. Wanting that unfairness corrected has nothing to do with appreciating the quality of the study and the article.

It's not unfair. Just as the deep fences at new Shea aren't unfair. These players have a skill at receiving the ball, and they're accessing that skill to improve their team's performance. If Ryan Doumit and Jason Varitek don't want to be a drain on their team's pitching performances, they should get better at catching.

One of the most common criticisms occurs when umpires call strikes outside off the plate to LHB. But they do that pretty consistently, so the batters and pitchers expect it, and I don't see how that harms the game. The problem comes when the umpires call pitches differently than the batter and/or pitcher expect based upon their past experience in the game. And that's not an easy thing to quantify, I've found.

Kiko/53, turning a generic strike (anywhere in the zone) to a ball costs your team about 0.15 runs. Turning a generic ball to a strike gains your team about 0.15 runs. Flipping the state of an average borderline pitch is worth just a little less, around 0.13 runs. That difference is small, so let's ignore that for now.

In linear weights, a walk is worth something like +0.3 runs, and a strikeout something like -0.3 runs. If you flip the state of one ball or strike, you move yourself a third or a fourth of the way closer to a walk or strikeout. A fourth of the 0.6-run difference between a walk and strikeout is about 0.15 runs. That's a crude way to understand the valuation.

This is something that I would really like to figure out how to quantify and understand better, because it's not at all obvious to me that it is true (or untrue), and to exactly what extent that is so.

Mike,

Are you familiar with Andrew Goldblatt's wortk on umpire tendencies? If not, you should check out his stuff. There is other stuff out there also, but Golblatt's published a book with data sets (albeir somewhat primitive) and interesting narratives. Umpire studies are still in their infancy, but they have stumbled out of the cave. I would bet a great deal of money that they would correlate with what you are looking at. I am too old to embark on any serious research.

I've seen lots of people use the bad strike box that's drawn on TV or Gameday to judge that umpires blew a call, or even worse, to use their distorted perspective from TV or the stands to determine that the umpire blew a call. Even if they were right in the criticism (sometimes they are, but mostly not), that would not address the issue of consistency. One of the most common criticisms occurs when umpires call strikes outside off the plate to LHB. But they do that pretty consistently, so the batters and pitchers expect it, and I don't see how that harms the game. The problem comes when the umpires call pitches differently than the batter and/or pitcher expect based upon their past experience in the game. And that's not an easy thing to quantify, I've found.

Isn't it pretty easy to see inconsistency by looking at the overlap on plot of balls and strikes? Any called ball that's closer to the rulebook zone than other pitches that have been called strikes is a sign of inconsistency (or every called strike that's farther from the rulebook zone than pitches that have been called balls). I'm not sure how you necessarily quantify that systemically, but it's easy to see inconsistency on a ball/strike plot. For instance, here is the game I am currently watching. Nothing about that chart suggest consistency in any way. It isn't constant overall, it isn't consistent against LHH, it isn't consistent against RHH.

I think there's also the human aspect of umpiring, which, as long as we have human umpires, we can't just ignore. (As an aside, having computerized umpires brings a whole different set of issues to the table, which anyone who has worked much with the PITCHf/x data, good as it is, can tell you about.)

I'm thinking mainly of two issues.

1. Calling balls and strikes is a very difficult task, in terms of the spatial abilities of the human brain. Anything a catcher can do to assist an umpire with a difficult task, the catcher ought to do. Umpires and catchers ought to work together. At the absurd extreme, one can understand that if the catcher jumped up to receive a pitch or dove suddenly into the dirt, even if the pitch was in the zone, it would be very hard as the umpire to concentrate on or perhaps even to see the flight of the ball across the plate. So things that the catcher does to distract the umpire are not things we want catchers to do. Ideally, good umpires would be able to ignore any distractions. In practice, it may be at or beyond the limits of human perception and ability to do that. And we don't know at this point if some umpires are better at ignoring these distractions than others.

Calling pitches on the outside edge and bottom edge of the zone is more physically difficult because the umpires lack good references and lines of sight to those edges. One area of research that is sorely lacking at this point (because it's difficult to do) is into what reference points umpires use for making calls on those edges.

2. If umpires call exactly to a spatially consistent zone, the catcher can make them look really bad to everyone else in the park by the way he catches the ball. If the pitcher hits the catcher glove and the catcher catches it cleanly, even if it was a few inches off the plate, no one else on the field or in the dugouts who saw the catcher will think that was a ball. Ideally, in fairness to the batter, we'd like the umpire to call that a strike. But if he does, he's going to take a lot of heat from everyone else who cues off the catcher. On the other hand, if Varitek does his exaggerated sweep to catch the ball, no one else on the field or in the dugouts who saw the catcher will think that was a strike. So the umpire is under pressure to make a call that the dugouts won't bark at him for. Why should he go out on a limb and give the catcher a strike when the catcher is the one who screwed up and will make him look bad if he makes the right call? In theory, we'd like the umpire to take one on the chin for the sake of the game and make the correct call. But in practice, I can at least understand that the umpire is a man on an island when he's making that call, and why the catcher behavior might influence what he calls.

For instance, here is the game I am currently watching. Nothing about that chart suggest consistency in any way. It isn't constant overall, it isn't consistent against LHH, it isn't consistent against RHH.

The RHH plot looks completely consistent to me (at the moment--this is a game in progress, so the ump may do something stupid in the 9th inning to make me look bad later).

The LHH plot has probably two pitches that look inconsistent to me, i.e., the ones at a horizontal location of about 1.0 feet and vertical location around 3.0 feet.

Make sure you are looking at the non-normalized plots at the bottom. The height-normalizing process uses bad data, and I frankly hate those plots. (I've told Dan Brooks that many times, but I haven't yet convinced him to get rid of them off his site.)

Flipping the state of an average borderline pitch is worth just a little less, around 0.13 runs.

Right, but my question is whether the proper starting point should be the opposite call. Let's compare, say, Jose Molina and Ryan Doumit - I think they were at opposite extremes in your article. There's a pitch that's essentially in the same place with both guys catching: Molina's gets called a strike, Doumit's gets called a ball. The run difference between Molina and Doumit in this case should, therefore, be 0.13 runs. In terms of "net strikes", Molina would be +1 and Doumit would be -1, so the difference there would be 2. That would imply a run value for a single net strike of 0.13 / 2 = 0.065. If you valued "net strikes" at +0.13 runs, I think your run values might be twice as big as they should be.

The description of the statistical review does not leave me wanting to read the book, but if there are anecdotes or interviews with umpires that are worthwhile beyond the stats, perhaps it would be worth checking out.

Kiko/63, I think I understand what you are saying now. You're basically saying that for a borderline pitch, the neutral expectation would be 50-50 ball-strike. So if that was the baseline, an extra strike in that region would be worth half of 0.15, or about 0.075 runs. But in practice that's not what's happening. Molina's getting strikes where the probability of getting a strike is more like 15% on average, for example. So he's getting value for recapturing 85% of 0.15, which is 0.13.

Certainly, it would be more careful accounting to capture the actual expected probability of a strike on each individual pitch and subtract from that instead of from the overall average expectation. It would also be a lot more work in my method. Maybe it's worth doing. It's not at the top of my list for refinements, but I can see eking out a little more accuracy that way.

Max Marchi did use the actual expected probability of a strike on each individual pitch as input to his regression model, and he found similar results to what I found, so that gives me some additional confidence that I'm not actually off by a factor of two here.

I don't think anyone needs a strike zone box to get peeved when a CB Bucknor is set up behind the catcher who is expecting a pitch inside, the pitch instead goes down the middle/belt high, but CB calls it a ball because he didn't expect that pitch.

I really hope you are not going to tell me that this is a unique event. This is an incredibly common occurrence and one of several examples of where the human element is a real pain.

Umpires, like hitters, are guilty of anticipating (as one of several flaws). When the reality does not jibe with their anticipated output they rarely get the answer correct.

I really hope you are not going to tell me that this is a unique event. This is an incredibly common occurrence and one of several examples of where the human element is a real pain.

I know this is people's perception. Believe me, I know that. I hear it all the time. But that does not help quantify anything. You've not proposed a method for measuring how bad CB Bucknor is.

I have a method that tells me that relative to a given strike zone definition, Bucknor gets 89.0% of his calls correct, whereas the average MLB ump gets 89.7% calls correct. But even that, unfortunately, does not speak to consistency. Bucknor might be perfectly consistent and still miss calls relative to the box I defined for the zone.

I'm not saying Bucknor's perfectly consistent. Don't get me wrong there. But unless we have a method to measure based upon objective data, what are we supposed to do? I don't really have much use for a vote on what people think umpire consistency is, except maybe as a sanity check on objective results.

But if part of your message is that overall things are ok or this is the best we should expect then I am disappointed

No.

If I didn't think things could be improved, I wouldn't be putting nearly so much time into this. I really think there is potential for better umpire training, among other potential improvements.

However, I do not think that most fans are very good judges of umpire quality, nor do they typically have a very good understanding of the potential drawbacks or problems involved in a computerized system.

I believe it would be good for the game if ball-strike calls were more consistent. I don't see a magic pill to make that happen. Figuring out how to accurately measure umpire consistency and what sort of factors affect an umpire's calls seems to me to be a better way to make progress, albeit slow progress, than yelling from the rooftops that umpires suck. Moreover, I believe in an approach that does not treat the umpires as the enemy of the game and is honest about the difficulties involved in doing their job well.

I also don't believe that calling balls and strikes relative to a particular spatial box is a worthy end unto itself. The goal is to improve the quality of play, the fairness of the result, and the enjoyment of the fans and players. That's why I think umpire consistency, if I could figure out how to measure it, would be a more helpful measure than the percentage of calls relative to a given box. Percentage of calls relative to a given box is a decent place to start, but I don't like it as the final judgment. I particularly don't like it as the final judgment when the box is poorly drawn.

Chiming in late here; this thread fell off the main page before I remembered to bookmark it.

First...terrific article and study. Any time we find new ways of quantifying behaviors and results, I'm all for it.

As to the umpiring debate...I get where you're coming from when you talk about the potential problems with some kind of electronic system. But I also think -- and you may have alluded to this above -- that getting a "reasonable" amount of accuracy out of a human judge just may not be possible. Given where the umpire stands, the speed at which things happen, etc., I think we're beyond human reflexes.

But, I think the magnitude seems too high. Jose Molina's pitch framing alone is worth twice Ozzie Smith's or Willie May's entire defensive contribution. This would have to show up in CERA in a big way.

I can buy this on the face of it. Remember that a catcher's pitch framing works on each and every single pitch. Ozzie and Mays get three to six balls per game.

It's too bad that Mike Piazza retired from catching before the years of this study. It would've been nice to learn how/if the widely differing opinions on his defensive prowess would be affected if we knew how/if he was a good pitch framer. Ancedotally I don't recall any such reputation about him.

Thanks for the article, Mike. klaw retweeted it, which is how I found it.

Kiko, I wanted to mention here that I think my understanding of what Dan's figure of 0.13 runs/pitch meant, as I expressed in #65, was not correct. I need to figure out the ultimate run impact, which I think will be less than what I said in my original article but more than 50% of what said due to some of the reasons I explained in #65, even if some of the things I said in that post were wrong. In other words, your objection in #63 has more merit than I originally realized, even if the impact is less than 2x.

There are a number of factors that I need to improve upon to tighten up the valuation in my model, and this will be one of the important ones.

Remember that a catcher's pitch framing works on each and every single pitch.

Not really; it only "works" on pitches that are (a) taken by the hitter and (b) on the border of the zone. Batters swing at about 45% of pitches, about 37% of pitches are called balls (with a small fraction of those being either intentional balls or pitchouts), and the remaining 18% are called strikes. Without looking closely, I'd bet that at least half of the called balls and strikes are clearly one or the other, and that would mean that at most a little more than a quarter of all pitches could be affected by framing. I'd guess that the actual number is quite a bit less than that.

About a third of taken pitches are within three inches of the strike zone boundary, which seems to be about the rough extent of the strikes that can be gained or lost by framing. That's a ballpark estimate, so don't take the exact number too literally, but that works out to about 25 pitches a game where the catcher can have an effect on the ball/strike call.

Bad85, are you talking about this book?...The description of the statistical review does not leave me wanting to read the book, but if there are anecdotes or interviews with umpires that are worthwhile beyond the stats, perhaps it would be worth checking out.

Yes, that is the book, although I didn't pay $40.00 bucks for the thing. Like I said, the stats are somewhat primitive (W/9 IP; K/9 IP. etc compared the the league norm), but every umpire has a profile (each about a page and a half long), and every profile comes with anecdotes that are well written.