Skewed Left

What We've Learned About Replay

the archives are now free.

All Baseball Prospectus Premium and Fantasy articles more than a year old are now free as a thank you to the entire Internet for making our work possible.

Not a subscriber? Get exclusive content like this delivered hot to your inbox every weekday. Click here for more information on Baseball Prospectus subscriptions or use the buttons to the right to subscribe and get instant access to the best baseball content on the web.

When a totally new system like Major League Baseball’s expanded instant replay—complete with brand-new job descriptions and job openings and technology—is assembled on the eve of the season, you’d imagine its implementation would look more like an evolution than the arrival of a fully formed process.

And by most accounts it has been. Whether it’s the change in the transfer rule that tangentially went along with it, or managers getting used to the silly choreography of how to argue with an umpire while simultaneously looking back at the dugout for a thumbs-up or a thumbs-down, everybody involved in the process seems to be getting better at it.

“We’re getting a little more educated in different ways not to manipulate the system but to use it properly,” Pirates manager Clint Hurdle said. “We went in with a plan, but I think we’ve been able to refine and make it a little bit better along the way. … Every team’s going about this in their own method. We’re keeping records on all the calls—specific calls that are made, leverage situations.”

Major League Baseball is keeping records too, and in the face of all the fears from this spring that teams would seriously game the system and massively disrupt the flow of the game, the records portray a pretty good picture.

According to data obtained from Major League Baseball’s central office on Tuesday, there were 381 traditional replay reviews—15 percent of which were umpire-initiated rather than challenge-initiated—plus four more looks to check counts or other record-keeping procedures in the first 759 games. Those challenges totaled 830 minutes, meaning that the replays averaged 2:09.

(The measurements will tell you that replay review has added 1:06 to each game, but that’s a pretty blunt estimate considering that it measures only the time from the challenge to the decision and not the manager’s stalling with an eye on the dugout. And it assumes no delay that would come with full-length arguments without replay.)

But most notable is just how high a percentage of calls have been overturned, with the frivolous challenge not really materializing. Despite no penalty for getting a challenge wrong, beyond just losing the challenge, the success rate of 47.2 percent overturns is considerably higher than the 40 percent that tends to be close to the yearly rate in the NFL, where lost challenges cost a valuable timeout.

The prevailing thought, at least here, was that managers would challenge anything and everything because the chances of a close call showing up again wasn’t enough of a deterrent to challenge even a play with a few percent chance of getting an overturn.

For the most part, according to a couple of managers, that’s been true. The possibility of losing a challenge isn’t motivating them to hold back when there’s a play they’re considering challenging.

“Our philosophy is that if we think that we’re right and we think it’s a good challenge for us, we’re going to do it; you never know when it’s going to come again,” Nationals manager Matt Williams said. “That’s our philosophy. Some have been good for us and some have been bad for us, but we’re not going to shy away from doing it because we may not have another opportunity to.”

And what has been the biggest inning for challenges? The first, and by a lot.

Part of that is probably structural as it relates to the first inning. The first is the highest-scoring offensive inning, and the second and third are the lowest of any in the starter’s typical time in the game. And it’s not a surprise that more challenges would come with more runners on base because more unusual things can happen. Still, with so many challenges early, even proportional to offense, teams are clearly not afraid of losing a challenge.

“We’ve committed ourselves to the point with our group where we make a decision and we go with it,” Hurdle said. “If we win, we win. If we lose, we lose and we move on.”

Where my prediction of overdoing their laxity might be going wrong is that there might just not be a lot of plays where there’s a 5 percent chance of getting an overturn. There’s no reason that the distribution has to be uniform or normal or anything else. What we’re learning these first couple months is that the distribution may be a whole lot of plays with a zero percent chance of being overturned, a whole bunch with a 90-100 percent chance, and then a few in the middle.

There’s been some controversy with bad angles that might make these chances not exactly zero or 100, but there really haven’t been too many surprises when umps come back from their chat session—especially on the side of bizarre overturns. If you need a five percent chance of getting it overturned to make it statistically worthwhile, that might just be an area of the probability curve that doesn’t really exist.

We’ve also started to see some patterns emerge in what types of plays are getting overturned. By far, the plays that result in the greatest percentage of overturns are the ones involving whether or not a ball was caught. That’s not surprising; in those cases, the replay official has to look at only one thing, and managers on the defensive side can lean on their players, who should know whether or not a ball was caught much better than they could gauge a close out/safe play.

Counting both plays initially called a catch and a no-catch, there have been a total of 12 challenges with 11 overturned, none confirmed, and only one in the distinct category of “standing” with not enough evidence to overturn but no confirmation. And even that one was just the transfer rule—a Terry Francona challenge of this Elliot Johnson non-catch—which might be a catch now, seven weeks later.

Here are the rest of the plays broken down by success rates of challenges on each type of initial call.

Initial call

Total plays

Overturned

Confirmed

Stands

Overturn %

Catch

7

7

0

0

100.0%

No Catch

5

4

0

1

80.0%

Safe 1st

69

41

7

21

59.4%

Safe other bases

83

42

18

23

50.6%

Out other bases

64

31

13

20

48.4%

Out 1st

82

36

20

26

43.9%

Other**

19

8

5

6

42.1%

Foul

12

4

2

6

33.3%

No home run

16

4

8

4

25.0%

Home run

9

2

6

1

22.2%

Plate block

15

1

13

1

6.7%

Total

381

180

92

109

47.2%

**Other includes things like HBPs and whether a ball hit the wall before being caught.

A few other things jump out here, one being the relatively low rates on the home run vs. non-home run calls, which were the only reviewable plays before this year’s expansion. Managers can’t challenge home run calls, but they can request that umpires review them. With so many of those “confirmed” and not just “standing” this one might be where the managers are just taking their hundred-to-one shots, given that the stakes are so high.

The more intriguing one is the different rates both on plays at first and at other bases/basepaths when the call is initially “safe” vs. when the call is initially “out.” With the immediate caveats that we’ll know much more when we have triple the data at the end of the season, and that this could be a sample size issue, the rates of overturns have been higher when the initial call is that the runner was safe.

This could be seen as some initial insight on how umpires are making these calls, erring on the side of “safe” calls. It could be some subconscious bias, or it could be erring—in the age of replay—on the side of keeping plays alive because it’s easier to sort things out that way than to return baserunners to the field.

However, it could be something much simpler—actually in the overturning rather than in the call. It’s generally easier to prove the existence of anything than the absence of anything. Hence, it would easier to overturn a safe call into an out call by proving that a tag happened instead of trying to proving that it didn’t happen.

Whether that applies to a tag on a video, though, is unclear. It might actually be easier to verify blank space between glove and back than to verify a tag. It will be interesting to see if these relationships hold, both for plays at first and for plays on the bases with more time, more data, and no transfer rule skewing the numbers at second.

Where the intuition is much more straightforward is in the idea put forth by broadcasters, who in this case are absolutely right in saying that the longer a challenge goes, the worse it is for the challenging team. Sort of.

The shortest replays are the ones that confirm the call, while the longest ones are the ones that get the exact same result in the call standing. After a certain point, it can certainly be said that the longer it goes, the worse the result for the manager who made the challenge.

Result

Total plays

Average time

Std. Dev.

Confirmed

92

1:37

0:45

Overturned

180

2:03

0:53

Call stands

109

2:45

0:49

Two months into the process without a lot of major hangups, the goals should be to lower those numbers. Hurdle suggested another change, desiring a camera angle that aims directly down the first base line to get a better view of plays at the bag, but most of the room for improvement will be in the flow of the process. And again, that average time counts only from challenge to verdict, not the time that the manager is stalling.

If there’s a major tweak in the offseason, it might not need to be in penalizing a team that makes frivolous challenges, but to enforce some time limits and regulate when a play can be challenged afterward. But two months in, the system has been working—albeit slowly working—with less funny business than anticipated, given how quickly it was all thrown together.

Is there anywhere that keeps track of team success rates of challenges. I know the Giants run out the stat that Bochy has been pretty successful with his challenges so far this year but is that the Giants keeping track on their own or are those stats out there.

You know who blathers on constantly about "how long" a replay takes? Beat writers and broadcasters whose own jobs center around deadlines and time slots. Yet they conveniently ignore the fact that replay has virtually eliminated manager/umpire arguments, which took quite a bit of time each game.

You know who doesn't care about how long a replay takes? Everybody else who just wants the calls to be right.