Baseball's Particle Accelerator

The new technology that will change statistical analysis forever.

We live in the golden age of baseball statistics. The tubes of the Internet are nearly clogged with stats-based analysis. Dozens of sites function as a sort of ad hoc peer-review system that churns out answers to what were previously baseball's imponderables. Would Barry Bonds outhomer Ruth and Aaron if they played in the same era, you ask? Well, if they all suited up in the same era against the same competition and they all happened to play every game in Montreal's Stade Olympique, then, no. No sir, he wouldn't. Baseball Prospectushas done the math.

But even with all the analytical innovations that this era has wrought, baseball is still filled with blind spots. Sure, you can tell me that Sandy Koufax allowed fewer runs per nine innings than Pedro Martinez. But whose fastball had more movement? Nobody's yet come up with an algorithm for "nastiness."

That could change soon. This season, Major League Baseball rolled out a new feature, "Pitch f/x," that's like the stathead equivalent of a particle accelerator—a technical marvel that might just yield answers to the fundamental questions of the baseball universe.

Pitch f/x is part of Gameday, an MLB.com widget that lets you follow games when you don't want to pony up for the video feed. Until this season, Gameday was little more than a crudely animated, pop-up information dump. In one corner, you can follow the progress of runners on the base paths; in the other, you've got the score line. A little map of the field plots where the ball lands. As pure entertainment, the Gameday experience leaves a lot to be desired. The scrolling text, play-by-play accounts—"Garrett Atkins doubles (28) on a line drive to right fielder Corey Hart. Kazuo Matsui scores. Todd Helton to 3rd. Two out"—is about as far as you can get from "The Giants win the pennant! The Giants win the pennant!" But for the baseball addict, it fills a need.

Enhanced Gameday is a whole new (yet still poorly animated) ballgame. MLB.com has installed cameras in 22 of the league's 30 parks that provide data for Gameday's pitch-location chart. Previously, some guy in the press box had to guess, Hmm, that looked like a curveball, maybe four inches inside and a couple of inches above the knees. Now the computer generates all of this automatically—how high the pitcher's throwing hand was off the ground when he released the ball, how fast the ball was traveling both when it left his hand and when it crossed the plate, to what degree and in what direction the ball diverted from a straight path on its way to the plate, and finally, if the pitch really was four inches inside and a couple of inches above the knees.

For the casual fan toggling between Gameday and a report she's half-assing at the office, all that new information is pretty useless. But this data has many baseball researchers positively giddy. Perform some simple spreadsheet shenanigans and you can see that Cla Meredith's sinker sinks more than that of any other pitcher. A pitch from Angels closer Francisco Rodriguez left his hand at 104.3 mph but crossed the plate at only 82. And the ball leaves submariner Chad Bradford's hand a full 2 feet closer to the ground than the guy with the next-lowest release point (incidentally, Pitch f/x sinker king Cla Meredith).

Baseball Prospectus' Dan Fox, who is incidentally one of those soon-to-be-obsolete guys who estimates pitch locations for Gameday, issued a challenge to the vast community of baseball researchers this May:

Now that we have these sorts of tools at our disposal, we can begin to ask and answer a variety of interesting questions. Which hitters tend to get bad calls? Which pitchers get the benefit of the doubt most often? On what counts is it more likely that pitchers or hitters will benefit? Which hitters swing at pitches out of the strike zone? ... How frequently do pitchers target specific zones against certain hitters? The list goes on and on from there—let's get started.

The community has answered the call to arms. Some of the Gameday-based studies are delightfully esoteric. Am I improved by knowing that much of Josh Beckett's success comes from the fact that his changeup moves nearly identically to his fastball? No. Am I entertained? Most definitely! Some of the analysis is strategically significant—a manager would want to know that right-handed hitters should expect the curve from Tim Wakefield if they get behind in the count.

Advertisement

To my mind, the most interesting discussions have been about umpires. Fans have been hurling out glasses-related invectives since the men in blue were men in seersucker suits and monocles. Until instant replay came along, however, the fan didn't have a leg to stand on if he found fault with a close call. This new Gameday system does replay one better. Pitch f/x claims to be accurate to within a half-inch, and it's constantly building a pool of data. In the same way you can now say definitively that Barry Bonds was better at getting on base than Mickey Mantle, someday soon you'll be able to say that certain umpires do indeed suck. And which umpires suck more than others. And if, generally, they suck less if they're taller.

So far, attempts to answer these vital ump questions have been thwarted by small sample sizes and inconsistencies in the system from stadium to stadium. But early reports suggest that umpires are surprisingly accurate, though they do tend to give some pitchers and hitters the benefit of the doubt on balls and strikes. (The data are sketchy, but if Jose Vidro has felt like he's been getting screwed on strike calls this year, he's probably right). Last week, the Houston Astros tested out displaying the Pitch f/x strike zone on the scoreboard for a few innings during a game. This feels like a bad idea. Imagine how bad things could get for an ump if 52,000 Mets fans know he just rang up Jose Reyes on a called strike that was 3 inches off the plate.

The crowning achievement of the Gameday researchers and, I would argue, one of the greatest moments for nerdom since Ric Ocasek married Paulina Porizkova, occurred earlier this summer in Seattle. Dave Cameron, a writer for the U.S.S. Mariner blog, was trying to figure out why hitters were getting to M's ace Felix Hernandez in the early innings. Cameron turned to Gameday's database, which helped him figure out that King Felix's problem is pitch selection: The dude throws too many fastballs. So, Cameron posted his findings in a beautifully argued open letter to the Mariners pitching coach. The coach printed out the post and handed it to Hernandez. Next time out, the pitcher laid off the fastball and threw eight innings of shutout ball. Bloggers around the world should tack up portraits of Cameron in their living rooms.

Of course, Cameron's success story is a statistical outlier. Some of the early enthusiasm for Gameday 2007 has been tempered by shaky data. It seems that all of the cameras aren't properly calibrated yet, so a pitcher might appear (incorrectly) to release the ball at different heights off the ground in different stadiums. Those itching to know how much altitude impacts curveballs in Colorado or how much air density affects sinkers in San Diego will have to wait a while. But for now, the stats community will be writing articles like "Pitch Break Angle Versus Length" and "Rangers Rotation Release Points Redux." And they'll be fantasizing about next year's rumored Gameday innovation—video cameras that cover the playing field. These cameras promise to yield important insights about the art and science of fielding. Barring the sudden appearance of a complete pitch-by-pitch record of the Negro Leagues inside a melting ice cap, that is the last great statistical frontier.