The Polio Vaccine, Cold Fusion, and Advanced Pitching Stats

For centuries we used horses to get around, until we unlocked the secrets of internal combustion and started building cars. For 2,000 years, we diagnosed the sick based on surpluses or deficits of their four humors, until modern medicine added decades to our lives and immeasurably improved our standard of living. For as long as baseball has existed, pitching remained a great mystery, until advances in statistics and technology opened our eyes to how pitchers work.

These three advances are nearly identical in their contribution to society. If anything, driving and curing disease are vastly overrated.

Voros McCracken was the man who started us down the road toward better understanding of pitchers. An avid amateur baseball researcher who whipped through thousands of hours on the rec.sport.baseball message board, McCracken spent his days trying to figure out how to separate defense from pitching so that both skills could stand on their own. Finally, he found his answer: Defense Independent Pitching Stats, or DIPS. The central premise of DIPS theory held that pitchers have no ability to prevent hits on balls in play. As McCracken put it, you could be Randy Johnson or you could be Scott Karl; once the batter made contact, a pitcher’s pedigree meant nothing.

McCracken’s theory was instantly branded as, to put it lightly, crizazlebeans. How could one of the greatest pitchers in baseball history, owner of a 98 mph fastball and possibly the best slider ever, have as little sway over balls hit in play as some generic junkballer? You could belt out numbers until your face turned blue, people weren’t buying it.

Still, those numbers, for the most part, backed him up. There were some exceptions. Knuckleball pitchers showed a tendency to suppress batting average on balls hit in play (now known as BABIP) a little better than the general population. The bigger flaw in DIPS theory (and McCracken’s resulting stat DIPS ERA) was its failure to account for a pitcher’s ability to prevent doubles and triples. Perhaps great pitchers couldn’t induce enough weak contact to materially affect BABIP. But weak contact did exist, and the best pitchers still benefited from their ability to create it. There were other minor flaws here and there. But the general idea that what we’d always called “pitching” was really “pitching and defense” started to take hold.

The next generation of baseball analysts began producing more sophisticated stats to express this new discovery. A baseball researcher named Tom Tango developed Fielding Independent Pitching (FIP), an update on DIPS ERA.1 The idea behind FIP was to strip out the impact of defense, as well as luck, official scorers, and other factors beyond a pitcher’s control. If a pitcher commits a two-out throwing error, then gives up seven runs, none of those runs are earned and none count against his ERA. If Terry Francona loses his mind and puts David Ortiz at shortstop, the countless balls that get by Papi (which wouldn’t be ruled errors) that a competent shortstop could normally handle wouldn’t destroy Josh Beckett’s FIP. To make the stat easier to understand, FIP runs along a similar scale to ERA. For instance, Josh Johnson led all qualified starters last year with a 2.41 FIP; Ryan Rowland-Smith finished last with a 6.55 FIP; league average came in at 4.08.

Colin Wyers of Baseball Prospectus recently explained pitcher evaluation thusly: “When evaluating a pitcher’s performance, there are three questions we can ask that can be addressed by statistics: How well he has pitched, how he accomplished what he’s done, and how he will do in the future.”

Won-lost record answers none of these questions. It’s highly dependent on defense, run support, bullpen support, and the whims of a manager. ERA answers none of these questions. It’s subject to all those shady official scorer’s decisions, as well as defense and other pitching-independent factors. Defense-independent stats, collectively, answer all of these questions. Pick any one you want. For our purposes, they’re all pretty similar in their utility and accuracy. 2

For now, get to know FIP. If general managers understood the concept back in the day, none of these disastrous signings would have happened.

Mike Hampton Rockies, 8 years, $121 million/Denny Neagle, 5 years, $51 million:
Extenuating circumstances galore. The 2000-01 offseason was one of the wildest in baseball history, with salaries sent sky-high by owners with pumped-up stock portfolios. You also can’t talk about the Hampton and Neagle debacles without bringing up the Coors Effect, which damaged or ruined several talented pitchers’ careers. We were also years away from embracing defensive-independent stats as accepted mainstream ideas. Still, even a cursory glance at Hampton’s strikeout-to-walk rate (1.75 and 1.53 in the two years leading into his megacontract) and Neagle’s acute case of gopheritis (54 homers in the season and a half before his big deal) should have waived huge caution flags. Neagle’s retroactive 1999 and 2000 FIPs: 5.63 and 4.90. If the Rockies wanted to burn $172 mil, why didn’t they just open a nationwide chain of Dante Bichette-themed amusement parks?

Even in Zito’s Cy Young season, his fielding-independent profile suggested a pitcher more than a run worse than his ERA would suggest — and nowhere near that 23-5 record. In fact, Zito’s ability to miss bats was already waning by the time he won that Cy Young in his third big league season, at age 24. His strikeout rate dropped from an excellent 8.7 per 9 innings in 2001 to a slightly above average 7.1 in 2002 to a poor 5.7 in 2003. With the help of some great teams behind him,3 Zito’s superficial stats continued to look strong even as he struck out fewer batters, walked more batters, and gave up more home runs. All that time, Zito’s home ballpark propped him up. From 2002 through 2006, Zito posted an infield fly ball percentage of 15.6, third-highest among all qualified starters. Oakland-Alameda County/Network Associates/McAfee/Overstock.com/O.co Coliseum contains enough foul territory to house all of Antonio Cromartie’s kids, and Zito was a soft-tossing fly ball pitcher who squeezed all he could from that environment, including .245, .243, and .239 BABIPs during that five-year stretch.

Giants general manager Brian Sabean and owner Peter Magowan ignored Zito’s peripherals, dreamed of the guy they’d seen a half-decade earlier, and envisioned a new Giants ace whose charisma would turn the Barry Bonds Era into the Barry Zito Era. A past-his-prime Zito tried to throw his 85 mph fastball and diminished curve in a park that favored pitchers but didn’t proffer that huge infield fly edge, and his lackluster performance finally caught up to his superficial numbers. And here we are.

Oliver Perez, Mets, 3 years, $36 million
Perez went 15-10 with a 3.56 ERA and about a strikeout per inning in 2007; he also walked too many batters and posted a 4.35 FIP. The next year, he posted one of the highest live-drive rates in the league (22.1 percent) and yielded a 4.68 FIP, in line with luminaries such as Ian Snell, Greg Smith, and yup, Barry Zito. Granted, everyone in the free world knew giving an injury-prone starter with lousy command $12 million a year was a terrible idea. Everyone except the Mets. The good news is Perez’s numbers became easier to decipher in the following two seasons: Perez’s 6.82 and 6.80 ERAs were in line with his true levels of awfulness.

History is littered with similar examples of horrendous signings that could have been averted by eyeballing FIP. Trouble is there will always be pitchers with deceptive stat lines who tempt short-sighted GMs into bad decisions. If your team hands big money to pitchers like these, you could form an unruly mob and descend on the stadium torches and pitchforks, and no one would blame you.

Jeff Karstens, Pirates, 3.05 ERA, 4.46 FIPThe regression’s already started for Karstens, as it has for the entire Pirates pitching staff. When you’re a right-handed pitcher with a high-80s fastball, moderate groundball rate, and Jason Hammelian strikeout ability, you’re asking for trouble.

Joe Saunders, Diamondbacks, 3.67 ERA, 4.72 FIPThere are certain pitchers who consistently post lower ERAs than their peripherals would suggest. There could by any number of reasons for this gap, including strong team defense, favorable home park, greater success with runners on base, or just a prolonged period of luck. Saunders has been one of those guys. That doesn’t mean you should expect him to hang with Tommy Hanson when he’s pitching like Zac Hanson.

Defense-independent stats aren’t perfect. Soon, we’ll gain even greater accuracy in our analysis. SportVision developed a system called PITCHf/x, which measures the velocity, break, and slope of every pitch thrown in the major leagues. Follow-up systems called FIELDf/x and HITf/x further isolate the exact positioning of every fielder, and the hit speed and trajectory of batted balls. Scientific advances, be they in baseball or other less awesome fields, ultimately come down to one goal: eliminating guesswork.

That doesn’t mean we’ll know everything about baseball. But it does mean we’re getting better at solving a very hard-to-squash perception problem. Take the Brewers’ offseason trade for Zack Greinke. It looks awful if all you do is glance at his 4.21 ERA. But Greinke’s 2.99 FIP makes him a top-20 starting pitcher, while his strikeout-to-walk rate of nearly 5.5-to-1 slots him behind only Halladay and Dan Haren. That doesn’t mean you expect Greinke’s FIP to instantly catch up with his ERA. It does mean he’s an elite starter, and that if he maintains these skills in future seasons his ERA and other superficial stats will likely improve.

(13 HR + 3 BB – 2 K) / IP + a number held constant throughout the year (usually around 3.10 to 3.20) that calibrates FIP to the ERA scale. Also, while Tom Tango is an alias, I can vouch for this gentleman. He’s a die-hard Montreal Expos fan, like me, and the founder of the excellent Tim Raines Hall of Fame advocacy site, Raines30.com.

Baseball analysts have produced several defense-independent stats that run along a similar scale to ERA. There’s Defense-Adjusted ERA (DERA), developed by Clay Davenport of Baseball Prospectus. There’s Fair Run Average (FRA), another Baseball Prospectus stat that strips out pitching-independent events, is park- and batted-ball adjusted, and also considers sequencing of events. There’s Skill-Interactive Earned Run Average (SIERA), a stat developed by Matt Swartz and Eric Seidman (formerly with Baseball Prospectus, now with FanGraphs). There’s FIP. Finally, there’s Expected FIP (xFIP), which takes FIP, regulates aberrant home run-to-fly ball rates, then spits out a result.

The A’s made the playoffs five times, won four AL West titles, and never won fewer than 88 games in Zito’s 6½ seasons with the club. Michael Lewis wrote a book about it, and next month we’ll get to hear Brad Pitt talk about sample sizes and regression to the mean in a sexy voice.