Sure, I’ll admit it: I don’t know exactly what to do with FIP – the Fielding Independent Pitching numbers that analyze a pitcher based entirely on strikeouts, walks and home runs allowed. On the one hand, I believe the Voros McCracken discovery that most of a pitcher’s value is tied up in those three things, that a pitcher has much more control over strikeouts, walks and home runs than anything else.

Of course a pitcher doesn’t have complete control of ANYTHING, and no matter who the pitcher may be Ryan Howard is going to strike out a lot, Jose Bautista is going to walk a lot, and Giancarlo Stanton will hit many home runs. But these are the three areas where I think everyone can agree a pitcher has significant say in the matter.

It’s considerably less clear how much control a pitcher has of a ball put in play. The fate of balls in play seem to be more in the realm of defense and luck and the quirks of a particular ballpark. Then again, it’s hard and maybe incorrect to make the leap that a pitcher has NO say in those balls in play. People will argue about this all the time.

The point is – I like FIP a lot. FIP is the Tom Tango and Clay Dreslough invention that essentially crunches strikeouts, walks and home runs into a number that is aligned with ERA.
The basic formula is pretty easy:

OK, so you can see based on this – Kluber has a very low FIP, Porcello has a pretty decent FIP, Miller’s FIP is pretty much disastrous. Now, what Tango and company did to put the number in context and make it seem more familiar and friendly – they added a constant that is tied directly to the league’s run scoring environment. That way the numbers should correspond, more or less, with ERAs.

You can see it how FIPs look like ERAs when you look at the three above pitchers:

So the numbers more or less match-up. All three pitchers have FIPs higher than their ERAs, but Kluber’s is barely higher, Porcello’s is a half run higher, Miller’s is almost three quarters of a run higher.

As I say, I really like using FIP. But I can’t quite cut the cord with ERA – can’t quite make the leap that how many runs a pitcher allows (or, to me more precise, how many runs a team allows with a pitcher on the mound) should not be part of the equation. In a way, I suspect this is probably a lack of imagination on my part, tied up in my childhood. But, you can’t relive your childhood. I grew up with ERA as THE thing, and it’s hardly the only childhood thing I can’t let go of. Point is, I tend to look at both FIP and ERA when looking at a pitcher.

Which leads me to this: Clayton Kershaw is having a crazy, historic season.

OK, you probably already knew that. But this morning, I was looking at his ERA (1.70) and his FIP (1.89) and I thought: Wow, that’s probably pretty unusual for a pitcher to have his ERA and his FIP less than 2. I wonder who the last person to do that was. My guess was Kershaw himself last year.

But … no. Kershaw last year had a 1.83 ERA but his FIP was 2.39.

OK, so I thought a bit more. Well, obviously Pedro Martinez did it.

Nope. The two years Pedro’s ERA was less than 2.00 (1997 and 2000) his FIP was a little bit above 2.00. And the crazy years when his FIP was less than 2.00 (1999* and 2001 in limited innings) his ERA was a little bit above 2.00. Now, this was getting interesting.

*In 1999, as you probably remember, Pedro pitched 213.3 innings, struck out 313, walked 37 and allowed nine home runs. That was an utterly absurd raw FIP of minus-1.74 – the lowest raw FIP in baseball history by a long, long shot.

Pedro’s ERA that year was 2.07, which was utterly absurd in the 1999 American League.

If it wasn’t Pedro, it had to be Greg Maddux, right? Nope. Maddux never had a sub-2.00 FIP. Roger Clemens? Nope. Clemens never had a sub-2.00 FIP. Randy Johnson? Nope.

OK, well, then it had to be Dwight Gooden in the mid-1980s when he was basically unhittable. Except – no. Gooden is fascinating. His legendary season is 1985, when he went 24-4 with a 1.53 ERA and led the league in strikeouts. But, crazily, his best FIP season was actually 1984, when he struck out 276 in 218 innings and allowed just seven home runs all year. His total FIP that year, 1.69, is the second best since Deadball, only behind Pedro’s insane 1999 season.

So now what? Ron Guidry in 1978? Nope. His FIP was 2.19. Steve Carlton in 1972? Nope, just missed, his FIP was 2.01. Gaylord Perry in 1972? Nope. So how far back do you have to go anyway?

Well, you have to go back one more year – Tom Seaver in 1971. He had a 1.76 ERA – and thanks to a league leading 289 Ks, 61 walks and 18 homers allowed, he had a 1.93 FIP.

In total, Kershaw is on pace to become just the fifth pitcher since Deadball to have a sub-2.00 ERA and FIP. The previous four are all-time seasons:

Amazing stuff. Kershaw should become the first pitcher in more than 40 years to tilt ERA and FIP, only the fifth ever, a year up there in its own way with Gibson’s 1968 season. There is no shortage of ways to show just how awesome Clayton Kershaw is these days … but I like this one. Kershaw is dominant in old stats and in new ones. That could be what they mean by timeless.

Share this:

Joe Posnanski writes about sports for a living, particularly baseball. Here, he writes about sports and also Springsteen, Hamilton, Harry Potter, iPads, infomercials, his idolization of Duane Kuiper, his family and magic. He lives in Charlotte with his wife Margo, two daughters Elizabeth and Katie, and their dog Westley. Joe is currently working on a book about Harry Houdini in today's world.

58 Responses to ERA, FIP and Kershaw

Maddux having never had a FIP below 2.00 (or even close to 2.00) seems to square with the perception that FIP isn’t reliable for certain pitchers, who don’t blow hitters away necessarily but control the ball well and change speeds to, theoretically, induce weak contact. In fact, Maddux’s FIP was well above his ERA for most of his prime.

That’s my problem with FIP: you often hear about “exceptions” to McCracken’s theory (Matt Cain? Many sinker ball pitchers? Every knuckleballer?). That makes it hard to use FIP as the end-all pitching stat, the way ERA has been used.

FIP really isn’t the end-all pitching stat. FIP is merely the number at the opposite end of a pitcher’s performence range as ERA. ERA gives pitchers all the credit/blame for BIP, while FIP gives them none. The truth is somewhere in between, but for most pitchers its closer to their FIP.

His career ERA is 3.16 and his FIP is 3.26. That works out to 55 runs (over 5000 innings). I don’t think it is inconceivable that Greg Maddux defense saved him 3 runs a year for 20 years relativise to that of an average pitcher.

By my simple calculation Greg Maddux recorded a put out or assist on 11% of the balls in play (as estimated by taking Inning * (3+WHIP) – SO – BB – HR which works out to 1740/15884)

Peers Roger Clemens and Randy Johnson did the same on 5.2% and 6.5%. Over 200 innings that means that Greg Maddux helped himself about 50 times more than Randy Johnson would have and about 40 times more than Roger Clemens. Fair to assume that added up to quite a few runs.

Now Randy Johnson’s ERA is 3.29 but his FIP is 3.19. That would imply that for the defense to explain the difference Greg Maddux would have to save 6 runs a year by fielding those 50 extra balls.

Roger Clemens’s ERA is 3.12 and his FIP is 3.09 which implies that Greg Maddux would have to save 4 runs a year by fielding those extra 40 balls.

PS See the comment in the Grantland article about how Mark Buehrle’s defense is more important relative to an average pitcher than Andres Simmelton’s is relative to the average SS PER INNING fielded. The reason is not that Mark Buehrle is better than Simmelton but that Buehrle’s comp group is other pitchers and Simmelton’s is other SS’s

While Maddux was supremely positioned after throwing the pitch, was really quick off the mound, was great at going up the ladder to snag choppers up the middle, and snagged everything that he could reach, it’s a stretch to say he snagged 40-50 extra balls/year…. In 32-34 starts that would make for almost 1.5 extra balls per game. That’s a lot, in that you’d expect that most fielders, especially pitchers would get enough non routine chances every game to make 1.5 excellent plays/game. That said, it never ceases to amaze me how out of position(often falling to one side and almost turning their back to the batter) and how poor many pitchers are at fielding balls hit up the middle. With some, any ball hit up the middle is an automatic hit. So, if that’s the baseline, maybe 1.5 is feasible. I know Randy Johnson was a terrible fielder and Clemens was much better at fielding broken bats than balls.

Well, then you have to look at the other factors. Let’s take, arbitrarily the 1993 season, a CY year for Maddux. 2.36 ERA, 2.85 FIP. So, FIP is saying essentially Maddux got half a run lower ERA based on factors out of his control. So, stadium. The Braves were still at Fulton County Stadium. Known as the “launching pad”, this was a hitters park. OK, must be defense. Outside of Mark Lemke and maybe Otis Nixon, plus Maddux himself, this was a group of slightly above avg, but not great defenders. Jeff Blauser, Terry Pendleton, Sid Bream, David Justice, Ron Gant and Damon Berryhill do not bring to mind Belanger, Brooks Robinson, Paul Blair type of defense.

So, what factor is it that accounts for that half a run difference? Was the sun in the opposing batters eye? No, most likely, FIP gives you averages that are largely true if you look at most individual players. But there are always variations that averages do not catch…. Which is why statisticians use standard deviations to track variation. Half a run could be within the normal range of expectations, meaning “close enough” or it could be an outlier which would require more analysis or be dismissed altogether as an anomaly.

My theory would be that Maddux’s great control and late movement could have caused more weakly hit, easily fieldable balls. But someone would really need to analyze every ball in play to see if, in fact, that is true. But you can’t state that extraordinary defense or very favorable park effects accounts for the difference in ERA and FIP.

So, you’re going to toss it off to luck? That’s kind of a lazy analysis. His WHIP was 1.049, and he only gave up 14 HRs in 267 innings. So there weren’t a lot of base runners to “sequence”.

In 1994, Maddux only gave up 4 HRs, walked 31 and had a .896 WHIP in 202 innings. ERA 1.56 FIP 2.39. That is .83 difference….I guess because he “only” had 156 Ks.

I think Maddux breaks the calculation. Like with WAR, there are some who have numbers that defy common sense.

I think the “weighting” of these “indexed” type numbers are always suspect. Those who created them tweaked them until they made sense, on average. But they are never going to be completely accurate for everyone in every situation. But they are good in the context of all the other numbers that are available.

One problem: if FIP’s constant is based on the run environment, wouldn’t it be easier to post a sub 2 FIP in the current year when scoring’s down than it was in the sillyball era? Doesn’t that make Kershaw’s achievement less impressive than it sounds?

Kershaw is currently 210 so pretty good. Of course nothing compares to Pedro’s 7 year run

219, 163, 243, 291, 188, 202 and 211. The only comparable pitching achievement to Pedro’s run is Mariano Rivera’s post season stat for which there is no ERA+ recorded and I can’t be bothered to calculate all the runs scored in all those postseasons to calculate it myself.

That’s true. FIP-, which is park- and league-adjusted, accounts for that. 100 is league average, below 100 is above-average (and better and better as it gets smaller and smaller), and above 100 is below-average.

As an example, in 2000 Martinez posted a 2.17 FIP and a 46 FIP-. This year, Kershaw is posting a 1.89 FIP, but a 52 FIP-. Relative to the league, Kershaw isn’t quite as dominant. (Martinez’s 1999 1.39 FIP was good for a 30 FIP-, which it may not surprise you to learn is the best season of all time, by far.)

It’s a bit more complicated. Per fangraphs, the FIP constant is equal to:

FIP Constant = lgERA – (((13*lgHR)+(3*(lgBB+lgHBP))-(2*lgK))/lgIP)

It’s designed to make it so the league ERA and league FIP are the same, so it’s just equal to the difference between the league’s ERA and unadjusted FIP .

The constant for 2014 is (very slightly) HIGHER than usual, so you could say that in a way it makes Kershaw’s accomplishment (very slightly) MORE impressive – though really, we’d have to take into account the quality of the Dodger’s defense relative to the quality of the league.

If I’m following correctly, a high FIP constant reflects a larger than average number of runs generated by balls in play, for the league as a whole. I’d be interested in trying to think about the way it’s fluctuated over time. I don’t think its peaks correlate with high run scoring environments — see the chart for cFIP here:

It is not necessarily “control” but rather be solely responsible for (until they figure out a way to assign blame between the pitcher and and the catcher for good and bad calls). Of course these days the data on framing more or less throw out the assumption that a pitcher is solely responsible for BB’s and SO’s so FIP is not as clear cut as it used to be.

Except not always. How many times a year do you see jam shot home runs. Home runs where the batter was way out in front or late?

How many times did Vlad Guerrero or Berra hit home runs on pitches they had no business even swinging at.

It’s not super common but if Johan santana throws a diving change two inches outside and four inches above the ground and Vlad Guerrero homers on it that says nothing about his talent or even the quality of that particular pitch.

In Yankee stadium its a weekly occurrence for guys to hit good pitchers pitches for home runs.

Occasionally right handed batters will popup to the stands over the right field fence.

I like fip for estimating how a guy will do going forward but all that really matters is how many runs score.

For judging each individual season era+ or better yet RA+ is a more useful stat.

FIP is great for finding guys who underperformed their ability and for appreciating truly great power pitchers.

O fcourse FIP does ignore balls that are not HRs and that good or bad defense play no role in. This only occurs in some parks, but it must play havoc for FIP in Fenway Park, for instance, where a number of balls are hit in a place that the fielder cannot possibly catch the ball, but it isn’t a home run.

When a stat is useful in the evaluation of certain types of pitchers but not others, then that diminishes its value. I would much rather have a pitcher who regularly induces weak groundballs than one who has high strikeout totals. The former pitcher throws fewer pitches per plate appearance, and, in today’s environment especially, pitches longer into games. That, in turn, saves wear and tear on bullpens.

Actually it’s been shown that strikeout pitchers ultimately throw fewer pitches than non-strike out pitchers, because they allow fewer men on base. Groundball/non-K pitchers are going to inevitably allow more hits, and thus have to face more batters, which raises the pitch count.

While pitchers don’t have anything approaching complete control over batted balls, they do have some input on the trajectory – the idea of a “groundball pitcher” is borne out in statistical results. Flyball pitchers are more likely to give up home runs, and FIP rightly judges them more harshly for that (although it does not make any underlying correction to the fact that grounders have a better average BABIP than flies).

xFIP attempts to further account for this, by replacing home runs allowed with fly ball rate, multiplied by the league-average HR/FB rate. The intent of that is to limit the effect of home park on the pitcher’s FIP – Kershaw, for example, has a better FIP than xFIP every single year because Dodger Stadium is a bad hitters’ park. On the other hand, guys who play in stadiums with short fences will generally have better xFIP than FIP.

And that’s why I have a real problem with xfip. Why on earth should we expect a pitchers hr/fb rate to naturally regress? Guys who are terrible allow far higher percentages of their batted balls to be crushed.

Xfip is a perfect example of a stat that is really only useful for comparing a pitcher to himself. If a guy has a season with a markedly different xfip that could possibly mean he over our underperformed due to bad luck on fly balls.

It could also mean he just stinks now.

Xfip has very little practical use and even worse there are a bunch of people running around using it like it’s a telling stat. The existence of stats like xfip give people easy access to numbers to back up specious arguments.

Not the stats fault but xfip is the mist commonly abused pitching stat around imo

After all this fapping about FIP, there’s one aspect of Clayton Kershaw’s career that leaves him in the dust behind guys like Gibson and Koufax: he’s been lousy in the post-season, and the bigger the game, the worse he’s performed.

Kershaw has been having a career year thus far, so it will be interesting to see what he does in this post-season, should the Dodgers reach it, which appears likely. There was one game that Kershaw pitched in so far that had a post-season feel to it, a match-up against the rival Los Angeles Angels, coming at a time when both clubs were leading their division and playing great baseball. For whatever reason, Kershaw was tight in that game, struggling early, and he was lucky to escape with a no-decision.

If the Dodgers want to win the World Series, they need Kershaw to channel his inner Koufax or Hershiser and dominate, and not continue in the footsteps of Don Newcombe, another Dodger great with a poor post-season record. In 1956, Newcombe had a career year, winning both the Cy Young and MVP (as Kershaw threatens to do this year). In the World Series, he went 4.2 innings and gave up 11 runs, all earned. I have no clue what his FIP was.

Perhaps, if the Dodgers want to win in the postseason with Clayton Kershaw, they should try something crazy like, say, scoring when he pitches. Crazy idea, I know. But maybe their 2013 NLCS strategy of “Clayton Kershaw can allow negative runs, so our hitters can just take today off” could use some tinkering. Because last I checked, it’s impossible to win a game when you’re shut out.

Kershaw’s actually allowed two or fewer runs in four of his six postseason starts. But hey, since those other two starts represent less than 1% of his career, we should totally focus on those.

In only one of those “quality” starts did he even go 7 innings (6.2, 7, 6 and 6). And in the other 2 of his 6 post-season starts, he allowed 5 runs in 4.2 innings, and 7 runs in 4 innings. He also allowed runs in 2 of his 3 relief appearances. His total post-season ERA thus far is an ugly 4.23. Sandy Koufax he ain’t.

I’m always amused by people who point to the small sample size of the post-season as a reason to dismiss the results. Performing well in these do or die situations is the whole point of the playoffs. When Kershaw had his meltdown in the elimination game against the Cardinals last year, he lost the luxury of coming back to get them next time. Maybe his two disastrous outings in October represent represent only 1% of his total starts, but they represent 33% of his post-season starts, and for a pitcher of Kershaw’s caliber, that leaves a black mark.

If the best Kershaw can do this post-season is go 6 innings, give up 2 runs and hand it off to the Dodgers suspect bullpen, the Dodgers don’t have much of a chance. They need Kershaw to dominate from start to finish. Maybe he can. Maybe he can’t. That’s what makes it interesting.

Post Season:
==========
Kershaw pitched 2 innings in 2008: 1 run in 2 innings. This was before he discovered his slider. His rookie year.

In 2009 he allowed 2 ERs in 6.2 innings in the division series. And then he got mauled by the Phillies.

In 2013 he pitched twice in the division series. The second game on short rest: 3 runs; 1 earned in 13 innings. Low innings because Mattingly was trying not to overwork him. But he did. His pitches against St. Louis were elevated and St Louis mauled him.

So his post season average looks terrible. I hope Mattingly will not let him do that again. But he was good for a series in 2009 and brilliant for a series in 2013.

Yes although the difference between era and ra is almost always so small it doesn’t really matter. So one may as well use the stat that most people are familiar with. The modern game features very few errors and in fact errors get less and less common every generation. Occasionally a guy will have some significant difference but it’s rare.

Of course it should be ra for relievers because a couple of unearned runs can skew things in tiny samples.

Then again despite my preference for era over FIP for starters when it comes to relievers FIP seems a much better to to me.
First off high strikeout relievers are important.

Second the samples are so small one bad outing can make an amazing year look pedestrian.

Although to argue with myself Mariano Rivera may very well be the all time poster child for pitchers in fact having control over the type of contact produced.

The mountain of broken bats, the good but not get k numbers, and the ridiculous amount of weak grounders over two decades are proof that certain pitchers (ones with late breaking two seamers or cutters or knuckles for instance) can in fact consistently produce weak contact on balls in play.

Certainly Rivera had a major part in all those broken bay flares and pop ups.

Depends on the pitcher. Over the long haul the difference between era and FIP is usually like.02 or on the extreme side .15 or so. So in reality most guys have FIPs that are pretty spot on with their era.

The only problem is that there are more than a few guys where fip is suddenly massively different. Like .50 up or down.

There are always some decent control bad command power pitchers who look like world beaters with FIP but not era. These are always the guys you hear announcers talk about “not living up to their stuff”.

Then you’ve got guys who rely on late movement but not big strikeout guys who for whatever reason consistently do way better with era than FIP. Old stats guys call them” crafty ” new stat guys call them “lucky” it’s almost certainly a bit of both.

But really the argument around these stats is always about like five guys. Most pitchers are pretty darn close to their FIP

It works because it is simple, and it takes the two parts of Mccracken’s accidental discovery that make the most sense – outs without a chance of getting a hit, players on base without a chance of getting an out.

The original formula grossly overvalued the home run rate, and the next generation tried to account for regression and/or park effects to reduce that – essentially saying that the home run rate was less under a pitcher’s “control” than we first thought.

Later more complicated versions tried to account for LD% (and other types of batted ball percentages), and while they are on the right track, there is currently no really accurate measurement of line drives, and the different methods produce drastically different results. Many of these are called “ERA estimators” which is an apt name, as the multipliers used with the different variables mostly have the purpose of getting their stat to match ERA over a large sample size. Results oriented, in other words.

I prefer to look at the simple rates of K’s and Walks, rather than a contrived metric whose purpose is to match the very ERA number they are somewhat trying to debunk. I also think extra base hits are alot closer in luck factor to home runs than they are to singles, especially when adjusted to park effects, and they are ignored by all of the metrics.

To be honest, I even have a caveat for looking at K rates, as I think the pitchers that get their K’s more through power than movement have a tendency to give up more solid contact when they do get hit, and therefore higher BABIPs and Slugging against, but quantifying that is a more complex problem, even with the advent pitch/FX, and hit trajectory and speed data. It is certainly beyond me. There is plenty of evidence there but it will take someone smarter than I am to catalog it.

In the end, I think Mccracken stumbled onto something real, but I think we have further muddled it since then by trying to bend our efforts to match ERA (Even if we were trying to match, it should be matched to runs allowed per nine, rather than ERA), and done a disservice by taking the leap that the pitcher controls absolutely nothing else. K rate and Walk rate are obviously the two most important factors, but by sticking our head in the sand about other factors that are real, though perhaps not as important, it has stunted progress in this area.

And that’s why all the defensive metrics and the fielding independent metrics are not perfect. If (hopefully when) hitfx goes public we will finally be able to see who is generating weak contact consistently. We will know who is allowing weak grounders and who is allowing scalding ground balls.

And without hitfx unfortunately we have to take all defensive metrics with a mountain of salt.

Which is why i value guys by taking offense war and half of defensive war.

While FIP is probably better indicator of quality, ERA is better measure of success. I think BABIP and FIP are great tools when assessing player value going forward. They should absolutely be key tools when mapping your roster, but I don’t think they should be predominant factors when discussing individual awards. Success via luck is still success. If a batter hits a 10-bouncer for seeing-eye single, it is still a hit. A 410 foot drive caught at the wall is still an out. For batters, we seem to rely on those evening out over the course of a season. If a batter encounters an enormous amount of good luck over the course of a season, it still “counts”. Pitchers, on the other hand, we seem determined to find out who was lucky, as opposed to letting success/failure stand on its own terms.

Over 600-700 at bats, or 200 innings, you get a pretty good sample size. So “luck” would have zero to do with results. Unless you consider luck to mean having Andrelton Simmons at SS or playing home games at Petco. You can point to defense or park effects as factors, and many do that. But just because those are out of the pitchers control, that doesn’t equate them with luck

No. Luck still has a LOT to do with results. Go home, flip a coin 600 times, and tell me that you got ‘heads’ 300 times. I’m willing to bet that won’t happen. Luck, chance, happenstance – whatever you want to call it – has a TREMENDOUS impact on statistics, even over a full season’s sample.