Wednesday, September 26, 2012

This is no knock on Cabrera or the Triple Crown. Both are impressive. It’s just that whether Cabrera wins the Triple Crown or not has absolutely no bearing on the MVP. Look at it this way: Josh Hamilton hit his 43rd home run last night, breaking a tie for the home run lead with Cabrera. Does a home run by a Texas Ranger (or, to flip it, blurry vision by that same Ranger that kept him out of the lineup over the weekend) decide whether a Tiger or an Angel wins the MVP? Is Cabrera less worthy because Hamilton hit a home run?

The Triple Crown is cool in the way that typewriters, rotary phones or vinyl records are cool: There’s an undeniable acquired history there, and it’s been so long since most people have seen one that now there is great novelty to it. But the truth in all cases is that we know better now.

The Triple Crown is an interesting relic from baseball cards and Sunday morning newspapers, which used to be the way people found out batting averages of major league players. Actually, it wasn’t until the 1940s that the Triple Crown held much resonance in baseball at all. RBIs did not become an official statistic until 1920. The home run didn’t become a regular staple of the game until Babe Ruth popularized it in the late 1920s. In the days of Cobb, people thought batting average, hits and runs were the most important offensive categories.

...The bottom line is that the AL MVP should be . . . too close to call right now. Trout did enter the month with a huge gap in value, but Cabrera has closed it considerably. If you believe in impacting games in more ways more often than anybody else, Trout is your MVP. If you believe in the greater volume of hitting stats, particularly down the stretch, Cabrera is your man. And if you believe in the Triple Crown defining the MVP, sit down with a big hunk of blueberry pie and hope the idea passes.

Reader Comments and Retorts

Statements posted here are those of our readers and do not represent the BaseballThinkFactory. Names are provided by the poster and are not verified. We ask that posters follow our submission policy. Please report any inappropriate comments.

It really didn't take that long - for selected players. Rogers Hornsby, for example, got Ruth's idea in 1920, which is a lot of why his early 1920s seasons are so ungodly good. By 1922, Ken Williams and George Sisler had it. There were other players on other teams. But the Yankees still tried to trade Lou Gehrig, and got turned down, which a thoroughly-indoctrinated management (on either side of the trade) would surely never have done. John McGraw over in NY never really bought into the homer thing; that took Bill Terry. Overall, it took heavier and harder in the AL, which is part of why the AL was stronger than the NL in the 20s and 30s. (The other main factor was that the AL was the first to really recruit in the Deep South, and so they got the Ty Cobbs and Eddie Cicottes and Tris Speakers and Joe Jacksons. When the AL started, they were desperate to establish themselves as major league teams ASAP, so they looked for untapped sources of talent, and found an unmined Confederacy. Now, if they had also agreed to recruit those southerners with those really deep tans....) But many teams still relied on the slap single, the hit and run, the sac bunt and the stolen base for offense. That's the big story of the 1927 series. It was a true test of offensive philosophies. The Pirates had a total slap and go offense, with people like Glenn Wright and Pie Traynor hitting fourth and fifth behind the Waners (which is why Traynor has a surprising number of RBI). The Yanks had Earle Coombs and a bunch of guys who could hit a baseball 400 feet. The homer won. - Brock Hanke

I do have to say it's surprising that RBIs are SO irrelevant nowadays that the Triple Crown isn't a bigger deal. I don't remember a serious Triple Crown chase in my period of baseball fandom (1998-present), so had expected it to be something everyone was hoping for with bated breath.

I don't remember a serious Triple Crown chase in my period of baseball fandom (1998-present), so had expected it to be something everyone was hoping for with bated breath

The first season I ever followed was 1967, and there had just been a Triple Crown in '66 as well, so I got the initial impression that it was sure to happen again soon. One of those things I'm still waiting for, like that Academy Award Peter O'Toole is fixing to win any year now.

4 - But so are an individual's run scored, and Mike Trout is getting very little traction as the AL leader in that category. Shouldn't he get the same as Miguel Cabrera is for leading the league in RBI? Anybody who values those two stats as pretty much equal is valuing them correctly. Sure, some people here might be undervaluing RBI, but most of the backlash is against people who treat RBI like the ace of spades, and runs scored as the six of hearts.

4 - But so are an individual's run scored, and Mike Trout is getting very little traction as the AL leader in that category. Shouldn't he get the same as Miguel Cabrera is for leading the league in RBI? Anybody who values those two stats as pretty much equal is valuing them correctly. Sure, some people here might be undervaluing RBI, but most of the backlash is against people who treat RBI like the ace of spades, and runs scored as the six of hearts.

Definitely, Runs are just as important.

I'm just reacting against the tendency to prefer theoretical value over actual value delivered.

Projecting player performance/establishing "true talent" is not the purpose of baseball.

This. Actually, it's not even a purpose of baseball. It's at most a baseball-related activity or something that might be able to be accomplished from the data generated by baseball -- much as fantasy or rotisserie baseball are.

#7 The problem is that raw rbis are heavily influenced by factors that have zip to do with ability (specifically batting order position and the OBP of the 3 players in front of him).

I'm open to the argument that the difference between predicted rbi and actual rbi has value (though you'd want to use a sophisticated rbi estimator -- Tom Ruane has a good one) but rbi in itself has essentially no value if you know a player's SLG. A player's rbi total is an extremely predictable product of opportunity (which a player has no control of) and power (most simply represented by SLG)

How is an RBI not something that records value provided on the baseball field? There might be better indicators of value, and the RBI may be dependent on existing game states, but how does the RBI not pass at least that important threshold?

NOTE: This isn't being asked with any misunderstanding of the effort to tease out "true talent" and repeatability. It's being asked knowing all that, yet entirely on its merits.

to add to Brock's piece of history - the prevailing wisdom of the time is seen by the off-told and I presume completely true anecdotes about Priate fans making bets with Yankee fans about how the Waner brothers would "outhit" Ruth and Gehrig in the 27 Series. Which they did. Of course the fact that Gehrig-Ruth pair drove in 11 runs in the 4 game sweep to the Waner's 3 calls into question the importance of the esteemed stat of batting average....

I believe RBI Percentage stat (RBIP) will (and should) become the statistic Sabermetric types use to "compromise" with old-school fans, a polite way of bringing them into the fold of their thought process. I'm betting this becomes a primary topic of chatter in 2013.

4 - But so are an individual's run scored, and Mike Trout is getting very little traction as the AL leader in that category. Shouldn't he get the same as Miguel Cabrera is for leading the league in RBI? Anybody who values those two stats as pretty much equal is valuing them correctly. Sure, some people here might be undervaluing RBI, but most of the backlash is against people who treat RBI like the ace of spades, and runs scored as the six of hearts.

Miguel Cabrera is 2nd in runs scored, so it isn't like he is some slouch there either.

How is an RBI not something that records value provided on the baseball field

You are conflating different values. Team RBI record value to the team winning games, but an individual's RBI doesn't really record an individual's value as well as we can possibly do. If you just want to say that Cabrera was in the middle of a lot of run scoring that also involved good hitters like Jackson, Dirks and Fielder, then fine, but you haven't done much to isolate Cabrera's value.

15 - Saying Trout should be getting more credit for his counting stats should not be taken as a denigration of Cabrera in any way. This is a kind of, make that really annoying thing that has really taken off recently. Saying player A should get a lot of credit for something has nothing to do with player B.

#11 I'm assuming you're responding to me. I'll repeat, raw RBI totals are of no importance if you know a player's SLG.

If you're making some form of clutch argument then what you need to do is account for his opportunities (not merely his at bats with runners on -- though this works reasonably well). My (quick and dirty) rbi estimator predicts about 127 RBI for a player with a BA of .329 a SLG of .609 and 284 AB with runners on base.

The formula for estimated RBI is ABROB*(SLG*1.09-BA*.66) (wouldn't be surprised if it works less well these days -- it's approaching 2 decades old)

It also predicts 66 RBI for Trout. He's +12. While Cabrera has significantly more power, he's also got 69% more at bats with runners on and that's a big part of his RBI edge.

EDIT: A reasonable clutch credit is (Actual RBI - Predicted RBI) / 3 (But you'd really want to take a good look at the distribution of baserunners)

Now I know that the primary source of error in the estimate is the distribution of baserunners. Which is why if you want real precision you want to go with Tom's (formula available in his paper at Retrosheet)

A player's rbi total is an extremely predictable product of opportunity (which a player has no control of) and power (most simply represented by SLG

RBI should track better to total bases than SLG. Bonds and Mantle had extraordinarily high SLG, but lower RBI totals because of their unusual BB rates. Brainless MSM criticism of especially those two players -- e.g. "he's not as good a run-producer as [player X]" -- is one of the origins of the backlash agains RBI among BTF denizens.

You are conflating different values. Team RBI record value to the team winning games, but an individual's RBI doesn't really record an individual's value as well as we can possibly do.

Look, this is pretty simple. RBI are good for players to get. More RBI are better than less RBI. Hence, value.

Telling us that it doesn't record "an individual's value as well as we can possibly do" is both missing the point entirely and expressing a stathead truism that we have all known for decades. And you could make the same comment about any number of counting stats.

If you're making some form of clutch argument then what you need to do is account for his opportunities

My argument is far simpler and more fundamental than this. It's that RBIs measure valuable occurrences in a baseball game.(*) No more, no less.

Our quibble is with the quibbling with this seemingly straightforward observation.

(*) Miguel Cabrera comes up with a runner on 3rd and two outs. Instead of making an out, he gets a single and a run scores. The fact that his single caused a run to score -- as opposed to not causing a run to score -- should be noted and recorded for posterity. That's what the RBI does.

#17 There was a study in rec.sport.baseball (by Arne Olsen IIRC) about what he called rbi vultures. Long and short guys whose skill sets are optimized for driving in runs (relatively few walks, good ISO) don't produce extra value for a team. They tend to drive in a lot of runs and create relatively few opportunities for others in the lineup.

One example that I've used with some success in showing how strange RBI can be is to look at Hack Wilson and the 1930 Cubs.

Assume Rogers Hornsby was healthy and played as well as he had in 1929. He'd have batted 3rd and Cuyler would have moved to bat 6th (you might think they'd have gone English, Cuyler, Hornsby, Wilson. Not a chance. Cuyler was with the Cubs because of a bitter battle on the Pirates about batting him second. No chance McCarthy goes there -- as you can see from the 1929 batting orders) Now the team would obviously have scored a lot more runs. And yet Wilson would have driven in fewer. Yes, Hornsby had a higher OBP than Cuyler but it's almost perfectly explained by the difference in home run power.

So Hornsby is going to drive in more runs than Cuyler and be on base about the same number of times (Cuyler was the fastest guy in the league and the speed difference might also cost a few RBI). Wilson's no worse. Team scores more runs but most of the difference goes in the #3 and #6 spots in the batting order. (#7 and #8 also pick up some extras)

A run scoring, as the Bear says, is pretty important in a baseball game.

Most of the time, a guy who has lots of RBI will also have lots of TB, and a high SLG, and many other indicators that show he's a good hitter. The interesting cases are the ones that don't align. Jeff Francoeur drives in 103 runs in 2007 despite being a pretty weak hitter by any other measure. Hanley Ramirez drives in 67 runs in 2008 despite being an awesome hitter.

As RonJ2 points out, most of the time there are obvious contexts that account for this; you could switch the two players' contexts even-up and the result would be a hilarious discrepancy, Hanley driving in 135 and Frenchy driving in 35, or something like that. And so the more interesting cases are those where the discrepancy between actual runs crossing the plate in front of somebody's hits being less a factor of context, and more of clutchness over a single season. Those are the most interesting cases of all, but the effects are relatively small. I do think clutchness over a season, if on a marked scale, should be taken into consideration in MVP voting, but the fact that I can't easily point to a case where it would be a big factor may be telling. Someone else might come up with a good one.

Basically, RBIs are very important, usually tell us something about a player's contribution that we know as well or better from other measures, and occasionally are very misleading. I imagine you'd accept that, Bear.

@ 7: but RBI are not a reflection of clutch hitting. They are more a reflection of opportunity and HR power, not that HR power is a bad thing. What I mean is if you look a almost any good hitter and see how many RBI he picked up, less HR, when he had an opportunity, say with RISP, you will see they all pick up the RBI about the same percentage of the time. Just because one of them hit 4 or 5 in the lineup and had more opportunities does not make him clutch or clutchier because he had more RBI overall.

What I mean is if you look a almost any good hitter and see how many RBI he picked up, less HR, when he had an opportunity, say with RISP, you will see they all pick up the RBI about the same percentage of the time.

Robinson Cano that show's he's been about 18 runs worse than his context neutral production

Yes, that would be the kind of season I was imagining in #23 (but with negative implications for the hitter). Cano is 8th in the AL this year in Runs Created (to grab a handy less-context-laden stat just for argument's sake), but 22nd in RBI. Unlike Trout (2nd in RC, 24th in RBI), Cano hasn't been a leadoff hitter; he's been batting 3rd or 4th for a high-scoring team. Absent other explanatory factors, the discrepancy becomes interesting.

the fact that I can't easily point to a case where it would be a big factor may be telling

Chipper's MVP in 1999 perhaps. I think he deserved it anyway but it would have been a very close vote if he hadn't smacked around the Mets in key games.

As a reminder of how the world has changed, Chipper had a 1074 OPS that year -- good for only 3rd in the NL.

However the fancy stats suggest Chipper wasn't hugely clutch that year* -- just 10th in the league in WPA, etc. -- and that Bagwell was very clutch, so fancy stats might give it to Bagwell. Or Bonds of course.

Man, I don't think I'll ever understand WPA (don't bother explaining it). Bonds had only 3.6 WAR that year, 4 oWAR. But he has nearly 8 WPA. How can you add 8 wins but be only 4 wins above replacement? (No a replacement player doesn't get 4 WPA, Chipper was 10th in the league with 4 WPA).

*Which may be true. He got the clutch rep with some big hits in Sept; he might have been un-clutch up until then and nobody would have noticed.

Ron, I don't think there's any question that your concepts are generally correct. It's just that the significance and the frequency of the top-end distortion possible in RBI has been exaggerated by sabe types to an extent roughly equivalent to the overvaluing that the RBI has sometimes gotten in MVP voting. Just as two wrongs rarely (if ever) make a right, two overstatements do not create a balanced perspective. The fact that MSM guys are following in the footsteps of the 1997 model of BP is, needless to say, not particularly encouraging.

However, the vast majority of guys with 100+ RBIs who have a .400+ RBI/TB ratio are guys having valuable seasons (of those in that category from 1961 to the present, 70% have an OPS+ of 120 or higher; only 4% of those guys--with good ol' Joe Carter being in that group twice--had OPS+ below 100).

As for your Cuyler/Hornsby-Wilson example, it's a little bit thick to be dissing the RBI with a guy who drove in 158 runs the year before he set the record! Wilson hit 17 more HRs in 1930 than in 1929; the guy batting #2 for the Cubs (Woody English) had a much better year in '30 than he did in '29, and the Cubs' #2 slot got on base a LOT more in '30 (.425 OBP as opposed to just .354 in '29). I just don't see this being much of an argument for RBI vulturing. Now, Maurice Van Robays (116 on just 230 TB and a 95 OPS+ for the Pirates in '40)--yes, absolutely. If Hornsby had stayed healthy in '30, the Cubs would have scored another 40-50 runs and some of those would have trickled to Hack--blending all of the effects, he might have only driven in.....180.

Bob, Cano's conundrum is very easily explained by his performance with RISP. In '11, 1.009 OPS, .636 SLG, 91 RBI; in '12, .747 OPS, .373 SLG, 41 RBI--in about the same number of PAs. He's also walking a lot more in these situations, which would drag down RBI as well. The Yanks #2 spot (Granderson/Swisher) has vultured RBIs this year.

In 2001, Ichiro's MVP case was bolstered by his exceptional RISP numbers. His splits that year are remarkable: .445 AVG with RISP, .420 with men on base, and a "paltry" .313 without. I remember arguing the case on BaseballBoards.com (I supported Giambi or ARod someone else) with some mopes that believed that he was more or less unstoppable in clutch situations.

IIRC through his first three or four years in the majors Ichiro was hitting something like .380 with RISP. (Too lazy to do the math)

[23] Most of the time, a guy who has lots of RBI will also have lots of TB, and a high SLG, and many other indicators that show he's a good hitter. The interesting cases are the ones that don't align. Jeff Francoeur drives in 103 runs in 2007 despite being a pretty weak hitter by any other measure. Hanley Ramirez drives in 67 runs in 2008 despite being an awesome hitter.

As RonJ2 points out, most of the time there are obvious contexts that account for this; you could switch the two players' contexts even-up and the result would be a hilarious discrepancy, Hanley driving in 135 and Frenchy driving in 35, or something like that.

Half of the RISP states include a runner on first base. So, by using RISP, one does not remove the effect of runners on first or the concerns about different mixes of baserunners for different hitters. I would not be surprised if runners on first base comprise nearly 50% of runners on in RISP situations.

Using RISP is going to improve the absolute RBI Pct for hitters. Will it improve the relative ranking of a low overall RBI pct hitter? Probably not.

#7 The problem is that raw rbis are heavily influenced by factors that have zip to do with ability (specifically batting order position and the OBP of the 3 players in front of him).

I'm open to the argument that the difference between predicted rbi and actual rbi has value (though you'd want to use a sophisticated rbi estimator -- Tom Ruane has a good one) but rbi in itself has essentially no value if you know a player's SLG. A player's rbi total is an extremely predictable product of opportunity (which a player has no control of) and power (most simply represented by SLG)

SLG are correlated with raw RBI because each HR is also an RBI. One has to remove HR from a player's raw RBI total to see what pct of baserunners he drives in.

I would bet that if you look at (RBI-HR), its correlations with BA, OBP and SLG would show a far narrower spread among them than the correlations of raw RBI with BA, OBP, and SLG.

Given the spread in RBI Pct (net of HR) across hitters, a player's RBI total is more than just baserunners on and SLG.

bob, the correlation would vary according to era. Even as late as the 1930s, there were far more doubles and triples than there are today, so correspondingly more runs were driven in by that means than would be true today.

I believe RBI Percentage stat (RBIP) will (and should) become the statistic Sabermetric types use to "compromise" with old-school fans, a polite way of bringing them into the fold of their thought process. I'm betting this becomes a primary topic of chatter in 2013.

I obviously agree. Most counting stats except RBI already have an associated rate stat that puts the opportunity into context, usually using AB or PA as the denominator:

#7 The problem is that raw rbis are heavily influenced by factors that have zip to do with ability (specifically batting order position and the OBP of the 3 players in front of him).

#10 The problem is that raw runs are heavily influenced by factors that have zip to do with ability (specifically batting order position and the SLG of the 3 players after him).

According to posts in an earlier related thread, "triple crown" as an MLB item dates back to the 1930s, which for huge RBI totals was THE decade. That's probably why the RBI was featured in the TC, where runs scored had been considered the more telling stat during the dead ball era (helped by the fact that RBIs weren't even an official stat then.) IMO, they each have the same value as a record of what actually happened. And that appears to be the sense of most posts in this thread...

But why not use RISP? It's really not expected someone will drive a runner in from 1st.

Well I checked that too when I looked at RBI. RISP is indeed significant in explaining RBI (as in cuts the standard error by two runs), but it's not a primary facot in the rbi process.

First of all, while it's a lot harder to derive in somebody from first, generally speaking you get way more chances to drive somebody in from first.

Second (and probably more important) people tend to assume runner on second, single = rbi. Not close to being true. Unless Alfredo Griffin is the baserunner, you don't generally score a guy from second on an infield singe. Not only that, a single to right (or left) only scores the runner ~60% of the time (don't recall the numbers on a single to center but it's not that different. Now it's been a fair number of years since I've checked this, but I'm genuinely doubtful that it's changed a lot)

I would bet that if you look at (RBI-HR), its correlations with BA, OBP and SLG would show a far narrower spread among them than the correlations of raw RBI with BA, OBP, and SLG.

I have looked at this. ISO (SLG-BA) is the single most important factor (under the player's control). Given equal opportunities and a more or less typical distribution of baserunners, if two players have the same SLG the guy with the lower BA will tend to drive in more runs.

40: Bob, I do remove the runners from 1st. Not saying there's anything special, or eve worthwhile, to my method, but that's the way I've calculated it. I also have been looking at career numbers, not single season, so that probably explains why I get less variability than you did.