The Chart That Launched A Thousand Ships

Much has been made about Sean Forman’s article in the New York Times that took Ryan Howard down a peg. Forman deconstructed Howard’s glamorous RBI totals, illustrating that the statistic is more a function of opportunity than skill. I don’t wish to rehash the arguments about Howard and RBI that I’ve had before, but I made a chart that I found quite interesting.

The above plots every qualified player’s WAR along with his RBI total. As the trend line indicates, you can see a positive relationship. The more RBI you have, generally the more valuable you are to your team. The r-square, or coefficient of determination, is 0.2265. That is to say, generally speaking, 23 percent of a player’s value is explained by RBI (and factors relating to RBI).

To traditionalists, that will seem very low; to Saberists, it will seem high. There are a bunch of caveats with this, of course, such as a biased sample (only one year), but it paints a good enough picture. That red diamond you see is Ryan Howard. He is all by himself, with the most RBI but not nearly as much WAR as other players with similar RBI totals. In fact, a lot less.

That Howard is an outlier is enough to make people take one look and swear off Sabermetrics forever. But to immediately discard a theory because it doesn’t match up with your preconceived notions is a fool’s errand. All progress you see and have seen is because people set aside what they think they know about their world and open their mind to new possibilities.

In statistics, we accept that in every sample, there are going to be outliers, pretty much no matter what. The 68-95-99.7 rule tells us that in normally distributed data, approximately 68 percent of the data will be found within +/- one standard deviation of the mean; 95 percent within two standard deviations, and 99.7 percent within three. If you take a look at this table for higher deviations, you’ll see that all of the data can never be found within any deviation range.

To laypeople, an outlier is a sign of failure, that the stat is doing something wrong. Unless your statistic claims to account for all universal factors, how can that make any logical sense? I believe this is the biggest obstacle for laypeople when it comes to accepting Sabermetric principles. They see Howard with a MLB-best 95 RBI and comparatively-low 1.4 WAR and cannot reconcile the two.

Wins Above Replacement is far from a perfect metric and anyone that tells you otherwise does not understand the statistic. In fact, any self-proclaimed Sabermetrics adherent that tells you that the stats we have now can explain anything and everything is a crazy person. However, Sabermetrics are a cut above traditional stats, such as RBI and won-lost records. Sabermetric stats don’t have to be perfect, or even extremely accurate, for you to discard your older, more familiar but incredibly flawed metrics.

Let’s do some critical analysis of the RBI stat. Runs batted in. What does it tell us? Simply, how many teammates the player in question helped reach home plate.

Now, what does RBI not tell us? It doesn’t tell us:

How often the player in question has other runners on base

The base running skill of the runners the player is driving in

The scoring opportunities of the player’s hits (i.e. a player who gets a lot of extra-base hits is more likely to drive in runners than a singles hitter)

The player’s common spot in the batting order

The quality of opposition

Effects of ballparks on run-scoring

The Phillies’ number one, two, and three hitters in the batting order have on-base percentages of .336, .348, and .343, respectively. If, instead, Howard had hit fourth in the batting order for the Washington Nationals, with 1-3 OBP’s of .269, .289, and .352, would we still expect him to have 95 RBI?

In another alternate reality, let’s imagine that the OBP stays constant, but in one lineup Howard has three Jose Reyes clones ahead of him; in the other, three Adam Dunn clones. Each has an OBP of .340. Would we expect Howard to drive in the same amount of runs with each team?

Let’s imagine Howard switches over to the AL West. Everything stays constant except the ballparks. Instead of playing at Turner Field, Citi Field, Sun Life Stadium, and Nationals Park, Howard is now hitting in Oakland-Alameda County Coliseum, Safeco Field, Angel Stadium of Anaheim, and the Rangers Ballpark in Arlington. Don’t you think that the more pitcher-friendly parks of the AL West would have an impact on Howard’s RBI total?

If any of the examples above make sense — and I should hope that they do — then the flaws in RBI are apparent. Saberists are often accused of holding up particular stats — flawed ones — as the be-all, end-all of player evaluation. But when the same people making those accusations fall back on RBI, they are holding Sabermetrics up to a double standard. You don’t have to accept every tenet of Sabermetrics, or even Sabermetrics at all, to admit that the RBI stat is extremely flawed. All Saberists ask of you is to be consistent when you apply your criticism. I think this is at the crux of the emotional debates that pop up every time Howard and WAR and RBI are mentioned in consecutive sentences.

Be critical of Sabermetrics. It is always good to look at the world from a skeptical point of view; it is a necessary biological trait that has allowed the human species to prosper. But be level when you do so. Don’t hold Sabermetrics up to a standard you wouldn’t be willing to or are incapable of living up to yourself.