Wednesday, April 20, 2011

When Bill James showed how runs could be cleanly converted to wins and losses (using the Pythaogrean Formula from last week), it opened up the door for studying the impact pitchers have on wins and losses. After all, we measure pitchers in runs - earned runs usually, but runs, nonetheless. It's fairly trivial to swap out one pitcher out for another, compare their ERAs, and estimate how many fewer runs one would give up versus another. Now that we can convert those runs to wins, it was easy to estimate the impact that would have on the team.

But how do you do the same for batters? Count their RBI? Their runs? Some average?

James studied the problem by backing away from it. Instead of looking at individual players, he looked at teams. Could you guess how many runs a team would produce given some of their other statistics? What he found shaped a generation of analysis. Just like predicting wins, it was a fairly simple formula - so simple that what was striking was what was NOT in it.

If you added up all the total bases for the Twins last year, you get 2347. (By total bases, I mean one base for each single, two for each double, three for each triple, and four for a home run.) Those bases were the result of 1521 hits. There were also 559 walks and 5568 at-bats. What James found is that with that info, he could estimate the total runs the Twins - or any other team –would score. You just do the following:

1. Add up the H and BB (essentially the number of times a team got on base).2. Multiply that by the total bases that team had (essentially the power the team displayed).3. Divide by the sum of AB plus BB (essentially the plate appearances the team had).

He called this value Runs Created. Go ahead and figure it out for the Twins. I’ll be over here playing Angry Birds.

The Twins actually scored 781 runs, sixteen runs less than what you just figured out using James’ formula, so it was off by about 2%. If you go through the whole American League last year, every team was within 10% of their Runs Created. Only four teams were not within 5%.

Now you have a way to measure runs for hitters, because players have these stats, too. And if a team of players can produce that many runs with those stats, it seems equitable to award each player with the stats for which they were responsible. So in 2010, if Joe Mauer tallied 239 total bases, 167 hits, 65 walks in 510 at-bats, one could figure out how many "runs created" he was worth:

(167 + 65) * 239 / (510 + 65).

(I won’t ruin the surprise, though I suppose someone in the comments could. Figure it out and you’ll see why sabremetric guys tend to like Joe Mauer a lot. ESPECIALLY if you take the time to figure out how many Runs Created Drew Butera is worth.)

The most striking thing about that formula is what it does NOT contain. No stolen bases. No clutch hitting. No bunting, no moving the runners over, no little things. It contained two things - getting on base and hitting for power. In fact, you could even rewrite the formula to include the two stats that MEASURE getting on-base and power:

On-Base Percentage * Slugging Percentage * At-bats

So where is that other stuff? It has to be somewhere right? Well, James revised his formula to add stolen bases, and then to add being hit by a pitch. And then others took the same idea and started adding additional factors to it, and this is where a good chunk of the alphabet soup that plagues sabremetrics came from. Each attempt was to get a little bit better at predicting a team’s total runs, and then apply that formula to individual players.

But it didn’t stop with just being more precise with more stats. The next step was comparing the impact of hitters to pitchers. Or hitters of one era (where power might be more plentiful) to another era (where speed or getting on base was more prevalent). Or trying to add defensive ability. Or major leaguers to minor leaguers. Or to include some defensive impact.

But these formulae almost all have Runs Created and Pythagorean Formula deep inside them as their engine:

1. On base times power equals runs.2. Runs can be converted to wins.

For better or worse, these are two of the cornerstones upon which a good chunk of sabremetric study is built.

9 comments:

"When Bill James showed how runs could be cleanly converted to wins and losses"

But he didn't show that. What he appeared to demonstrate is the opposite, that the number of wins a run will produce varies.

Moroever, the relationship between run differential and win/loss record his formula suggests is very, very loose. It can be off by as many as 10 games or more, a range of 20 games. In baseball terms, that is pretty much meaningless.

This is classic case of people being unwilling to critically evaluate a claim because it provides the foundation for numerous other studies. You could say its classic junk science. except its not really science at all. Sabermetrics is largely an interesting simulation game. Entertaining for the people who do it, but not useful for evaluating real baseball.

The problem with Runs Created is different. The fact that having a lineup balanced between getting on base and hitting produces the most runs does not mean that is true for individual players.

Its not hard to see that having a player who gets on base a lot batting in front of player who hits a lot of home runs is going to produce more runs than two players who are average at both. A player cannot both walk and hit a home run in the same inning.

In fact, this is a fundamental flaw in a lot of player evaluations. Players with one outstanding major skill are more valuable than players with mediocre/average skills across the board. You can see this with Gardy's difficulty in getting Luke Hughes in the lineup. Defensively, Casilla and Tolbert are better. And when the situation calls for offense, Thome is a better option (at least in theory).

Which brings us to the last flaw in a lot of the statistical analysis, including runs created. That is the belief that "sample size" somehow overcomes a lack of randomness in the data. It doesn't. And baseball data is decidedly not random. Managers and players are constantly seeking to optimize the results based on the situation and the skills of the players. This makes the average actual result statitics measure only a very rough approximation of the likely result on average if there was no effort to optimize results.

Sabermetrics requires a large dose of suspension of disbelief in order to be entertaining.

@TT - do you not understand the Pythagorean RC? It shows a fairly strong correlation between runs scored, allowed, and winning percentage, ergo, "Bill James shows how runs could be cleanly converted to wins and losses."

Of course there is a correlation between run differential and wins. But you are misunderstanding what that means, because this claim does not logically follow from that correlation:

" ergo, "Bill James shows how runs could be cleanly converted to wins and losses.""

Aside from the flawed reasoning, in fact the Pythagorean formula doesn't "cleanly convert" to wins and losses. Most seasons it is off by more than 10 wins for an individual team and 5 or more for over a third of the teams. So it isn't accurate even within a range of 20 games.

It is certainly true that there should be corelation between run differential and wins. The reason that it isn't as clean as The Geek suggests is that you don't get extra wins for winning 8-2 as compared to 3-2. If you happen to have a rather inconsistent team where you put up big scores sometimes and little or nothing a lot of other times, you can have a rather nice run differential without the corresponding wins. The same could be true with an inconsistent pitching staff.

To some degree this was true of the early 2000's White Sox and A's. Both teams had power and patience offenses with strong starting pitching. Their saber numbers looked great. They often put up big offensive games and their relatively strong starting pitching meant that they had a lot of 8-2 games. They also seemed to loss a lot of closer games, however. Neither team manufactured runs well, with lower batting averages and slower players. This sometimes affected their defenses as well, by not getting to balls that faster teams like the Twins got to. It seemed like everytime the Twins played the White Sox in a 3 game series, the White Sox would win one game big, and the Twins would win 2 close games.

I think when you look at saber numbers, you need to remember that they are often just as limited as traditional numbers. Also when they are combined to create other saber numbers like VORP or WAR, you often compound the weaknesses.

I feel that most saber numbers tend to over value power. Whether it is power hitting or power pitching. If you keep that in mind, you can sometimes get use out of those numbers.

Following up on TT's post about Runs Created, I notice that the formula isn't additive--that is, if you apply it to each player and add up their RCs, the total is not the result you get by using the team's aggregate stats in the RC formula. Don't know how big the differential is in practice, but that kind of logical inconsistency is a strike against the formula in my book.