Sabermetric Research

Phil Birnbaum

Sunday, July 26, 2009

No comment

My friend John Matthew says he stopped reading after the first sentence. Here's the first sentence:

"The stat geek Bill James, who has made a fortune taking credit for having invented on-base percentage, last week revealed to the waiting world his position paper on steroids in baseball, which, essentially says that there is no harm in steroids and steroids have had no harm on baseball."

Nate Silver offers a sucker bet (updated)

Nate starts by quoting a conservative blogger, John Hinderaker, who scoffs at climate change by claiming his hometown of Minneapolis is abnormally cold this year. So, Nate says, to Hinderaker or anyone else, put your money where your mouth is. He'll bet $25 daily, on a continuous basis, that your hometown will be warmer than trend that day. You bet your $25 that it will be cooler than average. Bets settle at the end of every month, and are renewed until one side cancels. The canceling party's loss will then be trumpted on the other's blog.

I'm arguing that Nate is a little bit of a hustler here; the bet is tilted heavily in his favor.

Those skeptical of global warming aren't claiming that temperatures are getting colder; they're assuming that the climate is staying pretty much the same. With both parties assuming cooling has zero probability, that leaves only two possibilities:

1. temperatures being above average;2. temperatures being about the same.

Which means, according to the rules of the bet: If there's warming, Nate wins. If there isn't, then, on average, he breaks even.

I respectfully submit that "Heads I Win, Tails I Tie" doesn't exactly sound like an equitable wager. Nor does it sound like the kind of offer you'd expect from someone who's confident in his position.

For the bet to be fair, Nate should state his expectations for temperatures. Perhaps he thinks that, because of climate change, the weather should be warmer than average, say, 60% of the time. The skeptic might say it should be 50% of the time. So, in that case, 55% would be a fair bet, equally appealing to both sides. Silver pays $5.50 every cooler day, and the skeptic pays $4.50 every warmer day.

Now *that* is a bet where both sides have something at risk. And, on those terms, I'm willing to accept if Nate is. Officially, I don't meet Nate's requirement that I be a resident of the United States, but I do have access to US dollars. And the stakes are a little high; I'd be willing at the $5.50/$4.50, but warier about $27.50/$22.50. (But if the stakes are the only thing holding Nate back, I'd be willing to reconsider.)

Nate, are you interested in this fairer bet? Are you confident enough that at least 55% of days should be warmer than trend?

And if you're not willing to bet at 55%, because you think the true number is lower, then what do you think it is? What kind of bet are you willing to make that puts you at the risk of actually losing -- instead of just tying -- if you're wrong?

P.S. Please do NOT argue climate change here. I know little to nothing about science of global warming, and I do not consider myself a skeptic. My point is about Nate's proposed bet, not about climate change itself.

UPDATE/RETRACTION: In the comments, Doc suggests that Nate's bet really is directed at global cooling advocates. I assumed it wasn't. But I think Doc is right. Nate's original post quotes Hinderaker:

"... the current low level of solar activity suggests that the cooling trend could indeed be universal."

I missed that originally. I should have caught it.

Not to excuse my error in not reading carefully enough, but Nate doesn't explicitly mention cooling anywhere. In addition, his post is entitled "A Challenge to Climate Change Skeptics." And I'd argue that's overly broad; in my unscientific exposure to skeptical arguments (mostly from the National Post newspaper, which frequently publishes "denier" views on climate), only a small minority of skeptics argue for cooling. A more reliable count comes from skepticalscience.com. That site, which seems to be devoted to debunking the skeptics, has a empirical (but not scientific) list of the frequency of arguments it's found against the consensus view of global warming. A generous estimate is that cooling theories seem to comprise about 20% of the skeptical arguments.

So I'd agree that those 20% should be willing to accept Nate's wager; the fact that none of them has does, indeed, signal something about the confidence the cooling advocates have in their hypothesis. My confidence in the global-cooling theory is lower than it was before Nate's bet, and I thank him for the evidence.

But it's too bad the post didn't emphasize the cooling issue. I googled "nate silver climate bet," and randomly clicked on a few of the finds. None of them explicitly mentioned that Nate was betting only against cooling. And many commenters, on Nate's site and elsewhere, interpreted the lack of interest in the bet as reflecting on skeptics in general. But they shouldn't. An advocate of the "sunspot" theory would have no reason to expect colder temperatures, and therefore no reason to accept Nate's bet. Technically, Nate should cause my estimate of the chance the sunspot theory is correct to rise, not fall; since I am less confident about cooling, I should be proportionally slightly more confident about the remaining hypotheses -- since my intuitive probabilities still have to add up to 100%.

To me, and many others who are trying to make sense of the debate, the willingness or unwillingness to bet on one's views is a very strong signal of one's beliefs. Indeed, it's one of the strongest arguments to those (like me) who don't know much about the science. The climate debate is highly charged politically, and any apparent evidence will get trumpeted far and wide by those on that side of the political debate. That's why it's so important to get things right.

Thursday, July 16, 2009

Why don't reporters consult sabermetricians?

The piece tries to figure out which players are the most overpaid in baseball. To get a measure of production, they compare a player's raw stats to the average at his position. The article doesn't give an actual formula, but it says that

"we compared the major offensive stats--batting average, home runs, runs batted in and OPS ... of each player to the league average for starters at his position. Then we did the same with salaries."

The idea being, if player X hits .300, and the league average is .270, player X's salary should be 11% higher than average.

That's just plain wrong -- a .300 hitter is a lot more than 11% more valuable than a .270 hitter. You have to measure from replacement value. If the best available minor-leaguer would hit .240, then you can see that the .300 hitter is actually twice as valuable as the .270 hitter -- he gains 60 points over replacement, as compared to 30 points.(This doesn't mean his salary should be double; only the part of the salary above replacement level should be double.)

Of course, batting average isn't a very good proxy for performance; OPS is much better (although there are other measures evan even more accurate). And if you're going to include OPS, like the Forbes article does, why bother including the other statistics anyway?

And, since we're already shooting fish in a barrel ... why would you evaluate a hitter by his RBI? We've known better than to do that for, what, around 30 years now.

----

This brings up a question, which Tom Tango asks on "The Book" blog: How did this article make it into print? Shouldn't Forbes, which is a pretty sophisticated business magazine, know better than to run a financial analysis that's wrong on so many levels?

The answer, I think, is that writers and editors' don't care as much about accuracy when it comes to baseball. I can think of several reasons, off the top of my head. Feel free to suggest more in the comments.

1. People don't think of sabermetrics as a formal field of knowledge, one with subject-matter experts who should be consulted. Forbes would never have dabbled in this kind of layman analysis in the field of, say, engineering. Can you imagine an article on which bridges are most overpriced, based on dividing the price by the total length of the bridge compared to the average length of bridges? That would never happen. Forbes would consult an engineer, who would tell them that there are many, many factors affecting how much it costs to build a bridge, and that their analysis was simplistic, silly, and very wrong.

2. My perception is that among most of the media (the Wall Street Journal among the notable exceptions), sabermetricians are seen as fringe researchers, nerds with the "blogging out of their parents' basement" stereotype. Certainly a veteran financial reporter should be more expert at player valuation than a unpaid 24-year-old blogger? Well, no. In this field, it turns out that the experts are among those with the fewest formal credentials. And that might be difficult for the mainstream media to accept.

3. It's only baseball. Remember when Bill James, in the 1991 Baseball Book, reamed out David Halberstam for some awful inaccuracies in "Summer of '49". James wrote,

"It is frightening to think that Halberstam, one of the nation's most respected journalists, is this sloppy in writing about war and politics, yet has still been able to build a reputation simply because nobody has noticed.

"What seems more likely is that Halberstam, writing about baseball, just didn't take the subject seriously. He just didn't figure that it *mattered* whether he got the facts right or not, as long as we was just writing about baseball."

I think that's what's happening here. Baseball is frivolous, Forbes probably thinks, and so it's not worth caring about getting it right. It's easy to forget, when you're dealing with a subject you don't take seriously, that there are other people who do. Freakonomics thought it was the first to talk about baby-naming trends, and was criticized by serious experts whom it would have been good to consult. I've done things like that myself, assuming that the subject is so trivial that I must be just as much an expert as anyone else. It's usually not the case.

4. Sabermetrics is a field composed more of logic than fact; sabermetric analysis sounds like opinion, not science. What treatments work for Hepatitis? I don't know, and Forbes reporters don't know; it's a question of medicine, a question of fact. We can't even guess. You have to ask a doctor; they know.

But when it comes to evaluating a player's performance ... well, if you don't think about it too deeply, it seems like you don't need to know anything. Sure, just divide the salary by the numbers ... it seems reasonable and logical, doesn't it, that if you drive in 20 percent more runs you should make 20 percent more money? It doesn't seem like there's an underlying fact there, something that an expert has to tell you. You can come up with something plausible in your head.

And if someone tells you you're wrong, so what? It doesn't look like it's a question that has a right answer. If you can find two financial analysts, one who says a stock is worth $30, the other who says the same stock is worth $10, and it's OK to stick them on consecutive pages of Forbes ... then who's to say that figuring out the value of a player is any different? It's just one opinion instead of another.

This isn't a problem just in sabermetrics ... a few years ago, I read an economist (I think it was Steven Landsburg) with the same complaint. He said (and I'm paraphrasing) that economics is such that everyone thinks they understand it as well as people who have studied it for years -- people who wouldn't dream of having an opinion on a medical issue will confidently come forth with solutions -- wrong solutions -- to whatever economic problem is on today's agenda.

And, again, I think that's because a lot of it is logic. And logic is hard -- it's hard to reason through an argument than to memorize a fact. There are lots of arguments that sound plausible, but are, in fact, flawed. It takes work to figure out which is which, especially when you don't have enough background to be able to avoid the common fallacies, when you might be too proud of your own flawed logic to accept that you might have left something out, and when you might, ignorantly, be rejecting certain principles that sound silly to you but that the field has empirically found to be correct.

It's funny, too, because reporters will be willing to publish their half-baked theories without a second thought -- but when it comes to a specific mathematical calculation, they'll often cite a source, even when the calculation is simple enough to state as fact. It's as if they know their math *facts* might be shaky, but believe their mathematical *arguments* are beyond error.

Saturday, July 04, 2009

Is Tiger Woods Irrational?

"Loss Aversion" is a kind of irrationality where people care more about avoiding a loss more than about securing a gain of the same amount. For instance, if one person wins $50, while another identical person loses $50, the unlucky gambler will gain more unhappiness than the winner gains happiness. Because of this, people will expend more effort to avoid incurring a loss than they will in pursuit of the identical gain.

This is irrational; if it takes three hours to save a loss of $20, but you can earn $10 an hour, you're worse off if you spend the three hours to save the $20. That's because if you accepted the loss, and instead spent those three hours earning $30, you'd be $10 ahead. But since, as it turns out, humans tend to value losses as about twice the value of the identical gain, there should be a tendency to spend time trying to save the $20 instead of earn the $30.

The irrational fear of losses is a generalization; it's certainly possible that some people don't have this bias, or at least are aware of it and able to compensate. Life is full of decisions and trade-offs, and you'd think the most successful people would have learned how to be more rational in these situations.

Take golf, for instance. In normal PGA tournament play, your score is just the number of strokes you took. A stroke is a stroke, and so it shouldn't matter if it's a stroke to make birdie (which, if missed, will leave your score unchanged), or a stroke to make par (which, if missed, will make your score one stroke worse). Both strokes are of equal importance. However, the stroke for par may count more in the golfer's mind. That's because, if he doesn't make it, it looks like a loss (his score gets worse). But if he misses a birdie putt, it's just a foregone gain (his score stays the same). If losses irrationally count for twice as much as gains in the golfer's mind, then he should try harder to make the par putt than the birdie putt.

And it turns out he does. According to a recent golf study (.pdf, free download) by Devin G. Pope and Maurice E. Schweitzer, PGA golfers are significantly more likely to make a par putt than the identical birdie putt. It's not even close: overall, the difference is about three percentage points. That's huge: the overall conversion rate for putts is 61 percent, and 3 points out of 61 is about five percent. In baseball terms, it's like a 96-66 team suddenly going 102-60 just by changing an irrational strategy.

Pope and Schweitzer did more than just count putt conversion rates, of course. After all, it's likely that birdie putts are different from par putts in many ways. They might be farther from the hole. They might be in less advantageous places on the green. They are less likely to have followed other putts, which means the golfer doesn't have as good a read of the green. Birdie putts might be associated with different kinds of golfers. They might be associated with different difficulties of greens. And so on, and so forth.

The authors took all these things into consideration. Indeed, their database is so extensive (over 1.6 million putts between 2004 and 2008) that the authors were even able to find thousands of "matching" pairs of putts where one was for birdie and one for par, but where both were in almost exactly the same place on the green, in the same round. The results held: par putts were holed significantly more frequently than birdie putts.

It turns out the difference is mostly length: when pros putted for birdie, they were more likely to leave the putt short than when they were putting for par. The idea is that the birdie golfers are playing conservatively: they want to make sure that if they don't sink the putt, they leave it close to the hole for an easy subsequent shot. On the other hand, the par golfers are scared to miss, because that appears to cost them the loss of a stroke, so they make sure they give it enough weight. For putts of 22.5 feet or longer, birdie putters leave the ball about two inches shorter than par putters.

From a strict strategic standpoint, the conservative play makes little sense. The authors find that after a missed (presumably conservative) birdie try, golfers make the subsequent putt only 0.2 percentage points more often than after a missed (presumably aggressive) par try. So the conservatism costs 3 percentage points, but gains only 0.2 percentage points. Moreover, the gain comes only if the putt is missed: assuming a 50% conversion rate on the original putt, the net gain from conservative play is only 0.1 percentage point!

So why are golfers accepting a much worse first putt for a very, very slight chance at avoiding a three-putt? It must be loss aversion. There are still two possible ways that could manifest itself. It could be that golfers are deliberately trying to hit the ball softer on birdie tries in a conscious effort to be more conservative. Or, it could be that golfers are just trying harder in general on par tries. Either way, that's something that you'd think pros would be able to control, if they realized that what they were doing makes no sense.

Has any golfer figured this out? Apparently not. The authors calculated the effect for each of 188 golfers in their data sample; they got a bell-shaped curve centered around 3.5 percentage points. The "most irrational" golfer's difference was 7 percentage points; the "least irrational" golfer was at half a percentage point. That is: every one of the 188 pro golfers exhibited this bias, to one extent or another. Just as surprising was the finding that the size of the bias was barely correlated with the ranking of the player. Better golfers had less bias, but only very slightly less. Tiger Woods, the best golfer in the world, was almost exactly average in this measure. By eliminating the bias, and shooting birdie putts the same as par putts, Pope and Schweitzer calculate that the average PGA pro will take one stroke off his score for a 72-hole tournament. If a top-20 golfer did this (and none of the other 19 did), his earnings would improve by 22% -- over one million dollars per year.

That's hard to believe, but there it is.

Read the whole study; the authors diced the data many different ways to confirm their thesis, and there are many interesting findings. A New York Times article about the study is here.

P.S. I've been getting lots of spam comments lately, so now comments on older posts (28 days old or more) are moderated. Comments on current posts will continue to appear instantly.