Sabermetric Research

Phil Birnbaum

Thursday, January 31, 2008

Is NFL defense mostly luck?

My last post linked to a study by Brian Burke that showed the Patriots scored a touchdown twice as often as the Giants, but that the Patriots' defense *prevented* a touchdown only slightly more often than the Giants'.

I wondered whether this meant that defense didn't vary between teams as much as offense does, and commenter "w. jason" confirmed that.

Now there's another confirmation, from this study at pro-football-reference.com (by Doug Drinen?). Near the end, Drinen found that the year-to-year correlation of NFL teams' defensive stats is ... *negative*:-.10 Turnovers forced -.11 Touchdowns

I assume it's just random chance that these numbers are negative, unless you think there's some reason teams that are above-average this year should be below-average next year. If you had enough data, you'd probably find a positive, but small, correlation.

Does this mean that defense doesn't matter much? Is any player pretty much as good as any other on your defensive line?

That's possible. It's also possible, for instance, that a defense is only as good as its weakest link, and it's your *worst* players that make the difference, not your best. Or, it could be that teams can't tell the good defenders from the bad, and wind up with a random assortment.

In any case, shouldn’t this imply that you shouldn't pay a premium for defenders? I assume teams pay more for their offensive squad than their defensive, but probably not as much as these results say they should. I'll check it out if I have a chance.

That Drinen link, by the way, came from today's "The Numbers Guy" article by Carl Bialik. He notes that the outcomes of sporting events are hard to predict. An organization called "Accuscore" is able to predict only 63% of NFL games, 57% of baseball games, and 68% of basketball games. (Hockey gets screwed again – not even mentioned.) Bialik attributes the difference to the amount of information known about the teams, but I think it's actually that basketball is intrinsically less random than baseball – as I have argued here and elsewhere.

Bialik also debunks one of the naive arguments against sabermetrics: that since statistical analysis thought the Giants should have lost all three games, and was wrong three times, the analysis must be wrong. I've seen that argument a couple of times lately, and it goes beyond silly. Even the best analysis only gives you a probability estimate, and even low-probability events happen sometimes.

And one last interesting tidbit: For NFL games, Accuscore has 54% accuracy against the spread. That seems pretty impressive to me.

Tuesday, January 29, 2008

Patriots offensive success dwarfs Giants

Brian Burke, of the NFL Stats Blog, notes today that the more possessions each team has in a game, the more likely the favorite will beat the underdog. That makes sense, of course, and for the same reason that the Royals may have a 30% chance of beating the Yankees in one game, but roughly a 0% chance of beating them in 162 games.

Burke's simulation shows that, with 9 possessions per team, the Patriots have a 71.4% chance of beating the Giants. But with 13 possessions per team, their winning percentage rises to 76.5%.

That is: the Giants can improve their chances of winning by almost 25% if they try to play really, really, slow. (Of course, they can improve their chances even more by scoring lots of points, and preventing the Patriots from scoring.)

Anyway, as I wrote, this isn't really all that surprising. But what IS surprising – at least to me – is how much better the Patriots are per possession:

The Patriots scored a touchdown on 42% of their possessionsThe NYGiants scored a touchdown on 21% of their possessions

The Patriots were *100%* better at scoring touchdowns than what is presumably the second-best team in the NFL!

On defense, they were only 10% better at *preventing* touchdowns. I wonder what that means ... does it mean the Patriots have a better offense than defense? Is the Giants defense especially good (assuming 90% as good as the Patriots must be exceptional)? Or does it mean that, in general, defenses in the NFL vary a lot less than offenses do?

Geez ... twice times as successful on offense. That doesn't include field goals, but still ...

Wednesday, January 23, 2008

Invest in baseball player futures

You can play fantasy baseball for real now: there's a company that will sell you, as an investment, a percentage of a player's future income. If he becomes the superstar you think he will, you get rich. If he languishes in the minors, you lose.

Today, Marginal Revolution points to an investment opportunity on Indians prospect Randy Newsom. For $20, you get a claim to 0.0016% of Newsom's future salary. If Newsom earns more than $1.25 million over his career (discounted to today's dollars), you make money.

Actually, Newsom owns the company that you contact to make the deal. He's trying to make this work as a viable business, but right now he's the only player available to invest in. David Laurila of Baseball Prospectus interviewed him a couple of days ago.

This strikes me as a great way for prospects to buy a form of insurance on their careers. It's also a way for fantasy players and baseball fans to put their money where there mouth is.

The interesting thing is how this is legal, but would be illegal if it were anyone else trying to sell a percentage of Newsom's salary as a bet. If MLB and the MLBPA wanted to legalize this kind of gambling (there's probably no reason they would, but IF they did), they would agree to sell 1% of each player into the market. If that wouldn't be enough liquidity to meet fan demand, they could then issue futures contracts that settle in cash.

And, there you go! David Ortiz futures would be as legal as corn futures.

UPDATE: According to this Slate article, your payoff is based on Newsom's *after-tax* income. Plus, there are fees on top of your $20. It looks like Newsom has to make about $2.5 million, not $1.25 million, for you to break even on the deal. As they say, be sure to read the prospectus before investing!

Saturday, January 19, 2008

The clutch hitter bet -- rules

This is a followup to my offer to bet that you can't pick clutch hitters in 2008, in my previous post. Now that people are interested, I suppose I should lay out the rules, to avoid disputes. Here they are:

1. I have improved the odds to 2:3. You now have to lay only $15 to win $10, for instance.

2. Please accept the bet HERE, not at any other site that may have linked here (eg. BTF, SOSH, etc.) I don't always read all those sites. To accept the bet, post a comment here, or e-mail me (a link to my e-mail is at my full name dot com). Please list:

a. The player or players in your "clutch" group;b. The player or players in your "choke" group;c. The amount of money you are laying;d. Your definition of a clutch situation;e. The charity you are playing for, or whether you want the winnings yourself.f. Who you are, if I don't know you, so I can find you if you lose. (You can send this by e-mail.)

I would recommend that your definition of clutch be something easy to calculate, maybe even something that ESPN or someone calculates for you. If not, you are responsible for the calculations.

I will confirm that I am accepting your bet, and then we're on!

3. Despite some objections, the metric will be (batting average in clutch – batting average in non-clutch). Why? Because the studies I have seen use batting average. I suspect that the evidence is just as strong against clutch talent using other statistics, but I'm going to play it safe anyway. Maybe I'll switch next year.

4. I won't accept attempts to game the system. For instance, suppose there's a runner on first being held on. Can lefties hit singles through the hole more easily than righties? If you choose that as your situation, and made it lefties vs. righties, I won't take your bet, because you're not really measuring clutch, you're measuring a certain platoon effect.On a related note, no pitchers allowed unless accompanied in your "clutch" or "non-clutch" group by at least one real hitter.

And if you want to bet way too much money for me, I will politely decline.

5. For multiple players, batting averages should be computed by adding all H and dividing by all AB (not by averaging the individual unweighted batting averages). Regular season only, no playoffs.

6. The minimum AB for the calculation is 1. If either of your groups goes 0-for-0 in clutch or non-clutch, the bet is cancelled.

7. I may not have thought of everything, so I retain the right to refuse any bet for any reasons I choose. Once the season starts, of course, I can't cancel any bets I've accepted. In the unlikely case of dispute at the end of the year, we can try to find someone neutral to arbitrate. (Honestly, if it's just $10 or something, I'd probably just pay you and complain about it.)

8. All bets must eventually be public (except for your real name or e-mail address). I will eventually set up a web page with all the bets listed, as I accept them.

9. Any legal trouble on non-charity bets will convert them to charity bets. I don't want to go to jail or anything.

10. For charity bets, I get to offset my losing bets with my winning bets. That is, suppose I win 50 bets at $15, and lose 100 bets at $10. That's $750 in winnings, and $1000 in losses. The 50 losing bettors send $15 each to charity on behalf of 75 of the winners, and I send the remaining $250.

Otherwise, I couldn't afford to do this, since I'm pretty sure that I will lose half the bets.

Another way of putting it: I am the casino. I get to pay my losing bets out of the money made from winning bets. If there is any leftover, it's profit, and 100% of my profit goes to charity. If there is not enough to pay the winning bets out of the losing bets, I take the loss (by making donations to charity).

11. After the season, I'll go through all the bets and figure out who has to send money where. Losers are responsible for sending the cheque to the charity assigned.

99. I may add or change rules on this page as I think of them. I will always let you know what the changes are.

Thursday, January 17, 2008

I will bet real money that you can't identify even one clutch hitter

Over the past 30 years, one of the hottest topics in sabermetrics has been the existence of clutch hitting talent. Does it exist? Do some players naturally perform better when the game is on the line? Are certain other players "chokers," who play their best only when it doesn't matter?

Those who believe in the existence of clutch hitters often believe it passionately, and many have been dismissive of evidence against it.

I think the evidence against it is substantial, and so, if you believe in clutch hitters, I am willing to offer you the following bet.

You tell me who you believe will be clutch hitters in 2008. You tell me who you think will be choke hitters in 2008. If your clutch hitters outperform your choke hitters, I will pay you $10. If not, you pay me $20.

You will note that I am asking for 2 to 1 odds. Why? Because I am letting you make all the choices. You decide who the clutch hitters are. You decide who the choke hitters are. You even decide how many there will be in each group. If you want, you can pit exactly one player against one other player, and I'll be OK with that. You can pit one player against the rest of the league. You can have three players in the "choke" column, and an entire team in the "clutch" column. Anything you want.

I'll even let you decide what's a clutch situation. Elias used to have a bunch of candidates: "late inning pressure situations," "late inning pressure situations with runners on," "runners in scoring position with two out." Use one of those, or make up one of your own. You can even weight the situations if you like – count a plate appearance double if the tying run is on third. Whatever you think gives you the best chance of winning.

At the end of 2008, we will see how much your "clutch hitters" improved in the clutch. We will see how much your "choke hitters" improved in the clutch. If the clutchers beat the chokers, by improving more or declining less, you win. The measure of clutchness will be the improvement (or decline) in batting average.

For example: I have the 1989 Elias book in front of me, so let's say that in 1988, you had chosen Tim Raines as your clutcher and Mark McLemore as your choker. Raines improved by 92 points in late inning pressure situations (.347 to .254). McLemore also improved in the clutch, by 12 points (.250 to .238), which was not as much as Raines. You win your bet, 92 points to 12.

You can see that 1:2 odds are quite reasonable by looking at other statistics. If I offered you 1:2 that Albert Pujols wouldn't hit more home runs than Juan Pierre, you'd jump on it. If I offered you 1:2 that Juan Pierre wouldn't steal more bases than the average National League catcher, you'd take it in a second. If I offered you 1:2 odds that Pedro Martinez wouldn't strike out more players (per nine innings) than the league average, you'd mortgage the farm to take that bet too.

I'm saying that if you want to do the same for clutch hitting improvement, I'll take your bet, and I'll even let you fill in both sides of it. You tell me who's going to outclutch whom. And you only have to be right 67% of the time to make it a fair bet.

Do we have a bet? If not, why not?

If you really think the 1:2 odds aren't fair, what do you think IS fair? If you say only 1:1 odds are fair, then you must think that even if you pick the very, very best clutch hitter in baseball, and the very, very worst clutch hitter in baseball, it's still a flip of the coin which one does better in 2008. And, so, basically, you're admitting that clutch hitting talent doesn't exist.

Or maybe you think clutch hitting exists, but you don't want to bet because you don't know who the clutch hitters are. In that case, why do you believe in the existence of clutch hitting? I mean, if you've never seen a clutch hitter, how do you know they're out there?

Perhaps you think the odds are too high because the sample sizes are too small. That's why I gave you the option of making the "clutch group" and the "choke group" as large as you want. Put your top/bottom 20 players in each group – that will give you at least 2000 plate appearances total. That should be enough to tell the heroes from the goats, shouldn't it? And if you're thinking that "late inning pressure situations" aren't really what you think of as clutch, just remember that I'm even letting you define what situations we're looking at.

Basically, if you're refusing the bet, you're saying that you can't think of ANY situation, in a full season's worth of baseball, where you can differentiate ANY clutch hitters from ANY non-clutch hitters with 67% probability. If that's the case, clutch hitting can't be that important, can it?

I actually don't expect that anyone will take me up on this bet. But I'm serious. If you want to place the bet, I will cover it. It can be for just $2 against $1, if you want, or $20 against $10, or even $50 against $25. Mitchel Lichtman, of "The Book," says that if you want to bet an amount that's too high for me, he'll take the rest of your action.

Also, if you want to bet donations to charity, that's OK with me too. Winner picks a charity in the loser's home country (I'm Canadian). Loser gets the tax receipt, just to keep it legal. Looking forward to hearing from anyone interested! (Hat Tip: Tom Tango for the inspiration.)

UPDATE: I have set out the rules here. Please place your bets there, and NOT at any other site where you may have heard about the bet. Thanks!

Tuesday, January 15, 2008

The Edmonton Oilers: shooting for shootouts?

Here's an article, a couple of weeks old, from hockey author Jeff Z. Klein on the prevalence of regulation ties in the NHL.

Klein notes that since there is now an incentive to proceed to overtime – the teams split there points between them instead of two – teams are playing for the tie. Specifically, the Edmonton Oilers, who had 13 ties in their first 35 games – a huge number. Klein argues that the Oilers are even playing for the tie in overtime, hoping to proceed to the shootout, where they are 10-2. He says that none of the 13 ties was broken in overtime – the equivalent of 65 minutes of 0-0 hockey. (However, the Oilers did indeed lose in overtime on December 10.)

Since the article was published, the Oilers have three more ties in nine more games. Two ended with a sudden death goal, one in a shootout.

So in 46 games, the Oilers have 16 ties. They are 1-2 in OT, and 10-3 in shootouts.

Klein suggests removing the incentive by awarding three points for a regulation win, instead of two. But he quotes Anaheim's Brian Burke as calling that a "terrible idea."

But it seems to me that when teams are deliberately trying not to score because of incentives the league has offered them, something is definitely broke. And people are starting to talk about the problem. I'd be surprised if the NHL doesn't change the system sometime over the next few years.

Sunday, January 06, 2008

Some evidence of streakiness in tennis

Reisman started by trying to figure out if momentum means anything heading into the fifth game of a best-of-five series in volleyball. He found that it probably didn't. When team A won the first two games, and team B won the second two games, the team with momentum – B – won the fifth game less than half the time. When the first four games went ABBA, the team with the little bit of momentum – this time, A – did win 15 out of 23. But that wasn't statistically significant. Reisman's full results:

AABB: B won 15 out of 34ABAB: B won 16 out of 33ABBA: A won 15 out of 23

Total: the team with momentum won 46 out of 90, almost exactly what you'd expect by chance. (Interestingly, a commenter notes that in all three cases, the team that won game 1 actually wins the majority of game 5s.)

However, in tennis, the player with momentum won substantially more than half the fifth sets:

AABB: B won 188 out of 339 (p = .025)ABBA: A won 186 out of 324 (p = .004)ABAB: B won 156 out of 291 (p = .120)

One explanation for these results is that it's not really momentum that matters, but the conditions of the two players. If a player gets tired over the course of the match, that should affect the fourth and fifth sets more than the others, so the fourth set is predictive of what will happen in the fifth set. Or, if one player finally figures out a strategy for beating the other, again, that will affect the fourth and fifth sets more than the others.

These may not be as true for volleyball, because the various players on the team would tire at different rates, so the effect would be smaller. As well, a strategy that works against one player would not be as useful when there are six on the court.

Are NFL players more criminally-inclined than the general public?

There's a lot of talk about how many NFL players seem to get in trouble with the law. Do professional football players really get arrested more than the general population?

A 1999 study from Chance magazine (.pdf) ran the numbers to find out. The authors, Alfred Blumstein and Jeff Benedict, broke down crime statistics by sex, age, and race, and compared NFL players to the rest of their demographic cohort.

There were 264 arrests overall. Of those, 87 arrests were for assault. There were 7 for rape, 4 for kidnapping, 2 for homicide, 31 for property crimes, 35 for DUI, 15 for drugs, 16 for resisting arrest, and 40 for other "public safety offenses." Apparently the 31 arrests for property crimes is a very low number, probably because pro football players are pretty rich.

After some discussion, the authors limited their investigation to assault (which includes domestic violence). The results surprised me.

It turns out that for assault, the arrest rate for players is only about half what it is for the general population. For blacks, it was 46% of the norm; for whites, 47%.

And old "icing the kicker" study

"Icing the kicker" is the tactic of calling a timeout just before the opposition is about to attempt a crucial field goal. The idea is that the interruption upsets the rhythm and mental balance of the kicker, who now as an extra couple of minutes in which to envision screwing it up.

Barry and Wood looked at all "pressure kicks" (which would give the team a lead or tie with less than three minutes to go) in 2002 and 2003, normalizing them for weather, distance, and so forth. They found that icing the kicker did, in fact, work – reducing the chance of a successful kick from 76% down to 66%.

They say "the evidence is not overwhelming, but it is compelling."

The difference of 10 percentage points looks like a big effect, except that there were only 38 "icing" kicks overall. 29 of those kicks "should have" succeeded, but only 25 did. It's hard to believe that difference of only four kicks would be statistically significant.

Actually, the standard deviation for those 38 kicks is about 2.6 (the square root of 76% times (100 – 76)% times 38), so four kicks is only about 1.5 standard deviations away from expected. That's definitely not significant.

In fairness, the authors, who are statisticians, probably mentioned that in their article.

P.S. After writing this post, I found this, which points to more data from Stats, Inc. So I guess this is old news, but I wrote it, so I'm going to post it anyway.

Wednesday, January 02, 2008

A non-analysis analysis of NFL playoff trends

You know those systems that try to figure out what kinds of teams do well in the baseball playoffs? Bill James had a system that did that a couple of decades ago (details here), and I've seen a couple of others since. Well, now, Mike Sando, of espn.com, tries to do something similar for football. It would be a reasonable effort, except that he doesn't give us the comparisons we need to figure out if he's right.

Sando looks at eight bits of "conventional wisdom." For instance: does defense win championships? Sando looked at 2000-2006, and found the 30 NFL teams that gave up the fewest (regular-season) points per game, to see how they did in the playoffs. They won 38 playoff games.

So? Is that good? Does that mean that defensive teams do better in the playoffs that non-defensive teams? We can't tell. He doesn't tell us how many *losses* went with those 38 wins, and he doesn’t tell us how many games the *other* teams won!

I am boggled. How can you just look at the number of games won and reach any kind of conclusion?

I guess you can probably figure it out, to some extent ... if winning were random, then, of 30 teams, 15 would win their first playoff game. 7.5 of those teams would win a second game, and 3.75 of those would win a third game. Not all those 3.75 teams would have a fourth game, so lets make it just one extra win for the "fourth round" (which is the Super Bowl). The total: 15 plus 7.5 plus 3.75 plus 1 equals about 27.

Yup, 38 wins is better than the expected 27 wins.

But does that mean that defense wins championships? Maybe if you look at the teams with good *offenses*, they also win about 38 games. In that case, just being good wins championships. But Sando didn't think of checking that.

Here's another test, of whether "home-field advantage means everything." Well, of course it doesn’t mean *everything*, but is it more important in the playoffs? The way to answer that question, of course, would be to show us the HFA for the playoffs, and compare it to the HFA for the regular season. Of course, we don’t get that. What we *do* get is these two facts:

Road teams won three-fourths of wild-card playoff games following the 2004 and 2005 seasons;

Home teams won 20 of 28 divisional-round games since 2000.

So: why are we picking and choosing 2004-2005 in one case, but 2000-2006 for another? Why are they separated? And what's the home winning percentage in the regular season, so that we can compare?

Anyway, I'll stop here; there are actually eight questions, not just these two, but the other six are similarly exasperating. The article sounds like analysis, and it wants to be analysis, but what we get are anecdotes and (deliberate) selective sampling. It must have taken a fair bit of research to compile the data, but, alas, we learn almost nothing from it.