Sabermetric Research

Phil Birnbaum

Wednesday, June 27, 2007

"The Straight Dope" on home field advantage

"The Straight Dope," one of my favorite Q&A newspaper columns, takes on the question of home field advantage. Psuedonymous columnist Cecil Adams can't tells us what causes HFA, of course, but his response to the question is excellent.

Tuesday, June 26, 2007

How important is a hot goalie in the playoffs?

How important is a hot goalie in the NHL playoffs? Conventional wisdom says: very important. A study by Alan Ryder says: not as important as you might think.

Ryder collected shot quality data for the first 46 games of this year's playoffs. He looked at not just the raw shot numbers, but the type of shot and distance from the goal – for instance, a wrist shot from 50 feet out. From that data, Ryder calculated the expected number of goals each team "should have" scored from shots of that quality.

In the 46 games studied, Ryder found that the team that was expected to win actually finished with a record of 38-8. In those eight losses, goaltending likely made the difference. Examining those games more closely, Ryder concludes that four of those games were goalies blowing a win, so that leaves only four goalies getting hot and stealing the win. He writes,

" ... we have seen exactly four games stolen by goaltenders in 46 playoff games to date. Strong goaltending is critical to playoff success. And there has been some tremendous play in the blue paint this spring. But the conclusion here is obvious. With only four steals, goaltending is not running the results board."

While Ryder's finding is important, that the best team (in terms of shot quality) usually wins, I'm not so sure that the influence of goalies is as minor as Ryder implies. First, aren't four games blown just as important as the four games stolen? A "hot" goalie is judged not just by having lots of great games, but also by having few poor ones.

Also, Ryder tells us that 8 games in 46 is about one game per series (although I'm not sure which series he used to add up to 46 games). One game per series is a lot, isn't it?

How much is one game worth, assuming two teams of equal ability?

If a team starts out 1-0, it has a 42/64 chance to go on to win the series. So if it starts out 0-1, its chance is 22/64.

Turning a loss into a win, then, is worth 20/64 of a series, or .3125. That's a lot, almost a third of a series!

And it's twice the value of an overtime goal (which only turns a tie into a win). Even if a team scores two overtime goals in a series, it's probably different players scoring them (and goals are a team effort anyway). So it does seem that a goalie having a hot game does have has a disproportional influence on the series.

Having said that, it's probably not fair to credit (or blame) the goalie any time the "wrong" team wins in terms of shot quality. Not every 30-foot snap shot is the same, and it's very possible that the team with the lower number of expected goals actually had better chances. So perhaps the number of games where the goalie made a difference is overestimated by this method.

And, of course, a large part of goaltending (and batting, and pitching, and quarterbacking, and foul shooting) is random chance. If Domenik Hasek steals a game with some spectacular saves, it's probably mostly luck, rather than a real increase in his usual (high level of) skill. For instance, suppose the Sharks have four point-blank chances, each with a 50% chance of being a goal. By luck, Hasek stops three of them, and thus saves one expected goal. The Wings win 3-2 in overtime. The goalie is the hero, and it looks like Hasek's clutchness was critical, but it wasn't – he was just luckier than usual.

Unless, of course, you believe in "clutch" performance among goalies. I'm naturally skeptical, given the evidence against clutch performance in baseball.

Finally, here's one more argument.

In baseball analysis, the WPA approach (as calculated by Fangraphs and others) calculates the change in probability of winning after each event. For instance, if player X hits a three-run home run in the top of the ninth in a tie game, he might increase his team's chances of winning from (making these numbers up) 60% to 95%. You might be tempted to say that X made the difference in the game. But if the other team's player Y hits a grand slam in the ninth, he might have increased his team's chances from 40% to 100%.

In an up-and-down game, the sum of the probability changes might be very large – easily well over 100%. And so looking at just one play might give you an overestimate of how important that play was.

Same thing with the goalie. A goalie "steal" might be worth 31% of a series. But if you add up 15% for each overtime goal, and, in fact, all the probability changes for every goal, you might find that the sum of all the changes is 400% of a series for one team, and 300% of a series for the other team. In that light, 31% of a series for the goalie might be huge, but not *that* huge.

So there are several things to consider, and, overall, I don't know what to conclude about all this. My gut says that goaltending is important, but overrated because, with the goalie on the ice the entire game, his luck is exceptionally visible. All the other luck – offensive and defensive – is spread among the other players on the team, and is a combination of their efforts.

I guess I agree with Alan Ryder. Hot goaltending is important, but not as important as a first impression might suggest.

Thursday, June 21, 2007

Pitchers with lucky and unlucky W-L records

Another nice article in SABR's "Baseball Research Journal 35" figures out which were the luckiest and unluckiest starting pitchers in terms of career wins. It's called "Still Searching for Clutch Pitchers," by Bill Deane ("with statistics by Pete Palmer").

Deane and Palmer figured it this way: first, they counted all the runs the pitcher gave up in all his starts. Then, they figured how many runs his teams scored for him in the games he started, and pro-rated that figure to his innings pitched. They subtract one from the other to come up with a run differential. Finally, they take a .500 record (with the same number of decisions), and add or subtract wins based on that run differential. The result is the pitcher's expected W-L record.

Here are the luckiest pitchers according to Deane and Palmer. Remember, "luck" here does not include any above- or below-normal run support, which has been taken out of the picture.

The authors note that, over all the pitchers, the distribution of the discrepancies between actual and expected, in terms of standard deviations, is exactly as you would expect from the normal distribution. And so they conclude that the difference between expected wins and actual wins is presumably due to luck.

But they don't explicitly say that they're using the standard deviation of the binomial distribution (that is, the SD of coin tosses). From the article, all we can conclude is that the distribution of the discrepancies is normal, not that it exactly matches what would have resulted if the discrepancies were luck. (For instance, you might find that 2.5% of people have SAT scores more than 2 SDs above normal, just as expected, but that doesn't mean that SAT scores are just luck.)

A couple of other minor quibbles:

How many runs does it take to turn a loss into a win? Palmer uses the formula

Runs per win = 10 * (average runs per inning for both teams)

The article bases "runs per inning for both teams" on the combined scoring of the team and its opponents *when that pitcher starts*. I don't think that's right: you have to base it on the *league* scoring. That's because the runs saved are off the league runs, so you have to start at the average.

Put it this way: suppose Roger Clemens gives up 3.5 runs a start, and the league average is 4.5. The league average of 4.5 means 10 runs per win. Clemens saved 1.0 runs *off the 4.5 level*, so you have to use the 4.5 level (not 3.5) to figure the runs.

Actually, the best way might be to integrate the function, or at least sum a few small slices. For instance:

The first 0.1 runs saved are at the rate of 1 win per 10 runs (using 4.5 + 4.5).The next 0.1 runs saved are at the rate of 1 win per 9.94 runs (using 4.4 + 4.5).The next 0.1 runs saved are at the rate of 1 win per 9.89 runs (using 4.3 + 4.5)....The last 0.1 runs saved are at the rate of 1 win per 9.49 runs (using 3.6 + 4.5).

The average of all the slices is the number you should use. But the article uses 9.49 runs per win for *all* the runs saved, not just the last slice, and therefore it's probably overestimating the "luck" by just a little bit. And, in fact, I've seen arguments that Palmer's formula doesn't actually work all that well anyway.

But in any case, I don't think it's enough to affect the results much, which is why I call it a quibble.

One more minor point: if ten runs makes up a win for a team, does it necessarily follow that ten runs make up a win for a starter? Maybe due to leverage differences, it takes 12 runs for a starter win, but only 8 runs for a reliever win. Again, I'm not sure that's true, and even if it is true, it's probably minor anyway.

The authors do check the overall results, and find that the average "luck" is one win more than expected. "This could be," they write, "because those who are 'lucky' in the win column are more likely to get 200 decisions [which was the study cutoff]."

Sounds right to me. Overall, I think the results of this study are probably pretty accurate.

Wednesday, June 20, 2007

Bill James' job

Here, from the Wall Street Journal, is another one of those profiles of Bill James, and how he now works for the Red Sox after a quarter-century of writing books. The article's first line:

"After 25 years on the outside, Bill James was invited to take a seat at the center of the baseball universe."

When I read that line, it struck me as somehow wrong. Bill James, it seems to me, was at the "center of the baseball universe" when he was discovering important things about baseball, writing about them brilliantly, and sharing them with hundreds of thousands of rabid fans. Now, he discovers smaller, less-important things, doesn't write about them, and shares them only with the Red Sox front office. He's obviously less influential now than he was then.

Everyone's personality and taste is different, but, speaking for myself, I'd rather have Bill James' old job than his new one. What fun is discovering new things if you can't tell anyone?

Monday, June 11, 2007

Do the Yankees lose money? Not if you include YES

Forbes magazine recently came out with profit/loss numbers for every major league team. The biggest money-losing team -- in fact, the *only* money-losing team -- was the New York Yankees, who, according to Forbes, lost $25.2 million last year.

But now, an article in New York Magazine claims that while the Yankees did lose money -- $28 million, which is pretty similar to Forbes' estimate -- their cable network, YES, was quite profitable. YES broadcasts Yankees' games to a large New York viewership, which would explain why it rakes in the cash. The article says:

"The Yankees—read Steinbrenner—also own more than a third of the YES network … The network's revenues top a quarter billion and its profit margin is 60 percent."

One-third of 60% of a quarter billion is $50 million. So it seems like if you include YES in the calculation, the Yankees are actually quite profitable.

The New York article wants to treat the Yankees and YES as two separate businesses for purposes of the calculation, but that's silly. If the team's contract with YES were an arms' length transaction, the Yankees would have demanded enough in royalty fees that YES would be making a lot less than $150 million.

By the way, this Wikipedia article implies that the Yankees and YES are each fully owned by the same holding company. If that's true, it follows that the Yankees effectively receive the benefit of all the YES profit, not just one-third, and the team's total earnings are around $125 million. I don't know who's right, if it's 100% or one-third.

In either case, this means that the accounting in the Forbes article wasn't smart enough to take into account teams' related businesses. As a result, we don't know less than we thought about how profitable MLB teams are.

I had previously argued that teams are much less profitable than you'd expect from their market value, and the reason is that owners are willing to sacrifice profit for the glory of owning a team. I still think that's true, but if Forbes really hasn't done a full accounting, the evidence is weaker than it appeared.

Thursday, June 07, 2007

A casino falls for the "hot hand" fallacy

We sabermetricians like to criticize sportswriters for failing to understand luck. A guy goes on a 3-for-25, and the press will be all over him wondering what's wrong with his swing and whether he should be benched. A team wins seven in a row, and we make fun of the announcer who says they're "hot" and they have "momentum" and that makes them harder to beat.

But that's the press. You'd think people that make profits by understanding how randomness works -- casinos, say – would be immune from this kind of thinking. Which makes this story so bizarre.

A high roller at Harrah's got lucky and hit a few jackpots on a video poker machine in Las Vegas. Next thing you know, Harrah's is banning him from all their properties.

Are they kidding? Yes, there is some skill to video poker, but even if played perfectly, the advantage is still with the house. Not only that, but this particular high roller says that he already lost back 80% of his winnings.

Our innate bias, the one that makes us want to attribute events to something other than random chance, must be pretty strong. Even casinos can't resist the temptation, even when they're shooting themselves in the wallet.

Sunday, June 03, 2007

Winning the world series in X games

As a SABR member, I got my copy of "The Baseball Research Journal 35" in the mail this week. It's got a few sabermetric articles in it, and I've so far only had a chance to take a look at one of them. It's called "Relative Team Strength in the World Series," by Alexander E. Cassuto and Franklin Lowenthal.

(Disclosure: when articles for the book were being selected, I was asked to briefly give the editor my opinion on some of them, and so I am listed as a "peer reviewer and designated reader." Also, one of my own articles appears in the book.)

Cassuto and Lowenthal (C&L) start by figuring the chance that the World Series will go a certain number of games, assuming that the teams are exactly equal in strength:

A 4-game series has probability .125;A 5-game series has probability .25;A 6-game series has probability .3125;A 7-game series has probability .3125.

Note that the 6- and 7-game chances are equal. That's because any series that goes at least six games will necessarily be 3-2 after the first five. Then, there's a 50% chance it'll go six (if the "3" team wins), and a 50% chance it'll go seven (if the "2" team wins).

C&L then compare the distribution of real-life World Series to the theoretical one:

4 games: 19 actual, 12.1250 predicted;5 games: 21 actual, 24.2500 predicted;6 games: 22 actual, 30.3125 predicted;7 games: 35 actual, 30.3125 predicted.It does seem like the theoretical model isn't very accurate, but that's probably because teams aren't really equal in strength. Using a Chi-Squared test, the authors found that the "teams are equal" hypothesis can't be rejected at the .05 level, but *can* be rejected at the .10 level.

The authors then turned to the question of just how unequal the teams would have to be to lead to the observed results. They find that if, in every world series, the stronger team overall had a 51.38% chance of winning every individual game, that would account for the results. That's a lot more balanced that I would have expected.

However, even with these results, there are many more game 7s than predicted. Why is that? The authors argue that

"The team that is behind must play to win games at all costs; thus it may change its rotation to start its best pitcher, use its best reliever for more innings than usual, etc., while the team that is ahead will formulate its strategy to win one more game, not necessarily the sixth game."

I wonder if there are more likely explanations, such as home field advantage. The team behind 2-3 is more likely to be at home for game 6. (The authors mention home field advantage, but not in this context.)

C&L also briefly discuss the economics of long and short series. They note that, assuming two equal teams, the average series length is 5.81 games. As the teams become more unequal, the expected length decreases. However, it decreases slowly: if one team is an expected .550 against the other, the expected games becomes 5.78, which is hardly a drop at all. A .600 team vs. a .400 team results in an average 5.7 game series, still a small drop.

In that context, the authors write,

"[Television] contracts that do not consider the probability of less than seven game series are likely to cause marginal economic losses to the networks and windfall gains to MLB."

Which is true, but I doubt that any networks are naive enough to believe that every series runs seven games. And I'm sure they have qualified staff on-hand to do some rough calculations before submitting a bid.

The part of the article I found most interesting was where the authors figure out the level of imbalance that maximizes the chance of the series going X games.

For a 4-game series, you want one team to have a 100% chance of winning each game. For a 7-game series, you want the teams as balanced as possible: 50% each, which minimizes the probability that one of the teams will win earlier.

What about a 5-game series? The authors calculate the optimal level is .789/.211. This makes intuitive sense; in a 5-game series, the winning team will finish with a record of .800, and .789 is close to that. It turns out that if you do have one team being .789, the series will go five games 33.33% of the time.

Which leaves a 6-game series. How should you imbalance the teams to get the best chance of a 6-gamer? The answer surprised me – you might want to think about it before reading on.

Ready?

The answer the authors give is 50%. The chance of a 4-2 series is maximized when both teams are of equal strength.

This can be proven mathematically, but is there a good intuitive explanation? The only one I can think of is that a six-game series must start out 3-2. The best chance of 3-2 is when the teams are equal. But then you still only have a 50% chance of the "3" team winning game 6. Suppose you make one team stronger. Then it will be less likely that the series will start 3-2, but more likely that the leading team will win game 6. Off the top of my head, it would seem possible that you could trade a small reduction in 3-2 chance for a larger increase in game-6 wins. But the math says no.

Can anyone come up with an easy intuitive explanation of why two equal teams maximizes the number of 6-game series?