Sabermetric Research

Phil Birnbaum

Friday, January 25, 2013

How much luck in a shortened NHL season?

Last post, I showed examples of the typical amount of luck in an 82-game NHL season. For this year, though, we're interested in a 48-game season instead. We'd expect significantly more luck with the shorter schedule.

It's hard for me to simulate a 48-game season, because I don't have a schedule handy. So I took a shortcut.For a shorter season, the typical r-squared between standings and talent should be about .4. So, what I did was, I ran the "regular" simulation, and I pulled out all the occurrences where the r-squared was between .35 and .45. That should give us some "typical" 48-game seasons. (How do we get .4? Well, we know, from the previous post, that the SD of talent in the NHL is 8.95 points per 82 games, which is 5.24 points per 48 games. And, we can calculate that the SD of luck in 48 games is 6.45 points.

Therefore, the typical r-squared between standings and talent should be about 5.24 squared divided by (5.24 squared + 6.45 squared). That's around .4. It might really have to be slightly less, taking into account that real-life has no inter-conference games, but I'm going to ignore that.)

In the West, the third- and fourth-best teams missed the playoffs. A below-average team finished third. And Minnesota, a bad team, squeaked into the last playoff spot. In the East, third-worst New Jersey tied the Rangers for fourth, and Buffalo was the only good team to miss out.

Maybe we can do this in kind of a summary way. Let's create "anomaly points", to count up the number of surprises. It'll be one point for a +10 or more team missing the playoffs, or a -10 team making the playoffs, which is a "small" surprise. Two points for "medium" surprise of +20 and -20. Three points for +30 and -30, and so on.

(Hopefully, when I say "points," you'll be able to tell by context whether it's anomaly points or standings points.)

By this scale, we have 2 anomaly points for Minnesota making the playoffs, 2 for each of Nashville and Chicago missing out, and 1 for Detroit's failure. In the East, we have 3 points for New Jersey's unexpected success, and that's all. So, 7 points in the West and 3 points in the East, for a total of 10 points.

In the West, only 2 points (Calgary and Detroit missing the playoffs). In the East, though, 7 points: the Islanders and Atlanta somehow made it in, while Pittsburgh finished well out of the playoff picture. Total: 9 points.

I'll just give you summaries from here on. These are the next few random seasons I found with 55 to 65 percent luck:

-- 15 anomaly points. Edmonton, the worst team in the West, finished eighth, ahead of Detroit and Chicago. New Jersey finished seventh in the East, ahead of Pittsburgh and the Rangers.

-- 5 points. Nashville was 12th, Chicago 15th, and Washington 9th.

-- 13 points. Only Chicago in the West (they finished 12th). But in the East ... Boston, the best team, finished 10th, two points out of the playoffs. The Rangers were 12th. But the Devils and Islanders made it in.

-- 5 points. Atlanta makes it in by a point; the Rangers finish second last. In the West, there were zero points ... Vancouver, by far the best team, finished only fourth, but we're not counting that even though it's very unlucky.

-- 12 points. Columbus swapped with Nashville, and New Jersey swapped with Boston.

-- 5 points. New Jersey (8th) finished three points ahead of Pittsburgh (9th). So, this one almost had zero!

-- 6 points. Below-average Carolina (at -2) finished first in the East, five points ahead of Boston ... maybe we should give that some points, but I won't. San Jose missed, while the Leafs and Islanders were the surprises in the East.

Taking an average of all these ... it comes out to a bit less than 10 points. That's around five "medium" surprises (like Toronto or Minnesota making the playoffs). Or, if you like, one "medium", one "large", and one "extra extra large" -- say, Chicago, Columbus, and Edmonton.

-------

To see how the shorter season compares, I ran the points for a hundred normal, 82-game seasons. The average was only 5.8 points (with an SD of 3).

The more luck, of course, the more anomalies. Here's how the anomaly points vary by the r-squared of the individual seasons:

Since 48-game seasons are around 60% luck, a typical short season will come from the top group, and so we should expect around 10.5 points. That's pretty close to the just-about-10 we got in the ones I described earlier.

So: perhaps we can say, a typical 48-game season has around 70 percent more surprises than an typical 82-game season (roughly 10 divided by roughly 6).

That's almost exactly equal to the difference in games (eighty-two games is 71 percent more than forty-eight games). I think that's just a coincidence, but I'm not certain.

-------

One thing I want to clarify here: the short seasons we looked at are very "average", close to the center of the distribution (which is 40% luck). There will be seasons with a lot more luck, and there will be seasons with a lot less luck, that go very much according to talent. In the seasons we looked at, we found that *typical* seasons have around ten points, with a range of 5 to 16. However: overall, there will be a much wider range.

An 82-game season this extreme should happen around one or two times in a hundred. But, in a 48-game season, it'd be a lot less rare. Remember, the average r-squared in a short season is .4, and this one isn't that much worse, at .34.

So, this year in the NHL, we should expect "10 points" worth of surprise teams, but it could, with a little more luck, be as many as 15 or 20 points.

Wednesday, January 23, 2013

NHL and luck: simulation results

Last post, I calculated that in an 82-game NHL season, luck is almost as important as talent. Talent accounted for 53 percent of team variance in the standings, while luck accounted for the other 47 percent.

That makes it seem like the standings are hugely unpredictable. But, that's not really the case. Fifty-three percent is still enough for the best teams to come to the top, for the most part.

I ran my NHL season simulator a bunch of times, using the 2010-11 season. I'll show you one of the seasons that came out with typical randomness ... in this case, 48 percent luck.

Teams are ranked in order of simulated points. Assumed talent (goal differential, regressed 22 percent to the mean) is in parentheses. ("Assumed talent" is based only on the raw stats. I'm pretty sure the Canucks aren't really a true +61 -- based on their seasons before and after -- but I'm not making any team-specific adjustments for that.)

Generally, the good teams did well and the bad teams did poorly. But, there were a few exceptions that contributed to the 47 percent luck.

In the East, the Islanders and Penguins basically swapped positions. Pittsburgh had a horrible simulated year, finishing next to last despite actually being the fourth best team. Other than that ... the East finished very much as expected: all the plus teams making the playoffs, and all the minus teams missing out.

In the West, the Sharks missed the playoffs despite being the second-best team. Effectively, they switched with the Ducks, who wound up in fifth place despite being worse than average. Also of note: the Blues wound up a few spots north of where they should have been.

Other than those five teams ... nothing really that notable. But you can imagine how the big stories of the year would be the Penguins' collapse and the Islanders resurgence.

--------

That's pretty typical for what you get with around 50 percent luck. If you want to get a better feel, here's a few more. I won't give you the whole record, but I'll just tell you what happened. (If you want some more sets of random season standings in full, e-mail me and I'll run a bunch off for you.)

1. The simulation showed 45% luck. The Anaheim Ducks (-3) finished fourth, but the Red Wings (+13) finished tenth and out of the playoffs. In the east, Tampa Bay (+3), Montreal (+5), and Buffalo (+9) took the first three spots. The best teams took four of the next five, except that the Leafs (-23) took the eighth spot. Washington (+19) finished ninth.

2. Luck was 44%. The Kings (+12) missed the playoffs in favor of the Wild (-21), who finished seventh in the West. In the East, two bad teams, the Leafs (-23) and Panthers (-25), finished fifth and sixth, respectively, while Pittsburgh (+25) missed the playoffs by two points.

3. Luck was 47%. In the West, nothing too serious ... Anaheim (-3) and Minnesota (-21) took the last two playoff spots from the Flames (+11) and Kings (+12), who were right behind them. But in the East ... the top three were the Sabres (+9), Hurricanes (-2) and Habs (+5). The Islanders (-26) tied with Boston (+46) and Philadelphia (+33), for 5/6/7th. The Capitals (+19) missed the playoffs by a point, and the Penguins (+25) by nine points.

4. This one was 52% luck ... but, strangely, it wasn't that big a deal. In the West all the plus teams made the playoffs except the Sharks (+27). In the East, six of the top seven made the playoffs (Buffalo, +9, was the odd man out); the surprises were the Devils (-31) and Islanders (-26) snatching seventh and eighth. Toronto (-23) also surprised, missing the playoffs by only two points. I guess what happened here was that the randomness was mostly *within* the playoff and non-playoff groups, instead of *between* them.

--------

Those were all typical seasons, with around 45% luck. But some seasons had more, and some seasons had less. Here's a summary of all one hundred. Explanation follows the chart.

Reading across the bottom line of the chart as an example: we assumed the Ottawa Senators had a talent level corresponding to a regulation goal differential of -41. On average, that team wound up with 77 points in the simulated standings. It finished, on average, 12.8th in its conference. It finished first zero times in 100 chances, and made the playoffs six times. The highest position it ever reached was fifth place.

The chart looks pretty reasonable, overall; the outcomes corresponded to the talent. But, for single seasons, we see that strange things happened. The New York Islanders, at -26, finished first twice, just by luck. The Washington Capitals, a good team, should have finished first a lot more than 4 times. And it looks like Colorado had miserable luck; they never finished higher than tenth, while Edmonton, which was actually two goals worse, made the playoffs three times and finished sixth at least once.

Also, it looks like the Canucks, by far the best team in the league, missed the playoffs once just by random chance.

(By the way, I ran the simulation again, just to make sure it wasn't a programming error; the Washington and Colorado outliers disappeared. Also, the Canucks made the playoffs all 100 times.)

You can get a feel for the number of anomalies in the average season, just by adding up some numbers. Of course, it depends how you define "anomaly". Suppose we say an anomaly occurs in a conference when:

-- one of the best three teams misses the playoffs.-- one of the worst three teams makes the playoffs.-- the first place team is not in the top five in talent.

Those frequencies, respectively, were 50, 38, and 23. So we have 111 anomalies total, 1.11 per season. That's an average; some seasons have more than one, some seasons have none. But, typically, there will be one big regular-season surprise that's completely due to luck.

---------

There were still a couple of things I wanted to check. One is, in one hundred seasons, what's the weirdest, most luck-filled one look like? Another one is, what's the typical talent difference is between, say, sixth place and tenth -- that is, the five teams fighting for the last three playoff spots? My gut says there won't be much ... but I haven't checked yet.

Monday, January 21, 2013

Luck vs. talent in the NHL standings

How much of the NHL standings is luck?

A few years ago, Tom Tango found that 36 NHL games was the point where luck and talent were equally important. But that was based on older data, so I thought I'd revisit the question. I looked at the years 2006-07 to 2011-12 -- a total of 150 team-seasons.

Following the usual method for breaking down performance luck, we start by figuring out the expected standard deviation due to randomness alone. That's a little harder than usual, because the NHL doesn't just have wins and losses; it also has pity points for overtime or shootout losses.

For each 82 team-games, there were 9.68 overtime [or shootout] wins, and 9.68 overtime losses ("regulation ties"). That means the average team has 41 games where they earn two points, 9.68 games where they earn one point, and 31.32 games where they get nothing.

The variance of single-game points is E(points squared) - [E(points)] squared. That's .8680 for the average team. Multiplying by 82 for a season, then taking the square root, and we find that the season SD of luck is 8.44 points.

In the five seasons in question, the observed SD of team points was 12.3. Since

So, over an 82-game NHL season, talent is only barely more important than luck -- an SD of 8.95, vs. 8.44.

Where do talent and luck converge? At 73 games. At that point in the season, SD of talent is 8.0 points (per 73 games), and the SD of luck is also 8.0 points.

------

Why did I get 73 games, while Tango got 36? Well, Tango's dataset had an observed SD of .2 points per game. My dataset has .15. That probably accounts for most of the effect.

Why the difference? I'm not sure, but I have a couple of ideas.

First, NHL rules have changed since Tango's study. Back then, some games ended in ties, where each team got one point. That doesn't happen any more -- now, one of the two teams wins the shootout and gets that extra point. Perhaps more bonus points tend to reduce the advantage of being talented.

Second, teams may be playing for regulation ties more than they used to. And, third, there might be more overall parity in the league, for all I know.

------

After I did these calculations, I wasn't sure that they really captured everything. I won't tell you what else I thought might be going on, because ... well, I wrote a simulation to check, and it turned out I was wrong. The simulation matched the theory.

I'll tell you about the simulation anyway, since it's already done.

For each of the 150 teams in the sample, I took their regulation goals scored and goals against, and regressed them 30 percent to the mean (so a team that was 30 goals above average became 21 goals above average). Then, I simulated all five NHL seasons, 100 times each, using the actual game schedule.

For each game, I calculated the two teams' expected goals scored by combining their (regressed) figures. So if Boston's offense was 10 percent above average, and their opponent's defense was 10 percent below average, I'd have the Bruins pegged to score 21 percent more goals than the mean.

Then I went to the random number generator. For each team, I randomly assigned a score, based on a Poisson distribution for its expected number of goals scored. If the two teams wound up equal, I simulated OT. Half the time, I simulated the game being won in OT, the winner more likely to be the team with more goals. The other half, the simulated game was won in a shootout, each team with a 50/50 chance.

I ran that simulation, and it didn't work.

The standings were too homogeneous. I had put in too much regression to the mean. After some experimenting, I found that 22 percent worked best.

Also, the simulation had too few standings points. That's because real life has more regulation ties than Poisson predicts (as we alreadyknew). So, I converted 20 percent of one-goal games to regulation ties (by randomly adding or subtracting one goal).

As an attempt at having the scores a bit more realistic, I added an empty-net goal to 20 percent of one-goal games. To balance, I subtracted one winning team's goal from half the lopsided games (4+ goal differential). These two changes didn't affect who won: only by how much.

After these changes, the simulation pretty much matched reality. Specifically, for one arbitrarily chosen run of my 100-fold simulation:

-- In real life, the average was 221.0 goals per team. In the simulation, it was 220.6.

-- In real life, the SDs of goals and goals allowed were 22.7 and 24.0, respectively. In the simulation, 23.9 and 24.5.

-- In real life, the average team scored 91.68 points (which means 9.68 overtime losses per season). In the simulation, it was 91.71 (9.71).

-- In real life, the SD of team points -- which is the most important thing for analyzing season luck -- was 12.30. In the simulation, it was 12.33.

------

So, in the simulation, how much of the standings turned out to be luck, and how much was talent? It turned out that the r-squared between talent (goal differential) and standings points was .53. That means there were .53 units of talent squared per .47 units of luck squared, a ratio of 1.13.

In real-life, we found 8.94-squared units of talent squared per 8.44-squared units of luck squared. That ratio was 1.12.

Pretty good match.

------

To recap, here's what I learned from all this:

1. For an 82-game NHL season like the last five, the SD of luck is 8.44 standings points.2. The overall SD, which you can easily calculate from the official standings, is 12.3 points.3. Therefore, the SD of team talent is 8.95 points.4. That means the r-squared of talent vs. results is around .53.5. From that, it follows that it takes 73 games until talent is as important as luck in predicting the standings.6. Or, put another way: over an entire season, talent is more important than luck, but not by much.

And, if you trust the simulation is close to reality, we can add:

7. To estimate team talent, you can perhaps take its season goal differential and regress 22 percent to the mean.

------

In future posts, I'll use the simulation to do funner stuff, like figure out the probability that the number one seed is actually better than the number eight seed, and things like that.

Monday, January 14, 2013

Chess and luck

But what about games of pure mental performance, like chess? Is there luck involved in chess? Can you win a chess game because you were lucky?

Yes.

Start by thinking about a college exam. There's definitely luck there. Hardly anybody has perfect mastery. A student is going to be stronger in some parts of the course material, and weaker in other parts.

Perhaps the professor has a list of 200 questions, and he randomly picks 50 of them for the exam. If those happen to be more weighted to the stuff you're weak in, you'll do worse.

Suppose you know 80 percent of the material, in the sense that, on any given question, you have an 80 percent chance of getting the right answer. On average, you'll score 80 percent, or 40 out of 50. But, depending on which questions the professor picks, your grade will vary, possibly by a lot.

The standard deviation of your score is going to be 5.6 percentage points. That means the 95 percent confidence interval for your score is wide, stretching from 69 to 91.

And, if you're comparing two students, 2 SD of the difference in their scores is even higher -- 16 points. So if one student scores 80, and another student scores 65, you cannot conclude, with statistical significance, that the first student is better than the second!

So, in a sense, exam writing is like coin tossing. You study as hard as you can to learn as much as you can -- that is, to build yourself a coin that lands heads (right answer) as often as possible. Then, you walk in to the exam room, and flip the coin you've built, 50 times.

------

It's similar for chess.

Every game of chess is different. After a few moves, even the most experienced grandmasters are probably looking at board positions they've never seen before. In these situations, there are different mental tasks that become important. Some positions require you to look ahead many moves, while some require you to look ahead fewer. Some require you to exploit or defend an advantage in positioning, and some present you with differences in material. In some, you're attacking, and in others, you're defending.

That's how it's like an exam. If a game is 40 moves each, it's like you're sitting down at an exam where you're going to have 40 questions, one at a time, but you don't know what they are. Except for the first few moves, you're looking at a board position you've literally never seen before. If it works out that the 40 board positions are the kind where you're stronger, you might find them easy, and do well. If the 40 positions are "hard" for you -- that is, if they happen to be types of positions where you're weaker -- you won't do as well.

And, even if they're positions where you're strong, there's luck involved: the move that looks the best might not truly *be* the best. For instance, it might be true that a certain class of move -- for instance, "putting a fork on the opponent's rook and bishop on the far side of the board, when the overall position looks roughly similar to this one" -- might be a good move 98 percent of the time. But, maybe in this case, because a certain pawn is on A5 instead of A4, it actually turns out to be a weaker move. Well, nobody can know the game down to that detail; there are 10 to the power of 43 different board positions.

The best you can do is see that it *seems* to be a good move, that in situations that look similar to you, it would work out more often than not. But you'll never know whether it's 90 percent or 98 percent, and you won't know whether this is one of the exceptions.

------

It's like, suppose I ask you to write down a 14-digit number (that doesn't start with zero), and, if it's prime, I'll give you $20. You have three minutes, and you don't have a calculator, or extra paper. What's your strategy? Well, if you know something about math, you'll know you have to write an odd number. You'll know it can't end in 5. You might know enough to make sure the digits don't add up to a multiple of 3.

After that, you just have to hope your number is prime. It's luck.

But, if you're a master prime finder ... you can do better. You can also do a quick check to make sure it's not divisible by 11. And, if you're a grandmaster, you might have learned to do a test for divisibility by 7, 13, 17, and 19, and even further. In fact, your grandmaster rating might have a lot to do with how many of those extra tests you're able to do in your head in those three minutes.

But, even if you manage to get through a whole bunch of tests, you still have to be lucky enough to have written a prime, instead of a number that turns out to be divisible by, say, 277, which you didn't have time to test for.

A grandmaster has a better chance of outpriming a lesser player, because he's able to eliminate more bad moves. But, there's still substantial luck in whether or not he wins the $20, or even whether he beats an opponent in a prime-guessing tournament.

------

On an old thread over at Tango's blog, someone pointed this out: if you get two chess players of exactly equal skill, it's 100 percent a matter of luck which one wins. That's got to be true, right?

Well, maybe you're not sure about "exactly equal skill." You figure, it's impossible to be *exactly* equal, so the guy who won was probably better! But, then, if you like, assume the players are exact clones of each other. If that still doesn't work, imagine that they're two computers, programmed identically.

Suppose the computers aren't doing anything random inside their CPUs at all -- they have a precise, deterministic algorithm for what move to make. How, then, can you say the result is random?

Well, it's not random in the sense that it's made of the ether of pure, abstract probability, but it's random in the practical sense, the sense that the algorithm is complex enough that humans can't predict the outcome. It's random in the same way the second decimal of tomorrow's Dow Jones average is random. Almost all computer randomization is deterministic -- but not patterned or predictable. The winner of the computer chess game is random in the same way the hands dealt in online poker are random.

In fact, I bet computer chess would make a fine random number generator. Take two computers, give them the same algorithm, which has to include something where the computer "learns" from past games (otherwise, you'll just get the same positions over and over). Have them play a few trillion games, alternating black and white, to learn as much as they can. Then, play a tournament of an even number of games (so both sides can play white an equal number of times). If A wins, your random digit is a "1". If B wins, your random digit is a "0".

It's not a *practical* random number generator, but I bet it would work. And it's "random" in the sense that, no human being could predict the outcome in advance any faster than actually running the same algorithm himself.

Thursday, January 10, 2013

Rewriting NBA scoring rules to help the underdog

There are far fewer upsets in professional basketball and football than there are in baseball and hockey. On Jan. 11 (tomorrow, as I write this), the Nuggets are favored by 12 points over the Cavaliers. That translates into at least an 85% chance of victory for Denver -- and, in fact, on Betfair, you can get 4.2:1 odds on Cleveland right now, which translates to only a 19% chance.

That's common for the NBA -- but I doubt you'll ever find an MLB game where one team's odds are that high.

Is the low competitive balance in basketball a good thing or a bad thing? It's bad in the sense that you'd like to see both teams have a fighting chance. It's good, I guess, in that the best teams are pretty much guaranteed to rise to the top of the standings.

In an oft-cited post from 2006, Tom Tango calculated how many games you need, in each sport's major league, for the r-squared correlation between talent and standings to reach 0.5. (UPDATE: Tango says r, but I think he means r-squared.) He found those numbers to be:

12 NFL games (75% of the season)36 NHL games (44% of the season)69 MLB games (43% of the season)14 NBA games (17% of the season)

The better team wins so much more often in the NBA, that after 33 games of the NBA season, you have as much information about which teams are the best as after an entire NHL or MLB schedule. That makes it very hard for NBA teams to make the playoffs by just having a lucky season.

And, of course, it explains why there are so few first round upsets in the NBA playoffs. If a 7-game series in basketball is the equivalent of a 34-game series in MLB ... well, you're pretty sure the better baseball team would prevail in that many games.

------

So, what I've been thinking is, how can we change the game of basketball to increase competitive balance, to make it easier for the underdog to win?

The easiest way is obvious: just make the game shorter.

Suppose you made an NBA game 24 minutes instead of 48. What would happen? Well, the normal curve of results would stretch by the square root of 2.

Consider a simplified game with 100 possessions per team, and only 2-point shots. In those games, a 14-point favorite is almost exactly 1 standard deviation of results (by binomial luck) above zero. That gives it an 84 percent chance of winning. But in a 50-possession game, a 7-point favorite is only 0.7 standard deviations above zero. So its chance of winning is reduce to just 76 percent. (I hope I've done this right.)

The problem is, who wants to watch a 24 minute game? It would only last an hour. You could have two games a night, but ... I don't know, every night a doubleheader? Plus, splitting a double-header is kind of unfulfilling.

A few years ago, I tried to come up with some other solutions that would make NBA games less predictable. They're not that great. But, last week, after a commenting thread at Tango's site, I thought of one that's reasonable. It's not perfect, and it might not be as entertaining, and there might be lots reasons why it wouldn't work in real life ... but it would definitely give the underdog a better shot.

------

Here's what happens. Instead of a single 48-minute game, you play a series of 4-minute games. Whoever wins the most 4-minute games, wins. (If it's a tie, you have one additional tie-breaker game.)

I don't know what to call those mini-games. They have the same thing in tennis, where they're called "games," but that word is taken. We could call them "sets," or "ends" (like in curling), or "innings", or "rounds". Actually, in French, an inning is "une manche", which literally means "sleeve". We could call them "sleeves". OK, maybe not. I'll call them "innings" for now.

So, we play a bunch of four-minute innings.

The problem is: what if you're down four innings at the end of the third quarter? Since there are only three innings per quarter, you'd have no chance to win. To prevent that, we add one more rule: if you go up by 6 points, you win the inning immediately. So, in theory, if you go on a 30-0 run in the fourth quarter, you'd win five innings and have come back from four down.

The "innings" game favors the underdog by adding some luck: even if your opponent scores ten more points overall, they might have bunched those points in one or two innings, so you might still win the game. It's like the 2012 Baltimore Orioles, who scored only 6 more runs than they allowed -- but finished 93-69 on the strength of their 29-9 record in one-run games.

------

That's the theory. To test it out in practice, I ran the results of the 2008-09 NBA season, based on play-by-play data provided by Ryan Parker at "Basketball Geek" (much appreciated!). The database has 1176 out of the 1230 games that year, which I figure is complete enough.

I just rescored every game as if it was being played by "innings" rules. I included real overtime, if there was one, even if it wasn't necessary to break an "innings" tie. If a team went past 6 points in an inning, it still got reset to zero for the next inning. But if a team hit 6 on its first of multiple free throws, the following shots counted towards the new inning. I only counted the unfinished inning at the end of the game if it was necessary to break a tie. If that inning was tied, too, I flipped a coin for OT. (In real life, you'd want to finish the last inning if it could win or tie the game. But I couldn't do that, because I had no data.)

Bottom line: that the bad teams did indeed win more often. Before I give you the results, I should mention that you can't consider them perfectly reliable. Team strategies would be different in the "inning" game than in regular games. Garbage time might not be garbage time, or vice-versa. I might be assuming deliberate fouls in games where the fouling team is actually ahead in innings. Teams wouldn't try 3-point shots when up by 4 in an inning. And so on.

You'll notice that most of the bad teams improved, and most of the good teams declined. The SD of the actual records was 13.1 wins; the SD of the "innings" records was 10.4 wins. That's a big difference. To me, it makes the standings look more "normal", compared to other sports, with no team in the teens in either wins or losses. (Of course, to an NBA fan, they might look too compressed; it's a matter of taste.)

-------

Here's some details of what individual games looked like.

-- Overall, 38 percent of innings were what I'll call "big innings," ended because of a 6-point lead. Probably, with teams changing strategies, that would increase somewhat. To, maybe, 50%? That seems reasonable to me, half ending early, and half ending by the clock.

-- The chance of a major comeback seems to be about the same. In regular basketball, teams down by 8+ points at the start of the fourth quarter, won 8 percent of the time (8 wins in 100 tries). In "innings" basketball, teams down by 2+ innings at the start of the fourth quarter, won 6.7 percent of the time (7 out of 104).

-- There were 213 games out of 1176 where the outcome was different between the two rules. Some of them were surprising. On December 19, 2008, the Nets beat the Mavericks 121-97. But, in the "innings" replay of the game, Dallas won 7-6!

-- The next most extreme reversals were 21, 18, 16, and 14 point games. There were 26 games total where a team that won by 10+ points, would have lost the "innings" game.

-- One thing about these rules is that you have to be willing to put up with a lot of similar-looking scores. There are a LOT of 7-6 and 7-5 victories. Most games have between 12 and 16 innings.

-- One of the things Tango wanted to see were occasional shutouts. With these rules, it never happened. But there was a "oneout." It was November 21, where New Orleans beat Oklahoma City 10-1 (105-80). There were only 11 "twoouts," games where the losing team won only two innings. They were all blowouts by normal standards -- 19 points was the closest.

-- There were three games where the winning team scored 12 or more innings. Those winning teams scored 129, 121, and 140 points, respectively. (The last one was a triple-overtime game in real life: Miami 140, Utah 129 (12-8).)

-- The most "big innings" was 13, on March 28. Denver beat Golden State 10-7 (129-116). There were two games with 12 big innings, and six with 11. Some of them look like (real-life) overtimes.

-- There were seven games with no big innings at all -- where neither team ever took a 6-point lead within an inning.

I'm not really advocating that the NBA change to this new format. It's more an experiment, just to prove that there *is* a way to increase competitive balance, without shortening the game or changing the rules too much.

-----

UPDATE, Jan. 12: One thing I just thought of: "big innings" don't help competitive balance much. Only innings than end by the clock do.Suppose you had a rule of ONLY "big innings". The first team to get a six-point lead scores the inning. Most innings wins. (And, if you go over six points, you carry the extras over to the next inning.)That wouldn't do at all. Because, suppose you take six times the winning team's innings, minus six times the losing team's innings. That's just the overall point differential, and it has to be positive. So the team that wins the innings game is also the team that would have won the regular game!It seems like the thing that gives the underdog a chance is innings that end by the clock. So, maybe you want to have more of those.Suppose we go to 7 points for a big inning, instead of 6. Hang on ... OK. With 7 points required instead of 6, the SD of wins drops even further, to 9.5. Now, only 24.5% of innings are "big". Nicer!There sure are a lot of 7-5 and 7-6 and 8-4 scores, though. That might get tiresome.

Sunday, January 06, 2013

"Make it, take it" benefits the underdog

In a variation of basketball called "make it, take it", a team scoring a basket gets to keep the ball for another possession, rather than giving it to the other team. So it can ring up 4, 6, 10, 20 ... points in a row, before the other team gets a shot.

You would think that in this variation, a good team could easily run up the score on a worse team, and you'd get lots of blowouts.

And I think that's true. But, unexpectedly, I think this variation also gives the underdog team a *better chance of winning the game* than regular basketball rules.

To verify, I created a simulation of a simplified game. The good team scores on 52 percent of its possessions. The bad team scores on 48 percent. A game goes 200 possessions; if the game is tied, the next score wins. There are no three-point attempts or free throws.

In my simulation of ten million games:"Regular" game: favorite won .7157682"Make it" game: favorite won .7146367

Difference: .0011315

Not a huge change in competitive balance, but definitely real (about 8 SDs).

(Note: Geoff Buchan (who writes the RotoValue blog), e-mailed me with his own simulation. The results were similar, although not exact. Geoff noted that the difference is very small, and wonders whether simulating a "traditional" overtime, rather than sudden death, would eliminate it. We're still working on that ... but, for purposes of this post, I'm going with sudden death.

In any case, I'm not advocating for this rule change, nor do I feel like thinking through the practical ramifications (how will teams change their play?). I just think it's an interesting result.)

So why does this happen, that the underdog benefits from a "rich get richer" rule change? I would never have thought it did. I kind of stumbled on a similar situation last year, and wrote up a puzzle asking for a proof. This time, I don't have an actual seamless proof, but I'm going to try explaining. I'm trying to make you feel like you understand what's happening, rather than convince you with rigorous step-by-step logic.

It's a fairly long (but not complicated) explanation. If you can think of a better or shorter one, let me know.

I'm going to do this in two parts.

--------

PART I

1. First, and perhaps obvious: the shorter the game, the better the chance of the underdog winning. If a bad team loses 30% of its games, it may lose only 40% of its quarters. A bad team might go on a 10-3 run in the short term, but it's not going to go on a 50-15 run in the longer term. The Charlotte Bobcats may have a 20% chance of winning a given game against (say) the Lakers, but much less than a 20% chance of winning the *season* against the Lakers.

2. Therefore, the underdog has a better chance of winning the first half, than it does of winning the whole game. It would love to negotiate with the other team to have the game end after two quarters. As it turns out, that would increase its chance of winning from 28% to 34%.

3. Now, suppose we play a full game. But we make one change: at the end of the first half, we take whichever team is in the lead, and we award it an extra 100 points. Then we play a normal second half.

The first reaction is: well, the overdog is going to be in the lead 2/3 of the time, so, it's going to get most of the benefit of the extra 100 points. You're just making the rich team richer!

But, it's the opposite. If you give the leading team 100 bonus points, then the team that wins the first half *has to win the game*. Right? There's no way to come back from more than a 100 point deficit. So, effectively, the "100 point bonus" rule is exactly like ending the game at halftime! And, as we saw, that favors the underdog.

4. What if the bonus is less than 100 points? It still favors the underdog, because it still makes it less likely for the lead to change. Not impossible, this time, but still less likely than otherwise, which still hurts the favorite.

Remember: the underdog leads 34% of the time after the half, but only 28% after the full game. Therefore, when the lead changes in the second half, it's usually in the overdog's favor. Therefore, fewer lead changes benefit the underdog.

That's true even if it's a small bonus -- say, 4 points. That still favors the underdog, to some extent. The explanation isn't as obvious as when it's a 100 point lead, but the mechanism is the same: reducing lead changes.

5. Another way to look at it is this: when the game is a blowout, 4 points isn't going to matter. The 4 points is going to matter more in a close game.

Now: in our simulation, the better team is favored by 8 points over the game. That means, after the first half, the average point differential is 4 points.

Of course, it's not always 4 points. The reality is something like:

-- Two-thirds of the time, the favorite is leading at the half, by an average of maybe 7 points.-- One-third of the time, the underdog is leading at the half, by an average of maybe 2 points.

That does, in fact, give an overall average of +4: two-thirds of +7, added to one-third of -2.

Now, in the second half, the better team will add on another 4 points of differential. So you can add 4 points to both of those:

Two-thirds of the time, the favorite is leading at the half, and the final differential will average +11 points.One-third of the time, the underdog is leading at the half, but the final differential will average only +2 points (for the favorite).

Now, what if you give a 4-point bonus to the team leading at the half? Then, two thirds of the time, it turns a +11 into a +15. Doesn't matter much; the favorite is overwhelmingly likely to win either way. But, one-third of the time, it turns the worse team from a 2 point underdog, to a 2 point favorite! That's a big deal. And that's why the bonus favors the underdog team.

The bonus makes a much bigger impact when the game is close, and when the underdog has a chance, the game is more likely to be close than a blowout.

6. Now, suppose that instead of giving points to the team that's leading at the half, you give them *extra possessions*. Same thing, right? The extra possessions will just translate to points. Even more so if you give them extra possessions, and simultaneously take extra possessions away from the trailing team! Even though it's the favorite that gets the extra possessions twice as often as the underdog does, the rule change still helps that underdog win more games.

7. In summary: anything that gives a benefit to the team that's leading *favors the underdog*. That's because the underdog is more likely to be leading partway through the game than at the end of the game, and the benefit has the effect of increasing that lead, thus partially cementing it.

---------

PART II

Now, what does this have to do with "Make it, take it"? Let me show you.

Here's a possible play-by-play of a "make it, take it" game. I'll use "U" for underdog and "F" for favorite, and follow each possession by the number of points scored (either 2 or 0). I'll put each "run" of possessions on its own line. U2 U0F0U0F2 F2 F2 F0U0F2 F0U2 U2 U0F0...

You'd read the game horizontally. So, "U2 U0 [new line] F0" means, "the underdog scored 2 points (U2). So they kept possession, but this time missed (U0). So the favorite then took possession, and missed (F0)."

The chart will continue on until the game hits 200 possessions (which we assume to correspond to 48 minutes).

How many rows will the chart have? About 100. There will be 100 "runs" on average. (Why? Because the average length of a "run" is around 2 possessions. The teams are both around a 50 percent success rate. Therefore, there's roughly one make for every miss. Since every run corresponds to exactly one miss, the average run has 1 make. That gives a run length of two.)

Now, it doesn't matter what order we consider the possessions in, since they're independent, and since the total score is going to be the same either way. So, imagine that the game actually unfolded this new way. In that case, the top row, being 100 possessions wide, can be considered the first half. The rest of the rows, combined, can represent the second half.

See where we're going with this? The first half has alternating possessions by team. So it's like normal basketball.

Therefore, we can rewrite the rules of "Make it, take it" as follows: Play a normal first half. Then, spend the second half awarding the extra "make it" possessions.

How do we do the second half? We start by awarding the second row. Each team gets one extra possession for every basket they made.

That means that the team leading after half the game gets more extra possessions than the team trailing after half the game. That is -- we're giving a bonus for the team leading. And as we already saw, that favors the underdog.

And the same for the third row: we give an extra possession for every basket made in the second row. Again, the more successful team gets the bonus. In this case, it's not necessarily the leading team -- it's the team that had a better second row. But it's *probably* the team in the lead, so it still *probably* benefits the underdog.

And the same for the fourth, and fifth, and sixth rows, until there are no more bonuses to give. Each row, on average, benefits the underdog a little bit.

-------

Get it? The "Make it, take it" rule gives the underdog a better chance to win the game, because it has the effect of helping cement a lead at several points during the game. Since the underdog is more likely to be leading partway through the game than after it, the cementing helps them disproportionately.

-------

There you go! As I said, I'm open to better explanations; this was the simplest I could think of.UPDATE: Geoff Buchan's results are here.

Tuesday, January 01, 2013

Foul shooting and luck

I always think that performance in sports is a combination of luck and skill. When an 80 percent free-throw shooter steps to the line, I think what happens is almost all luck. To me, the shot is like the player flipping a coin that lands heads 80 percent of the time.

Where's the skill? The skill is in the player having practiced enough to bring his coin up to 80 percent heads. Personally, I'm probably a 25 percent shooter, at best. But if I practice and practice, I'll get better; maybe eventually I'll hit 50 percent. Once I do, my shots are like flips of a fair coin.

Some people disagree. They reject the idea that there can be any luck at all. It's just the player, and the ball, and the basket. It's all under the player's control. Where's the luck?

I think our disagreement is that they're thinking of luck only as "external" luck, something outside the player -- like a bad bounce in the infield, or getting dealt good cards in poker. I think there can also be "internal" luck, even though if everything is, in theory, within the player's control.

-------

We can probably all agree that coin flipping is a good example of luck. But in a way, coin flipping and foul shooting are almost the same. If I'm doing the flipping, and I'm trying to land a head ... it's all within my control, the same way as shooting a basket.

The difference, you might argue, is that it's impossible to influence the outcome of a coin flip. I may *want* it to land heads, but I just can't do it.

Except that ... I think I *can* do it. Not all the time, but better than 50 percent. I bet you if I practiced, I could make it land heads more than tails. Suppose that, flipping a coin that starts head up, I try to make the coin rotate exactly 10 times, and land flat in my hand. That's hard, but ... who's to say that I can't do it often enough to get 50.0001 percent heads? That would mean out of one million flips, I manage to convert one tail into a head because of the skill I developed. Doesn't seem unreasonable.

So after I've achieved that amount of skill, you hand me a coin, and I flip it, and it lands heads. Was it luck, or skill?

Everyone should agree that it's mostly luck. My skill is being able to increase my chances from 500,000 out of a million, to 500,001 out of a million. After that, the individual flip is just luck.

It has to be, doesn't it?

-------

In the comments to the previous post, someone wrote that foul shooting *seems* like it's all skill because the success rate is so high. That seems right to me; we have a default assumption in our brains that either we can do something, or we can't. If we can, it seems like we should be able to do it all the time, and when we don't, it's our fault, we made a mistake. At least, to me, that's how it *feels*. When I screw something up, I feel like, geez, I did that before so many times, why couldn't I do it now? I feels like I choked, or did something wrong.

And it might have been a choke, actually. Even though we shoot at 80 percent normally, maybe this time we chose not to concentrate enough. Maybe we should have bent our knees more. Maybe we were too impatient, or too nervous.

The thing is, it *could* have been all those. But it probably wasn't. It's just our nature to think, we could have done it differently! And, yes, we could have, just like we could have flipped the coin harder, or softer. But, so what? There was no way of knowing in advance that *this time* was one of the 20 percent what we did wasn't going to work.

And, in a way, it's a good thing we think it was our fault. Because it's still *possible* that we did something wrong, and it's *possible* that if we figure out what it is, mental or physical, we could get our success rate up from 80 percent to 81 percent. Or, it could be that we've slipped up and forgotten something we knew, so that if we don't fix it, we'll fall to 78 or 79 percent. You don't want to just think, "Hey, I won't worry about it, it was just luck."

But it probably was.

-------

None of this is to say that preparation and concentration and patience aren't important. But, you'd expect, professionals would have learned to do that all the time. That, I think, is one of the underrated skills that professional athletes have: the ability to focus and concentrate much better than we do.

I can probably hit my trash can with a wad of paper most of the time. It's probably 60 percent when I don't care, and 80 percent when I aim.

Now, if I miss, it might certainly be that I didn't care, and you might be reasonable in assuming that it wasn't just luck, that it was mostly within my control -- that if I cared more, I'd have made it. In fact, if half the time I don't care, then, I'm missing 30 percent of the time, rather than just 20 percent if I always tried my hardest.

In a sense, then, you might have the right to say that there's more than luck involved: that maybe my misses are 2/3 luck, and 1/3 bad decisions.

The thing about the pros, though, is: I think they have the ability concentrate and prepare *almost all the time*. I have no evidence for this, but I bet an 80 percent NBA player is seldom, if ever, even a 75 percent player, for reasons under his conscious control.

In any case, there still has to be a substantial amount of randomness. Because, there's no way an 80 percent shooter is able to be a 100 percent shooter, if he just takes the task more seriously. That's just not plausible.

-------

If you agree with me that my coin flipping is luck (despite my 50.0001 success rate), but disagree that foul shooting is luck, I have one more argument for you.

Suppose in my coin flipping, I'm trying to get the coin to turn over exactly 20 times: that is, do ten 360s, for a total rotation of exactly 36,000 degrees. And suppose you measure a large number of my tosses, and you find that my average is indeed 36,000, and my distribution is a bell curve with a large standard deviation. You analyze the distribution and figure out that based on my accuracy, I should indeed wind up with heads 50.0001 percent of the time.

Now, if you agree that my coin flipping is largely luck, you have to agree that my deviations from 36,000 are largely luck.

Now, what if you did the same thing for free throws? You measure all my free throws. You find that my angle of release is normally distributed with a mean of 52 degrees, and a certain standard deviation. And that the ball goes in when my angle is within 1.3 SDs either way, which works out to 80 percent success.

Isn't that the same thing as the coin? If my deviations of coin flipping rotation are luck, then why aren't the deviations of release angle luck?

--------

A summary of the overall argument is: there are natural limitations to how much humans can control their muscles. With practice, you can develop "muscle memory," which increases the amount of control you have. That is, you get the standard deviation to shrink. But, humans just aren't capable of getting so precise that we can be 100% shooters of basketball free throws.

We're all going to have our individual accuracy curves. Good shooters will have a narrower and better-placed curve than poor shooters. But, once you have the curve for your own personal muscle control, where you land on that curve, for any given shot, is luck.

If you don't like using the word "luck," because it's internal to the brain, and it feels like it's under your control ... well, fine. But then you can't call coin flipping luck, either. Because, aside from the probability of success, it's almost exactly the same phenomenon.