Why No. 5 Seeds Are Jinxed

JeQuan Lewis of Virginia Commonwealth University, a No. 5 seed, reacted after his team was defeated by Stephen F. Austin University during the 2014 NCAA men’s basketball tournament.

Donald Miralle / Getty

JeQuan Lewis of Virginia Commonwealth University, a No. 5 seed, reacted after his team was defeated by Stephen F. Austin University during the 2014 NCAA men’s basketball tournament.

Donald Miralle / Getty

If there’s one piece of folk wisdom that has emerged over the past decade or so of March Madness, it’s that No. 5 seeds are jinxed. SportsCenter did a whole story on the subject featuring Virginia Commonwealth University. In 2012, VCU was a No. 12 seed that pulled off a “shocking” upset against Wichita State. In 2013, VCU was itself a No. 5 but defied the trend, crushing No. 12 Akron by 46 points to become the only No. 5 seed to win its opening-round (round of 64) game that year. In 2014, VCU’s story came full circle. It again entered the tournament as a No. 5 seed but was upset by unheralded No. 12 seed Stephen F. Austin University. The tournament quirk that was once VCU’s magic was now its curse.

Including those VCU games, No. 12 seeds over the past three years have pulled off upsets in eight of 12 round-of-64 matchups, including six of their last eight. It would be extremely easy to dismiss this as a freak occurrence. (I certainly did at first.) But it’s a real phenomenon. And after looking into it, I think it may be indicative of something larger. The 5-seed jinx may be a sign that March Madness — at least on the men’s side — is even madder than we think.

But I’ll get there. First, let’s look at the phenomenon. If it seems like No. 12 seeds beat No. 5 seeds more than they should, it’s because they have. Going back to 1995, No. 5 seeds have been upset 33 times in 80 games. Their 59 percent win rate compares unfavorably to the 66 percent win rate of No. 6 seeds. Based on the trend, it would appear that No. 5 seeds should be winning more like 72 percent. Take a look at how far No. 5 seeds deviate in the chart below. The gray region is the standard error on the fit between seed and win percentage when not including the No. 5 seed:

So they’re an outlier, but is it significant? Particularly, how unlikely is this to have happened by chance? Let us consult the oracle of binom.dist() — Excel’s handy function that tells you the probability of things happening a certain number of times, given the probability of them happening once. In a fun bit of symmetry, given an expected win rate of 72 percent, the odds of No. 5 seeds losing six of eight, eight of 12, or 33 of 80 are all about the same: Each is a little under 1 percent.1

Note that it would be unremarkable for this to have happened by chance: One in a hundred-type things happen every day. But, as a committed Bayesian, I have to consult my priors and determine whether the phenomenon of No. 5 seeds’ underperforming is more likely to be a result of chance or other plausible factors.

First, let’s look at how strong each seed’s teams have been since 1995. As you go from the 1 vs. 16 matchups down to the 8 vs. 9 ones, the better-seeded teams get worse and the worse-seeded teams get better, making the contests much closer. To see how much so, we can plot each team’s SRS (Simple Rating System, a metric that measures margin of victory adjusted for strength of schedule) prior to the game.2

The shading shows you the general range of strengths for each seed.

Although not a huge effect, the gap between the No. 5 and 6 seeds and their competition has been narrowing over time. The average difference between No. 5 and 12 seeds from 2000 to 2002 was about 7.6 points, but it has been about 5.8 from 2012 to 2014.

More importantly, the 5 vs. 12 matchup looks a lot more like the 6 vs. 11 one than it does the 4 vs. 13. The No. 5 seeds have been considerably weaker than No. 4 seeds, and No. 12 seeds have been considerably stronger than No. 13 seeds. The average No. 5 seed had a 6.6 point expected advantage going into a game against its No. 12 seed opponent. That’s only 2.2 points higher than the average advantage that No. 6 seeds held against No. 11 seeds (4.4 points), but it’s 5.1 points lower than the average advantage that No. 4 seeds held against No. 13 seeds (11.7 points).

It seems like the 5 vs. 12 seed matchup is the threshold where the games should start being much more competitive. Combine that with the psychological effect of thinking five is a number that has more in common with four than six (blame our five fingers), and you have a recipe for “shocking” upsets.

That is, there are a number of upsets, but we shouldn’t really be shocked. Even just looking at recent history, No. 5 seeds have only been a greater than 10 point SRS favorite in eight round-of-64 games since 2005, and they won 7 of them.3 The No. 5 seed has been an SRS underdog three times (and lost twice). Still, the No. 5 seed has performed below what one would expect based on the difference between them and their opponents. But so have most seeds. Here’s a chart comparing the average expected outcomes based on SRS difference and average actual outcomes for each seed over the past 12 years:

From this angle, the No. 5 seed “outlier” doesn’t look as impressive. Seeds No. 1 through 6 all underperformed expectations by a smallish — but somewhat consistent — amount. The main difference with the No. 5 seed is that it didn’t have a big enough advantage to underperform this much without losing a lot more games.

In other words, if there’s something that has systematically led tournament favorites to underperform their expectations by a few points or so across the board,4 No. 5 seeds would be disproportionately hard-hit. Thus the 5-seed jinx may be more like the proverbial “canary in a coal mine,” indicating that something bigger is going on.

We know the Big Dance is exciting, but could there really be something about the tournament that makes favorites underperform and gives underdogs better-than-normal chances?

It’s tricky. For example, the selection committee may systematically overvalue particular types/classes of teams, but that doesn’t necessarily explain why teams would underperform relative to SRS. Some of it could be that SRS is poorly calibrated for the types of matchups we see in the tournament (e.g., between larger and smaller conferences that rarely play each other). It could be that favorites are more likely to regress to the mean.5

Or it could just be that this is March Madness, and anything can happen.

Footnotes

I backed these out myself, so there may be very small differences from what was actually recorded at the time. They’re as prior to each team’s round-of-64 match for each year (since 1995).

The loser was Illinois against Western Kentucky in 2009.

As a strictly mathy thing, having a somewhat constant deviation isn’t as weird as it may seem because the standard deviation for a team’s actual SRS is similarly stable. So in this case, it’s a bit like the stronger teams are all running one standard deviation below the expected mean.

This is always a good candidate, but, interestingly, there is no such effect in the women’s tournament.

Benjamin Morris researches and writes about sports and other topics for FiveThirtyEight. @skepticalsports