When Picking a Bracket, It’s Easier to Be Accurate Than Skillful

Arizona forward Aaron Gordon reacts after scoring a basket while playing Gonzaga in the 2014 NCAA basketball tournament.

Denis Poroy / AP

Arizona forward Aaron Gordon reacts after scoring a basket while playing Gonzaga in the 2014 NCAA basketball tournament.

Denis Poroy / AP

So, how is your bracket after the first weekend of the NCAA men’s basketball tournament? Did you pick Mercer to win in the round of 64? Most didn’t, of course, including President Obama and anyone who based picks off the favorites in the FiveThirtyEight bracket. Only 3.3 percent of brackets in the ESPN Bracket Challenge had Mercer beating Duke, and only 0.3 percent correctly picked all four of the double-digit seeds that advanced Friday.

IncorrectNCAA predictions come from two places: One is the failure to anticipate upsets, and the other is the prediction of upsets that don’t occur. Because favorites win most games, the upsets you get wrong are far more important than those you get right.1

In fact, after the first weekend of play, it turns out that 83.4 percent of entrants in the ESPN Bracket Challenge would have been better off picking all the favorites.

In last year’s tournament, this simple strategy would have resulted in 43 (of 63 total) games picked correctly, a 68.3 percent accuracy rate. For comparison, consider that a bracket based on FiveThirtyEight’s 2013 odds would have also resulted in 43 of 63 winners picked correctly. Even the wisdom of the crowds couldn’t exceed the 43-game mark by much. No one in the CBS or Yahoo NCAA 2013 bracket competitions correctly picked more than 50 winners. So far in 2014, only 1.83 million of the 11.01 million participants in the ESPN Bracket Challenge are doing better than a naive bracket going into the Sweet 16.

In 1884, a scientist named John Park Finley set the standard for being accurate but not skillful in his predictions. Over three months, Finley predicted whether the atmospheric conditions in the U.S. were favorable or unfavorable for tornadoes over the next eight hours, and then compared whether his prediction was accurate. By the end, Finley had made 2,806 predictions and 2,708 of them proved accurate, for a success rate of 96.5 percent. Not bad. But two months later, another scientist pointed out that if Finley had just said that there wouldn’t have been a tornado every eight hours, he would have been right 98.1 percent of the time.

In forecasting, accuracy isn’t enough. Being a good forecaster means anticipating the future better than if you had just relied on a naive prediction. It’s difficult in more consequential settings than the NCAA tournament as well. For example, investing in a mutual fund with a portfolio that mirrors the Standard & Poor’s 500 index over the past decade would have generated better returns than 65 percent of managed funds overseen by investment professionals. Think about that: 65 percent of mutual funds charge fees to underperform an investment strategy that requires no thinking and lower fees.

Chance alone means that some brackets and funds will beat a naive baseline. So, how do we know whether a good prediction was the result of savvy prognostication or just blind luck?

Having a lot of data on predictions and real-world outcomes helps. But there are only a few disciplines for which we can clearly identify good predictions using evidence. One of those is weather forecasting. Consider that the National Weather Service issues more than 10 million forecasts each year. Such a large amount of data allows for rigorous evaluation of if and by how much forecasters are able to improve over a naive baseline. It turns out that weather forecasters are highly skilled. As a few colleagues and I wrote in the book “Prediction: Science, Decision Making and the Future of Nature,” weather forecasting was one of the scientific triumphs of the 20th century and it continues to improve.

But weather forecasting is also unique; you’ll never get 10 million prediction data points while choosing NCAA tournament picks or investing. That’s what makes predictive skill so hard (or even impossible) to identify.

The only way to figure out whether our predictions have skill is to track them in real time, not retrospectively, against a naive baseline. So, think of your busted bracket when you make decisions about your 401(k). The biggest reward you might get from trying to predict the tournament could be a lesson in just how difficult it is to be a skilled forecaster.

Footnotes

Think of it this way:

A. The naive baseline always picks the higher seed (this will almost always be the favorite; there may be rare occasions when the lower seed is favored).
B. That means that if I pick the favorite, there is a 0 percent chance of losing ground to the naive baseline.
C. The only way that I might lose ground to the naive baseline is by picking an underdog. But if I do that, my expected gain turns negative. For instance:

Team A = 60 percent chance of winning

Team B = 40 percent chance of winning

Assume that you get 20 points for picking the winner, as occurs during the round of 32 in the ESPN Bracket Challenge. If you pick the underdog, you have (a) a 40 percent chance of gaining 20 points on the naive baseline, or an expected return of 8 points; or (b) a 60 percent chance of losing 20 points to the naive baseline, or expected return of -12 points.

Your next expected return with respect to the naive baseline in this case is -4 points. But when picking the favorite, the expected return is 0 points. Even in a big upset, you won’t lose ground to the naive baseline.

Thus the only way that forecasters lose ground on a naive baseline is by picking against the naive baseline and losing.

Roger Pielke Jr. is a professor of environmental studies at the University of Colorado Boulder. @RogerPielkeJr