Saturday, April 30, 2011

Recently posted to SSRN: FantasySCOTUS: Crowdsourcing a Prediction Market for the Supreme Court, a draft paper by Josh Blackman, Adam Aft, & Corey Carpenter assessing the accuracy of the Harlan Institute's U.S. Supreme Court prediction market, FantasySCOTUS.org. The paper compares and contrasts the accuracy of FantasySCOTUS, which relied on a "wisdom of the crowd" approach, with the Supreme Court Forecasting Project, which relied on a computer model of Supreme Court decision making. From the paper's abstract:

During the October 2009 Supreme Court term, the 5,000 members made over 11,000 predictions for all 81 cases decided. Based on this data, FantasySCOTUS accurately predicted a majority of the cases, and the top-ranked experts predicted over 75% of the cases correctly. With this combined knowledge, we can now have a method to determine with a degree of certainty how the Justices will decide cases before they do. . . . During the October 2002 Term, the [FantasySCOTUS] Project’s model predicted 75% of the cases correctly, which was more accurate than the [Supreme Court] Forecasting Project’s experts, who only predicted 59.1% of the cases correctly. The FantasySCOTUS experts predicted 64.7% of the cases correctly, surpassing the Forecasting Project’s Experts, though the difference was not statistically significant. The Gold, Silver, and Bronze medalists in FantasySCOTUS scored staggering accuracy rates of 80%, 75% and 72% respectively (an average of 75.7%). The FantasySCOTUS top three experts not only outperformed the Forecasting Project’s experts, but they also slightly outperformed the Project’s model - 75.7% compared with 75%.

3 comments:

"The Gold, Silver, and Bronze medalists in FantasySCOTUS scored staggering accuracy rates of 80%, 75% and 72% respectively (an average of 75.7%). The FantasySCOTUS top three experts not only outperformed the Forecasting Project’s experts, but they also slightly outperformed the Project’s model - 75.7% compared with 75%."

is a bit odd. With thirty-seven "experts", having low levels of what the authors call "reliability" (that is, the experts did not all tend to agree), I think it would be surprising if there weren't any experts with such high accuracy rates. What I would have liked to see is some sort of comparison across data-sets — say, look at the experts who did best on even-numbered cases, and the experts who did best on odd-numbered cases, and see if they're the same experts. That would help determine the extent to which the performance of the top-performing experts was due to chance.

I think I understand your concern: A large set of players will tend to generate a few "experts" by dint of randomness. We want to know if their expertise arises by dint of superior insight or simple luck. You propose a mechanism for testing that--a clever one, too.

The authors plan to make all their data available and will encourage others to take a crack at it. They are very open to suggestions. Drop 'em a line!

Like the Constitutional requirement for the decennial census, the mandates for SCOTUS are at once demanding and inconvenient. Statistial analysis might be more accurate than door-to-door checks; but that's not the law. So, too, does the game here point to algorithmic models of decisions. They, too, may be deemed better than the present system.