In Defense of Nate Silver, Election Pollsters, and Statistical Predictions

Share

In Defense of Nate Silver, Election Pollsters, and Statistical Predictions

Nate Silver.

Photo: Brian Finke

Nate Silver analyzes poll data on the influential FiveThiryEight blog at The New York Times. He crunches polls and other data in an electoral statistical model, and he claims that his work is guided by math, not left or right politics. Yet he's become a whipping boy as election day approaches. His crime? Publishing the results of statistical models that predict President Obama has a 73.6 percent chance of defeating the Republican challenger, Mitt Romney.

“The pollsters tell us what’s happening now," conservative columnist David Brooks told Politico, trashing Silver. "When they start projecting, they’re getting into silly land.” In the same article, MSNBC's Joe Scarborough added, "And anybody that thinks that this race is anything but a tossup right now is such an ideologue, they should be kept away from typewriters, computers, laptops, and microphones for the next 10 days – because they’re jokes.”

David Brooks is mistaken and Joe Scarborough is wrong. Because while pollsters can’t project, statistical models can, and do ... and they do some predictions very well.

We rely on statistical models for many decisions every single day, including, crucially: weather, medicine, and pretty much any complex system in which there's an element of uncertainty to the outcome. In fact, these are the same methods by which scientists could tell Hurricane Sandy was about to hit the United States many days in advance.

[#contributor: /contributors/59264aabf3e2356fd8008c31]|||A fellow at the Center for Information Technology Policy at Princeton University and an assistant professor at the University of North Carolina, Chapel Hill, [Zeynep Tufekci](http://twitter.com/techsoc) explores the interactions between technology and society. Tufekci was previously a fellow at Harvard's Berkman Center for Internet and Society and an assistant professor of sociology at UMBC.|||

Dismissing predictive methods is not only incorrect; in the case of electoral politics, it's politically harmful.

__It perpetuates the faux "horse race" coverage that takes election discussions away from substantive issues. __Unfortunately, many of these discussions have become a silly, often unfounded, time-wasting exercise in fake punditry about who is 0.1 percent ahead. There may well be reasons to consider Ohio a toss-up state, but “absolute necessity for Romney to win the state if he wants to be president” (as Chris Cillizza argues) is not one of them.

It confuses "polls" and "statistical" models, which are not predictions about the same thing. The election can indeed be won by 50.1 percent of the national vote, as Scarborough notes in his comment that "Nobody in that campaign thinks they have a 73 percent chance – they think they have a 50.1 percent chance of winning." More correctly: by 270 electoral votes which can be won with even less. But the chances of getting past that 270 electoral votes margin can be 80 percent. Heck, the odds of Obama passing 270 votes can be 90 percent and the election could still be close in terms of winning margins.

Because the vote percentage (how many electoral votes Obama/Romney win) is the outcome of the election; but the odds (%) are the probability of a particular outcome happening.

The Pride and Prejudice of Pundits
———————————-

“If there’s one thing we know, it’s that even experts with fancy computer models are terrible at predicting human behavior.” So said David Brooks in his recentNew York Times column, sharing examples of stock market predictions by corporate financial officers. He has certain points I agree with; for example, CFOs are not very good at predictions.

And yes, there's no point in checking individual polls every few hours. But experts with fancy computer models are good at predicting many thing in the *aggregate. *This includes the results of elections, which are not about predicting a single person’s behavior (yes, great variance there) but lend themselves well to statistical analysis (the same methods by which we predicted the hurricane coming).

>While pollsters can’t project, statistical models can, and do.

This isn’t wizardry, this is the sound science of complex systems. Uncertainty is an integral part of it. But that uncertainty shouldn't suggest that we don’t know anything, that we're completely in the dark, that everything's a toss-up.

Polls tell you the likely outcome with some uncertainty and some sources of (both known and unknown) error. Statistical models take a bunch of factors and run lots of simulations of elections by varying those outcomes according to what we know (such as other polls, structural factors like the economy, what we know about turnout, demographics, etc.) and what we can reasonably infer about the range of uncertainty (given historical precedents and our logical models). These models then produce probability distributions. So, Nate Silver:

takes all the polls we have;

adds in factors to his model shown to have impacted election outcomes in the past;

runs lots and lots and lots of elections; and

looks at the probability distribution of the results.

What his model says is that currently, given what we know, if we run a gabazillion modeled elections, Obama wins 80 percent of the time. Note this isn't saying if we had all those elections on the same day we'd get different results (we wouldn't); rather, we are running many simulated elections reflecting the range of uncertainty in our data. The election itself will "collapse" this probability distribution and there will be a single result. [Thanks to Nathan Jurgenson for suggesting and helping with this clarification.]

>This isn’t wizardry, this is the sound science of complex systems. Uncertainty is an integral part of it.

Since we'll only have one election on Nov. 6, it's possible that Obama can lose. But Nate Silver’s (and others’) statistical models remain robust and worth keeping and expanding – regardless of the outcome this Tuesday.

Odds and Outcomes
—————–

Refusing to run statistical models simply because they produce probability distributions rather than absolute certainty is irresponsible. For many important issues (climate change!), statistical models are all we have and all we can have. We still need to take them seriously and act on them (well, if you care about life on Earth as we know it, blah, blah, blah).

A one to five chance is pretty close odds. When Nate Silver’s model gives Obama 80 percent of passing 270 electoral votes, this is not a prediction for a landslide; it's not even overwhelming odds. One in five chances of getting hit by a bus today would not make me very happy to step outside the house, nor would I stop treatment for an illness if I were told I had a one in five chance of survival. And if I were Romney’s campaign manager, I’d still continue to believe I had a small but reasonable chance of winning and realize that get-out-the-vote (GOTV) efforts can swing this close an election.

The U.S. electoral system's “winner-takes-all” approach is one reason for the discrepancy between the odds of a win by Obama and the closeness of the vote percentages – 50.1 percent of a state gets 100 percent of the Electoral College votes for a state. And there are many states in which the polls suggest the candidates are only a few percentage points apart. It remains a very close election, given that:

polls have known sources of error (even if you poll perfectly, you will get results outside the margin of error approximately one in twenty times for a 95 percent confidence interval);

there are unknown sources of error (cellphones? likely voter screens?); and

polls do not measure factors such as GOTV efforts, which can make a huge difference in close elections in winner-take-all-systems. It also remains hugely, and significantly, tilted towards an Obama win.

So the election remains pretty close but the odds that Obama will win remain pretty high, and those statements are not in conflict.

>Refusing to run statistical models simply because they produce probability distributions rather than absolute certainty is irresponsible.

Statistical models are scientifically and methodologically sound and well-established methods in many sciences, key to analyzing reasonable risks of complex events. Nate Silver may be the face of the electoral statistical model, but there are others, too: here’s just one example, a site run by researchers at Princeton. While Silver gives a lot of information about his model and it all sounds reasonable, frankly, it would be great if it became more open source at some point for more peer-review. Because this kind of modeling isn't some dark science of wizards: It's important work that requires expertise and care.

I share a wish with Sam Wang of Princeton that sound statistical models replace the horse-race coverage of polls, which are drowning out the important policy conversations we should be having. As Wang explains, he started doing statistical modeling thinking his results could "be a useful tool to get rid of media noise about individual polls" and "provide a common set of facts... opened up for discussion of what really mattered in the campaign."

If Brooks wants to move away from checking polls all the time, he should support more statistical models. And we should hope for more people like Nate Silver and Sam Wang to produce models that can be tested and improved over time.

We should defend statistical models because confusing uncertainty and variance with “oh, we don’t know anything, it could go any which way” does disservice to important discussions we should be having on many topics – not just on politics.

*Editor's Note: An earlier, unedited version of this article appeared on the author's blog. *