Tuesday, March 20, 2012

Consider the complications

One of the many things I like about Nate Silver's approach to interpreting numbers is that he takes the time to explain his methodology clearly, supports his decisions, and sets out the advantages and disadvantages of his choices. One example is last week's column titled "Why is Santorum Overperforming his Poll Numbers?"

As Silver puts it:

The FiveThirtyEight forecast model is based on statewide polls and statewide polls alone in presidential primaries and caucuses. It doesn’t do anything especially fancy. This approach has advantages and disadvantages.
The disadvantage is that if the polls are wrong, the model will be wrong too. . .

The advantage is that the model gives us a relatively clean and objective benchmark to evaluate the candidates’ actual performance against the polls.

He then compares his model against Santorum's performance in the states with robust polling, and in states with limited polling, and concludes that Santorum has outperformed the polls by a small amount - 2.3 percentage points in the smaller set, and 2.1 percentage points in the larger set, enough to be statistically significant. Silver offers fours hypothesis (higher Santorum turnout, neglect of cellphone-only voters, unwillingness to reveal choices to poll takers, and tactical voting) that might explain the difference, and considers each of them in turn.

It's a good piece, clearly written. It enlarges our understanding of what's happening. (And in case you're wondering, Silver thinks Santorum is unlikely to get the nomination.) Because it continues the conversation as events develop, this kind of iterative approach is a good model to follow for anyone who has to explain numbers for a living.