Polling conspiracy theories

September 1st, 2012, 8:00am by Sam Wang

(Update: The comments are excellent. Thank you all. I’ve been a bit of an empty chair because of travel. I gave my own thoughts a few times, most recently Sept. 1, 4:36pm.)

Based on evidence from 2004-2010, I take the position that polls aggregated using robust, median-based statistics lead to the most accurate outcome for Presidential, Senate, and House races. Furthermore, the most appropriate measurement is whatever is closest to the final mechanism, e.g. state margins for the Electoral College. For confirmation, see the results documented in the left sidebar.

Today I ask for help in cataloguing ways in which these assumptions might be wrong in 2012. I will list a few just to get you started.

2004: Kerry voters in the woodwork. In 2004, when I originated the Meta-analysis, I assumed that polls did not reflect the true state of play between Kerry and Bush. At the time, it was discussed by Mark Blumenthal (Pollster.com) and many others that undecided voters typically broke against the incumbent. I also thought that Democratic energy was high, and that turnout might be underestimated.

These possibilities found their way into the Meta-analysis using a variable called, appropriately, bias. If crowds of pollsters leaned towards the Democrats, then bias > 0; towards Republicans, bias <0. As it turns out, bias=0.0%. Although using the bias in 2004 gave me a wrong result, this Wisdom-of-Crowds-of-Pollsters is invaluable in getting a snapshot of the 2012 race.

The bias variable has turned out to have many other uses. Now it powers many features here: the Meta-margin (equivalent to a national margin, but calculated from state polls), The Power Of Your Vote (tells you where your efforts are most effective), and the Obama+2% and Romney+2% features.

Rasmussen, (dis)honest broker? In 2012, I have heard various speculations about what’s wrong with polls. A common theme is that there is something wrong with polls. Republicans seem to think Rasmussen is the only unbiased pollster in town. I think everyone agrees that the value of bias for him is typically 4% relative to other pollsters. As stated above, I’m pretty sure he’s the one who is off.

The Trouble With Nate? Matt McIrvin wrote me regarding a speculation that because Nate Silver made an error in predicting the 2010 House elections, pollsters are on average pro-Democratic by several points. As I showed yesterday, this error was well within the election-to-election variation of linkage between popular vote and House seats. Also, as I wrote in 2008, voter enthusiasm in nonswing states seems to be hard for pollsters to measure. So I think the case is weak.

>>>>

However, there may be other evidence. This year’s race is close enough that if bias>0, it might have significant effects for the Presidency, the Senate, and the House. If you can cite reasons for why this might be true, I’d be interested in knowing. I’ll help you – here are two popular right-leaning aggregators, Race42012 and ElectionProjection.com.

We have discussed this before, Dr. Wang, and I believe that what Nate calls “republican house effect” is strongly inversely correlated with lack of polling in cell phone demographics.
Cell-onlies are 31.6% of American households now and 47% of Americans are smartphone users.
Those demos go Obama by 20 points.
Rasmussen is a wholly robopoll house. He only livepolls where robopolling is illegal.

Several Hispanic polls are showing that the voting enthusiasm for the Hispanics in this cycle is off the charts. There is exactly one reason for that: The Deferment Act. Various Mexican consulates in the US are getting flooded with passport and other paperwork applications by young folks. Half the Hispanics are cellphone only households.

While the Hispanics are already the biggest minority in the US, less than half of them can vote. The reasons are two: Half (that is the half of the half) are illegals and the other half are less than 18. Among the other half who are citizens, about 30 percent are not registered.

Hi Prof. Wang, I was pointed to your website by a prominent string theorist. He lamented that you were a good one that we (physics) let get away.

To answer your question above: one of the biggest problems I can see is in the input data: the “registered voter”/ “likely voter” screen. There are likely voters who will not vote and registered/unlikely voters who will (in experimental high energy physics we might call these “inefficiencies” and “fakes”, respectively). I suppose the correct way would be to take all the answers to the screening questions and assign a weight. Or if a continuous variable is too much trouble then do a light/medium/tight likely voter screen.

Of course, this sort of weight calculation would have to be done by the data collectors. They seem quite unwilling to share information.

On a different topic, I had a question about an assertion you made about Nate Silver’s analysis. You said that by including economic data he was adding more noise than signal. But if you have the correct sensitivity for all your inputs, adding new data should always make your estimate better, no?

In experimental high energy physics one is always debating this when compilations are done for things like the top quark mass or the recent higgs discovery. Do you add a low sensitivity channel that does nothing much to move the central value or the uncertainty bands, but will add a paragraph to the journal article…

The raw data for the meta-analysis are poll results. The only reason a poll result will be useful is if it is based on a random sample of some form, and the validity of a random sample depends on the sampling frame being inclusive of the population of interest. Thus, it seems to me the biggest threats to the validity of the results of the meta-analysis will be due to poor quality data input into the meta-analysis. Three possible problems immediately come to my mind: an incorrect sampling frame (all persons in the frame are not members of the population of interest), an incomplete sampling frame (so all members of the population do NOT have an equal probability of being in the sample), and persons who are contacted to be a part of the sample refusing to participate (a problem of missing data that turns a random sample into a non-random sample). To the degree one or more of these problems are true for a particular polling, the poll results will be at risk of misleading and the errors in the poll will propagate through the meta-analysis.

So the issue then becomes identifying the extent to which these problems (and other possible problems I have not mentioned) are methodological problems with a particular poll, and then trying to correct the problems. Here is where various weighting schemes might be used. The problem as I see it is that any weighting schemes must be based on some model of what the methodological problems are, and without appropriate data to build, and test, these models, the weighting schemes adopted are nothing more than SWAGS (Scientific Wild Assed Guesses). Any analyst trying to make corrections for these problems is in the same boat as the pollsters themselves trying to “correct” their results for the methodological problems in their polling process.

In short, it seems to me that attempts to correct for methodological problems, in the absence of good data characterizing the methodological problems, is in a sense going to always be based on pure speculation. If I am correct, then corrections are going to introduce bias, some of which will be due to use of an incorrect model of the methodological flaws in the sampling and polling process, and some of which will be due to perhaps unconscious expectations and beliefs of the person doing the adjusting. (If you have not read, The Mismeasure of Man, by Stephen J. Gould, you should read it. It details numerous examples of such errors; it is one of my favorite books and I have many of my graduate students read it.)

In short, it seems to me to be a bit of a quixotic endeavor. I am certainly interested in others’ take on this.

@Olav, no I meant what I said. Rasmussen always robopolls except in states where robopolling is illegal.

@Amitabh
Dr. Wang’s assessment of Nates economic indicators as added noise is correct, because Bayesian assumptions about the economy are simply not valid in this election cycle. The proof of this is the recent switch of the Romney campaign from focusing on the economy (which is not flipping the curves) to an embrace of the Sailer Strategy.
In this election, there is no indication that the economy is predictive, because no matter how bad the economy gets there are many demographics that will still never vote for Romney.

@Bill N
I find Gould rather fluffy, but chacun a son gout.
The phenomenon you are describing is better explained with neuropolitics, evo theory of culture and game theory.

Thanks wheelers cat, but I confess I do not understand what a Bayesian prior has to do with this.

Here is my understanding from digging into Prof. Wang’s writings on this site:

The economic data is subsumed 100% in the polling data. Therefore they bring NO extra information. The polling data is a superset of the economic data.

Is this correct? I do not know. But it makes sense, after all, a lot of the economic indicators are themselves derived from polls, like consumer sentiment. But others, like durable goods orders, are not. I suppose given the uncertainty of the correlation, Sam Wang’s leaving it out is the correct thing to do.

Amitabh – generally, that is correct. I like your comparison to the low-sensitivity channel. That is an excellent analogy.

One additional point: I imagine there is a possibility that economic indicators can add information to poll-based ground truth. But in my view this remains to be demonstrated. I do not think terms like Bayesian are fully descriptive, since they imply a level of rigor that doesn’t quite exist yet. I am a Bayesian more or less.

More later on the other points made here. I am traveling.

One note: I am also interested in points of view from the right. The Nate-error hypothesis is plausible in an off-year election and/or House races. Seems wishful to me this year, but I’m interested in any further evidence.

Oh, I meant the 538 model. Nate uses 7 economic indicators in his model.
Nate is a Bayesian. He believes past economic performance is indicative of current stasis.
I’m a non-parametrician. I believe in asymmetry and New Events and demographic evolution.

You and Dr. Wang are correct that how voters feel about the economy is baked into the poll data.

Dr. Wang….are there informed intellectual points of view on the right?
I havent seen any.
I think you are wishing for the moon.

I used to be a Bayesian. I think past history has less and less predictive power in the age of the internet and social media.
Culture is evolutionary, and memetic evolution and transmission are speeding up. Look at twitter memes and how fast they spread.
Probably one reason Romney is not getting a bounce is that the existence of @InvisibleObama and #eastwooding have damped the traditional convention effect.

1) At what point does the influence of state poll aggregators, such as PEC, become important enough that agents release polls solely to distort the results? This was not a concern in 2004. Can we detect any patterns in the polls being released to suggest this?

2) The iphone only came out in 2007. So, our interaction with telephones has changed wildly over your 2004-2010 experience window. Is this forcing pollsters to increasingly abandon randomness?

Earlier Bill N said a poll was only useful if it was random – but the real condition is that the poll be representative. While a purely random poll approaches representative in a quantifiable way, pollsters are finding that their random approaches aren’t working – they get too many stay-at-home righties, too few iphone-toting liberals – so to seek something representative they increasing insert a priori beliefs about the electorate into their data processing. Are they getting this right?

Many pollsters reveal these assumptions and at least some of them are silly.

3) Over your experience window, political spending has exploded. Obama abandoned the spending limits last cycle and crushed McCain – going from a tie polling average to up 8 points in the last six weeks. This cycle both sides have abandoned the limits and super-PAC’s have jumped in as well. Could this mean volatility will be higher? Is volatility this year any different than last cycle?

Wheeler’s Cat is clearly a fellow Frenchman, and probably feels terribly embarrassed by your pointing out his unfamiliarity with the English language. For future reference, Monsieur, “to each his own” and “whatever floats your boat” are widely used idiomatic expressions which have the same meaning as “a chacun son gout”

On the issue of Nate’s inclusion of economic factors being “noise” or not, it should be noted that he only includes them in the “Nov 6th forecast” portion of his analysis. My (perhaps over-simplified) understanding of his intent is that he is taking current polling, and then attempting to extrapolate how these current polls might shift between now and Nov 6th (which in the final analysis is the only date that matters). To perform such an extrapolation, you can use endogenous factors, most obviously how the polling has trended over time so far this election cycle, either on a per-state or national level (i.e., a pure polling data extrapolation). Or you can use one or more exogenous factors, including macro-economic measures, that you believe might indicate the direction and magnitude of likely shifts in the current polling moving forward, based on your analysis of historical polls and those same factors. Most likely, you use a combination of all of these things.

To me this all seems entirely reasonable. But if it’s not so to you, all you need do is click on Nate’s “NowCast” button, and bingo! You have a projection based on pure polling data, with no economic factors whatsoever baked in. You can see it whichever way pleases you – with or without the economy included – so everyone should be happy.

ThatSeattleGuy – That’s not quite right. The Now-cast also includes “state fundamentals,” which I would categorize as another example of introducing unnecessary noise. It would be useful mainly as a means of filling in missing data. I have said before that such an approach can add value in sparsely polled regions. In the Presidential race, there are no such regions.

In a key test for whether such added “fundamentals” are needed, Election Eve polls alone did extremely well in the 2008 and 2004 Presidential races on a state-by-state basis. Given this, the Now-cast contains unnecessary noise, which blurs the result.

LondonYoung –
1) In regard to the influence of aggregators on pollster behavior, I believe you have hit a nail on the head. As a recent example, the EV estimator reached a very low trough about 10 days after the Ryan VP announcement, followed by a bounce-back 1-2 days later. Unless Rep. Akin’s powers of offensiveness were so effective that they spilled into the regular news, it makes more sense to imagine that immediately post-Ryan polls were exiting the Meta-analysis just about then. Those include Rasmussen and Purple Strategies.

This gets back to a point by Wheeler’s Cat about robopolling. I don’t think there’s an intrinsic problem with this method, since so much can be fixed by re-weighting the sample. Re-weighting is a universal method among pollsters. Instead, the issue is that robopolls are less expensive to conduct. Therefore a robopolling organization like Rasmussen can “flood the zone” with data, achieving the effect you mention.

2) Interactions with telephones: you are correct that these devices have invaded our lives. However, as recently as 2010, existing polling methods did very well at the Senate level, with the exception of NV (Reid v. Angle), a state with an exceptionally large mobile (and mobile phone-using) population. Also, keep in mind that pollsters are very aware of this problem. If anyone’s interested, perhaps I will solicit a guest post on this topic.

In regard to inserting assumptions…that’s called stratification, and is unavoidable. I don’t have too much more to say, except that aggregation using medians seems to address this issue satisfactorily.

3) Volatility – actually, I think volatility is down compared with 2008. Look at the graphs (oops, 2008 is hard to see…will fix that in a bit). The movement in 2008 during the six weeks you mention was probably because of the economic collapse. Lehman Brothers doesn’t go bankrupt every day. I suspect that plus the debate were returning the race to a more natural set point.

@Seattle
mais, ma petit choux, that IS the problem!
If the 538 Nowcast and PEC’s forecast are based on the exact same poll data, why arent their results close? Sam has Obama’s probability to win at 88%, and Nate has Obama’s probability to win at 72.7%.
One difference is that the two poll aggregators handle Rasmussen differently, Sam uses robust statistics and Nate uses the mean.

I just remembered another reason I’m not a Bayesian anymore.
Probability has no memory.

I had a look around Nate Silver’s 538 blog and as SeattleGuy points out, there are two money plots, the “NowCast” and “Nov6”. The former has no economic data. The purpose of the economic data seems to be to extrapolate the current snapshot to Nov 6. I fail to see why economic data has these powers of projection (maybe some, like 4th quarter estimates by panels of economists are of some help).

But never mind. Silver must have figured out how
polls = now, while economic data + polls = future.

The main difference between this site and Nate Silver’s is that Silver runs 100k pseudoexperiments as a way to marginalize his nuisance parameters. Note that he predicts an integer quantity (electoral votes) to one-tenths. But he does not quote any uncertainty on his estimate.

Silver’s analysis is also far more baroque than Wang’s. He makes corrections for each pollster’s “house effect” and polling bumps from conventions.
Maybe correct, but nervous-making all the same.

Wang has a closed form solution that allows him to propagate uncertainties. Note the uncertainty bands on his estimate. Frankly, I am more impressed by these error bands than by his central value.

I will say that those error bands conceal an issue: each poll can be thought up as approximating the true margin M with the result M+s+d, where s is sampling error and d is a constant pollster-specific offset. Estimating d would reduce the band. A loose end on this site…

@Amitabh
Nate’s 50k sims per day are a way of injecting randomness or natural uncertainty into the forecast.
Monte Carlo method.
My complaint is that inappropriate use of Rasmussen data and economic indicators are ways of adding artificial uncertainty.
I do not know why Nate would do that, but as Dr. Wang once said, it seems very like spitting in the soup.
The goal of forecasting is to remove uncertainty, right?

And yes, Rasmussen is a problem. In what alternative metaverse can both claims of “dem oversampling” and “GOP voters at alltime high” be true?

That is useful Sam. But on the stratification in point two: As a scientist, I’d look at the result of my approach to random polling and when I found a failure to converge to representative knowns, I would go back and try to fix what was wrong with my attempt to reach random voters. But stratifying pollsters don’t do this.

When they reach too few young people, it is dangerous to over-weight the few they found because the fact that they are not finding enough tells us something is wrong and makes me doubt that the few they found are representative. E.g. Ohio has a lot of college kids who vote in Ohio but have probably been out of state all summer … do their youthful fellows who spend the summer in Ohio vote like the kids with summer jobs on the coasts?

Given your track record up to 2010, PEC’s approach only has a problem if this effect is growing enough to make a few more 2012 states into 2010 Nevada’s. But how people and phones interact has changed a lot in the last two years …

I do agree with Froggy that “Efforts to disenfranchise voters (adding requirements, limiting early voting, and purging voter roles) could induce (and in fact are designed to induce) a bias relative to previous elections.” I do not think that pollsters can possibly know how to correct for voter suppression, as it’s pretty new, is using new techniques in many states, and is in a continually shifting legal condition–regulations get struck down or upheld every few days, and appeals are filed, and neither we nor the pollsters know which suppression measures will be in effect on election day, or how voters will respond to the possibility of being turned back/forced to file a provisional ballot.

This all makes the “registered voter” category unclear, as many voters don’t know whether they’ve been purged or not, and makes the “registered voter” to “likely voter” move something that isn’t just a matter of which voters decide to go to the polls, request an absentee ballot, or fill it out according to regulation.

We don’t know what the pollsters are doing about this, or whether they will turn out to be right.

@Rachel: Actually, Nate has a good post on that here.
Measuring the Effects of Voter Suppression Laws.
No, I think the big surprise on Nov 6 will come from the cell phone demos.
Conventional wisdom is that “cell-onlies and smartphone users dont vote”, but how can we say that with certainty if they never get polled?

And yes please could you solicit a guest poster on the cell phone issue?
Eric Foster of Foster McCollum White Baydoun polling house basically told me at 538 his survey ignored the 31.6% of American households that are cell-onlies.
Which could be why Romney was 15 points up.

wheelers cat: You’ll notice that Prof. Wang and Nate Silver get pretty similar results for how many electoral votes Obama and Romney have today. Where they differ is that they’ve got completely different methods for calculating win probability.

Silver seems to be adding in some uncertainty essentially arbitrarily. As I think has been noted here, it’s odd that his “Now-cast” is currently giving very similar results to his November forecast; given how presidential races typically move after the conventions, the uncertainties in those numbers shouldn’t be so similar.

What is new and potentially game changing for the PEC’s methodology? I’m with London Young, point 3:

Campaign spending is up by orders of magnitude from what it was as recently as 2000.

The Obama campaign outspent the Romney campaign through the summer. In my view, they did that with the — achieved — goal of damaging/defining Romney’s image (Bain, tax returns).

But through those same months, the Romney campaign out raised the Obama campaign substantially. The Romney camp now has a lot more “dry powder” to fire off in the closing months. Can they move voters from where polling says they are now with an avalanche of (certain to be) negative media buys in key states?

(I include the new super PAC’s on each side in with the campaigns. So far, we haven’t seen a super PAC take a tack different from its parent campaign that ends up damaging the candidate’s efforts. I assume that in some cycle down the road that will happen.)

I don’t know. There are certainly examples of big failures to try to buy an election (see Meg Whitman in California). But what is going on this year, post-Citizens United, seems to me something truly new in a national cycle.

These are excellent comments. I am impressed. Also, thank you for the kind words.

Yesterday my colleague Paul Starr brought up several relevant points: (1) The fraction of voters who are undecided appears to be extremely low this year. The Meta-analysis indicates perhaps as low as 2% undecided, for a 4% swing. (2) A May 2012 Pew center study showed a response rate of just 9% in phone surveys, down dramatically from even 2009. Holy moly.

In regard to identifying who is a likely voter (LV), I believe that Gallup already does what Amitabh is suggesting using “light turnout,” “medium turnout,” and “heavy turnout” models. That really matters in off-year elections. I recommend an extensive analysis in 2010 by former pollster Mark Blumenthal at Pollster.com.

In regard to registered voters (RVs) vs. LVs: in principle this ought to be a larger problem before Labor Day, after which most pollsters switch to reporting LVs. In past years I have not noticed a shift in the Presidential estimator in September. But it might be hard to see because of transients like convention bounces.

The question of cell-phone sampling comes up repeatedly. To my thinking it is not a real problem, in the sense that in principle it can be fixed by adjustments in survey techniques. However, the low response rate problem does give me pause.

Voter ID/voter suppression. For the long term this is a serious issue. It matter the most in marginal situations.For the 2012 Presidential race it may not come up. Pennsylvania is not in play. But for State/district races, and as a fundamental matter of our democracy, it is a serious issue.

Technical issues with poll aggregation. Estimating d: Amitabh Lath, we are thinking alike. Yes, somehow one must anchor d. One possibility would be to assume that the median of d was zero. The justification is that on Election Day in 2004 and 2008, the Meta-analysis landed right on the EV result. If one wanted to get fancy, one could create a tool to allow the reader to anchor d using his/her favorite pollster.

There is nothing wrong with numerical simulations, if one wants to assess a model with lots of “nuisance parameters” (that is an excellent name for them). They are not a means of introducing randomness. But in my view, when one can, it is better to understand a model well enough to calculate a closed-form distribution. For example, I see no reason for the “Now-cast” to be calculated by simulation. All 2.3 quadrillion electoral possibilities can be explored (as they are in my calculation) using this bit of MATLAB haiku: dist=[0] / for i=1:51 / dist=conv(dist, [p(i) 0 0 … 1-p(i)]) / end. For the compulsive, add a few lines for Maine and Nebraska.

Campaign spending. This does not affect poll accuracy per se, since its effects ought to be measurable. However, I do think it is the single most important new issue to watch in 2012. This is the first year in which the effects of Citizens United will be felt strongly. I believe it will make the largest difference in Senate, House, and state-level races. There is a reason why Karl Rove’s Crossroads GPS is working at these lower levels – leverage.

so Matt….what is your hypoth as to why the Nowcast is so different from PEC’s?
Both are entirely driven by poll inputs.
That is my understanding.

@pechmerle
PACs may have an effect. We dont know yet. But Team Romney doesnt think they will.
That is why Team Romney recently switched to the Sailer Strategy. The economy isnt having the effect the GOP believed it would either.
I think that is due to the calcification of the electorate, and demographic evolution.
Like Dr. Wang points out, the election is already over, barring a black swan event. If the dems turnout, they win, if they don’t they lose.
Simplesauce.

Stop picking on these our learned colleagues. I am sure there statistical analysis is better than their French. Their occasional lack of self awareness might make a Bayesian blush. I think you are onto something in raising the issue of expert pollsrers having incorrect models of voter participation. That’s where a smart aggregation scheme of individual polls would bear fruit. Perhaps one should use prediction markets in the aggregation as well? How did the Intrade Odds differ day by day with the 2008 Wang aggregate for Obama?

Mon cher, InTrade prices understate confidence. Even a 5-point Election Eve lead does not satisfy them. Follow the link for more. Type InTrade into the Search tool to see other essays on this subject. A bientot, Sam

I think you are making an excellent point about the Intrade markets reflecting to much residual uncertainty. Perhaps you can use the mis-pricing as a way of backing out how concentrated the representative trader’s priors were on the probability of success of an outcome. The natural conjugate family for the binomial is the beta distribution.

There is an article in the American Economic Review claiming that the Iowa Electronic Markets predicted better than the polls in past presidential elections.

Since I am an economist, I tend to believe in markets. Your example about missing the Chief Justice’s ruling cost me some money, but then I was pleasantly surprised at how little early insider trading there was. (In fact, I think the market did begin to move a few hours before the announcement, so there was some insider trading in the last night, I am sure.)

I also have a perspective as an economists on biased polls from organization that label themselves Democrat or Republican. Why would any candidate pay good money pay for that garbage? I think the answer is they report obviously flawed polls (that you include in your meta-analysis) simply to confuse the opposition. It’s a little like the Allies setting up fake military camps in the south of England before D-Day.

These purposely biased polls may not move your median, but they certainly make my prior beliefs about the true probability of success more diffuse.

By the way, as a former philosophy student at Old Nassau, I find your candid claim on the main page that you are not trying predict outcomes a bit disingenuous. But it is a good disclaimer, and a chacun son gout.

Please forgive my economist’s hubris, but could I buy $1000 worth of Romney from you for $120 + commission? You see, I could buy the offsetting (allegedly mis-priced) Obama contract for $570, and that would be a tidy almost sure profit.

The only way I could lose would be if one candidate died before the election and his replacement won. That’s the only residual uncertainty.

While you’re fixing typos, your formula for the binomial expansion is opaque at best.

Eric – Glad to see economists are still learning basic probability. There is also the value of the reduced uncertainty. Go figure out your own arbitrage on InTrade. There are so many opportunities there.