Vast liberal conspiracies?

September 26th, 2012, 12:48pm by Sam Wang

(Welcome, readers of IEEE Spectrum! A small group, but enriched for quantitative people. Do not be fooled by the picture. I usually do not glower.)

The Popular Vote Meta-Margin just hit 5.0%, its highest value of the campaign. I will briefly address the wishfulthinking of Romney supporters, despite its faint whiff of hysteria. Some fantasies involve the assertion that party weighting is off. Others suggest that undecided voters will all go in one direction. However, Jay Cost and Unskewed-Man should probably drop the silliness, for the following reasons.

Party self-identification is more fluid than is commonly realized. Not only does voter enthusiasm change on either side, but some people seem to change their self-report. This hypothesis would also require a simultaneous, coordinated effort by all the pollsters (except for brave Scott Rasmussen) to stop doing what they did so well in 2004 and 2008 (see left sidebar). Bottom line: the polls are fine.

The undecided voter break is also highly unlikely, as Charles Franklin reported in 2008. As evidenced by recent polls, undecideds are currently 5.0 +/- 1.7 % (median +/- estimated SEM, n=6). Based on how last-minute undecideds have split in the past, they’ll only be good for a net shift of 0.3 +/- 1.0 % towards Romney.

Finally, there is the question of whether the debates matter. Contrary to what has been said lately, in fact they might, especially the first one. In 2004, Kerry gained about 30 EV, equivalent to about 1.5% of Meta-margin:

Let’s say Romney overpowered Obama in debate (1.5%) and captured an unusually high fraction of undecideds (0.3+1.0 = 1.3%). That’s 2.8% total. He would still have to make up an additional 2.2% from somewhere, almost the size of the post-Democratic-convention bounce to date. I do not see that happening.

I encourage you to focus away from the Presidential race. Of more practical interest are questions where the outcome is uncertain: a few Senate races (ND, IN, MT, CT, and MA), and House control. Karl Rove’s Crossroads GPS is probably making adjustments to reflect this polling reality. It’s hard to imagine them pouring much more money into the Presidential race. I have adjusted ActBlue as well, by de-emphasizing WI-Baldwin and VA-Kaine because they are out of the knife-edge zone.

(Update: As pointed out in comments, Senate races in Nevada and Arizona might be entering knife-edge territory. I’m watching that.)

Looks like the Arizona Senate contest is tightening. Carmona last night released an internal poll that shows him only one point behind Flake. Rasmussen’s poll released today has Flake leading by 6 points, which also puts Carmona within striking distance, especially considering that it’s Rasmussen.

Sam, I was wondering if you’ve looked into ballot initiatives and the effect of generic congressional polls on those? I’m assuming that this is much more complicated, since you don’t see such a clean partisan divide on those issues. This comes to mind, because it certainly looks like Washington State’s I-502 is heading for passage, which completely surprises me. It’s certainly radical by modern political standards…

Tons of ballot initiatives here in CO- Now that I think about it, it might foreshadow the election, since the ‘Right to Life’ initiative failed to make the ballot (1st time in quite a few cycles) but the ‘legalize maijuana’ initiative is on there.

Saw my 1st anti-Miklosi ad this morning (CO-6). They truly seem to want people to believe that Obamacare is bringing Canada style health care to the US. Strangely enough, they also say he’s cutting Medicare (which is much more similar to Canada health care than Obamacare) in the same ad!
But so far, on house seats, the positive ads have outnumbered the negative ones. And I’m seeing slightly more Dem ones than GOP.

Hi Mr Wang.
I, too , was indulging my non-partisan side and poked into some GOP rabbit holes. Came across these references to “D + 11” as the explanation as to why most all the polls are looking grim for the GOPers. It was on “Breitbart” , so I guess it’s no surprise they are dismissive.
Could you take a minute and shine your bright light on this arcane straw they seem to be clinging to?

The legislature should pass a constitutional amendment prohibiting federal law enforcement from overriding state laws not mandated by the constitution of the US. I think several states with medical marijuana statutes. would jump on that bandwagon, forcing the feds to back down or face a constitutional convention.

But, but, but, the Romney campaign has internal polls that show it closer than the public polls. Yes, just like the McCain people had internal polls a few days before the 2008 election showing a tight race in PA. Right…

If the Romney people knew it was close, his Bain pals would be scarfing up cheap shares on Intrade and his price would be higher than $2.50. And Pawlenty wouldn’t have quit 6 weeks before the election.

Of course, affiliations are fluid and they are self-reported. No pollster can or would check that someone calling himself a Democrat is actually registered as such.

Sam, is it wise to de-emphasize Warren–Brown just yet? The two latest polls show a tie (Rasmussen 09/25) and a four-point lead for Scott (UMass/Boston Herald 09/20). Granted, before that Warren was leading in half-a-dozen polls, but still…

Any fresh news on Heitkamp–Berg in North Dakota? I know of nothing since the July poll by Rasmussen.

Are ND, MT, and IN more important due to the down ticket effect? For instance, Warren is polling even, but POTUS’ performance in the state should help her carry the seat? The aforementioned traditional reds will not get this ‘boost’, hence the need for more $$$?

Also, why is the race in Nevada not included, it appears to be neck and neck?

Basically. All races appear to have been pulled toward Democrats, but more so in strongly Democratic states, as I’ve written.

Arizona and Nevada – those are new. Thank you, Patrick and Olav, for pointing these out. That puts a 57-43 split within the range of possibilities. It is starting to look like the optimal path for either side would be to put all resources into House races.

E.L.: I fear you may be spot on.
What is truly frightening is that so-called “voter fraud” (which is virtually non-existent!) is being used as a means to disenfranchise hundreds of thousands, possibly millions, of voters. Now that is voter fraud>

I have a completely stupid question. Please forgive my ignorance. When a pollster calls 1000 people, who say they are likely voters, why does he even bother to ask their party identification? This makes no sense if he wants simply to report a poll of 1000 random people.

Am I right to guess that he is doing a stratified sample? Thus he wants to make sure that these 1000 people match some demographic prediction for the American electorate.

Here comes my stupid question. Where does the demographic prediction for the electorate come from? Do the different pollsters use different demographic models, or do they largely key off the same one?

First, it wasn’t always clear that party weighting was a bad idea. In fact, from about 1960 up through the 1980s, it was poli-sci consensus that party was a demographic, and not an attitude. Asking for party ID, even if you don’t intend to use it, is a bit of holdover in that sense.

(Zogby Interactive’s truly awful polling, which made heavy use of party ID, helped discredit the notion, as did the research of Fiorina et al.)

And the goal for trial heats is not to conduct a survey of random adults, because adults aren’t the people who elect candidates. Neither are registered voters. Your target population (nationally, at least) is that 50-60% of the voting age population that will actually show up. And realistically, there is no easy way to conduct a truly random sample of those people – or of registered voters, or of adults in general. Pollsters start with a random sample of adults with phones, and go from there.

Here’s one common way to conduct a poll: Design a questionnaire that is not likely to produce wording effects. Random-digit dial your population, including both landline and cellphone users, and speak to a randomized adult within the household. Conduct the interview, rotating question order so that you’re not suffering unduly from primacy or order effects. Check the registered-voter sample respondents against the demographics of registered voters within the population (if you properly randomized the original sample, this should not pose a problem, as 99% or more of registered voters have phone access). Weight that sample once more by easily predictable demographic statistics, using prior results and census data. (Sex, age, and race are common, as are education and income.) Report the results.

Some pollsters prefer to devise likely-voter samples using attitude, but I believe that this is prone to response error: More people will tell you they intend to vote than will actually do so. A few, like Rasmussen, even terminate the interview if they don’t get a strong response to that question, which I believe is methodologically unsound. Some use different demo weights, or none at all.

I like demographics because they tend to vary little each year, and in predictable ways. (You can set your watch to a 46-48% male electorate, for example. The white vote share drops by a few points every presidential cycle, while the black, Hispanic, and other categories increase accordingly. Census and registration data are used to decide the weights precisely.)

Party has two disadvantages compared to demos. It’s an attitude and not a characteristic, so while it usually doesn’t swing heavily from cycle to cycle, it tends to do so in either direction unpredictably. And second, it has a very high correlation with voter choice – Democratic and Republican candidates routinely capture 90% or more of self-described Democrats and Republicans, respectively. Out of the commonly used demographics, only the black vote has such a high correlation to vote choice. Each point of error in your party breakdown is more damaging to your forecast than a corresponding point of error in sex, age, or race.

Party ID does have one use: If your samples are demographically consistent across multiple surveys, but party ID fluctuates heavily, it’s a warning sign that you’ve fallen prey to random sampling error. (If party ID doesn’t swing back in the next poll, however, it may represent a genuine shift in the electorate.)

I have done a lot of digging and a quick back-of-the-envelope calculation. The weighting schemes are based on the CPS from May 2010 that reported voter participation by demographic group in November 2008. If voter participation by demographic group does not change across quadrennial cycles, then weight the raw sample might be appropriate. Still, pollsters, like the proverbial generals, are always fighting the last war. And Sam Wang is a ballistics expert using the best ammunition he is given.

But the 2008 election did have unprecedented turnout among certain groups because of the enthusiasm behind Obama. Pollsters then used the 2004 model, and indeed they under predicted Obama’s margins in OH, FL, VA, and NC by about two points. (Check Realclearpolitics 2008 data.). These biases did not mess up Sam Wang because the final polls showed Obama ahead in all those states.

Here’s what I still do not understand about what political scientists do. I know they want a true picture of likely voters. This is the same problem we economists face if we want to predict who will buy Green Rolls Royces instead of blue ones. We could sample the population of car owners randomly, but we would need such a huge sample to get enough people who actually are rich enough to be our target audience. So if we sampled 1000 people, we might get 10 in the right income bracket, and we would give their answers huge weight. This would create high variance in our estimates that we would mask by N=1000.

But the idea of random sampling is of course stupid. We should over sample rich people. So why don’t political scientists just throw out anyone who says she is Republican or Democrat and just oversample people likely to be pivotal in the swing states?

I’ve been wondering the same thing, especially when it comes to “Likely Voters.” Maybe party affiliation is relevant when the population is “Registered Voters” and you are trying to adjust for who is actually going to show up and vote. But when you are sampling “Likely Voters” why would you care about party affiliation? Even more so when you aren’t talking about one poll of a 1000 voters but an aggregation of numerous polls. Doesn’t the Law of Large Numbers come in to play?

As for Jay Cost, the term is “weighting” not “weighing” ; the “t” makes a difference. Weighing poles would mean putting them on a scale to find out how much they weigh. Obviously possible when the polls are printed on paper, not so obvious how you do this when they are digital.

I feel compelled to point out that the “Polls are Biased Crowd” are the same folks who believe all of the media is biased, except for FOX. And they are the same folks who hold science in contempt and worship ignorance. Of course they are going to have trouble understanding the math and aren’t going to believe it when the results don’t conform to their preconceptions. These people can’t be allowed to run the country.

Bill, I am one of those ignorant people who think conservative ideas may have much merit. Pleas forgive me if my undrstanding of statistics is not what it might be.

The Weak Law of Large numbers will not come into play if the different polls are not independent. The correlation arises because the pollsters are probably using roughly the same demographic models, perhaps based on the 2008 exit polls. Those polls were bad predictors for the 2010 election, and that’s why the data Sam fed his model in that year made his predictions fare badly. The likely voter polls need not be biased, but they are not independent. So any form of aggregation, including one based upon medians, will have problematic statistical properties.

The likely pattern is that on-year, with the Presidency in the spotlight, pollsters on average can identify likely voters. In off-years, partisan intensity becomes more important – and harder to quantify by current prevailing methods.

Regarding partisan statements: everyone, think twice before posting a pure statement of opinion. Discussion here is better if it is informed and/or technical. For example, Craig’s answer is amazing.

That “at this early stage” caught me, too. Does Romney not know that early voting has started? And that, while there are still 40 days to go until 11/6, he’s already been defined too a large number of people … and they don’t like him?

They are following a panel of 3500 people through the entiire election cycle, questioning each of them weekly (500/day). Romney and Obama were tied at 46% as recently as 3 weeks ago and now Obama is over 50% and Romney is at 42%. The “Shifts Between Candidates” shows individual voters actually switcing from Romney to Obama.

In the light of this, those who pretend Romney is ahead are simply ludicrous.

I recall that at least one of the “debates don’t matter” papers analyzed elections up through 2000. But it seems to me that both 2004 and 2008 were possible counterexamples. 2004 you covered above. In 2008, while the Lehman Brothers collapse probably killed McCain’s convention/Palin bounce, it was after the first debate that Obama’s lead really blew wide open.

If I recall correctly, most of that analysis focused on the binary direction – i.e., who’s in the lead. I suspect a more quantitative approach would be consistent with 2004 and 2008. That’s as opposed to debates suddenly becoming relevant at the exact moment I got into this activity.

It’s okay to have faith in the numbers, but are you still dealing with a huge group of people who can do ANYTHING once they get inside the booth? This election might end up being closer than these recent polls indicate. Plus there’s a lot of time left.

Obviously individual voters can vote whichever way they chose to once they enter the voting booth and that doesn’t have to reflect the answers they gave when asked by pollsters, but modern electoral history suggests that is very unlikely, and why would huge blocks of voters change their mind last minute? And there doesn’t seem to be any data from the current election to support that hypothesis. If lead was routinely swinging from Obama to Romney and back again that might suggest a public that was likely to change their opinion at the last minute, but we don’t see any of that in the data. Obama has held the lead pretty much from the beginning.

OK so, when the polls were showing a consistent 1% Obama lead (July to August), were the Dems being oversampled back then? If so, then the polls would be consistent in showing a true shift toward Obama, even if their numbers were not quite where the electorate truly was. If not, then something else is going on- the pollsters all decided to start polling more Dems all at once! Liberal conspiracy! Or perhaps showing a different shift in the race, Independents who lean Dem and Dems who would normally call themselves independent (like myself) who start embracing the label the closer we get to the election.

John Cole has a counter-worry, that the ridiculous “Unskewed Polls” narrative is actually laying the groundwork for an impending Republican theft of the election, at which point the manufactured vote totals will reveal it to have been apparently right all along.

I think I’d lend this more credence if that site were actually getting any traction outside of the conservative echo chamber. I suspect it’s more laying the groundwork for a myth of *Obama* stealing the election after he wins.