NPR’s Science Friday – this week at 2:15pm ET

October 17th, 2012, 4:00pm by Sam Wang

This Friday at 2:15pm Eastern, I’ll be on National Public Radio’s “Science Friday” with host Flora Lichtman. Nate Silver and I will talk about poll aggregation and the state of the national race. We’ll take calls too. Tune in – or listen here.

Update: one topic that may arise is what’s been surprising about this year’s race. My nominees are (1) the stability/polarization of the race, (2) the lack of an RNC bounce, and (3) the giant debate bounce (and the apparent effect of the VP debate). What do you think?

I’ve been telling everyone to stop getting worked up about one poll (whether good or bad) and to just stop by 538 and Sam Wang’s election site instead.

I really think people end up way less informed by trying to keep track of all of these polls themselves rather than to just let you guys take it all in and make sense of it.

…

Speaking of uninformed. Did anyone else notice that the CBS host said something like this following the debate: “the first debate moved the race from a tie to a small Romney lead.” (I’m paraphrasing, but that’s the gist of what he said) … WTF?! that’s so far off the mark of what’s actually happening… it’s just amazing. And this guy is a lead anchor for CBS News… could he possibly be that misinformed or is it something more nefarious?

I cast my vote for b) something more nefarious. Because he’s only slightly off. The first debate started with a small O lead and after that debate there came a small R lead. I’m assuming he may have been a neophyte as regards polling which would indicate he was speaking of the national polls. The nefarious part is your interpretation. Gallup has a rolling average that is today up to Romney 51% and Obama 45%. I can’t reconcile that one.

I agree that Obama has a slight advantage at the moment. but it needs to be remembered that Romney is likely to win VA, NC and FL . CO also seems like a good bet for Romney. So, if Romney can flip Ohio (and this may not be as hard as Mr. Wang thinks), he will almost certainly win the election.

“it needs to be remembered that Romney is likely to win VA, NC and FL.”

PPP & YouGov have Obama down by 1% point in Florida and Obama has also been very close in many recent NC polls (even Rasmussen only has Obama down by 3% points in NC.)

And Virginia is a toss-up at worst. There’s no solid evidence that Romney is “likely” to win it. NYT/Q had Obama up by 5% there even after the first debate.

As far as Ohio is concerned even during Romney’s peak (and I believe that’s what the past two weeks have been) he’s never been able to take a lead in a reputable poll and most Ohio polls have him down by 4 to 6% points… when adding in the huge advantage Obama has in early voting there… I don’t think you have much of an argument there either.

And all of that was before President Obama’s magnificent ass kicking of that smarmy liar Romney last night.

Whenever I was down about the President’s chances, the one thing that has kept me confident are those polls in Ohio. If Obama can take Ohio, a Romney presidency is highly unlikely. I expect my blood pressure to spike in the hours after the Ohio polls close….

Romney wins without Ohio if he takes New Hampshire, Colorado and Wisconsin, on top of all the states where he’s currently leading. Some of the Romney-victory tail in Sam’s probability distribution is probably that.

I suspect NH is going to flip back into Obama’s column, and that Obama will hold Wisconsin. But Ohio isn’t a leakproof firewall if Obama ends up having to defend it at the expense of the rest of the country.

Hello world. I have a question on statistics; I cannot quite figure out the answer despite significant research here on the Internet. It regards margin of error. As I understand it, if we have a poll of Obama 50 and Romney 45 with “a 3% margin of error” that means that if that poll were done 100 times, 95 of those times Obama would get between 47 and 53 and Romney would get between 42 and 48. Two questions:

(1) Imagine a standalone poll–in a vacuum–with O 50 and R 45 with a MOE of 3%. Is it correct to say that, based on this poll, there are no grounds to believe that O is ahead?

And (2) Imagine that now instead of just one poll there is a series of polls, spread over time. In each poll O is leading R (perhaps by different amounts in each poll) but ALWAYS within the MOE of that individual poll. Can we not ascribe some objective advantage to O even though this is not implied by any individual poll? Thanks.

(1) No. In that example, O is ahead. Reported margin of error is usually the 95% confidence band for a 50-50 race – for one candidate’s vote share. The candidate-candidate margin doubles the uncertainty. In this case the principle is: if A leads B by the margin of error, he/she really is ahead 84% of the time – in the sample that was surveyed.

Correct me if I’m wrong, but the margin of error most polls quote is simply the statistical uncertainty, 1/sqrt(N). The uncertainty on subsamples (by ethnicity, gender, age) is larger, but not often quoted.

Also, they should quote uncertainties on likely voter screens, subsample weights, etc. and add those in quadrature, but don’t.

In the second case, the value of multiple polls depends on the type of series it is. In a RAND-style survey, where you re-ask the same sample of people over and over, asking them many times does nothing to reduce the possible error from randomly getting an unrepresentative sample of the population.f

But if you’re re-sampling the whole population every time, a consistent signal you see ought to gain statistical significance over time. Of course, it could be real or it could still be some systematic error in your procedure.

On the other hand, if the results are all within your MOE but Obama is never behind, at some point that’s a sign that something is wrong with your math. If you’re calculated the sampling error correctly, occasionally you should see some results that are outside the 95% confidence interval, and if the distribution is roughly symmetric, about 2.5% of the time the outlier should be in Romney’s direction.

1. Obama won the debate due to the Libya fact check issue. It would have been a tie otherwise.
2. Most people don’t care about Libya enough to change their vote, so very few minds were changed.
3. As long as the YouGov polls stay in, we’re looking at O+1-O+1.5 in the MM
4. Once Yougov rolls off (it’s a re-ask, so people are less likely to change their minds than never-asked, see RAND), we’re looking at an O+.5 to O+1.5 range.
5. Either candidate could move the MM a little over the next 20 days, but we can’t predict that.
6. If the MM is below the Kerry-Bush margin, Romney could win, as Kerry came close. He does have a potential undercount Hispanics issue, so I wouldn’t put 50-50 until the MM is at R+.5.
7. If we’re at Kerry-Bush or above on election day, Obama will probably win. R+1.5<MM<O+.9= long election night.

I think the “fact check” was bigger than the topic itself — people may not care about Libya, let alone on what day Obama said “terror” — but not having your facts straight when you make an attack that your advisors had been telling the world you planned to make just looks bad.

Alas, its good and its bad. The more popular Dr. Wang gets, the more we are beset here with low information conservative trolls citing singleton outlier polls and whining about dem oversampling.
And Im afraid some ginormous media conglomerate will buy him like Huffpo bought Blumenthal and the NYT bought Nate.
/sadface

Holy kurtosis a fractal!
The Debate1 around 10/3 behavior is isomorphic and opposite to the Debate2 behavior around 10/16.
Is that an artifact of the methodology?
Respondents can turn in their surveys up to a week late….I wonder if some responders held onto their surveys until after the debate.
I think RAND would tend to educate respondents about politics…maybe make them more aware.

If you look at the other charts on the RAND site, you can see that the jump wasn’t from anybody switching to Obama (the net motion is still toward Romney there). It was a jump in “likelihood of voting” among Obama supporters and a decline for Romney, which would be consistent with the theory that the main effect of the second debate will be on base turnout.

It sounds as if PPP doesn’t do a lot of likely-voter screening in national polls besides telling people to hang up if they’re not going to vote. So they might miss some of that.

Before anyone asks, it looks like this morning’s Meta Margin jump is mostly due to a Survey USA poll in Ohio and JZ Analytics/Newmax in Florida, both showing Obama leads. Both conducted prior to the Hosftra debate.

I have GOT to hear this! I’ll donate 10 bucks to the candidate of your choice if you offer to arm-wrestle Nate Silver on the air! (it’s hard to convey in text how much I love both sites so don’t take that too seriously).

Dr Wang, I have several questions that I would be delighted to hear addressed in your discussion with Nate Silver.

1) What assumptions lie at the core of you prediction model/method?

2) Looking back at your work as an election predictor, what old assumptions have you found it necessary to correct / fine-tune, and why?

3) Are there, in your opinion, substantive efforts to undermine the integrity of the American election and ballot counting process — and if so, how can you as an election predictor take this into account?

4) How do you deal with the new phenomenon that some pollsters seem more concerned with driving the narrative in the news media, than in accurate polling? Or is this never the case…

.
Regarding 1, for instance: You have been extraordinarily open about your methods and assumptions. As far as I can see, one of your core assumptions is “the wisdom of the crowd” — and that the sum-total of polls in each state is unbiased. Is there any reason to question this?

Nate Silver, however, often refers to “the Model” almost as though it is a third person, and he seems rather opaque as to his core assumptions. He seems to assume, though, that economic indicators are not already reflected in the polls and must thus be taken into account — whereas you’ve pointed out that this risks counting these factors twice.

Regarding 2: You have quite openly discussed your mistaken assumption in past elections about how the undecided votes would break. But I’m sure you and Nate have other interesting examples…

Regarding 3: You have, for example, recently analyzed the impact of re-districting on House races. And I believe you were quite astounded at what you found… (As I understand it, you have incorporated your findings into your House predictions.)

@wheelers cat: In 2008, McCain would have needed 62% of the white vote to reach 50% of the total vote, if his proportions among all other groups held steady. (He won 55% in the end.) That’s your starting point – if you assume Obama’s 2008 proportions among nonwhite voters hold (probably not a good assumption) and that the white vote declines slightly (probably a good assumption), then Romney would need about two-thirds of the white vote to win.

Thanks for the nice link to the NPR show. I’ll use it and tell others! Is it true I can listen after just as easily? If so, what time will it end/post?

I like the questions posed. Why not hand these or others in to the producer to look over? Better yet, say something quotable to affect voter enthusiasm, and get people out to the polls, or to inquire about absentee/mail voting right away.

Does anyone think publicizing voting by mail can help fight voter intimidation? If so, please join me in spreading the blue link below the FAQ, upper left on the this page.

In Berkeley CA, our mail in ballot is 4 oversize sheets of cardstock with 50+ candidates and 30+ measures covering all 8 sides. The included info says use $1.50 postage, but the Post Office advised $1.05, and of course there are drop off methods….

Say, can you or anyone tell us anything about the new Meta-Margin dip? Is the VP debate not yet in the mix? Or is the MM trying to better resemble a fish hook? Sometimes these results just stick in my craw.

I could be wrong, but I think the MM is very sensitive to new polls from very close states, e.g., OH, which had an O+3 poll last night and an O+1 poll late this morning.

Btw, I read a few weeks ago that something like 400K absentee ballots were thrown out in OH in 2008 for various defects. If one party tends to use absentee ballots more than another, that might swing a very close election.

“…400K absentee ballots were thrown out in OH in 2008 for various defects.”

This should be a cause for great concern. Who processe absentee ballots? If it’s Ohio’s Secretary-of-State Jon Husted, then I would be very worried, indeed. That is the same guy who fought tooth-and-nail to prevent making early voting being made available on weekends for all Ohioans. He shamelessly pursued this all the way to the US Supreme Court, hoping to suppress the vote of Democratic-leaning demographics.

I expect a lot Ohio absentee ballots to be “lost in the mail”.
And if Husted & Co has a finger in the “processing” or “counting” process, it should not be trusted.

Seriously does anyone really believe that what Mitt did in Debate #1 in “exposing” Obama changed the voter thinking process? Are the candidates’ positions, lack of details, weak points etc. so obscure that they were hidden till then?

One of the issues both you (SW) and Nate Silver have expressed concern over is the horribly low response rate for most pollsters nowadays (10% -ish).

Was it always thus?

If you were running an experiment where you accepted only 10% of the possible signal, you would worry that you had an extremely biased sample. And yes, there are weights etc. that pollsters apply to fix that problem to first order.

But perhaps there is something else hinky with this sample. Maybe they are people who are more prone to suggestion than the general voting public? Maybe the probability they will pick up the phone changes with what they see on TV? This could explain the large sudden jumps in the polling, which seem counter-intuitive in a season when there are very few undecideds.

Just to clarify: In the hypothesis I am forming, this 10% (who still pick up the phone when caller-id shows someone unknown, in day and age!) they mirror the general voting public in their opinions and statistical makeup in all ways (after weighting).

The only difference is that R-lean ones are less likely to answer in the aftermath of something like the 47% tape (and O-leans more likely); and O-lean ones less likely (R-lean more) after debate#1.

So in the long run (integrated over ~month) you get the right answer. But the sudden peaks and troughs are artifacts due to the inconstancy of this sample of people.

well my hypoth, Amitabh cher, is that enthusiasm correlates with likelihood of voting/picking-up-the-phone at R^2 = 1 for republicans and R^2 < 1 for same correlation for democrats.
Because of asymmetrical political behavior.
I just dont think carbon-based systems behave as neatly Gaussian as classical physics systems.

I can tell you that when I got into the business in the early part of the last decades, response rates were much, much higher, about 25% on average. However, I will note that the two best presidential cycles for pollsters have been 2004 and 2008, which came after four straight cycles of poor predictions – despite much higher response rates in those eras.

Now it’s true that methodology has probably improved overall during that time, and ideological cohesion has increased, which makes the job easier. so it’s impossible to say whether the decline response rates have negatively affected polling and just been overwhelmed by changes in other variables, or whether it had no effect at all.

wheeler’s cat: So if enthusiasm gooses R phone-picker-uppers more than D (and why this asymmetry?) then once we are two weeks or so past the event, things should get back to normal (ie, in agreement with the general voting public).

Craigo: It’s interesting that you have firsthand experience of steep drops in response rates. Do you have any ideas why it happened (is happening)? And is there any un-correctable difference between the people who still respond to pollsters and the rest of the population?

It’s a combination of factors: the rise of caller ID in the 1990s, followed closely by the rise of cell phones in the next decade, quickly followed by the rise of IVR polling (which has slightly lower response rates compared to live-interview).

The biggest difference in response rate lies in the 18-29 (and now somewhat older) age range, who are much less likely to answer their phone or participate if they do.

The second difference is those households where Spanish is the primary spoken language – lower response rates even when Spanish-language options are included.

The good news is that A) These groups comprise a lower proportion of voters than they do of the population, so the potential error is relatively small, and 2) the error is correctable using demographic weighting.

Some people worry that different partisan/ideological groups may have lower response rates, but I’ve seen no evidence for this claim. It was particularly widespread just after the 2004 election by pundits who confused the early exit polling (which predicted a Kerry victory) with pre-election opinion polling (which was exceptionally accurate that year).

@Craigo
Ed Freeland was here last month and described fairly robust polling methods to capture the cell phone demographics; stratification adjustment and dual frame polling. The problem appears to be (for example the gap between traditional pollsters and robopollers) that Gallup and Rasmussen either dont use or misuse those methods. I suspect this is because of asymmetrical political behavior in red polling houses– of course I can’t prove it because their methdology is opaque.
Which leads into Amitabh’s question. Red/blue genetics, neuropolitics, and asymmetrical ideology are all rather new domains of research.
But the basic premise is that red phenotypes and blue phenotypes differ in significant and measurable ways, both in morphology and function, and these differences could lead to asymmetrical effects in behavior.

I’m sorry I missed that! I’m really happy that Sam was able to get an expert to do a guest-post, i was hoping for that. I favor dual-frame until further research indicates that stratification adjustment is as effective (that’s not to say that it’s flawed, however – it just has more uncertainty at this point). I admit that dual-frame is often cost-prohibitive.

I have no doubt that there are neuropolitical differences (though it’s far, far from an area of expertise of mine). What I’m unsure of is that neurpolitics are producing different response rates among red and blue, and whether that potential difference is significant. I do find it quite interesting that Republican self-identification has dropped among conservatives this cycle.

One paper suggests that voter preferences normally should not undergo big swings in short periods of time. High volatility is due to the changing composition of the “likely” voter pool (as deemed by the pollster), which is affected by changes in enthusiasm.

I just looked over Gallup. So, let me get this straight. Obama now is behind only 1 point in registered voters. Day before, he was down 2. And his approval went up to 50%. Yet, he is down one among likely voters? This makes absolutely no sense whatsoever. So Gallup finds that more people approve of him. More people registered are voting for him, yet more are less likely to vote for him? This is why Gallup has gone downhill. I used to be addicted to them. But recent history shows they are horrible. Look at their polling for 2008. Even worse than 2010. I am glad I follow Dr Wang now. He is right. popular vote is meaningless in this democratic republic. It is the EC. My only concern was Dr Wang moving NH to the Romney column.

This is meant as an addition to Olav’s words to Loader; Surely there’s naught I can say to Olav he doesnt know all about!
I glance at other things, but if Sam work is right, it would mean it adds noise to go elsewhere, especially when PEC’s map is adjustable for this [I admit I had a friend travel quite a ways to get Java working just for PEC]
Beneath the daily snapshot map [is _that_ the SINGLE snapshot, Sam? Or are all three models of the single snapshot in the ideal mind?] choose current map.

Click on an unsafe colored safe and your map will firm up. That may tell you all you need to know. Read the “safe” EVs now. Interesting, eh?

With white states, use the swingstate list in the right margin under the header POWER OF YOUR VOTE [Raised in Jersey I love “jerseyvotes”]. That will suggest the logical color to turn your white states if any. If a state remains you may as well play with it yourself. Click the states and they strobe through the choices, the EV’s self tabulating deliciously.

This is meant as an addition to Olav’s words to Loader; Surely there’s naught I can say to Olav he doesnt know all about!
I glance at other things, but if Sam work is right, it would mean it adds noise to go elsewhere, especially when PEC’s map is adjustable for this [I admit I had a friend travel quite a ways to get Java working just for PEC]
Beneath the daily snapshot map [is _that_ the SINGLE snapshot, Sam? Or are all three models of the single snapshot in the ideal mind?] choose current map.

Click on an unsafe colored safe and your map will firm up. That may tell you all you need to know. Read the “safe” EVs now. Interesting, eh?

With white states, use the swingstate list in the right margin under the header POWER OF YOUR VOTE [Raised in Jersey I love “jerseyvotes”]. That will suggest the logical color to turn your white states if any.

Click the states and they strobe through the choices, the EV’s self tabulating deliciously.

If a state or two remain/s white you may as well play with it yourself. Thanks to Sam, you know as much as anyone.

Even more confusing? A new Newsmax/Zogby poll, which is a conservative republican tracker. How gives O a +4 in Florida. That is +1 over the pre-debate 2 poll. Wow. This is either good or bad depending on how you look at it. This tracking poll skews republican usually. I would love to ask Dr Wang and NS tomorrow what their take on it is.

Like he said, don’t bother looking at any one poll too closely – unless it has serious, obvious methodological problems.

But that said – Random error plus likely voter screens can do funny things, and tracking polls are not meant to be a snapshot of the day they are published, but the mean of public opinion over their entire sample, which at this point lies almost entirely between the two debates when Romney hit his peak.

Dr Wang, of course not! I have a ton of questions! But I would take up the entire time you and Nate are on the air. Heh. I was mainly replying to the one person’s freak out over Gallup, and pointing out the inconsistencies. I gave the Zogby poll as an example of why, as you state on your site, we need to track the EC, not the PV, which is what outfits like Gallup do. My concern is that the general public is a low information voter and they pay attention to name brands like Gallup etc, and not polls that actually track data in regards to the EC, which is really all that matters. 2000 showed us that. I cannot wait to hear the show tomorrow. Good luck!

One explanation for how Gallup can have Romney increase his lead at the same time that Obama’s aproval rating is going up is that their approval polling is on a three-day cycle. Not 7 days like the horse race. So the post debate polling just won’t be felt as quickly.

Rasmussen also had Romney expanding his lead. From 1 to two points. Likely because they just had a good Obama day roll off the cycle.

I am training myself to ignore the national polling. It really is not as useful. A post-debate poll from Ohio/Nevada/Colorado/Iowa/Florida etc would be way better.

I believe Gallup’s approval figures are drawn from U.S. adults, not registered voters, as well as being 3-day averages.

About 5 days ago Gallup announced it was changing its methodology to make cellphone calls 50% of their daily sample (rather than 30%, I think). Since the change, Romney has steadily climbed among LVs. That seems counterintuitive given younger, and presumably more Obama-friendly, voters are being called. Although I assume they still adjust their sample’s responses to match underlying demographic assumptions, I wonder what influence, if any, the change has had on their poll results. (Caveat: I know very little about the inner workings of polling.)

I wouldn’t call it probable, but it could happen. It sounds as if opposition to Obama in the South is getting more intense than we’ve seen in a very long time. If Obama can hang onto the swing states elsewhere while losing the South as badly as Gallup says he will, we could have an EV/PV split. And a lot of drama, probably.

There are no approval charts by individual states, but I keep seeing this pop-up, a gap of several points between approval and horserace for Obama and it strikes me as strange. Obama has frequently had a favorability gap because he is personally liked by people who don’t think he is doing a good job (enough of them anyway), but I don’t recall seeing this approval gap before.

On the Pollster chart, at the height of Obama’s peak he was at 48.4% and basically an identical 48.8% approval. Since then he’s dropped on the horse-race chart (at 47% at the moment) but actually improved on the approval chart, to where is he at 49.4% today.

I don’t have the time or ability to comb back through the polls to see if there is a true statistically significant difference between the correlation of approval rating and horse race number before and after the “fall” but if it’s there, I don’t see a rational explanation. According to the polls right now there is a 2%+ gap (nationally) of people saying they approve of the job Obama is doing but do not plan to vote for him for president. I just don’t get it.

I have a question regarding how you calculate your median for the state polls. How many polls do you use to calculate the median? Is it a fixed number, or do you have a chronological cutoff (i.e., using all the polls published in the last ten days)?

I’ve been trying to figure this out on my own, but I haven’t succeeded yet. Specifically, with regard to today’s data, I’m not sure how you arrive at R +2 in Florida. If I’m not mistaken, Romney was listed +1 there yesterday, and the new poll published in the last 24 hours has Obama +1. In this case, how did the median shift to Romney by a point?

I think the question of this election is how big of a lead does Romney need nationally to win Ohio. There seems to be a lot of data that point to a tied national popular vote. If you use Sam’s median method on the national polls you get an exact tie. With this tie, you have polls that show Obama +2 in Ohio with the same method. For all intents and purposes this is as close to a make or break state for either candidate. You could come up with plausible combinations for either that don’t include Ohio, but the most likely winning combinations for either do include it.

There is a precident for a candidate winning the popular vote by 3%, getting over 50% and still losing the electoral college – Tilden in 1876. So even if Romney won the popular vote by up to 2% and lost Ohio and the election, it wouldn’t be the biggest miss between the popular vote and electoral college.

Youve probably thought all this, plus noticed I said metamargin when I meant EV estimator, which explains the sharpness, now, doesnt it? But not the down turn. When did.does the vp debate & this last debate show up in charts? Over the course of a week, after a week, or??

I don’t figure to surprise you, but reading these comments, I think you and Silver could get get a bit of useful attention from collusion: are there procedures, questions or data you agree you need to see from polltakers ?
Is there some soundbitable narrative etc you can agree on that will help get everyone out voting? Was the debate downturn an implosion caused by gish gallop and imploded superman syndrome?

Seen as competitors, if you made surprising statement/s together, wouldnt it be hard for the mainstream media to resist? You may have real power here. Hide my comment if that helps. As IF!
To this end [teaming up a moment for power! :D], you might also want in advance to skim the book Nate’s flogging or otherwise prepare a positive remark short of,
If you only read one book this year, I think we found your problem.

My nominee for what’s most surprising would be your analysis of redistricting on the House. That seems to be one area where you surprised yourself with size of the advantage conferred to the Republicans by the 2010 redistricting exercise. I don’t think that item has gotten nearly enough mention in the national press.

So…I watched Silver on the Daily Show last night.
And I noticed something that is pretty prevalent on this blog too– DSP analogy.
What if DSP doesnt really work well for carbon-based systems? What if pattern recognition works better?
Is the brain a machine, or something else?
Is the underlying structure of reality Gaussian or Mandelbrotian?
What if we are doing this all wrong?