Demystifying the Science and Art of Political Polling - By Mark Blumenthal

May 14, 2005

AAPOR: Day Two

A few quick notes from some of the sessions I attended yesterday at the AAPOR conference:

Keep in mind that the "working papers" presented at AAPOR (actually most are currently PowerPoint presentations) are just that - works in progress. Also, I can only sit in on one of the eight presentations that typically occur at any given time, so what follows is just a tiny fraction of the amazing variety of research findings presented today. Finally, I am sharing my own highly subjective view of what's "interesting." I'm sure others here might have a different impression.

Party ID - I have written previously about the idea that party identification is an attitude that can theoretically show minor change in the short term. This morning, Trevor Tompson and Mike Mokrzycki of the Associated Press presented results from an experiment showing that the way survey respondents answer the party identification question can change during the course of a single interview.

Throughout 2004, the IPSOS survey organization randomly divided each of their poll samples, asking the party ID question for half of the respondents at the end of the survey (where almost all public pollsters ask it) and for the other half at the very beginning of the survey. When they asked the party question at the end of the questionnaire, they found that consistently more respondents identified themselves as "independents" or (when using a follow-up question to identify "leaners") as "moderate Republicans." They also found that the effect was far stronger in surveys that asked many questions about the campaign or about President Bush than surveys on mostly non-political subjects. Also, they found that asking party identification first had also had an effect other questions in between. For example, when they asked party identification first, Bush's job rating was slightly lower (48% vs. 50%) and Kerry's vote slightly higher (47% vs. 43%).

A few cautions: First, these results may be unique to the politics of 2004, the content of the IPSOS studies or both. The effect may have been different for say, a Democratic president at a time of peace and prosperity. Second, I am told that another similar paper coming tomorrow will present findings with a different conclusion. Finally -- and perhaps most importantly -- while the small differences were statistically significant, it is not at all clear which placement gets the most accurate read on party identification.

Response Bias and Urban Counties - Michael Dimock of the Pew Research Center presented some intriguing findings on an examination they did on whether non-response rates might result in an overunderrepresentation of urban areas. The basic issue is that response rates tend to be lower in densely populated urban areas, higher in sparsely Although the Pew Center uses a methodology that is far more rigorous than most public polls (they will, for example, "call back" numbers at least 10 times to catch those typically away from home), even though they weight their samples to match demographic characteristics estimated by the US Census, they still found that they under-represented voters from urban areas. Thus, in 2004, they also adjusted their samples to eliminate this \geographic bias.

The result, according to Dimock, was typically a one point increase in Kerry's vote, a one point drop in Bush's vote (for a two-point reduction of what was usually a Bush lead). Thus, Pew's final survey had Bush ahead by a three-point margin (48% to 45%) that more or less nailed the President's ultimate 2.4% margin in the popular vote. But not for the geographic correction, their final survey would have shown a 5% Bush lead.

After the presentation, a representative of CBS News pointed out that their survey also made a very similar weighting correction to adjust for geographic bias. While all of this may sound a bit arcane, it reflects an important change for these public pollsters who rarely weight geographically.

The User's Perspective - For MP, one very heartening development at this conference was discussion of the idea that in considering the overall quality of a survey, pollsters need to consider the perspective of the consumers of their data. Two very prominent figures within the survey research community, Robert Groves of the University of Michigan and Frank Newport of Gallup, both endorsed the view that one important measure of the quality of a survey is its "credibility and relevance" to its audience. Put another way, these two leaders of the field argued this week that pollsters "need to get more involved" in users of survey data perceive their work.

For my money, there no "users" of political survey data more devoted than political bloggers. As Chris Bowers wrote on MyDD yesterday,

Without any doubt in my mind, I can say that political bloggers are by far the biggest fans of political polling in America today. We are absolutely obsessed with you and what you do. Many of us subscribe to all of your websites. We read your press releases with relish, and write for audiences that are filled with hard-core consumers and devotees of your work. In Malcolm Gladwell's terminology, political bloggers and the many people who visit and participate in political blogs are public opinion mavens who can almost never consume too much information about the daily fluctuations of the national political landscape.

Chris is absolutely right. If political pollsters want to understand more about how their most devoted consumers feel, there is no better place to go than the blogosphere.

PS: Actually, the Chris Bowers quotation above is from the speech he will present by videotape tomorrow at a session at the AAPOR conference on blogging's impact on polling in which I am also a participant. Chris put the text of his presentation online, and those interested can view his readers' reactions to it here. Also, Ruy Teixeira has posted a summary of the various polling issues he discussed on his blog during the campaign.

Comments

I endorse Chris's points. As an end user during the campaign, I was a constant reader of the various free and subscription polls, especially the competing tracking polls summarized online in various places, as well as intrigued by the mathematical models posted to try and syntesize them all into a poll of polls.

I warned against such 'poll of polls' models in the sense that collating apples and oranges (different questions, different samples, etc) into a meta-analysis without the details from the individual polls was likely not accurate and potentially misleading (how do you weight TIME and Newsweek vs Pew?).

I wonder if you've run into any discussion of that aspect of pre-election punditry at AAPOR - meta-polls and their worth? CNN loved 'em - I hated 'em.

Posted by: DemFromCT | May 14, 2005 11:45:37 AM

Mark,

Did you mean underrepresented urban areas?

You wrote:
"Michael Dimock of the Pew Research Center presented some intriguing findings on an examination they did on whether non-response rates might result in an overrepresentation of urban areas. "

Great posts on the conference so far.

You might want to consider a more verbatim type post from the Mitofsky presentation. I think us consumers eat that exit poll stuff up as well. :)

Posted by: Alex in Los Angeles | May 14, 2005 4:42:07 PM

Alex... This is all coming from memory, so I don't want to be accused of misrepresenting anyone. If I do, it's not on purpose. Consider this my eyewitness account of events that seem like a blur to me because aapor is very new to me and because events today moved very quickly.

The Mitofsky presentation and the "debate" with Ron of USCV afterwards happened so fast, it was hard enough to absorb, let alone get enough for anything close to a verbatim account. It would be nice to get Warren's slides.

Warren wasn't the only one who presented on the 2004 exit polls. Kathy Frankovich of CBS and Fritz Shueren of NORC (and President of the American Statistical Association) presented. Kathy's presentation had some great insight, that I think I'll let Mark share if he dares ;-).

Fritz's presentation was excellent as well as he obtained the raw Ohio exit poll data and did some thorough analysis. He concluded there was nothing strange about OH in the poll data, but that when you analyze the count data in OH, there are some irregularities worth looking into (he and others are doing that now). His bottom line seemed to be that the exit polls are a distraction from investigations into what appear to be irregularities in the count data in Ohio.

While I'm on this subject, there was another presentation on exit polls that was fascinating. Argghhh I don't have the info in front of me, but can provide more if Mark doesn't decide to take it up in a post on the main page. Basically, a group did an exit poll in New Mexico, but not with the same objectives as the NEP. They were trying to survey voter experience in voting. They asked questions like how hard it was to find the polling place, etc.

The most fascinating part was all the problems they encountered with being forced to stand along partisan activist organizations like MoveOn, which they think depressed response rates and introduced bias (made Bush voters less apt to respond or to avoid the pollster when there are partisan activists standing next to the pollster). Basically, the information they presented about their on the ground operations screamed differential response. (Fritz Shueren took a great photo of their pollster standing with someone from MoveOn and another protect the vote liberal advocacy group - Mark, I've requested it from the study author, perhaps you could get it from Shueren).

Hopefully Mark gets into a lot more detail. He took better notes and wasn't handicapped by the "rookie" factor, like I was. Also, getting papers, powerpoints, and other notes from presenters is very useful. I asked presenters for more stuff than I could possibly read before next year's conference, but the response very often was "I'll e-mail it to you." So, Mark, like me, is probably waiting for some e-mails and Mark probably doesn't want to post a bunch of anecdotes from memory on his blog, like I just did in his comments (heh).

"Finally -- and perhaps most importantly -- while the small differences were statistically significant, it is not at all clear which placement gets the most accurate read on party identification."

How about another thought, it should not be used as a tool to weight the entire survey. We have long said that party ID is an interesting stastic but it cannot and should not be used as a weighting variable.

Posted by: Eric Nielsen | May 16, 2005 9:42:14 AM

Eric, the more I learn about this, the more I tend to agree with you. Party ID seems to be have a lot like an attitude.

I think the AP/IPSOS study was interesting in that there might not be *one* good placement of the party ID question; that is, if that question is placed in the same place in every questionnaire, then it might not be reliable. If it is not reliable, it is a less "interesting stastic."

Will Gallup be experimenting with placement of their Party ID question? For those of us who find Party ID an interesting statistic, I'd sure like to know if there is an ordering effect.

The problem is, I think there's going to be an ordering effect both ways.

Meaning, if you ask, let's say, support/oppose war in Iraq and then ask party affilation, you might get x% supporting, y% opposing, and a, b, and c% republican, democrat, and independent.

And if you switch the order of the questions, you might well get d, e, and f% identifying with each party and v and w% supporting and opposing the war.

Lots and LOTS of room to test this out.

Posted by: Ken Alper | May 16, 2005 2:26:56 PM

Ken, so true!

If there are ordering effects either way, and it is agreed that weighting to this characteristic is not really the best idea in the first place (some may argue that point), then why include the question? If it's degrading the reliability of the survey, then perhaps it shouldn't be asked at all.

Did we meet Saturday Ken? I recall meeting Jay and someone else from Survey USA, but am not sure if it was you.

I would you to review the most recent study at: www.USCountvotes.org in detail - especially the Appendices.

Our basic point, has been, and continues to be that the E-M data as reported are not consistent with constant mean uniform "bias" across partisan precincts.

In our latest study we have demonstrated this in three different ways. We show that:

a) overall exit response rates, that would be required in “representative” high Bush and high Kerry precincts to generate the within precinct errors, are mathematically infeasible.
b) output simulation of individual precincts shows that Bush voter exit poll reluctance would have to change to by at least 40% across partisan precinct categories to generate the outcomes reported by EM.
c) input simulation of partisan exit poll response rates randomized around mean values of .56 for Kerry voters and .5 for Bush votes – which E-M claims can explain all of the within precinct error in the exit poll samples (see p. 31 of their report) – shows in multiple runs of over 10,000 simulations that mean and median WPE and overall response rates, especially in high Bush and Kerry precincts, cannot be generated.

On the other hand, our analysis shows that EM’s data IS consistent with a non-uniform response bias that would require further explanation, a uniform bias and vote shifting, or vote shifting. The patterns of mean and median WPE and overall response in high Bush and high Kerry precincts in particular cry out for further investigation.

My comments and "debate" with Mitofsky (in my view) centered on:

a) Clarifying a number of misreprestations made by the (Mitofsky, Frankovich, and Shueren) panel , and the chair of the panel whose name escapes me, i.e. that our analysis was based on "leaked" afternoon data that had not be properly weighted. Frankly I was surprised that these claims were still being made. As reader of Mystery Pollster, I think know, our earlier analysis was based on the "unadjusted" exit poll data downloaded from CNN that stamped as having been updated after 12:00 AM on Nov. 3. Appendix C to our previous (April 12 updated) report shows that this data is almost identical to the Call-3 composite data presented in the E-M report (p. 21-22). This was evidently the "best guess" by E-M of the election at come at the time it was released. Warren jumped on my phrase "unadjusted" to imply that I thought this data was somehow pure unadjusted exit poll data. What I meant is that it had not yet been adjusted to match the reported election outcomes. My point, any claim that we have been working with bad data implies that E-M has also been working with bad data, as we're using the same data - though of course we're restricted to what they release.

b) After reporting on our results (above - in much less detail given my time limit), I challenged Warren (and later Joe Lenski of edison media research) to prove his "hypothesis" that the exit poll discrepancy was caused by a pervasive reluctant Republican response
bias, specifically the statement on p. 31 of the E-M report that:

"While we cannot measure the completion rate by Democratic and Republican voters, hypothetical completion rates of 56% among Kerry voters and 50% among Bush voters overall would account for the entire Within Precinct Error that we have observed in 2004."

This is, after all what has been picked up as THE explanation of the exit poll discrepancy.

My point was simple. If this is the explanation prove (or at least provide credible statistical support for it) by running multifactor regressions that link the numerous factors that your tabulations show effect WPE (number of precincts, rate of response, distance from poll, etc.) to WPE, and than run annother regression linking precinct level WPE to state level exit poll discrepancy. In particular show that these factors result in the "partisan response" 56/50 (or something close to this) partisan response rates that are uncorrelated with precinct partisanship cited above.

This is statistics ABC, the first thing one would do to support such a hypothesis. For some reason no one has asked them to do this obvious analysis to support their hypothesis. I noted that as an editor of an academic journal (which I am) I would have rejected the E-M report as presenting an unsubstantiated hypothesis without such an analysis. I said something like "This is the 20th century. We can do better than tabulations." Mitofsky reponded by saying that he had done the regressions?! Why then have they not been released?!

c) Finally, I asked that E-M release, at least, precinct level reported election results and final composite call-3 unadjusted (to the reportred election results) weights, so that we can do our own WPE analsis. Interestingly, Shueren's analysis, by far the most interesting on the panel - which I haven't yet obtained a copy of to review, was based on raw precinct data and precinct level reported election results for Ohio. If the data on Ohio can be released to Shueren, I see no reason why this entire data set can't be released to everyone else. Ultimately, if this analysis supports a more complete non-statistical investigation (as we are concerned that it will) - precinct identifiers of the unexplanable precincts should be released as a matter of overriding national interest in a fair and credible election.

Other more generally points:

d) Our report also includes, I think, a nice list of reccommendations for reforms needed to restore confidence in our electoral systems including: routine independent exit polls, voter verified paper BALLOTs that can be checked by citizen election judges (not just paper trails accessible only to voting equipment vendors), and admistration of elections by non-partisan civil servants. These may begin to put us in line with international election standards!

e) One thing that struck me about the conference is the gap between the pollsters and the election reform movement. We (UScountvotes and the Ohio lawyers) have been collecting (from under oath testimonials) and receiving reams of other evidence and reports of election corruption (some on a truely massive scale) from all over the country. My feeling was those who question our analysis do so largely because they simply cannot believe that the level of corruption that we posit as a "possible hypothesis" simply could not occur. Perhaps if they were more aware of what's going on on the ground - they would be less skeptical?

f) After all, I think our analysis, has been more intensive and more credible than that of E-M (especially given our data restrictions).

It was interesting to me that the only time this is appreciated is when a former member of our group (Elizabeth Liddle) comes to the conclusion that that uniform bias MIGHT be consistent with the reported E-M data. Her analysis which came out of interaction with our 'passionate but shoddy work' (reported characterization by Mitfosky) - which we I believe we have developed further than she has (see our latest report) - is now suddenly hailed by Mitofsky as a great insight. The value of one's analysis, as judged by the mainstream, seems very much to depend on what kind of conclusions you reach from it!

g) Mystery blog readers, I hope know, that the central effect of Elizabeth's (who still as far as I know characterizes herself as a "fraudster") valuable insight regarding the difference between partisan response rates and WPE is to make the uniform bias hypothesis LESS credible (see Appendix B of our last two papers). Uniform bias means the WPE by partisanship relationship should be "U" shaped not even flat, let alone inverted "U" shaped. The asymmetry part of her analysis is a mathematical "nit" (having to do with relating a ratio "alpha" to an absolute difference "WPE") that cannot possibly explain the WPE asymmetry in the E-M data (see Appendix E).

I think I've gone on long enough. Again, I urge readers to study our most recent report at www.uscountvotes.org in detail. My own (unbiased opinion!) is that it's an excellent report that far outstrips anything that has been done to date on the E-M hypothesis. What's more it's completely transparent and verifiable by any reader who cares to download our publicaly available (and transparent - one is on a spreadsheet) "exit poll" simulators and check the E-M hypothesis (and a range of "vote shift") hypotheses out by themselves!

Best Regards to Mystery Blog Readers,

Ron Baiman

P.S. - Mark Blumenthal and I had some exchanges during the conference,which by the way I (and Peter Pekarsky) were only able to go to because of the outpouring of local support from the Andersonville Neighbor's for Peace, Oak Park Committee for Peace and Justice, and the Chciago Democratic Socialists of America, after I casually mentioned that we would not be able to send anyone to the AAPOR at some local forums. A lot of people are livid about what's going on - my sense is that credibility in our electoral system has never been lower - at least in modern times. In any case, Mark and I, have agreed for now to disagree (until I can convince him otherwise!) on the plausibility that the reluctant Republican responder hypothesis can fully explain the exit poll discrepancy.

Before, I forget, let me thank those members of AAPOR who prevailed on the conference organizers to waive our reg fees, provide us with a free lunch, allow us "first response" to the panel, and the opportunity to distribute literature and debate with the panel, other AAPOR members, and the press, after the panel. USCV plans to join AAPOR and submit a formal proposal to next year's AAPOR convention.

Posted by: Ron Baiman | May 16, 2005 2:43:28 PM

Ron, funny thing but you put this in comments to the wrong post. You should refer to Mark's detailed post regarding the exit poll discussion above.

Why simulate any longer when we have the results of Lizzie's index applied to the real data? The correlation is not significant. Here's a statistics ABC lesson for you: The line is not different from zero. Bias is not concentrated in High Bush precincts, not even close.

Funny, you told me, Warren, even David Moore of Gallup that the scatters were "just a blob" and didn't say anything. How right they are - they don't say anything about fraud. They are consistent with randomly distributed bias that resulted in an overall bias towards Kerry. No fraud fingerprints left in the pattern.

Those scatters thwart your entire case. The fact that you call them a "blob" is very enlightening.

Hi, Rick! That was me. Very nice to meet you; sorry we got sidetracked at the end there.

I think the party affiliation question gets asked mainly as a crosstab, so that users of the poll can quickly see who's gaining and losing ground among which groups of voters. Leaving it out and replacing it with a conservative/moderate/liberal crosstab might be interesting, though then observers wouldn't be able to look and see how many republicans/democrats/independents were in the poll.

An interesting experiment, actually, to try to see just how much party affiliation changes, would be to take daily or weekly tracking over the course of a year, asking nothing but party identification, and displaying the numbers on a graph that points out news events that might logically cause identification to shift.

Posted by: Ken Alper | May 16, 2005 3:27:05 PM

Ken, glad to hear! Now I have a face to match the name.

If there are ordering effects with the Party ID question and they do reduce the reliability of responses the instrument's other questions, then perhaps the party ID question should only be asked as a stand alone question? It seems to be a problem definitely worthy of more research.

Maybe SurveyUSA could test this question and present next year? It would sure be worth knowing if responses to the single question differed from responses to the Party ID question imbedded in other surveys over time.

Yes, Rick. DU forum has been taken over by the elementary-school-level "mathematician" and his mob of sycophants. He discovered a couple of functions in Excel and thinks he is an "expert" now :) It's really funny how he cannot answer even the most elementary questions about polling statistics and methods - but hey, he is urged to write books (!) by his "true believer" followers.

Posted by: Sawyer | May 17, 2005 1:46:36 AM

Of course, the complete and utter un-professionalism of Kathy Dopp can be observed on DU board where she started a witch hunt against Elizabeth. How do you feel about that, Ron? How does it feel when your co-author's "academic" argument on a public forum has been reduced to "Are you now or have you ever been...?"

Posted by: Sawyer | May 17, 2005 1:51:42 AM

But TIA has three mathematics degrees! I have to give Elizabeth some credit for trying at DU, although she's tried pretty hard to explain what's going on to USCV.

Do you know who Ribofunk is? S/he is sort of on the right track, but not quite there. Same with mgr. These folks should come over here and discuss some things and then take what they learn back to DU.

If TIA has any math degrees at all, then I am a ballet dancer. In one of the arguments on the board he insisted that 40*10=4000. Seriously.

ribofunk and mgr are walking a very thin line over there. One step over it and they will be banned as well. Censorship is alive and well in the DU land. The eradication of all "thoughtcrime" is in full swing.

Posted by: Sawyer | May 17, 2005 2:23:51 AM

Yes, welcome, Sawyer. I'm not at liberty to answer your questions to Ron, but I can say that the truth is out there. And I do mean, out there. ;)

"Professional pollster Mark Blumenthal started Mystery Pollster to provide better interpretation of polling results and methodology... offers much needed help to Political Wire readers" - Political Wire