Demystifying the Science and Art of Political Polling - By Mark Blumenthal

May 16, 2005

AAPOR: Exit Poll Presentation

Unfortunately, the sleep deprivation experiment that was my AAPOR conference experience finally caught up with me Saturday night. So this may be a bit belated, but after a day of travel and rest, I want to provide those not at the AAPOR conference with an update on some of the new information about the exit polls presented on Saturday. Our lunch session included presentations by Warren Mitofsky, who conducted the exit polls for the National Election Pool (NEP), Kathy Frankovic of CBS News, and Fritz Scheuren of the National Opinion Research Center National Organization for Research and Computing (NORC) at the University of Chicago.

Mitofsky spoke first and explicitly recognized the contribution of Elizabeth Liddle (that I described at length a few weeks ago). He described "within precinct error" (WPE) the basic measure that Mitofsky had used to measure the discrepancy between the exit polls and the count within the sampled precincts: "There is a problem with it," he said, explaining that Liddle, "a woman a lot smarter than we are," had shown that the measure breaks down when used to look at how error varied by the "partisanship" of the precinct. The tabulation of error across types of precincts - heavily Republican to heavily Democratic - has been at the heart of an ongoing debate over the reasons for the discrepancy between the exit poll results and the vote count.

Mitofsky then presented the results of Liddle's computational model (including twocharts) and her proposed "within precinct Error_Index" (all explained in detail here). He then presented two "scatter plot" charts. The first showed the values of the original within precinct error (WPE) measure by the partisanship of the precinct. Mitofsky gave MP permission to share that plot with you, and I have reproduced it below.

The scatter plot provides a far better "picture" of the error data than the table presented in the original Edison-Mitofsky report (p. 36), because it shows the wide, mostly random dispersion of values. Mitofsky noted that the plot in WPE tends to show an overstatement mostly in the middle precincts as Liddle's model predicted. A regression line drawn through the data shows a modest upward slope.

Mitofsky then presented a similar plot of Liddle's Error Index by precinct partisanship. The pattern is flatter and more uniform and the slope of the regression line is flat. It is important to remember that this chart, unlike all of Liddle's prior work, is based not on randomly generated "Monte Carlo" simulations, but on the actual exit poll data.

Thus, Mitofsky presented evidence showing, as Liddle predicted, that the apparent pattern in the error by partisanship -- a pattern that showed less error in heavily Democratic precincts and more error in heavily Republican precincts -- was mostly an artifact of the tabulation.

Kathy Frankovic, the polling director at CBS, followed Mitofsky with another presentation that focused more directly on explaining the likely root causes of the exit poll discrepancy. She talked in part about the history of past internal research on the interactions between interviewers and respondents in exit polls. Some of this has been published, much has not. She cited two specific studies that were new to me:

A fascinating pilot testin 1991 looked for ways to boost response rates. The exit pollsters offered potential respondents a free pen as an incentive to complete the interview. The pen bore the logos of the major television networks. The pen-incentive boosted response rates, but it also increased within-precinct-error (creating a bias that favored the Democratic candidate), because as Frankovic put it, "Democrats took the free pens, Republicans didn't." [Correction (5/17): The study was done in 1997 on VNS exit polls conducted for the New York and New Jersey general elections. The experiment involved both pens and a color folder displayed to respondents that bore the network logos and the words "short" and "confidential." It was the folder condition, not the pens, that appeared to increase response rates and introduce error toward the Democrat. More on this below]

Studies between 1992 and 1996 showed that "partisanship of interviewers was related to absolute and signed WPE in presidential" elections, but not in off-year statewide elections. That means that in those years, interviews conducted by Democratic interviewers showed a higher rate of error favoring the Democratic candidate for president than Republican interviewers.

These two findings tend to support two distinct yet complementary explanations for the root causes of the exit poll problems. The pen experiment suggests that an emphasis on CBS, NBC, ABC, FOX, CNN and AP (whose logos appear on the questionnaire, the exit poll "ballot box" and the ID badge the interviewer wears and which the interviewers mention in their "ask") helps induce cooperation from Democrats, "reluctance" from Republicans.

Second, the "reluctance" may also be an indirect result of the physical characteristics of the interviewers that, as Frankovic put it, "can be interpreted by voters as partisan." She presented much material on interviewer age (the following text comes from her slides which she graciously shared):

In 2004 Younger Interviewers...

* Had a lower response rate overall - 53% for interviewers under 25 - 61% for interviewers 60 and older * Admitted to having a harder time with voters - 27% of interviewers under 25 described respondents as very cooperative - 69% of interviewers over 55 did * Had a greater within precinct error

Frankovic also showed two charts showing that since 1996, younger exit poll interviews have consistently had a tougher time winning cooperation from older voters. The response rates for voters age 60+ were 14 to 15 points lower for younger interviewers than older interviewers in 1996, 2000 and 2004. She concluded:

IT'S NOT THAT YOUNGER INTERVIEWERS AREN'T GOOD - IT'S THAT DIFFERENT KINDS OF VOTERS MAY PERCEIVE THEM DIFFERENTLY

Partisanship isn't visible - interviewers don't wear buttons -- but they do have physical characteristics that can be interpreted by voters as partisan.

And when the interviewer has a hard time, they may be tempted to gravitate to people like them.

Frankovic did not note the age composition of the interviewers in her presentation, but the Edison-Mitofsky report from January makes clear that the interviewer pool was considerably younger than the voters they polled. Interviewers between the ages of 18 and 24 covered more than a third of the precincts (36% - page 44), while only 9% of the voters in the national exit poll were 18-24 (tabulated from data available here). These results imply that more interviewers "looked" like Democrats than Republicans, and this imbalance introduced a Democratic bias into the response patterns.

Finally, Dr. Fritz Schueren presented findings from an independent assessment of the exit polls and precinct vote data in Ohio commissioned by the Election Science Institute. His presentation addressed the theories of vote fraud directly.

Scheuren is the current President of the American Statistical Association, and Vice President for Statistics at NORC. He was given access to the exit poll data and matched that independently to vote return data.[Correction 5-17: Schueren had access to a precinct level data file from NEP that included a close approximation of the actual Kerry vote in each of the sample precincts, but did not identify those precincts. Scheuren did not independently confirm the vote totals ].

The more detailed information allowed us to see that voting patterns were consistent with past results and consistent with exit poll results across precincts. It looks more like Bush voters were refusing to participate and less like systematic fraud.

Scheuren's complete presentation is now available online and MP highly recommends reading it in full.

[ESI also presented a paper at AAPOR on their pilot exit poll study in New Mexico designed to monitor problems with voting. It is worth downloading just for the picture of the exit poll interviewer forced to stand next to a MoveOn.org volunteer, which speaks volumes about another source of problem].

The most interesting chart in Scheuren's presentation compared support for George Bush in 2000 and 2004 in the 49 precincts sampled in the exit poll. If the exit poll had measured fraud in 2004, and had fraud occurred in these precincts in 2004 and not 2000, one would expect to see a consistent pattern in which the precincts overstating Kerry fell on a separate parallel line, indicating higher values in 2004 than 2000. That was not the case. A subsequent chart showed virtually no correlation between the exit poll discrepancy and the difference between Bush's 2000 and 2004 votes.

The experimental research cited above was part of the VNS exit poll of the New Jersey and New York City General Elections in November, 1997. While my original description reflects the substance of the experiment, the reality was a bit more complicated.

The experiment was described in a paper presented at the 1998 AAPOR Conference authored by Daniel Merkle, Murray Edelman, Kathy Dykeman and Chris Brogan. It involved two experimental conditions: In one test, interviewers used a colorful folder over their pad of questionnaires that featured "color logos of the national media organizations" and "the words 'survey of voters,' 'short' and 'confidential.'" On the back of the folder were more instructions on how to handle either those who hesitated or refused. The idea was to "better standardize the interviewer's approach and to stress a few key factors" to both the interviewer and the respondent intended to lead to better compliance.

In a second test, interviewers used the folder and offered a pen featuring logos of the sponsoring news organizations. A third "control" condition used the traditional VNS interviewing technique without any use of a special folder or pen.

There was no difference between the folder and folder/pen conditions so the two groups were combined in the analysis. The results showed that both the folder and folder/pen conditions slightly increased response rates but also introduced more error toward the Democratic candidate as compared to the control group. Since there was no difference between the folder/pen and folder conditions, it was the folder condition, not the pen, that appeared to influence response rates and error.

The authors concluded in their paper:

The reason for the overstatement of the Democratic voters in the Folder Conditions is not entirely clear and needs to be investigated further. Clearly some message was communicated in the Folder Conditions that led to proportionately fewer Republicans filling it out. One hypothesis is that the highlighted color logos of the national news organizations on the folder were perceived negatively by Republicans and positively by Democrats, leading to differential nonresponse between the groups.

Murray Edelman, one of the authors, emailed me with the following comment:

The reference to this study at the 2004 AAPOR conference by both Bob Groves and Kathy Frankovic in their respective plenaries has inspired us to revise our write up of this study for possible publication in POQ and to consider other factors that could explain some of the differences between the two conditions, such as the effort to standardize the interviewing technique and persuade reluctant respondents and the emphasis on the questionnaire being "short" and "confidential." However, we agree that the main conclusion, that efforts to increase response rates may also increase survey error, is not in question.

1. The University of Chicago's NORC = National Opinion Research Center, which can be verified by going to the website you linked to and checking under "About NORC." Interestingly, though, toward the bottom of the essay on the NORC page, is does say, "NORC is a national organization for research and computing..." (which appears to just be an unfortunate coincidence of acronyms).

2. I'm anxious to hear about the session you organized on the role of the blogosphere in 2004 election polling. I don't think you've written about that yet, have you?

Posted by: Alan R. | May 16, 2005 11:18:52 PM

For reasons that I hope will be made clear soon, I am retracting my pen/pencil comment above. It turns out that Democrats like folders with media logos on them more than Republicans. The pens didn't make much of a difference.

a) Constant "bias" ("Liddle's Index" = K/B in USCV notation, or Percent Kerry voter response/percent Bush voter response) should generate a "U" shaped curve not a flat curve, as shown on p. 5-7 and derived in the Appendices to our latest study at: http://www.uscountvotes.org. If Mitofsky's data show no "U" shape but a flat line, he's just proved our point - these data cannot be the result of a constant mean uniform response "bias".

b) However, many different non-linear patterns may produce no correlation, so one cannot tell from a "blob" , or a linear correlation analysis what's really going with these data. E-M produced aggregate mean and median WPE's (see Table 1 p. 17 of our report) which clearly show an inverted "u" shape to mean and median WPE across partisanship categories.

c) Regarding the "Liddle Bias Analysis" chart, a similar point. The fact that a linear correlation gives almost no, or slight positive correlation, does not prove mean uniform bias. It may just prove that you can run an (a more or less) flat line through an "inverted u" shaped "Alpha" curve. E-M aggregate data, clearly show that Alpha for "representative" mean and median precincts in each partisanship category, has a strong inverted "U" shape (see Alpha columns on p. 17 of our report).
Inverted "U" Alpha is NOT constant mean Alpha but rather has to be explained. Scatter plots and linear correlations have to be consistent with the E-M aggregate tabulations presented in their report.
You can't change the basic pattern of the data by running a line through it in a scatter plot!

d) My basic point again, and this is statistics ABC. E-M have the data to do a serious multi-factor regression analysis that would support or disqualify their uniform response bias hypothesis. They have not done this, or at least not released such an analysis, or released the data to the public so that others can do it (multiple "independent" analyses would obviously be better) except, in a slightly aggregated form, for Ohio to Scheuren. This suggests that they could, at the very least immediately, release ALL of their data in this form to everyone.

e)The data that they have released indicates that constant mean response bias cannot explain the exit poll discrepancy. This is the point of the simulations which show either what the data would have to look like if it was produced by a constant mean response bias ("U" shaped WPE and declining overall response rates), or whether such a bias can produce the aggregate results reported by E-M (it can't). Tabulations, and blobs with linear correlations do not disprove these mathematical results.

f) E-M need to stop playing blob games and engage in a serious statistical analysis to find the causes of the discrepancy and let others do so as well. It is highly irresponsible to withhold data of this importance and on top of this not to do a serious analysis (or release it to the public) of information that may be critical to reestablishing trust in, and/or radically reforming our electoral system.

g) The media (including Mystery Pollster) need to stop going along with this charade and start demanding real analysis and open access to the data. They could start by widely publicizing the UScountvotes report and all of the other evidence that has been accumulating of election "irregularities" and demanding a thorough investigation.
Uncritical support for an unsubstantiated “reluctant responder hypothesis” undermines our ability to get to the bottom of this.

Best,

Ron

Posted by: Ron Baiman | May 17, 2005 1:26:02 PM

Ron:

I think you forgot to take the log. Your range is from .5 to 2. Take a natural log, (or any log) and the U curve should vanish (an alpha of .5 means two Republicans for every Democrat; an alpha of 2 means two Democrats for every Republican - that's why you have to take the log of alpha).

Interestingly, in that plot, you also model the increased variance at the extremes, also seen in the E-M scatter. I think this is due to the fact that when the numbers of any one group of voters is small, "bias" fluctuates much more wildly. At the extreme, if there is only 1 Democrat, you will either get 100% response rate or 0%.

So your plot (yellow curve) of constant mean bias, once the log function has flattened it, will look remarkably like the E-M data set. It would look even more like it if you increased the variance.

Oh, and I agree about the regressions. It is a point I have been making for a while.

Posted by: Febble | May 17, 2005 1:49:04 PM

Febble, your maturity in dealing with Ron is remarkable and I can only hope to emulate you when I grow up. ;-)

Ron, to your point: "This suggests that they could, at the very least immediately, release ALL of their data in this form to everyone."

Why don't you ask Warren or Fritz how much effort went into producing that dataset for ESI? Ohio represents only a fraction of the precincts in the dataset. Are you proposing to pay for E-M to do this for the entire dataset?

Also, how do you expect to get the NEP partners to go along with your plans? They paid big bucks for that data. If the exit polls didn't show patterns of fraud in Ohio, where there is other evidence of serious irregularities, what exactly are you expecting to find? Sounds like a post-hoc fishing expedition to me and they probably see right through you.

I've been right there with you asking for the multiple regressions. In fact, I told Warren Saturday night that I thought that many AAPOR members want to see that regression model and probably agreed with your point on the floor following his session.

A very influential member of AAPOR (former past president) was standing there when I raised this point and nodded his head in agreement. Warren told us to just be patient. I take that to mean that he understands the desire for more analysis and that he'll try to get more released; but, in the end, it's not his call.

Remember, Ron, the Merkle and Edelman study Kathy Frankovic mentioned in her talk was done in or around 2000. That analysis was based on the 1992 and 1996 elections. Not exactly quick as lightening...

Ron, you may not have seen the questions I posted in the other forum, so I will ask here:

1. Are there members of USCV that signed the first paper that you put out that did not sign the second paper because they didn't agree with it?

2. Are there members about to come out with papers of their own rebutting yours?

3. What is your opinion of the witch hunt that your colleague Kathy Dopp has started against Elizabeth on the DU forum? Do you consider that professional behavior on Dopp's part?

Posted by: Sawyer | May 17, 2005 2:18:55 PM

Dear Febble,

Taking a Log will convert "U" shaped Alpha into upward sloped Alpha. The point would be the same, Alpha (however calculated) is not flat. The Log helps make the index more symmetrical but does nothing to support linearity.

Based on Table 1 p. 18 of our report, natural logs of Alpha from means would be (going from high Kerry to high Bush precincts):

-0.0166
0.1448
0.1704
0.1414
0.4626

From Medians:

0.019
0.137
0.168
0.141
0.438

As you can see this is a strong upward slope (a change of over 2894% for means and over 236% for medians!) that just reinforces the point that constant Alpha cannot produce these data.

Best,

Ron

Posted by: Ron Baiman | May 17, 2005 2:29:25 PM

Ron are you assuming anything about the accuracy of the response rates in the January report? That is, have you considered that for some precincts, where interviewers deviated from the prescribed sampling interval, or did not follow the instructions for recording refusals and misses, the response rates could be artifically high while resulting in additional bias?

If you read my paper, Ron, you will note that applying the function to the aggregated category means (or even medians if they are derived from a range of vote-count margins) does not get rid of the artefact. It needs to be applied at precinct level.

It is your U shaped plot from the simulator I am referring to (page 4 of the version of your paper downloaded 17th May 2005). If you plot log(alpha) instead of alpha for your simulated data points for constant mean bias, I think you will find that the yellow curve is flattened.

Posted by: Febble | May 17, 2005 2:39:11 PM

Ron, way back when (April 6th I believe), Febble agreed with you. She applied her function to the aggregate means and medians and said it made your case stronger. Then I e-mailed her for her formula and spreadsheets and we began corresponding.

MP and I suggested that her function would behave differently when applied to the precinct level data. She modeled it at the precinct level and - voila - it did behave differently.

Now we have it applied to the precinct level data (thanks Warren!). Whereas there was a slope before, there is no slope no longer. The "blob" tells us everything.

Rick, Mark, and most of all Febble -- I'm immensely pleased to see how this body of work has matured, and greatly regret that I haven't had time to keep up and catch up on the play-by-play.

Posted by: RonK, Seattle | May 17, 2005 3:03:05 PM

Mark, as you know, I registered my gripe with the AAPOR community in an e-mail to you that the 1998 Merkle et al study suggested further research on this issue, but 7 years later, there hasn't been any follow up.

I wonder if they'll revisit the other Merkle and Edelman paper Kathy mentioned in her talk? Perhaps they could include analysis of 2000 and 2004, that is, if they asked the right questions in the post-election survey of their interviewers.

As we start with randomized Kerry voter and randomized Bush voter response rates and not with Alpha, there is no need to take Logs as there would be if we started with K/B=Alpha as you do. (Thanks for letting me know who you are!) We are treating Kerry and Bush voter response rates symmetrically as normally distributed around mean .56 and .5 response rates (in the E-M hypothesis symulation reported on in Appendix H in our study).

Alpha, w, E=WPE, R, etc. are then simulated from the E-M response rate hypothesis. The maximum and minimum calculations show that no precinct level distribution can produce the E-M results.

Ron

Posted by: Ron Baiman | May 17, 2005 3:50:18 PM

Ron:

If you are going to compare the plot in Mitofsky's talk with your plot of "constant mean bias" you need to take the log of alpha first, because that is what Mitofsky plotted. It doesn't make any sense to plot alpha. Alpha is the numerator of ratio. You need to take a log of it to get a measure that is symmetrical around zero (alpha = 1). Otherwise the distribution will be skewed and you will get a U, as you did. It needs ironing.

Cheers,

Lizzie (Febble for blogs, as in Febble's Fancy Function, as christened by RonK)

Posted by: Febble | May 17, 2005 4:53:32 PM

I may be dense, but why are we talking about simulations anymore? We have the actual plots of the precinct level data with a regression line, slope, and F. The line has been ironed flat, thanks to febble's fancy function.

Well, the nice thing, Rick, about the simulation on page 4 (at present) of Ron's paper is that it does simulate "constant mean bias". If Ron or Kathy would apply a log to their alphas, and iron the curve, we'd have a neat demonstration (with slightly limited variance) of how the simulation matches the E-M data.

The legend does say ln(alpha) but it doesn't look as though that's what they've plotted. I'm not sure why Ron thinks they didn't have to.

Posted by: Febble | May 17, 2005 5:21:27 PM

(Apologies to all ordinary humans for the obscurity of the following text...)

The first page of the simulator does have a yellow plot of LN(alpha), which may appear to be slightly curved when a quadratic is force-fitted. Actually, at least in my runs, that distribution is flat. No significant quadratic term, no significant linear term. So, in the Dopp simulation of constant bias, the ln alpha transformation flattens out the big U that is visible in the mean WPEs (and removes the overall positive slope that is clear if one adds a linear trendline). Q.E.D.

As for the rest, Ron, does it trouble you that actually applying the ln alpha transform to the actual precinct data yields a flat ln alpha with variance? or are you content to insist that an Excel spreadsheet and a crude five-row table prove that this must not be so? What can it possibly mean to say that "no precinct level distribution can produce the E-M results"? (I'm restating Rick's question -- it took a while for me to check my numbers above.)

Posted by: Mark Lindeman | May 17, 2005 5:27:52 PM

"Flattening" the curve by taking a Log doesn't change the hypothesis or the data in any principled way. The "real" response rates are the unlogged ones and they are demonstrably "U" shapped under a constant mean bias hypothesis.

Mark, if you looked at the Dopp simiulator you would realize that it caluculates over 10,000 precincts at a time and is based on 1% percentile groupings of precincts.

Finally, all this is beside the point. Look at minimum and maximum calculations. The E-M hypothesis cannot produce the E-M results - this is the bottom line. If you want to Log everything to make the curves look flatter - fine. This won't change the outcome.

Best,

Ron

Posted by: Ron Baiman | May 17, 2005 5:41:22 PM

"Actually, at least in my runs, that distribution is flat."

Mark L., are you telling us that you have a simulation that reproduces the E-M data? Aren't you one of the USCV subscribers? If you can do it, why can't Ron and Kathy? If you did it, why didn't they at least acknowledge that you have reproduced the E-M data? If they don't agree with your modeling, why don't they demonstrate why it is flawed?

Lizzie, if you recall, many moons ago, I suggested that you model Ron's w at the precinct level because I thought he was holding the answer to the question, but didn't yet know it. Little did I know that he would eventually model it at the precinct level, but not do it right, so he has the answer to his question, but still doesn't know it yet.

Maybe we're getting closer to reaching him.

Mark L., you have a PhD, you gonna submit a formal critique of the USCV work? Maybe they'll put it on their website. Oh, wait. Maybe not. Lizzie's critique doesn't seem to be there yet, why would they acknowledge yours?
http://www.uscountvotes.org/index.php?option=com_content&task=category&sectionid=4&id=98&Itemid=43

No decency to acknowledge legitimate criticism; especially when coming from former and current USCV contributors.

Ron - my eyes have been deceiving me. I had missed the decimal point. Apologies. Perhaps that really is a plot of log alpha. However, if you are really getting a curve from log alpha (and now I'm confused, because you said it wasn't necessary....) then it's not the same as my simulation, because both the algebra and the simulator tell you that the line for a constant mean bias will be flat. Because alpha IS bias. If you are getting a curve, either something hasn't been logged or the distributions aren't Gaussian.

Lizzie

Posted by: Febble | May 17, 2005 6:06:12 PM

Rick, I try not to mention that... credential in polite company such as this. Something about ad hominem argumentation. ;)

I think anyone can download USCV's simulator, unless they have some sort of IP filtering in place. Go to
http://uscountvotes.org/ucvAnalysis/US/exit-polls/
where the latest version of the working paper should be; drill down in simulators/dopp/ to get the latest spreadsheet. This is a work in progress (not my work!), so what you see may differ from what I've described.

(I have no idea why Ron thinks I fail to realize that the Dopp simulator models over 10,000 precincts at a time, or why this matters.)

Lizzie, I can think of two possible interpretations of Ron's curve from log alpha. One is the slight curve often visible in that yellow line -- after all, it _is_ a quadratic by construction. Second, Ron may be referring to logging the mean and median WPEs from the E/M table, and we've been down that road before.

Posted by: Mark Lindeman | May 17, 2005 7:22:27 PM

Mark, I only mentioned the credential because Lizzie's lack of a credential is the only reason I can come up with as to why they have not put a link to her work on their website. Seems more like argumentum ad verecundium than argumentum ad hominem to me. ;-)

"Professional pollster Mark Blumenthal started Mystery Pollster to provide better interpretation of polling results and methodology... offers much needed help to Political Wire readers" - Political Wire