Demystifying the Science and Art of Political Polling - By Mark Blumenthal

February 16, 2006

Belated Update: Gallup on Weighting by Party

After yesterday's post on Gallup's decision to start reporting two different types of rolling averages regarding the president's job approval rating, the folks at Gallup emailed to point out two things I had overlooked. First, they have been reporting the "smoothed" averages I discussed yesterday in their (subscriber only) "In Depth" presidential approval page since January. Second, back on January 16, Editor-in-Chief Frank Newport posted some similarly in-depth comments (free to all) about Gallup's policies on weighting by party identification. For those who follow the debate on party weighting - and I know you're out there - Newport's post is a must read. Here is a quick summary:

As Newport notes, his article largely summarizes a presentation he made at last year's AAPOR conference and elsewhere. MP saw much of this presentation about a year ago in Washington when Newport said he was in the midst of "zero-basing everything we know about party [identification] in an election year." His presentation raised many questions. Although he reiterates that Gallup continues a process of "reviewing, researching, and discussing our policies on this issue," this article includes the conclusions that Newport and Gallup reached in their deep dig on party ID. It also provides some context for their decision to report "smoothed" averages for presidential approval along side the regular non-averaged results.

Most of the issues that Newport discusses should be familiar to those who have followed the party ID debate on MP. However, his summary is a good place to start for those who are not. The bottom line:

Gallup is not convinced that variation in party identification from poll to poll predominantly results from sampling error. In other words, Gallup is not convinced that party identification varies from sample to sample because the wrong percentages of "real" Republicans, independents, or Democrats are selected into the sample.

Instead, it seems at least equally likely that survey-to-survey variation in self-reported party identification is caused by two other factors: 1) measurement error and 2) real change in the population.

Newport argues that some small percentage of Americans may shift their answers on the party identification question in response to their events, or what Newport calls "short-term environmental stimuli." These can be events in the news or questions asked in the middle of a survey. As such, Newport endorses the theory that asking the party ID question near the end of each interview increases the potential for either short term change or "measurement error." He cites the AP/IPSOS study presented at AAPOR (and discussed by MP) last year. The Gallup organization thus concludes that adjusting or weighting a sample of voters based on their answers to the Party ID question introduces the possibility of making those samples less representative, not more so.

Newport's statement also responds indirectly for the calls to weight by party using a "smoothed" or rolling average of recent results. The most prominent advocate of this approach is Professor Alan Abramowitz. He spelled out his proposal for "Dynamic Weighting" in a paper posted back in September by the Cook Political Report (see also Ruy Teixeira's commentary). Newport's response also provides the context for Gallup's decision to report smoothed averages of presidential approval:

Attempting to weight an entire sample based on a smoothed estimate of PID involves the impossible challenge of trying to isolate some proportion of the change in PID that results from sampling error and not "real" population change or simple measurement error. While weighting to a smoothed estimate could, in theory, help eliminate sampling error for a particular sample, it is not possible to know to what degree this is being done. Weighting to a smoothed estimate can also create more bias in a sample by changing that sample's overall composition in a way that a) incorrectly alters what is an estimate of a real change in the population or b) incorrectly alters an entire sample based on measurement error involved in one variable at the end of the survey questionnaire -- error that did not affect the measurement of variables included nearer the beginning of the questionnaire.

This is not to say that it is inappropriate to smooth the reporting of individual variables. Analysts may want to report a rolling average or other smoothed procedure in order to provide a longer-term perspective on the trends of a specific measure of interest. In other words, even if one assumes that survey-to-survey variation reflects real-world population change, one may want to look at data trends from a broader perspective. This effort to produce a smoothed estimate can be done for any given variable, including party identification.

This procedure, however, is best conducted on variable-to-variable basis. The rationale for smoothing one variable (for example, party identification) and then weighting all other variables in a dataset to that smoothed average is less defensible for the reasons enumerated.

Newport's response will not end this debate, and MP will continue to follow and comment on it. The decision of whether to weight on party ID or not is not a simple one, not easily resolved by a one-size-fits all rule.

However, there is one point that nearly everyone agrees on: Every survey includes some "component" of random error which complicates our ability to see real changes in any one survey. While smoothing and rolling averages help reduce random error, they cannot eliminate it entirely. Some argue that routines like "Samplemiser" sometimes smooth out real change - ditto for weighting by party. Either way, there is no perfect, fool-proof way to remove the uncertainty that comes with sampling error. The best rule is to look at as much data as we can -- including "smoothed" averages -- and try to avoid reading too much into minor fluctuations between two surveys.

Comments

"...Gallup is not convinced that party identification varies from sample to sample because the wrong percentages of 'real' Republicans, independents, or Democrats are selected into the sample."

"Instead, it seems at least equally likely that survey-to-survey variation in self-reported party identification is caused by [factors including] ...real change in the population."

As some of you will recall, Gallup put out a poll leading up to the 2004 election with a sample that contained 12% more Republicans than Democrats. This, despite the fact that for the last several years, estimates of partisan composition of the electorate have consistently shown either equal proportions Democrat and Republican, or slightly more Democrats.

In 2000, the CNN/USA Today/Gallup daily tracking poll exhibited its legendary roller-coaster pattern of wild swings, including one installment that had Bush up 13 over Gore in late October, which was totally at variance with other outfits' concurrent polls.

If Newport and Gallup want to continue their denial, so be it.

I invite people to look at an essay I wrote in 2004 on the issue of weighting by party ID, including a postscript on the 2004 election (which appears above the original essay).

http://www.hs.ttu.edu/hdfs3390/weighting.htm

Posted by: Alan Reifman | Feb 17, 2006 12:22:07 AM

This still leaves out the biggest problem with party ID, namely we don't have a way to verify what we are polling. Census reports give us a way to compare what we are polling to various demographics like gender, race, location, economics, etc.

The closest thing we get to party ID verification is in the presidential popular vote percentages. A look at comparing the polled party ID to the presidential popular vote over the last 14 election shows something strikingly at odds with the polled data. Polling gives the democrats an advantage in every year, averaging 7% greater than republican party identification. Looking at the presidential popular vote, you see 9 out of those 14 elections going to the republicans with an average republican advantage of 3%. So either the polls are measuring people who call themselves democrats but consistantly vote republican or they aren't polling republicans who are voting.

If you look at the presidential popular vote percentages, you see that republicans always pull a higher vote percentage than the polled party ID, with the vote avaerage percentage being 1.9x the polled party ID percentage. The democrats vote percentage was only 1.15x their polled party percentages. The repubican vote percentages were higher than the Strong, Weak and Indepedent Leaning Republican polled percentages in all 14 out of 14 election years. In fact, 9 out of the 14 elections, the republican popular vote percentage was greater than Strong, Weak, Independant Leaning republican and independant independants.

So debating about how you smooth what you are measuring or if the measurement is being influenced by real world or poll introduced variations is putting the cart before the horse. It doesn't look like the polls are even coming close to identifying the real party identification, at least as far as comparing it to one of the few real world data points we have, namely the presidential popular vote percentages.

Maybe a better question would be to ask who they voted for in the last election and weight based on that.

Posted by: yatanotherjohn | Feb 17, 2006 1:24:43 PM

I would say two things in response to the previous post:

1. I think it's well-accepted that self-identified Democrats (particularly in the South) vote Republican in presidential elections, more than the reverse. Thus, the discrepancy between party ID and presidential vote, alluded to above, is not surprising.

2. Zogby, who has pioneered the practice of weighting on party ID, has done extremely well in forecasting the national numbers in the last three presidential elections:

The ANES polls that I referred to went back to 1952, well before the 1968 election that started the swing of the south to the republicans. But if that is true, that there is a sizable minority of democrats who identify with the democrats on party ID, but vote republican for president, then wouldn’t you want to be able to differentiate between those two kinds of democrats. If you look at 2004, 6% of republicans and 11% of democrats supported the opposing party candidate. The independents were evenly split. So it doesn’t look like it’s a big cross over vote. The ANES for the 2004 year shows a 5 point advantage for the democrats initially identified with a party and a 10 point democrat advantage when you count in leaners. But the exit polls on CNN show an even 37% each for the republicans and dems. http://www.cnn.com/ELECTION/2004/pages/results/states/US/P/00/epolls.0.html The ANES doesn’t match up on the independents ether, 39% ANES vs the exit poll of 26%. The data doesn’t seem to support your contention that the difference between the polling and election is DINOs.

As far as Zogby, I suspect that his heart gets in the way of his head when it’s close. His May 2004 and October 2004 predictions of Kerry winning are prime examples. He wants the democrat to win and when it’s close he errs towards the democrats in his polling. I don’t know if 1996 was a fluke or if it not being close he didn’t let emotions get in the way. After 1996, he always seems to play Maxwell smart and miss it by that much.

Posted by: yetanotherjohn | Feb 18, 2006 12:41:57 PM

In an ideal polling world, the Party identification polling would be done separately from the substantive survey.

Then we could see if weighting is appropriate.

It seems to me that some research on Party ID responses can be easily done.

Say you take a 1,000 persons in a (ha!) random sample and just ask them their Party ID over time, say once every two months for a year.

How this can be done without the respondent knowing what your real focus is I am not sure, because obviously they will be prone to be more ocnsistent in their response, but I think there are ways around that problem.

Posted by: Armando | Feb 19, 2006 4:21:20 PM

Its pretyt late and I don't have the energy for a long post on this, but I have just never believed in the "real world change" explination, especially considering the wild swings that often appear in Party ID from one Gallup poll to the next. Long term data from HArris, Pew and NES confirm that party ID is a slowly changing characteristic, nto a fast one. According tohtese soruces it is rare for there to be more than a 2% shift in the D-R gap in party ID from one year to the next. However, Gallpu has consistently tried to convince us that shift of eight points or more are common from week to week. Taht just doesn't jive witht he long term findings from omnibus NES, Harris and Pew surveys. IF party ID really did swing that quickly and wildly, there should ahve been more variation in the long-term data from those organizations. But there just isn't, really, ever.

"Professional pollster Mark Blumenthal started Mystery Pollster to provide better interpretation of polling results and methodology... offers much needed help to Political Wire readers" - Political Wire