In virtually every newspaper, current events TV show, or political web page, we see them. Polls, all purporting to give us the inside scoop about and who and what is going on, and how we should think about it. It helps, I think, to have an idea of just what we are working with, its nature, value, and of course, its balance and/or bias.

According to Houghton Mifflin (who publish school textbooks), one of the earliest “soundings of opinion on record” comes from the Harrisburg Pennsylvanian and the Raleigh Star, who reported favorites in certain regions during the 1824 Presidential race, which was also the first race in which the general public was allowed to vote.

By the turn of the century many newspapers were conducting polls to determine political preferences. Later polls were conducted by magazines; the first among them were the Farm Journal (1912) and the Literary Digest (1916). Early polls were anything but scientific, dependant on cavassers asking people at random, or by “straw ballots” in newspapers, which readers mailed in. Sampling, a method which takes a small percentage, and then projects the whole group’s opinion from it, began in the mid-1930’s, especially with the 1936 campaign between President Franklin Roosevelt and Republican Alf Landon. Literary Digest magazine conducted a poll with over 10 million ballots, indicating that Landon would win easily. History and the actual election decided otherwise, and interest in polls dropped off with the embarassing blunder.

However, a new method of polling, introduced for the same election by the Gallup Organization, which accurately predicted FDR’s re-election, and with a respectably close margin (Gallup predicted FDR would take 56% of the vote; he actually took 62.5%). But it took a while for Gallup’s success to be recognized as an example of a new wave of methodology; Gallup had to compete with the Fortune survey (begun in 1936) conducted by Elmo Roper, and the Crossley Poll (also begun in 1936). To make matters murkier, pretty much everyone, Gallup included, predicted President Truman would lose to challenger Dewey in 1948. There is a famous photograph of Truman holding up a newspaper in Chicago with the headline, “Dewey Defeats Truman”. The lesson to prognosticators is as valid now, as it was then, but sadly, is just as ignored.

Sampling was the key to many varieties of polling, which is one reason why polls often disagree with one another. Sampling techniques may be random, stratified, or purposive, or a combination of any of these. The information may be elicited by personal interview, telephone interview, or mail questionnaire, and the polling is completed only after the data have been tabulated and evaluated.. Once these responses have been assembled, the pollster then puts together his model. Which brings us to demographics, often the least-discussed element of the pollster’s method.

Before I go into that, I should like to observe that pollsters generally fall into three types:
Polling Agencies (like Gallup, Harris, Zogby, Pew Research, and Rasmussen Reports);
Media Groups (like ABC, CBS, NBC, CNN, The New York Times, The Wall Street Journal, The Washington Post, The LA Times, Fox News, Time magazine, and Newsweek magazine); and
Universities (such as Quinnipiac, University of Connecticut, Marist, Arizona State University, the University of Michigan and the University of Massachusetts).
What makes that messier, is the cooperation between many of these groups. For instance, in a March 2004 poll of Iraqis, about the War on Terror, the US-led occupation, and conditions now that Saddam has been removed, was released simultaneously by ABC News, the German network ARD, the BBC, and NHK in Japan, with identical responses. This happened, because the actual polling was done by Oxford Research International of Oxford, England, who licensed the media to use its results under their own names. Cooperation between media, who have money and name recognition, and universities, who can provide the fieldwork quickly and within designed parameters, is not only common, but has proven a useful way to meet the growing public appetite for new polls.

Going back, then, to demographics. There are, actually, rules for this sort of thing, though they are hardly observed by all pollsters. The two most respected groups for determining a valid methodology for opinion polling in the United States are the American Association for Public Opinion Research (AAPOR), and the National Council on Public Polls (NCPP). Both organizations claim to set standards for polling agencies, but it is noteworthy that they tend only to comment on whether the conclusions from polls are comparable to each other, not whether the agencies are accurate, biased, or whether the underlying methodology may be flawed. Caveat Emptor, indeed!

The AAPOR tends to showcase a number of polls (their homepage advertises the latest Pew Researech poll, for example), while the NCPP appears to act much more as a watch-dog. The NCPP, for example, includes a section entitled “20 Questions A Journalist Should Ask About A Poll”. These questions include such important questions as
“1. Who did the poll?
2. Who paid for the poll and why was it done?
7. Who should have been interviewed and was not?
18. So I’ve asked all the questions. The answers sound good. The poll is correct, right?
20. Is this poll worth reporting?”

What makes these questions particlarly salient, is that on the website, these questions are links to discussions about the issues behind the questions. Sadly, from reading through these questions, it’s rather apparent that most journalists have never read, let alone asked, these important questions. But at least it’s a start.

I thought about going into detail about how the polls reach their sample. But besides the fact that many of the polling agencies now refuse to divulge demographic details in their polls, it can be distracting to get too deep in the numbers, and so miss the actual works of the machine.

To begin with, a pollster wants to know whether he can use the information the respondent might provide. So, he asks whether the voter is registered to vote, and whether the voter is likely to vote. The “likely voter” is the prized respondent, but no pollster can truly be sure that the “likely voter” is what he claims. Gallup has discussed how they determine if a respondent is a “likely voter”, but in general the only virtue a “likely voter” shows, is that he has claimed he plans to be one. Just look at the number of respondents in a poll, and compare those who claim to be registered voters or likely voters, to the known demographics. The 2000 Election actually was pretty close to the traditional turnout ; around 70% of the adult population in the U.S. is registered to vote, and something like 60% of the registered voters showed up to actually vote. So, when a poll claims to be representative of the actual population, if around a thousand people are polled (the usual number), around 700 should be registered to vote, and only 420 or so should say they are likely to vote. Just keep that in mind the next time you see the few details the pollster tosses out at the bottom of the poll.

The next point is to consider where the sampling results lead. It’s not hard to say what 500 people think, if all 500 are cited in the poll. But what happens if you interview 500 people, and only 200 of them are women? Most polls want a rough 50-50 balance between men and women. So, they “weight” the results to match what they consider the appropriate demographics , which in this example would mean adding an additional 25% value to the response of any woman polled, and discounting the male responses by 17%. Statistically, this sounds great, but it means that the pollster has just eliminated 50 valid responses because they were not what they wanted, and created 50 artificial responses in order to achieve a desired result. Does this sound reasonable?

The AAPOR and NCPP both rely on the last U.S. Census for their demographics base, and they instruct their members to weight the responses to match those numbers. But 2000 was four years ago; there is no concensus for dealing with the fact that people can and do change their habits and statistics in nearly half a decade. This too, must be considered when looking at the polls. They may have been taken just hours ago, but could still be years out of date.