Friday, November 07, 2008

++Addition2++The vote totals continue to grow slowly in some states, presumably due to write-ins and recounts. The table in the first addition has been updated to reflect the most current counts.

++Addition++Wikipedia's election page has some errors in voter totals (as reported three days after the election; it may have been subsequently edited yet again by the time of this reading). Kentucky's total was overstated by more than 500,000 votes, and a few states were presented as final results even though the totals displayed did not represent 100% of precincts reporting. As Kentucky is a white state, the table presented in the body of this post slightly inflates the white national total. I've kept my original table in the body (I'm only into revisionism when it serves a constructive purpose other than covering my behind), but also present this more accurate one constructed from state voter totals provided by CNN:

Votes

State

Media

White

96.33

75.0%

74%

Black

16.42

12.8%

13%

Hispanic

10.24

8.0%

9%

Asian

2.58

2.0%

2%

Other

2.82

2.2%

3%

The shifts are very minor. The white percentage drops by .2%, black by .2%. The Hispanic percentage rises by .2%, Asian by .1%, and other by .1%.

Hysterical pundits will announce that the Hispanic tidal wave accounted for 8 or 9 or even 10 percent of the vote!

Then, a year from now, the Census Bureau will quietly announce the results of its huge post-election survey of voting, the gold standard of ethnic voting shares. It will show that the Hispanic share of the vote, which was 5.4 percent in 2000 and 6.0 percent in 2004 actually was only 6.9 percent in 2008, or whatever.

It's not even necessary to wait that long to undercut the 9% figure now deemed the official national total by Edison Media Research and the major media outlets that partner with it.

An easy (albeit tedious) way to verify the national figure is to simply look at exit polling data from each state. Taking the voting percentages by race in each state and multiplying them by the total number of votes cast in the state allows for the number of actual votes cast by race to be determined. The numbers are rounded to the nearest percentage, but there is no reason they should systematically differ across states (7.54% as 8% here should be balanced out by 12.46% as 12% there) to an extent that they misrepresent the national total, and anyway the national exit polls round to the nearest whole percentage point as well. Presumably EMR creates the national exit poll by aggregating each of the individual state exit polls so this does not offer an explanation for the abrupt post-election editing.

What do the exit polls actually say when aggregated together to form a national exit poll? The table shows the total number of voters by race (in millions), what the states actually reveal in percentages of the total by race, and what the major media are now running with:

Votes

State

Media

White

92.43

75.2%

74%

Black

15.92

13.0%

13%

Hispanic

9.53

7.8%

9%

Asian

2.34

1.9%

2%

Other

2.63

2.1%

3%

That's not carelessness on my part--the media total really does come to 101% even though exit polls at the state level come to exactly 100% (and match what CNN originally reported, with the exception of the "Other" category, which varied by one point). Funny how to 'balance' things out, EMR seems to have arbitrarily taken 1.2% of the total respondent base that was white and converted it to Hispanic, er, "Latino", in the national exit poll.

Even that 7.8% will likely prove to be an overestimate, since smaller groups tend to be oversampled as a way of ensuring they're not undercounted, since it would be more problematic in determining voter tendencies to undercount Asians or Hispanics than it would be to undercount whites.

5 comments:

So, they boosted the Hispanic % from 8 to 9% right before our very eyes? I hadn't checked back

Thanks for adding up the state numbers. It's a much bigger sample size, so it tends to be more correct. I believe the bigger state sample is based on a shorter questionnaire, while the national sample is from those who answer the longer questionnaire.

In general, exit polls aren't very good at figuring out turnout shares. They're not like an election where the voters come to you. They have many of the much discussed problems of phone surveys in getting a representative sample of voters, but they have even more severe problems because the exit polling company has to decide before the election where to go to do exit polls. Only a tiny fraction of all the polling places in the country are covered. That means the exit polling company has to decide ahead of time where to send its pollsters. It has to have an expectation of what turnout will be. "We expect Hispanics to cast 9% of the vote so 9% of our pollsters need to be in Hispanic neighborhoods." Measuring turnout is basically a self-fulfilling prophecy.

The fundamental problem with exit polls is one that's common in the marketing research industry, where I worked for many years: monopoly. There's not enough room in these submarkets for two competitors to make a profit, so the industry generally settles out at one firm. The joke is that the market is big enough to support 1.5 competitors. (You'll notice that Nielsen doesn't have any competition for TV ratings and Arbitron doesn't have any competition for radio ratings. In the supermarket sales data business, there are two competitors, but they fought a decade long price war that almost bankrupted both of them.)

There isn't enough money to be made in Presidential exit polls to support two competing exit poll companies, so there's usually monopolist these days, and, without competition to spur them on, they usually do a bad job.

They misplaced all their data in the 2002 midterms and nobody published it until I did a year later. They predicted Kerry would win the 2004 election. According to Karl Rove in the WSJ, the raw data from the 2008 exit poll showed Obama winning by an absurd 18 points, which they arbitrarily dialed back to 10 points, and it ended up being about 6.

I don't know whether this wildly off exit poll was due to methodological problems or due to a post-election Bradley affect with voters lying to pollsters about whom they voted for.

That appears to be the case. The total number of respondents from the state samples comes to a ballpark of 50k, while the national poll is based on responses from 17k people. But why would EMR change edit the national poll after it had put forward one number with all results in, especially when the original results (but not the subsequent change) agreed with what the broader state sample showed?

Thanks for sharing the insider information on marketing research. Since the Census takes a detailed look at voter participation, it'd be easy to see what exit polling operations were the most accurate, if there were competing firms in action instead of just EMR.