Friday, August 15, 2008

I mentioned in yesterday's electoral college post that I thought Nevada was a state that had been polled less often than than it should have been given how closely contested the Silver state appears to be. Of course I was called on to elaborate on that assessment.* Jack may have been asking for a simple gut reaction as to what I considered to be underpolled. However, I cannot help but over-analyze even the simplest of questions. Why provide a feeling when we can put the data we have to good use?

With that in mind, what is underpolled?

We can go about answering that question in a couple of ways. The simplest way is to take an average. With the release of Rasmussen's poll in North Carolina this morning, the grand total of polls in our data set (from Super Tuesday to now) is 553 polls. That's an average of just over 11 polls per state. States coming in under that line, then, are underpolled. Sure, that certainly isn't false, but that is rather a low bar to set in defining what "underpolled" means.

Another couple of layers can also be added to this. We would expect that the number of polls conducted in a state would vary based upon how close and how large the state was. We'll get to a state's size in a moment, but let's focus initially on the "how close" question. An easy way to extend the simple approach is to split the states into groups according to how close they are. Well, that's already been done for us. We can take an average of the toss up states, the lean states and the strong states with the expectation that toss up states would have more polls conducting in them on average than a lean state or a strong state (Likewise, lean states would have more polling than strong states.).

And that is what we see in the table. So instead of saying the overall average of polls across all 50 states is 11 and there have only been 10 polls in Nevada. We can say that among toss up states, the average number of polls is 15.5 and Nevada has had only 10 polls conducted since February. That gives us a better definition of underpolled.

It gives us a better definition, but perhaps not a very efficient one. What about state size? We'll get to that in a minute. First, we can take a page out of FiveThirtyEight's book and run a regression with the number of polls conducted so far in a state as the dependent variable and the competitiveness that state (as measured by our weighted average) as our explanatory variable. In other words, we would expect that as the spread between the two candidates increases, the number of polls in that state decreases. That's exactly what the graph below depicts.

Predicted Polling Frequency

[Click Graph to Enlarge]

And with that handy regression line, we can predict where a state's frequency of polling should be given its level of competitiveness. So, Nevada, with ten polls thus far is about six polls under what we would expect in light of how close the race appears in the Silver state. But right there in that lower left quadrant of the graph are several toss up states clustered together. Alaska, Indiana, Montana, New Hampshire and North Dakota all come in under that prediction line. Even lean states like New Mexico, South Carolina and South Dakota are underpolled.

And what about a state's size? The number of electoral votes at stake in a state -- a reasonable proxy for size in this context -- could affect the frequency of polling in a state as well. When we add that into the regression how are the things we see above affected? Again, that would add to our understanding of what is causing polling frequency to vary across states and ultimately increases the efficiency of our prediction. Competitiveness alone explains about a quarter of the variation in polling frequency and competitiveness and state size bumps that up to just over half. If we focus our attention on the 14 toss up states -- six of which were underpolled when compared to the original prediction -- only four were significantly underpolled: Indiana, Montana, Nevada and North Dakota. Alaska, Michigan and New Hampshire were about on par with where they would be predicted to be with 10, 17 and 13 polls, respectively. The remaining seven states could be considered "overpolled" based on competitiveness and state size. You cannot over poll in my opinion, but in a world of finite resources and comparatively speaking, that's the reality.

So, long story short, it is that small group of toss up (and some) lean states that are underpolled at the moment.

*Our readers and commenters here are great. I certainly have my own ideas of what to post here, but it is in my conversations both here in the comments section and with colleagues here at UGA that spur some of the great ideas that ultimately appear in this space. I don't say it often enough, but thank you all for your support of the site and for your contributions.

5 comments:

Thank you for the thank you. I should have known you'd do a whole post on this, complete with a regression and all.

The states that stuck out in my mind as having been underpolled were Indiana and Delaware (I know Delaware is safe Dem, but there haven't been any polls there in quite a long time, and it went to Kerry by only seven points, I believe). Montana didn't seem underpolled to me, probably because every poll result out of there has been so interesting. Of the states well below the line, probably the Dakotas and SC (I know it's pretty safe McCain but I want more proof), as well as IN, NV, and MT should be polled a bit more. I really don't care if Utah is underpolled.

You know, I wasn't planning on going the full post route. It just kind of materialized.

I don't get the Delaware situation. Here's a state that goes for the Democrats 9 times out of 10, but is closer than you might expect. The GOP should really make an effort in Delaware. The two polls and the final result from 2004 and the one poll this year have/had the race(s) in the upper single digits there. It isn't an ideal pick up opportunity, but putting in some effort there could work over several cycles (What?!? A long term solution.). In the end I think Delaware gets lumped in with New Jersey: it always teases the GOP by being closer than you'd think, but more often than not goes for the Democrats when the votes are cast.

This "proof" point is a good one, Jack. This is more of a feeling than anything, but anytime a polling result leaves you scratching your head or at least wanting more proof of that result, the state, in my mind, is underpolled. A lot of those Western states fit in here. You mentioned the Dakotas and Montana, but Idaho is in there as well. Obama will not win Idaho, but he is doing around 20 points better in the state than Kerry. That's a big difference. And though it isn't consequential to the electoral college tally, it is an interesting result that I'd like to see tracked more closely.

According to the RCP polls it looks like Phelps is not the only one taking a dip during the Olympics. McCain has lost all of the gains he made as a result of the Obama trip abroad and Obama is well below 46%, the lowest he has been since early April.

And that's consistent with what we would expect given the negative campaigning literature. It ends up being a drag on both candidates. McCain may be down nationally, but he is gaining in two notable western states: Colorado and Nevada.

And that's all I'm saying. I'll continue that thought in tomorrow's update.

In the meantime, the Faith Forum tonight should give us something to talk about tomorrow. That gets underway with Obama up first in less than ten minutes. McCain may be fighting Phelps for attention during the tail end of his Q&A with Rick Warren.