The War of the Senate Models

As the midterm election season heats up, politically inclined quantitative nerds like me have been trying to predict which party will end up in charge of the Senate. For us, it is the most suspenseful question of the year. Control could easily go either way.

Today, there’s a glut of forecasts out there, each one promising to be more accurate than the last. Their authors range from veteran handicappers like Charlie Cook and Larry Sabato to relative newcomers like the Monkey Cage and the Upshot. How to make sense of the free-for-all? You just need to keep a few basic principles in mind.

Story Continued Below

The Data

Polls tend to be scarce before Memorial Day, so early predictions of the November election outcome must rely on indirect indicators of how voters are likely to behave—what we call “fundamentals.” To make a sports analogy, these predictions are like a team’s initial seeding in a tournament. They just tell us who’s looking good at the outset of the campaign.

Once polls become available, they can capture the same ballpark range of November performance that fundamentals do—and with much less uncertainty. Years of polling have shown that what voters say they want “right now” is a strong starting point for predicting, give or take a few points, how they will vote in the fall. Because of that—no matter the race—the most accurate predictions are made using polling data, when enough of it is available.

The bottom line: Even at this early stage, polls are our best way to predict November outcomes. In the 2012 election, for instance, polling data available in July and knowledge of how far presidential polls tend to move in the months leading up to the election were enough to give President Obama’s reelection a probability of 91 percent. That crept up to nearly 100 percent as the election approached. However, predicting the partisan control of the Senate in 2015 is a far harder problem.

The Model Type

In the election-prediction business, most models fall somewhere between two extremes: Type 1, which is purely fundamentals-based, and Type 2, which is purely poll-based.

Here are a few different quantitative models and how they answer the question: Will Republicans take over the Senate? The GOP needs to pick up six seats in the fall, on top of the 45 the party currently controls.

At first glance, these three predictions seem very much at odds with one other. But in my experience, any contest with probabilities between 20 percent and 80 percent should be regarded as a toss-up, with no solid favorite. All three models above therefore suggest a knife-edge situation. Although Ilovetopointout that all predictions like in this range are basically hedged bets, I recognize that it is natural to ask why the probability given is above or below the magic 50 percent threshold. To understand the answer, a closer look at the models they have used can prove useful.

Type 1: Fundamentals only. Type 1 models, which rely on no polling data at all, have the advantage that they can be created before the campaign even starts. The Monkey Cage model is currently pure Type 1, relying on a large number of fundamentals, from candidate “quality” to economic growth. This year, the most important fundamental is that, in midterm elections, national public opinion tends to go against the president’s party. That gives us some idea of the range of possible outcomes: Basically, Democrats are going to lose seats.

Interestingly, because of this reliance on national public opinion, as a general rule, with a Democratic president in power, the more a model relies on non-poll-based assumptions, the more it will favor the Republicans. Note that the probability of a GOP takeover is higher in the Monkey Cage model than it is in the others.

READ MORE

When it comes to extremely close races, though, Type 1 models are of limited use. Modelers put in lots of factors that have been shown to affect election outcomes (“signals,” in engineering parlance), such as the economy and incumbency. But each factor you add also contributes “noise”—accumulating uncertainties that, once added, cannot be taken out. For example, during a midterm election year, the generic congressional poll (Would you rather vote for a Democrat or a Republican?) tends to move against the president’s party—but the range of actual outcomes on Election Day ranges from an 11 percentage-point loss to a 4 percentage-point gain in the national popular vote margin.

Fundamentals can be national factors, such as the generic congressional ballot, which captures a general national mood. Or they can be local, such as whether an incumbent is in the race, a factor that attempts to capture how well known a candidate is. But these are simplifications. From a reader’s standpoint, probabilities in Type 1 models should never be read with more certainty than, say, the National Weather Service’s numbers.Rain forecast probabilities are good enough to help us plan our weekend outings—and even they are uncertain enough always to be rounded to the nearest 10 percent.

Rather, Type 1 models are hypotheses about where a campaign is naturally headed. You can think of them as asking, “Do our assumptions about how politics works give the correct prediction?” They tend to be of most use after the results are in.In 2012, Type 1 presidential models ranged from predicting a Romney win to an Obama landslide—and everything in between.

If past history is any guide, FiveThirtyEight comes up with a more exact model, it will have a strong Type 1 component but will also include some polling data. That probably explains why FiveThirtyEight’s state-by-state win probabilities seem to give Democrats a better shot than the Monkey Cage does.

Page:

Sam Wang is a data scientist, a co-founder of the Princeton Election Consortiumand an associate professor of neuroscience and molecular biology at Princeton University. Follow him on Twitter @SamWangPhD.