Which Economic Indicators Best Predict Presidential Elections?

I’ve written quite a bit in the past about the use and abuse of economic indicators in forecasting presidential elections.

In general, my view has been that there’s a lot of “physics envy” in thinking that something as complex as a presidential election can be boiled down to just one or two “fundamental” variables. Many models that attempt to do so suffer from a variety of technical and theoretical problems and their predictions have been mediocre when applied in the real world.

But this obscures a question that is interesting on its own merits: which economic variables have done the best job in explaining the results of past presidential elections?

One complication in evaluating these models is that they usually include other data in addition to the economic variables themselves: variables to denote wartime or peacetime, or various ways to measure incumbency, or polling data, and so forth. There is not necessarily any problem in including these things — I’m certainly a believer in making forecasts based on polls, for instance. But they tend to obscure the question of how much predictive power the economic variables have had on their own.

A related problem is that all of the models handle the economic data in their own idiosyncratic ways. Some take the variable on a per-capita basis while others do not. Some used revised data while others use the variable as originally reported. Some models are applied to elections all the way back to the 19th century, while some look at as few as the eight or ten most recent ones. Some look at the variable for just one quarter of the election year, while others consider longer time periods.

Which economic indicators do best if they are put on a level playing field? Let’s establish a set of ground rules so that we can make some apples-to-apples comparisons.

First, our sample will consist of all elections since World War II — that is, the 16 presidential elections from 1948 through 2008. The reason for this choice is that many commonly used economic data series were first published shortly after the war, particularly in 1947 or 1948. I admire models that go further back than this — if you’re claiming that a variable is truly fundamental, it ought to hold up under a variety of different economic and social milieus. But other than one or two variables like G.D.P. and the inflation rate, there aren’t very many variables available prior to 1947, and our goal here is to see how quite a few different economic indicators perform.

Second, we’ll be looking how the variable performed from January through September of the election year. The reason we’ll use September as the cut-off date rather than October or November is because many of these variables are only available only on a quarterly basis — so data from the fourth quarter will include results from after the election took place. I don’t personally think this is a huge deal, but using the first nine months of the election year is as close to anything as an industry standard. (There’s reasonably clear evidence, on the other hand, that performance earlier in a president’s term may not matter very much. Some models even contend that poor economic performance during the first two years of a president’s term can be helpful to him.)

Third, although this is a fairly technical point, we will be using the most recent data revisions for all variables, such as was available through the Federal Reserve as of the week of Nov. 13, 2011.

Fourth, we’ll be looking at how well the variables predicted the incumbent party’s margin of victory or defeat in the popular vote. This is also a fairly minor issue as most models use close variants of this approach (although a few take more exotic attitudes like counting all third-party votes against the incumbent party).

As to which economic variables we’ll be looking at, the answer is … all of them. Well, not literally all of them. But I have retrieved data for 43 such variables, including almost all of the ones that economists and investors use most commonly. What these variables have in common is that they are available dating back to 1948, that they are easily accessible to the public (with a few exceptions, they are available at the Federal Reserve’s Web site), and that they are relatively easy to calculate. I have adjusted the variables for inflation in some cases when the publicly available versions did not, but that is about as complicated as it gets. If you don’t see a variable listed here, it’s because it did not begin to be tracked until well after 1948, or because it has an ambiguous economic interpretation.

What you’ll see next is a chart evaluating the performance of the 43 variables on the basis of their coefficient of determination or r-squared. The interpretation of r-squared is that it describes how much of the variance in a dependent variable (in this case, election results since 1948) can be explained by an independent variable. An r-squared of .30 or 30 percent, for instance, implies that the variable explains about 30 percent of election results (leaving the other 70 percent unexplained). Normally, r-squared scores are always expressed as positive numbers, but I have listed them as negative instead in a handful of cases where the election results have been related to the variable in the opposite of the expected direction. For instance, the incumbent party has actually done better (although not to any statistically significant degree) when the personal savings rate has been lower, perhaps because this variable is a bit counter-cyclical and people tend to save more of their incomes when the economy is struggling.

So let’s see how the 43 variables rank:

The first thing to notice is that no individual economic variable has an r-squared higher than .46, meaning that none can explain more than half of election results in the post-war period. If you see models that claim to have more explanatory power than this, it is because they are using additional data (like polling) besides economic variables, or because the model has been jury-rigged to maximize its “fit” on past data in ways that will contribute little to its true predictive accuracy. The economy is hugely important to presidential elections, but there are no magic bullets.

Perhaps the best starting point is to see how the most commonly-used economic variable — G.D.P. — compares to election results:

There is definitely a relationship between G.D.P. and election results. But, it isn’t a perfect one. While the election results were pretty much exactly in line with what you’d expect from G.D.P. in many years like 1960, 1980, 2004, 2008 and so forth, it would not have given you a good prediction in 1952, 1956, 1976, or 1992. Overall the r-squared for G.D.P. is .33 in elections since 1948. That is, about 33 percent of election results are explained by G.D.P., leaving about two-thirds of the results unexplained. For what it’s worth, while this might sound underwhelming, the relationship has actually grown stronger over time. If you run the numbers back to 1880 (using Ray Fair’s economic data for years prior to World War II), the r-squared is somewhat weaker at .23.

One popular alternative to using G.D.P. is the variable real disposable income per capita. I made some fun of this variable in my recent New York Times Magazine piece, not because it is inherently ridiculous (it seeks to measure something useful, take-home income) but because its popularity in a number of political science forecasting models probably stems more from the fact that it happened to ‘fit’ the data quite well for a number of elections from about 1960 through 1988. Outside of that window, however, it’s performance has been mediocre. It did badly, for instance, in 1948, 1952, 1956 and 2000, and to a lesser extent in 1992. Overall, it’s R-squared in post-war elections is .30, essentially the same as for plain-vanilla G.D.P.

Even though G.D.P. and disposable income only explain about a third of election results, they do perform better than some other commonly-used variables. The inflation rate during the first nine months of the election year, for instance, has had only a loose relationship to election outcomes.

This variable is a little tricky since there was really only one post-war election, 1980, in which inflation was especially high. Jimmy Carter did do extremely poorly that year, although there were a lot of other problems with the economy as well. Overall, however, it’s r-squared is just .10. (In fact, there are several different ways to calculate the inflation rate, none of which have done much better or worse than this.)

Another poorly performing variable is the unemployment rate. It has had essentially no relationship to election results at all.

However, while the unemployment rate had told us very little, the rate of change in the jobs market has been fairly meaningful. Here, for instance, is a comparison of election results to the rate of payroll jobs growth — the variable you often see highlighted when the government releases its jobs report on the first Friday of each month.

This variable has had an r-squared of .44 — a fair amount better than G.D.P. or disposable income. Thus, while the raw rate of unemployment has been one of the least useful variables for forecasting election results, the rate of job growth during the election year (whether measured by this variable or by closely-related ones like the net change in the unemployment rate or the employment-to-population ratio) has held up pretty well. So pay attention to those employment reports, as common sense would dictate. I apologize if I’d given you a contrary impression in the past, but I hadn’t dug into the data this deeply.

The broader point — and one thing this evidence is fairly definitive upon — is that the rate of change is what counts. Americans will give a fair amount of credit to a president in an economy that is still below its full productive capacity provided that it seems to be getting better. This can also be seen in the poor performance of the variable actual-to-potential G.D.P. (which tracks output against its long-term trend-line) as compared to the comparatively strong performance of the rate of G.D.P. growth during the election year.

Another class of variables that might be avoided are those related to government activity. For instance, the government expenditures component of G.D.P. has had almost no relationship to election results; it ranks 42nd out of 43 variables and the relationship has actually been slightly negative historically. Likewise, the ratio of net government savings to G.D.P. — essentially, a measure of deficits — has had no predictive power one way or the other.

Meanwhile, the best-performing variable has been the ISM manufacturing index, which is a measure of how much manufacturing businesses are ramping up or ramping down their activity. It has had an r-squared of .46.

The manufacturing index has several things going for it. It is considered a fairly good leading indicator of economic activity in general and of jobs growth in particular, and it represents a composite score from five different component indicators, including things like the employment environment and new factory orders.

So does this mean you should all go out and use the ISM manufacturing index in your forecasting models? Actually, maybe not. It is certainly worth looking at. But when you’re testing 43 different economic indicators over a sample of just 16 elections, the best-performing ones are likely to have been a little lucky.

In fact, the relative rank of the economic indicators has historically been very inconsistent: those that perform best over one set of elections do not do much better over the long-term. We will discuss this problem at more length in a follow-up to this article.

Nate Silver is the founder and editor in chief of FiveThirtyEight. @natesilver538