How We’re Tracking Donald Trump’s Approval Ratings

President Trump’s first real test on the ballot will come 20 months from now, when Republicans face voters at the midterm elections. But those Republicans have decisions to make today about whether to support Trump’s latest policy proposal or criticize his latest tweet. And their fate will be tied to his: Historically, the president’s approval ratings have been one of the best indicators of how his party will fare in congressional elections.

As with our election forecasts, we use almost all polls but weight them based on their methodological standards and historical accuracy.

We adjust polls for house effects if they consistently show different results from the polling consensus.

And we account for uncertainty, estimating the fairly wide range within which Trump’s approval ratings could vary over the next 100 days. (This is what’s indicated by the green and orange bands on the chart.)

Here’s a more detailed — and at times, technical — rundown of how we crunch the numbers.

Finding and weighting polls

Our philosophy is to use all the polls we can find — provided that we think they’re real, scientific surveys.1 However, we use a formula that weights polls according to our pollster ratings, which are based on pollsters’ historical accuracy in forecasting elections since 19982 and a pair of easily measurable methodological tests:

Whether the pollster usually conducts live-caller surveys that call cellphones as well as landlines.

Polls are also weighted based on their sample size, although there are diminishing returns to bigger samples. Surveying 2,000 voters substantially reduces error compared with surveying 400 of them, but surveying 10,000 voters will produce only marginal improvements in accuracy compared with the 2,000-person survey.3 The worse the pollster’s rating, the quicker it encounters diminishing returns in our formula; a Zogby Interactive survey still wouldn’t get much weight in our model, for example, even if it polled 100,000 people.

The weights also account for how often a pollster measures Trump’s approval ratings. If it does so more often than about once per 20 days, each instance of the poll is discounted so that the pollster doesn’t dominate the average just because it’s so prolific. Daily tracking polls also receive special handling from the formula so that interviews are not double-counted.

“Adults” versus “voters”

Comparisons between different approval-ratings polls aren’t always apples to apples. Some polls are surveying adults regardless of their voting status; others are polling registered voters, and a few are even polling “likely voters.”4 So far, Trump’s approval ratings are higher among the voter population than among the adult population.

Our default version of the approval ratings reflects a combination of all polls, whether they’re of adults, registered voters or likely voters. If a pollster releases multiple versions of the same survey, however, we use the all-adult version of the poll before the registered-voter version (and the registered-voter version before the likely-voter version). Approval ratings have traditionally been taken among all adults, so this provides for better continuity between Trump’s ratings and those of past presidents.

However, we have another version of the approval ratings that includes only registered or likely voter polls5 and discards all-adult polls; this may be the most useful version for forecasting Trump’s impact on the 2018 midterm elections. (We also have a version that uses only the all-adult polls and discards any polls that restrict the sample to registered or likely voters, which might be useful for historical comparisons.)

VIDEO: President Trump has record disapproval ratings

Calculating a trend line (local polynomial regression)

Because individual polls can be noisy, we estimate how Trump’s approval rating has changed over time using local polynomial regression. Basically, this consists of drawing a smooth curve over the data; this method is similar to those used on Huffington Post Pollster and other sites. In the regression, polls are weighted on the basis that I described earlier, so higher-quality polls with larger sample sizes have more say in the estimate.

While local polynomial regression is a flexible and fairly intuitive method, it’s a bit trickier to work with than it might seem. That’s because people don’t always take the time to determine the correct degree of smoothing, which is governed by several parameters, including the bandwidth and the degree of the polynomial. Too little smoothing can make the curve jut up and down unnecessarily and will result in overfitting of the data. If you smooth too much, however, the curve may be aesthetically pleasing but won’t do all that good a job of describing the data and may be slow to catch up to new trends. While there are usually a wide range of “reasonable” settings when choosing trend-line parameters, our experience has been that people often over-smooth the data when applying these techniques.6

For our election forecasts, we choose the degree of smoothing based on what will maximize predictive power. Generally, this results in a fairly aggressive setting, especially in the days and weeks just before an election. This was one of the reasons our model came closer to the mark than most others in last year’s presidential election; it was aggressive about detecting the substantial tightening in the race that came after FBI Director James Comey’s letter to Congress in late October.

In the case of approval ratings, there’s no election to predict — so we instead choose the settings based on how well they would have predicted a president’s future approval ratings. It asks, for instance, what settings would best have predicted Bill Clinton’s approval ratings in March 1998 based on data through February 1998. The analysis is based on approval-ratings polls since 1945.

This also turns out to produce a relatively aggressive model. A week or two is usually enough to detect a meaningful change in approval ratings, and perhaps sooner than that if several high-quality polls tell a consistent story. See the footnotes for more detail about which settings we use.7

One more detail: When you see our estimates of Trump’s approval rating for a given date, they reflect only polls that were available as of that date. For instance, our estimate of Trump’s approval rating for March 15 will reflect only polls that have been released to the public by March 15. If on March 18, a new poll comes out that was conducted on March 15, we don’t go back and re-run the numbers for March 15.

Adjusting for house effects

Polls are adjusted for house effects, which are persistent differences between the poll and the trend line. Rasmussen Reports, for example, has consistently shown much better approval ratings for Trump than other pollsters have, while Gallup’s have been slightly worse. The house effects adjustment counteracts these tendencies. So, a recent Rasmussen Reports poll that showed Trump at 50 percent approval and 50 percent disapproval was adjusted by the model to 45 percent approval and 51 percent disapproval.8 Meanwhile, a recent Gallup poll that had him at 43 percent approval and 52 percent disapproval was adjusted to 44 percent approval and 50 percent disapproval. After adjusting for house effects, therefore, these polls — which had seemed to be in considerable disagreement with each other — are actually telling a fairly consistent story.

The house effects adjustment is more conservative when a pollster hasn’t released very much data. For instance, if a new firm called PDQ Polling releases a survey showing Trump at a 50 percent approval rating when the trend line has him at 44 percent, the model doesn’t assume that PDQ has a 6-point pro-Trump house effect, because its result could have reflected sampling error rather than methodological differences. Therefore, PDQ’s house effect is discounted: The model might adjust its numbers down by 1 or 2 percentage points, but not by a full 6 points. As a pollster releases more data, however, a larger fraction of its house effect is adjusted for. If, over many months, PDQ’s polls have consistently been 6 percentage points better for Trump than the consensus, the model will eventually deduct 5 or 6 points from Trump’s approval rating in a PDQ poll.

As a technical note, house effects and the trend line are calculated on an iterative basis. First the model calculates a trend line using the unadjusted version of the polls. Then it estimates house effects based on how polls compare to that trend line. Then it goes back and re-calculates the trend line, with polls adjusted for house effects. Then it recalculates the house effects adjustment using the recalculated trend line. It loops through this process several times. This helps the model determine whether an apparent shift in the data reflects a real change in Trump’s trajectory or is an artifact of house effects.

Estimating uncertainty

In our election forecasts, we estimate uncertainty by comparing how close past forecasts would have come to past election outcomes.9 If the polling average in a certain type of U.S. Senate race historically missed the outcome by an average of 4 percentage points a month before the election, for instance, that would be reflected in our forecast. Thus, our calculation of confidence intervals and probability estimates is empirical, rather than being based on idealized (and possibly overconfident) assumptions about how accurate polls “should” be.

For approval ratings, we don’t have that luxury. That is to say, there’s no national plebiscite in which all Americans go to the ballot and vote up or down on whether they approve of Trump’s job performance. Furthermore, our approval ratings estimate blends different types of populations together (adults, registered voters, likely voters), so it’s not clear what such a plebiscite would look like even in theory. Therefore, there’s not any good way to determine how Trump’s “true” approval rating compares to our estimates.

What we can do, however, is measure how well our approval rating estimate on a given date predicts future approval-ratings polls. These estimates are empirically driven, based on an analysis of approval-ratings data from 1945 through 2017 (for Presidents Truman through Obama).

The shaded area (as in the example below) reflects where we project 90 percent of new approval ratings polls to fall. As of March 2, for example, Trump’s approval rating is about 44 percent, but with a range of roughly plus or minus 5 percentage points. Thus, we’d expect Trump’s approval rating to be between 39 percent and 49 percent in about 90 percent of new polls, with the remaining 10 percent of polls falling outside this range. The width of the bands is determined by the volume of recent polling (more polls make it easier to home in on the average), the degree of disagreement in the polls and the amount of long-term volatility. So far, polls have disagreed more on Trump’s approval rating than they did for Obama, so his range is slightly wider than Obama’s would have been.

We’re projecting a range for Trump’s approval and disapproval ratings over the next 100 days. (Or for the 250 days if you look at the “four years” tab of our interactive.) As you can see, uncertainty increases as you advance further into the future. Also, presidential approval ratings tend to be more volatile early in a president’s term, so you should keep a wide range of possibilities in mind for how Trump’s presidency might progress.

In the long run, presidential approval ratings tend to be somewhat mean-reverting: Good ratings tend to get worse, and bad ones tend to get better. They also tend to worsen, slightly, over the course of a president’s tenure in office. It would be easy to overstate these effects, which are relatively minor over the near-to-medium term. Nonetheless, you’d expect a president’s approval rating to decline more often than not when it’s above 50 percent and to rise more often than not when it’s below 40 percent.10

The dashed line in the chart indicates Trump’s projected approval rating over the next 100 days, accounting for this mean reversion. Because Trump’s approval rating is within the 40 percent to 50 percent range now, we wouldn’t expect much effect from mean reversion. That is to say, on the basis of how presidential approval ratings have behaved historically — not considering any circumstances particular to Trump — ratings like the ones Trump has now are about equally likely to rise and to fall. It’s true that most presidents begin with higher approval ratings (and much lower disapproval ratings) than the ones Trump now has and then see them deteriorate. But because Trump’s numbers are already middling, he may avoid the slump that past presidents experienced when they exited their “honeymoon period.”

Footnotes

There are extremely few exceptions. Polls are put on our banned list only if we think they may be fabricating data or if they’re “push polls” or convenience samples that aren’t even attempting to collect an unbiased sample of the public. Again, our aim is to be inclusive, and we usually wind up including a more comprehensive set of polls than other polling averages do.

Note that we last updated our pollster ratings in the summer of 2016 after the completion of the Democratic and Republican primaries and intend to revisit them again soon to account for the results of the 2016 general election.

At least for the topline results; the larger sample size would be extremely helpful if you wanted to break the results down by demographic subgroups.

The scare quotes are purposely throwing a bit of shade. Likely voter polls are fine and good — indeed, often preferable — once you get close to an election. But they’re slightly weird at this stage because voters in most states won’t go to the ballot booth for nearly two years. It’s hard to project what turnout will look like this far out from an election.

This version uses the likely voter numbers if a pollster releases both likely-voter and registered-voter data.

This is especially the case if polls are adjusted for house effects, which can substantially reduce the amount of noise in the data.

The model uses three second-degree (quadratic) smoothers, with bandwidths of 10, 20 and 30 days, and averages these three estimates together. The bandwidth reflects the number of days used to calculate the polynomial: For instance, a 20-day bandwith applied on Feb. 27 would mean that polls from Feb. 7 to Feb. 27 are used in the calculation. The model also requires that a minimum of five polls be used in the calculation, so it will lengthen the time intervals under consideration in the event of sparse data.

House effects are calculated on a president’s approval rating and disapproval rating separately. For instance, a poll might show both a lower approval rating and a lower disapproval rating for Trump than the consensus if it has a lot of undecided voters.

It’s slightly more complicated than that — some of our election forecasts revise uncertainty estimates upward to compensate for overfitting and sparse data — but that’s the basic idea.

In between 40 percent and 50 percent, the different types of long-term factors our model considers can have ambiguous effects. For example, a president’s approval rating might be expected to rise slightly before slightly declining again.

Nate Silver is the founder and editor in chief of FiveThirtyEight. @natesilver538