Last night, I decided to try to get some polling data and Huffington Post makes their polling data available through a very easy to use API in JSON format (GitHub code here).

This first plot uses the national polls of Trump vs Clinton. All polls that were conducted on “likely” or “registered” voters were included. Next I computed the weighted moving average of each of these polls using different moving average windows from 1, 2, 3,…, 21 days. I then plotted all of these curves on top of one another with the width, transparency, and color related to how many days were considered in the moving average. The more days included in the moving average the wider and more opaque the line is and the redder/bluer the line is. I then plotter three different confidence bands using the 7, 14, and 21 day moving averages.

I then pulled out all the state polls that were available and computed the weighted average across all polls with “likely” or “registered” voters (I did not consider the timing of the polls). I then computed a mean and standard error for each of these estimates and randomly sampled from the distribution for Trump and Clinton and plotted these random samples on the plot. The wider the spread of the plotted points for each state the fewer people have been polled in that state. So for instance, Utah has had more polling that Idaho. The color is related to what percentage each candidate is receiving in the poll (redder for Trump and bluer for Clinton. I’ve also added lines with negative slope to show what share of support third party candidates are receiving. If you follow the line y=x the states closer to the origin are more receptive to third party candidates. So for instance, Utah and Idaho are giving a lot of support to third party candidates whereas Georgia and Florida are mainly voting for the two major candidates.