For me the big story is the growth in the other party vote. In the US system of first-past-the post voting, these are largely wasted votes. My contention is that these votes were largely lost from Democrat voters, and they cost Clinton the election.

The first chart following is percentage point change in the other parties vote (without Utah, because it was a special case). The second chart is the percentage change in the number of votes cast for other parties (without Oklahoma, because there were no other party votes in 2012). The final chart is the vote share for other parties

The Democrat count is clearly down in the industrial mid-west. The first chart following is the vote share for the Democrats in each state. The second chart is the change in vote share (in percentage points) from 2012 to 2016. The final chart is the change in raw vote count from 2012 to 2016 (expressed as a percent of the 2012 vote count).

For the Republicans, the story is a mixed bag: votes were up in the mid-west (but a close result), but down in Texas, and the west of the country.

While the Michigan count is still not declared (and is now subject to a recount), the indicative results give Michigan to the Republicans.

Saturday, November 19, 2016

Last week some 128.5 million votes had been counted. The count is now at 132.3 million votes (and we are still counting). It continues to look like six states will flip from their 2012 party preference.

The two-party swings are now posting as follows (in percentage points) ...

Update

In addition to the change in the raw vote count (above), we can look at the change in vote share for Democrats, Others and Republicans in percentage points in 2016 compared with 2012. In these charts I have not reported Utah (which was a special case). We can see that the Democrat vote share was mostly down across the country, the Other vote share was up and it's an up-and-down mixed-bag for the Republicans.

Sunday, November 13, 2016

I have read a lot of garbage about what the 2016 US election means and signifies. In this post I want to look at what the raw numbers suggest.

While the counting has not finished, it looks like five and possibly six states will flip from Democratic in the 2012 presidential election to Republican in 2016: Florida, Pennsylvania, Ohio, Michigan, Wisconsin and Iowa. The states where the count is still in doubt are New Hampshire and Michigan.

In this first chart, I have plotted the likely state winners according to American sensibilities: the states which Trump/Republicans won are red and the states Clinton/Democrats won are blue. The states that have flipped from one side of politics to the other are in a darker hue.

The biggest swings to the Republicans (on a two-party basis) were in the industrial upper mid-west. This suggests the economy (and the challenges of managing economic change in those states previously heavily dependent on the industrial/manufacturing sector) may have driven the Trump win. There is an irony here: Bill Clinton won the 1992 with the catch-phrase: it's the economy stupid.

The bigger swings by state follow. In this table, a positive swing is to the Republicans, a negative swing is to the Democrats. In Utah, the measurement of a two-party swing was compounded by Evan McMullin an independent and former Republican, who took votes from Trump.

If the economy was the distal cause, the immediate factor I find most compelling in explaining the election outcome was the decline in raw Democratic votes in 2016. Put simply, Clinton was not as attractive to voters in 2016 as Obama was in 2008 or 2012. This outcome is driven less by a decline in turn-out and more by an increase in votes for the Other parties. In the US first-past-the-post voting system, these third party votes are effectively wasted. Of note: Maine voted on a referendum to introduce ranked-choice voting for the next Presidential election. (Ranked-choice voting is what we have in Australia).

Before looking at voting patterns by party, let's look at the change in overall turnout between 2012 and 2016. At this point in the count some 128.5 million votes have been counted. In 2012, there were 129.2 million votes counted in total. I expect the final 2016 vote count will exceed the 2012 count.

In the next three charts, we will look at percentage changes in the raw vote numbers by state for Republicans, Democrats and Others. The most significant thing to notice here is the dramatic increase in votes for other parties. But also of note, the decline of the Democratic vote in many states, compared with the neutral or slight growth in Republican votes.

The table of percentage changes in vote count, by state, as set out in the above charts follows.

This table tells a fairly consistent story of votes leaking from Democrats to the Other parties.This raises the interesting conjecture on whether Bernie Sanders would have done a better job at holding the flow of votes. My suspicion (without supporting evidence) is that he would have held more votes lost to others on the left, but may have lost more votes to Trump on the right.

Some have contested that Trump doesn't have a real mandate because he did not get more than 50 per cent of the vote. Arguments can be made about the fairness of the US voting system: particularly as it looks like Clinton won more of the popular vote but not the electoral college vote. However, these arguments are not resolvable. Fairness, like beauty, is in the eye of the beholder. The American founders decided to weight their voting system to those who are engaged (through voluntary voting) and to those who live in the less populous states (by giving all states two electoral college votes, and then one or more votes weighted to the population of the state). They also decided on a first-past-the-post system for counting votes. While compelling arguments can be made for and against each of these design elements, ultimately the Presidential election was conducted under the rules accepted by the American people.

Wednesday, November 9, 2016

We have had a few polling fails recently in the Anglo-sphere. Two United Kingdom examples quickly come to mind: the General Election in 2015, where the polls predicted a hung parliament, and Brexit in 2016, in which remaining in the EU was the predicted winner. Closer to home we had the Queensland state election in 2015, in which the polls foreshadowed a narrow Liberal National Party win.

The New York Times had the average of the polls with Clinton on 45.9 per cent to Trump's 42.8 per cent (+3.1 percentage points). The NYT gave Clinton an 84 per cent chance of winning the Electoral College vote.

FiveThirtyEight.com had the average of the polls with Clinton on 48.5 to Trump's 44.9 per cent (+3.6). FiveThirtyEight gave Clinton a 71.4 per cent chance of winning the Electoral College vote.

The Princeton Election Consortium had Clinton ahead of Trump with +4.0 ± 0.6 percentage points. PEC gave Clinton a 93 per cent chance of winning the Electoral College vote.

While the count is not over, the current tally has Clinton ahead in the national popular vote by +1.1 percentage points, but losing the Electoral College vote. The most likely Electoral College tally looks like Trump with 305 Electoral College votes to Clinton's 233.

So today's big question: Why such a massive polling fail?

It will take some time to answer this question with certainty. However, I have a couple of guesses.

My first guess would be the social desirability bias. This is sometimes referred to as the "shy voter problem" or the Bradley effect. At the core of this polling problem, some voters will not admit their actual polling preference to the pollster because they fear the pollster will negatively judge that preference. It is not surprising that such a controversial figure as Donald Trump would prompt issues of social desirability in polling. Elite opinion was against Trump. Clinton labeled Trump supporters as "deplorable". No-one wants to be in that basket. Pollsters might also look at Latino voters in Florida that appear to have voted for Trump in larger numbers than expected.

The second area where I suspect pollsters will look is their voter turn-out models. Who actually voted compared with who said they would vote to pollsters. This was a very different election to the previous two Presidential elections. Turnout-out models based on previous elections may have misdirected the polling results (particularly on the basis of race and particularly in the industrial mid-west).

A final thing that might be worth looking at is herding. The final polls were close, perhaps remarkably close. This may have been natural, or it may have resulted from pollsters modulating their final outputs to be similar with each other.

Sunday, November 6, 2016

If we take all nine Federal elections since 1993 (that is the Federal elections in 1993, 1996, 1998, 2001, 2004, 2007, 2010, 2013 and 2016) we can look at the distribution of individual seat swings around the national swing for each election. This information is useful for Monte Carlo simulations of future elections. The summary statistics for this analysis follow.

Let's start with a normal distribution with a mean of -0.014 and a standard deviation of 3.22 percentage points (as suggested by the raw data). It yields a plot as follows (where the normal curve is fitted to the data in red).

However, there is a potential problem with this normal distribution. There are too many outliers for a normal distribution. In the area between -3 and +3 standard deviations from the mean we would expect to see 99.73 per cent of all observations if the data was normally distributed. We would only expect to see 0.27 per cent of the observations outside of this range. We actually have 0.82 per cent of our observations outside of this range.

We also have more observations bunched in the middle of the distribution. For a normal distribution, you would expect to find 68.27 per cent of the observations between -1 and +1 standard deviations from the mean. We have 70.19 per cent of our observations in the middle of the distribution.

Not surprisingly therefore, the excess kurtosis statistic for the observations is positive: 1.15 (using Fisher’s definition of
kurtosis, where the normal curve has a kurtosis of zero). From our sample, it would appear that the distribution of seat swings around the national swing is a touch leptokurtic. More observations than normal are clustered around the mean, but there is also a higher probability than normal of substantial outliers occurring from time to time (which is sometimes referred to as a fat tail risk distortion).

Given the fat tails, I tried a couple of other distributions to see if they would provide a better fit than the Gaussian normal distribution. My attempt to fit a Cauchy distribution - see next chart - yielded a poorer fit (as measured by the sum of squared errors).

However, Student's t-distribution (next chart) was an improvement on the normal distribution in terms of fit. When it comes to a Monte Carlo simulation of election outcomes for individual seats around a national swing, the t-distribution looks the most promising. The parameters for this t-distribution are: df=10.5318772642; loc= -0.0531408340653; and scale=2.89669817523. I suspect it is simply an artifact of the raw data that the location parameter is not zero.

The relationship between the t and normal distributions can be seen in the next chart.

It is worth considering the outliers that lie beyond three standard deviations from the mean. The complete list follows. The critical column is the Adjusted Swing column, which is the swing for the seats minus the national swing (to Labor). There are interesting stories with many of these unusually large swings.

The python code for this analysis is pretty rough. A fair bit of effort went into munging the raw data from the Australian Electoral Commission (AEC) into something that I could work with. The data format changed a number of times over the years. Swings prior to the 2004 election were always reported from the perspective of the Labor party. From the 2004 election, they are reported from the perspective of the government of the day (going into the election). State abbreviations were not used in the earlier data files from the AEC. There is also quite a bit of arcane code dedicated to statistical tests.

Saturday, November 5, 2016

For the first time at the 2016 election, the Australian Electoral Commission has provided a two party preferred (TPP) count for each seat by vote type. This affords another window on the contention that the final count is more favourable to the Coalition when compared with the count of ordinary votes, which is completed on election night.

You can see the earlier analysis here and here, which came to the same conclusion over the past five Federal elections, based on the two candidate preferred (TCP) counts for each seat by vote type.

If we look at the Coalition's TPP percentages (summed across all seats, for each vote type) it received:

In the 2016 election, the Coalition lost the TPP count on election night (ordinary votes in the above list). But by the final count it had improved its position by 0.404 percentage points to win the final TPP count. The distribution of the Coalition's bias (compared with ordinary votes) across seats can be seen in the following charts.