Based on Actual Math

Monday, November 7, 2016

We're here. 18 months, thousands of man hours, hundreds of maps, and thanks to Donald Trump, more than the usual dose of bigotry and ignorance. But we got through it, and the election is tomorrow.

Before we get to the Official Prediction, I want to talk about how specific state outcomes would move the odds. As it stands now, Clinton is 90% to win and Trump is 9% (remainder is a tie). Here's how some individual state outcomes would change those odds:

I haven't been shy in sharing my contempt for Trump this year, but in general I try to stick to my wheelhouse: data and analytics. However, on the eve of Election Day, I feel compelled to write about why Clinton deserves your vote.

But this election is too historic, important, and horrifying for me to not at least share a little of why I'm feeling good about voting both for Hillary Clinton and against Donald Trump; I'll keep it short. This isn't a policy discussion; it's a discussion about character, because that's what this election has become about. It's what's important to us as Americans, and what it means to be an American.

For Hillary Clinton
When I watch Hillary Clinton, I'm in awe. No matter the issue, she can discuss it. I've watched her be asked questions on every possible policy, and she always has an answer that shows thoughtfulness, preparation, and empathy. Hillary Clinton has spent 30 years in and around Washington, grinding it out, trying to help people. She doesn't necessarily score touchdowns or make flashy plays, but she's in the trenches blocking and tackling, then obsessively studying game film so next time out she can be just a little better.

Against Donald Trump
Donald Trump is the worst major party nominee in this country's history. By a lot. The following is a list of things that are a) off the top of my head, b) true about Donald Trump and c) would each be disqualifying for the presidency.

He bragged about committing sexual assault, and in response many women came forward to say he'd actually sexually assaulted them

He has encouraged proliferation of nuclear weapons, and didn't know what the nuclear triad was

He proposed banning Muslims (including Americans) from entering the USA

He lies, literally all the time, about everything

He's already whining about how if he loses, the election will have been rigged

He's indicated he might not concede if he loses

He launches hideous and false attacks at private citizens who say mean things about him

He wants to "open up libel laws" to allow politicians to sue journalists who write mean things about him

Imagine every presidential nominee as an NFL player. Some are stand-out stars (Obama), some barely get on the field (GWB), but all are elite. All NFL players are among the best in the world at football.

If nominees are NFL players, Donald Trump is the guy watching games on his couch.

In the last few days there has been a lot of ink spilled over 538's projection which (ludicrously) gives Trump a 30% chance to win tomorrow. My model gives him markedly lower odds. I wanted to take a moment to highlight the key differences between my model and the 538 model, and explain why I think my methodological choices were better.

How national polls are handled

To quote Jed Bartlett from The West Wing: "There are times where we're 50 states and times when we're one country." National polls can be thought of as 50 individual state polls, and modeled accordingly. I wrote a very thorough post on this technique here.

FiveThirtyEight uses trends in the national polling to adjust state polls. The following represents my best understanding of his methodology:

If, from last week to this week, the national poll average has moved 2 points in Trump's favor, FiveThirtyEight adjusts all individual state polls from last week to be 2 points more favorable to Trump . For example, a poll from a week ago in Florida showing Clinton +1 would be adjusted to Trump +1.

This over-amplifies the voices in those latest national polls. At any given time those "latest national polls" which drive the trend have a few dozen or few hundred voters from any given state. FiveThirtyEight uses those voters to adjust the voice of the thousands of voters aggregated in the individual state polls.

How polls are weighted

Polls are voters. I weight polls by how many voters they include and how recently they were taken. This, plus my national poll technique, has caused a big national poll taken by NBC and SurveyMonkey to be weighted fairly heavily on my model. I've written about that more extensively here.

FiveThirtyEight weights their polls based on age and (I think) their rating of the pollster.

Level of uncertainty

I've calibrated the level of national uncertainty based on prior elections, and I've calibrated the individual state uncertainty based on an unbiased poll taken on Election Day.

I'm not certain how 538 has calibrated for uncertainty, but I think they are overestimating it, and here are a few examples of states from FiveThirtyEight's model that don't pass the sniff test

Michigan: Trump has led in 3 out of 86 polls, yet 538 gives him a 21% chance to win

Wisconsin: Trump has led in 3 out of 80 polls (and not in a single poll since September 8th) yet 538 gives him a 16% chance to win

Minnesota: Trump hasn't led in a single poll in 538's database, and Clinton has led many polls by double digits, yet 538 gives him a 15% chance to win

Two-way vs. Four-way polling

I use two-ways polls (Clinton/Trump), while 538 uses four-way polls (Clinton/Trump/Johnson/Stein), I've written here on why I think two-way polls is the appropriate choice. In short, it's functionally a two-person election.

The new baby I mentioned in my previous post was born with some health issues and has needed a lot more of my time and mental energy than I could have possibly predicted. I have been updating election data on the left hand side of the page, but I haven't had the energy to do much else, including catch the bug I'm about to discuss.

At some point in the last few weeks, a bug was introduced into the model that essentially made state outcomes independent. This isn't how the model was designed and is incorrect (state outcomes are dependent - if Clinton is doing well in one state, it's likely she's doing well in another). I've rectified that, and now the published output shows the true output of the model, rather than the result of the bug.

I've also reconstructed what the model's true output would have been during those two weeks, and put the results on the graph below. Clinton's odds to win peak around 99% before dropping to their current levels.

Saturday, October 15, 2016

My wife and I just had a baby. I'll keep updating the election model because that's quick and easy, but college football is much more labor intensive and won't be updated this week. Hoping to be back next week.

Sunday, October 9, 2016

Remember 3 days ago when I wrote that Clinton had quite a good week and was now an overwhelming favorite? That was roughly 18 hours before the now infamous Trump tapes were released, where Trump brags about committing sexual assault, so I imagine he may look back on the days when he was just calling Alicia Machado fat as the good old days.

The Vox string I linked to above contains many insightful takes. The two that most resonated with me were:

I don't need to invoke my wife or my daughters to feel horrified about his comments, they should horrify all of us

There's no chance Trump drops out, and when attacked he attacks back ten fold (he's nothing if not predictable)

But my own take is for those saying: "Why didn't you abandon Trump before? Why was this the last straw after he did 55 other putrid things?"

I agree with you, and there should be a reckoning for politicians who are jumping ship now, probably because it's politically convenient. They gave us Trump and, I hope, there will be consequences for that. But this news is very bad for Trump. It makes him even less likely to become president, and that is very good. So instead of lecturing people on their hypocrisy, let's stop and ask what it is we're trying to achieve, then take actions consistent with that.

Thursday, October 6, 2016

It's been quite the good week for Hillary Clinton. Not only did she crush the debate, but Trump has had several awful news cycles since (some self-inflicted, some not). The model has moved sharply in Clinton's favor this week, and now considers her the overwhelming favorite. If you're curious as to why, skip to the NBC/SM National Poll below, but the short version is:

A very large national poll came out that's favorable to Clinton

The state polls are also very favorable to Clinton

Time is running out

My model incorporates information from national polls into its poll aggregation in a way other models don't. This gives it more information to work with, and makes it more confident.

Polls

Nearly all of the key post debate polls have either been neutral or good for Clinton. Here's a sampling of them:

Good for Trump

Gravis National poll shows Clinton and Trump tied

Rasmussen national poll shows Trump +2

FL poll showing Trump +1

Good for Clinton

NBC/SM National poll showing Clinton +6

Reuters National poll showing Clinton +7

Economist/YouGov national poll showing Clinton +5

PA polls showing Clinton +5, +12

CO poll showing Clinton +11

FL Polls showing Clinton +7, Clinton +5

OH poll showing Clinton +2

Neutral

LAT/USC tracking poll holding at Trump +4

Lots of polls in states that won't impact the election

The NBC/SM National Poll

If you're not familiar with how state and national polls interact in my model, you can read about it here. The short version is that national polls are, at their core, individual polls of 50 states, and they can and should be treated as such.

NBC and SurveyMonkey release a large online-only national poll once per week. The sample size is huge (YUGE). The 12 NBC/SM polls released since July have had an average sample size of 17,000 voters, much larger than the typical national poll of 1,100 voters. Parsing that large sample out to 50 states ended up having a material impact many of those states. For example, the model is treating the most recent NBC/SM poll the same as if each of the following state polls were added to the model:

660 voter poll in OH showing Clinton +5

390 voter poll in AZ showing Trump +1

180 voter poll in IA showing Clinton +7

581 voter poll in NC showing Clinton +3

163 voter poll in NV showing Clinton +8

1142 voter poll in FL showing Clinton +4

For reference, the average sample size in all of my state polls is 900.

Adding each of those polls to the model, all at once, causes significant movement in the model, especially when they tell a different story than the one we're working with (in the case of the most recent poll, one more favorable to Clinton). Pair that with the influx of excellent state polls for Clinton, and the model feels very good about her chances.