Life through nerd-colored glasses

Nate Silver made a lot of testable predictions about the election on his 538 blog. In particular, he predicted the winner of each state (and DC), and placed a confidence percentage on each prediction. He did the same for senate races. In total, that’s 84 predictions with confidence estimates.

As in 2008, his predictions were phenomenal. Some of the races are not yet decided, but it looks like all of his presidential predictions were correct (Florida is not yet called as I write this), as well as all but perhaps 2 of his senate predictions (Democrat candidates are unexpectedly leading the Montana and North Dakota races). There are plenty of pundits who were predicting very different results.

Granted, while 82-84 correct predictions sounds (and is) amazing, many of those were no brainers. Romney was always going to win Texas, just as Obama was a sure bet in New York. A slightly harder test is whether his uncertainty model is consistent with the election outcome.

Let’s simulate 1000 elections. For each race, we assume (for the moment) Silver’s uncertainties are correct. That is, if he called a race at a confidence of x%, then we assume the prediction should be wrong 100-x% of the time.

Number of errors in 1000 simulated elections (red, shown with jitter) as a function of prediction confidence level

This plot shows, for each simulated election (in red), the total number of mis-predicted races with prediction confidences greater than the threshold on the x axis. The left edge of the plot gives the total number of mis-predictions (at any confidence). The half-way point shows the number of errors for predictions with confidences greater than 75%.

The lines all go down with increasing confidence — that is, there are fewer expected errors for high-confidence predictions (like Texas or New York). I’ve added some random jitter to each line, so they don’t overlap so heavily. The grey bands trace the central 40% and 80% of the simulations. The thick black line is the average outcome.

This plot summarizes the number of mis-classifications you would expect from Nate Silver’s 538 blog, given his uncertainty estimates. A result that falls substantially above the gray bands would indicate too many mistakes, and too-optimistic a confidence model. Lines below the bands indicate not enough mistakes, or too pessimistic a model.

If we assume that the North Dakota and Montana senate races end up as upsets, here is Nate Silver’s performance:

He did, in fact, do slightly better than expected (I doubt he’ll lose sleep over that). This result is broadly consistent with what we should expect if Silver’s model is correct. On the other hand, consider what happens if he ends up correctly predicting these two senate races. It’s unlikely that Nate Silver should have predicted every race correctly, given his uncertainty estimates (this happens in about 2% of simulated elections). It’s possible that Silver will actually tighten up his uncertainty estimates next election.

In any event, I think he knows what he’s talking about. I’m reminded of this clip (Nate Silver is Jeff Goldblum. The rest of the world is Will Smith)

The fact-checking site Politifact was quick to verify his assertion, and also provided a few caveats about the figure — namely, that Presidents probably don’t deserve as much credit or blame for this number as they are given. Nevertheless, I wanted to see the breakdown for myself.

The Bureau of Labor Statistics is great for data like this, and I appreciate that our government collects and distributes such data. I took a look at the non-government employment rates that Clinton’s claim is based on (this is the relevant table). First, the raw employment figure from 1961 until today:

US Employment (Non-Government, Seasonally-Adjusted)

Next, color-coded by the sitting president’s party

US Employment by president

Next, the difference in employment from the day each president took office

Jobs added or lost under Presidents since 1961

And shown on top of each other, with the net change:

Jobs added or lost by presidents since 1961

Given the current rhetoric, I was a little surprised at how similar President Obama’s line (lowest blue one) is to President Reagan’s (the highest red one). The turnaround under Obama’s presidency has been slower, but now seems to be improving at a rate comparable to Reagan and Clinton (highest line).

A few months ago (at the height of the fighting and deadlock over the national debt ceiling), I decided to investigate how partisan the United States congress really is — that is, how often do congresspeople vote along strict party lines. Perhaps I’m naive, but I find it frustrating that politicians seem so much less willing to compromise than other groups of people. I would prefer a government where concern for the common good takes priority over the culture wars.

With the 2012 elections approaching, I wanted to get a better handle on how willing different congresspeople are to compromise. Fortunately, the congressional voting history is public, and govetrack.us provides convenient access to both browse and download these data.

After downloading these data into a local SQL database, I decided to take the following approach:

For each vote since 2008, count up the total yes/no counts from the Democrat and Republican parties

Calculate the ‘net’ party vote for each of these. A value of 1 indicates a party is unanimously in favor of a motion, bill, etc. A value of -1 indicates unanimous disfavor. A value of 0 indicates a deadlock, with half of the party voting yes and no.

Compare the net party votes for each party.

Here’s one way to visualize this data, for the House of Representatives:

Each colored circle represents a single motion, bill, etc. The x/y locations give the net party vote for democrats/republicans. The color of the circle indicates, in cases where the majority opinion for each party differs, which party won (red=republican, blue=democrat, black= agreement among both parties). So, for example, all points in the upper right and lower left corners are black, since both parties agreed on a yes or no vote. The upper left and lower right corners are regions of disagreement between the parties.The histograms show the overal distribution of democrat (top horizontal graph) and republican (right vertical graph) votes, again color-coded by which party won the vote.

I noticed a few interesting features here:

Very many votes are (nearly) unanimously accepted by both parties. I found this to be refreshing, as it suggests that Washington is not as deadlocked as the news may imply. Of course, many of these motions are hardly controversial. Take the title of one such resolution, passed in the House by a vote of 421-2: “On Passage – House – H.R. 2715 To provide the Consumer Product Safety Commission with greater authority and discretion in enforcing the consumer product safety laws, and for other purposes – Under Suspension of the Rules.” A good next step would be to filter out the most benign / procedural votes, to better see ideological divides.

The votes for both parties are clustered around unanimous support or rejection, with few points near the center of the graph. I wish there was more disagreement within each party — dissenting opinions generally seem like a good idea.

The Republican party indeed has been more of a “Party of No” lately, with a greater concentration of “No” votes. Interestingly, however, Republicans have had more success against Democrats when voting yes for something — Republicans have lost most of the votes where they have voted nearly-unanimously “No” against a Democrat “Yes.”

I haven’t calculated how partisan individual members of congress are — that’s coming soon. In the meantime, here’s the voting record for the Senate (many more blue points, due to the steady Democrat majority since 2008)