Comparisons among aggregators and modelers

November 4th, 2012, 6:08pm by Sam Wang

I get asked a lot about why the various aggregators differ from one another. After all, we all start with the same polling information. Today I will give a general sketch of how and why we differ – and what I view as the strengths and weaknesses. I’ll restrict myself to organizations that I am more familiar with. It’s Sunday, so not that much math.

(2) Do we take a snapshot of current conditions only, or do we attempt a future prediction? I’ll review three organizations that have been making predictions all season: FiveThirtyEight, Votamatic, and the Princeton Election Consortium.

I’m leaving out political scientists who use predictors only, such as Alan Abramowitz, Ray Fair, and the University of Colorado people. As I’ve written, I categorize their models as tools to test ideas about how voting preferences are shaped. All of them do well at “post-dicting” past events. They might get the next election right, but if they don’t…so what? Make another model. This activity is research, with emphasis on the “search.” There’s nothing at all wrong with it. But it’s most useful before the election season starts. In the storm, you want the person with the instruments, not the person with the almanac.

I’ll go through the various models, going gradually farther away from polling data.

Polling data: Electoral-vote.com,RealClearPolitics, Pollster.com. All of these sites present polls with relatively little additional processing. Electoral-vote.com, by Andrew Tanenbaum, exemplifies the first wave of aggregators. He gives simple tabulations of state polls, with the electoral vote total determined by the most recent poll. To reduce poll-to-poll fluctuation, RealClearPolitics adds simple averaging. They also leave out partisan pollsters. Pollster.com uses more sophisticated smoothing methods, and has remarkably good user tools to allow the construction of customized graphs. In all cases, the electoral vote count is a simple total, assigning each state one possible outcome. This is the mode of the distribution.

Pro: Gives a quick look at the race with a minimum of filtering. Averaging gives a sharper picture of any individual state.Con 1: The total electoral estimate still fluctuates because all states are reduced to a single combination of outcomes, which usually corresponds to the single highest point on the histograms here at PEC or at FiveThirtyEight.Con 2: The use of averaging allows an extreme outlier poll to pull the average disproportionately in one direction. This can be an issue where polls are sparse. Smoothing is not optimal for revealing sudden shifts such as the effect of Debate #1.

Snapshot plus state-polls-only based prediction: Princeton Election Consortium. Like the sites listed above, we also offer a snapshot, shown in the topline of this website. The history of the snapshot is plotted in the right column. However, our methods wring considerably more information from the data. Our fundamental output is two core numbers: the EV Estimator and the Popular-Vote Meta-Margin. They are very high-resolution measurements at a single point in time. Think of them as an electoral snapshot or an electoral “thermometer.”

How we get these numbers requires a little explanation. Briefly, in each state we do more than ask “who’s ahead?” Instead we calculate a win probability from the median of recent polls, an outlier-rejecting approach. Using a simple math trick, we then take these 51 probabilities to calculate the exact distribution of all outcomes (2.3 quadrillion). The middle of the distribution is the EV Estimator.

The Meta-Margin takes advantage of the fact that the EV Estimator calculation is very speedy (takes much less than 1 second to run). It relies on a core tool, the bias variable b. It is easy to shift all polls over by a fixed amount b. There are several reasons to care about this: (i) polls in different states tend to move together – correlated variation; and (ii) polls may all be biased by some amount, which can also be simulated by varying b. The Meta-Margin is defined as what value of b would lead the electoral college to be a perfect tossup. It’s just like a margin, which is why it’s in units like Obama +2.6%.

Finally, we also use b as a way to game out future scenarios, and make a prediction. If we think polls can move by up to 1% in the future, then we can add up all the possibilities from b=-1% to b=+1%. The red and yellow strike zones (which are almost gone as of today) are calculated this way. Based on past elections, we can estimate what b might be.

The reason I am going off about b is that it is my way of thinking of contrasts between the Princeton Election Consortium with FiveThirtyEight. In some sense, my assumption that b has a narrow range accounts for why the two sites give different re-elect probabilities for President Obama.

Pros: Makes near-maximal use of existing state polls, whose track record using PEC’s methods is excellent. Uses medians to reject outliers. Converts Electoral College mechanisms to a Popular Vote margin, an intuitive quantity. The low noise allows accurate identification of swings in the race.Cons: Doesn’t use national polls. Doesn’t correct for house effects. Assumes that state polls are, as a group, unbiased (though this does have support from 2004-2008).

Hybrid model: FiveThirtyEight. In addition to state polls, Nate Silver uses other variables – national polls and econometric indicators – to infer a likely election result. He used this approach to predict winners in the 2008 Democratic primaries, in that case including demographics and more. He was able to fill in some missing-data problems.

For his Presidential model, he takes several approaches. One type of variable is econometric indicators, which informed his calculation earlier in the season. His current calculation uses national and state polls, with fuzz factors to account for the possibility that these polls could contain systematic errors.

I’ll be brief without getting too far into the weeds. He takes a very conservative approach to estimating win probabilities, in the sense that he builds in ways that effectively reduce the certainty of any particular outcome. In addition to being conservative about single-state probabilities, it appears that he puts a lot of credence into the possibility that national and state polls could be off by a substantial amount. Recently he said that this cautious approach accounted for much of the 16% probability of Romney winning the election.

Let me express this idea in terms of my bias variable b. The 16-percent idea is approximately equivalent to saying that there might be overall (i.e. all pollsters combined) systematic errors in national and state polls that could drive b as high as 5% in either direction (a 95% confidence interval), given today’s Meta-Margin. However, as I pointed out the other day, b doesn’t affect state outcomes very much in most cases, since these races, even in swing states, are usually determined by a larger margin. Also, based on my analysis in 2004-2008 (where data are abundant), b for state polls is smaller on average, 1-2%. It’s larger for national polls.

Eventually, I believe that a suitable way to measure b in past elections is to perform aggregation separately on state polls and national polls, then compare actual national popular vote and EV with national-poll margin and the Meta-Margin that I have defined. This might be hard to do for earlier elections, where polling was sparser.

Pros: Takes into account national polling data; corrects for individual pollster biases; takes a conservative approach to the uncertainties.Cons: Likely to overcount uncertainties (look at the error bars). The use of national polls may reduce accuracy of state-level Electoral College outcome. Uses econometric variables even after direct measurements (polls) are available. Time resolution not as good as a pure-state-polls approach.

Polls with a predictive prior: Votamatic. Drew Linzer’s project is a fresh approach to the problem of combining econometric variables. In his case, he uses an econometric model for long-term prediction to set up a “prior” expectation of how the race will unfold, then uses this to guide the interpretation of polling data.

As you can see, the model fluctuates hardly at all. It seems to really have an affinity for Obama 332 EV, Romney 206. At some level this is a good feature: if a prediction is accurate, it shouldn’t vary much. However, I am a bit concerned because this suggests that the prior is drawn very restrictively. In other words, it is set to ignore or shape incoming polling data. The predictive value of such a model depends a lot on the validity of the prior.

My own inclination for a model like this is to use it to fill in “missing data” problems. Many states are underpolled, such as Texas or Vermont. A strong prior can give us expectations for what would happen there. Although those outcomes are not in doubt, the vote-share is not known. This would be a good test. Another example of a missing-data problem is Senate or House races, the latter being a significant prediction prize.

Pros: Very stable prediction; keeps prior and polling data separate; high level of analytical rigor.Cons: Dependent on the validity of the prior; doesn’t reveal much about the dynamics of the race.

>>>

As you can see, these models each have their own uses. To my own taste, I’d use them as follows.

I should also note all of these sites also have their own flavor of commentary. Drew Linzer has done fascinating recent work getting into examining individual pollsters and looking for “skew” or “bias.” Electoral-Vote.com gives a very good daily survey of the scene at all levels, and highlights polls of particular interest. And of course FiveThirtyEight’s Nate Silver made his bones in part by the data-driven play-by-play commentary that made him famous in 2008.

It is certainly possible that I have not put my finger on key differences between these approaches. I imagine many of you are big fans of the other sites, and can offer alternative interpretations or corrections in comments.

245 Comments so far ↓

Any insights as to why it has been Nate who has borne the majority of the right wing hostility? Is it because he’s gay? Works for the times? Shows up on tv more? Or a combination of all three?

It seems like Dr. Wang pokes them with a sharp stick more and where Nate is careful to say that there’s still hope for Romney, Dr. Wang categorically denies just about any hope at all. And yet they’re going after Nate and leaving Dr. Wang relatively unscathed.

Though as much as i can tell personality through writing, i get the impression that Dr. Wang wouldn’t mind it and Nate does. So maybe that’s why they pick on him.

He’s the most famous one by far, because of the following he attracted during the 2008 cycle. Prof. Wang was doing this earlier but never achieved the same level of celebrity, though it seems as if his star is rising now.

It is also because Nate pandered to the Right at the start of the season. Or his editors did. He bent over backwards to seem fair and unbiased, if on one hand then the other, both sides do it, etc.
They thought Nate was their friend (snicker).
Jay Cost had a total meltdown on twitter.
Then the right turned on Nate with a vengeance.

Dr. Wang has always been upfront about his affiliation. No potential for misunderstanding there.

I just wish that the stupid, frothy-mouthed partisans on both sides would STFU about the quants so we could have a better discussion about which models are better and how to combine their wisdom into a single prediction. Then again, I guess no one would Tweet that “boring” news (which fascinates the bejeezus out of me).

I think the p***ing match is good for business, for now anyway. It’s a step forward in getting attention for aggregation as part of the political conversation. Basically, the qualitative types use Nate Silver as their lightning rod. I assume there will be another wave of attention on Wednesday.

Nate is visible, and associated with the “communist owned” New York Times. Dr. Wang is associated with Princeton University, which prior to 1970 was the preferred university of rich, right wing Republicans…so he gets the benefit of the doubt.

As of 11 p.m., Republican owned RealClearPolitics, “communist affiliated” FiveThirtyEight, and the Princeton Election Consortium ALL agree the President will get at least 303 electoral votes.

When was the last time RealClearPolitics and FiveThirtyEight reached precisely the same conclusion about the outcome of an election?

Nate works for the NYTimes now, so has the highest profile for being at the biggest daily in the country, which also happens to have a liberal bias (the paper, not FiveThirtyEight). By skewering Nate, the wing-nuts believe they are killing two birds with one stone.

Sam would be a good fit for Maddow, but they can do a lot without guests. I’d think her people other people’s people were aware of him aware of him…Now that there’s tv footage, they can see how well he comes across. The 99% percent probability is an exciting lede. CNN did well with it, Sam looked and spoke excellently. I expect to see him on tv in similar stories.

Caution: If Sam gets his own PBS series, [ “LIKELY STORIES”?], eventually the producer will want him to eat a bug. Nice clip in the opening credits montage. Feh.

skmind, may I remind you CNN hosts Erik “Redstate” Erikson?
They are just as susceptible to regulatory capture by market forces as anyone in media.
I’ve followed Nate since 2008. He has not been open about his affiliation this season. Its my hypoth that the grey lady even made him remove a post on oversampling that messed with the horse race narrative.
This is because of what Julian Assange calls Fox News Effect. You cannot alienate half the country with the truth they dont want to hear and expect to sell them product.

Matt, Nate just has the most *influence*. Social Network Theory 101. This depends on relative “power” of his node and raw number of connections (links).

Was there a new NC poll that showed it going back to R? I am getting whiplash. Also, Sam has to come back next year and do an MM for my state of Va for Gov as well as his home state of NJ. I demand it!

You can throw out any poll regarding Florida with the suppression. 6 to 8 hour waits for early voting. And the Governor says there is no problem. What is happening down there is absolutely criminal and blood boilingly infuriating.

This isn’t a partisan comment. If you are a Republican in Florida, you are responsible for the downfall of Democracy in your state and by extension the nation.

It’s an application of the rule Josh Marshall recently stated that the moment people start talking about you having the momentum, your numbers are probably about to drop, just from regression to the mean.

I love the incredible GOTV efforts in Florida and NC, but I highly doubt they’ll flip those states. It’s still important just to make sure Romney has to fight there.

Thank you. Finally, someone who’s tied to reality! All these people claiming a 332 O victory are just being strung along by Jim Messina’s jazz talk and false BS bravado! Aint gonna happen folks. He’s lost to much early vote ground & indies by ceding the economic argument to R in both Fl and NC. It would take a grand miracle for him to pull those off! People ought to stop diluting themselves!

Actually, I think the change was Florida. Most aggregators had Florida as a tie, or as going to Romney. As of today, the “Big Three” aggregators are giving Florida to Romney, and all three (PEC, 538, and RCP) have the President winning with 303 electoral votes.

Sam,
Excellent analysis of the “top” sites this year. PEC is the “top of the top.” I also look at predictwise.com and oddschecker.com to incorporate the “wisdom of crowds” approach. Looking forward to your post-election commentary and analysis of these same sites. Thanks.

Nate Silver makes perhaps the best argument I’ve seen yet about the accuracy of state polling. Using states where he has at least 3 final week polls in his database going back to 1988, he found only 3 cases where the candidate with the average lead in state polls didn’t win the state. That graphic is a thing of beauty for the numerati…

Money quote:

“Of the 77 states with at least three late polls, the winner was called correctly in 74 cases.”

Wouldn’t encorporating the poll underestimate (slope not 1:1) effectively repel the meta margin away from 0%. aka taking it into account would amplify even small leads (even though historically that may be correct). wondering if this goes into your Bayesian prior based estimate..

I like the votamatic approach because it best reflects the reality of modern presidential elections. The range of outcomes is very, very small. There’s really only been four states that weren’t locks for one candidate or the other this year (NC, VA, FL, CO). Professor Wang is worried about votamatic missing the dynamics of the race. There are no dynamics. The only question is who will show up at the polls.

Yes, there are dynamics of the race. All you need to do is look at poll average fluctuations to see that. Sampled vote intent fluctuates over time, and variation among polls doesn’t overwhelm variation over time. The Votamatic approach doesn’t “reflect the reality of modern presidential elections.” It uses an econometric model to come up with a prior probability distribution, and updates distribution with the evidence from the polls. That’s why it doesn’t fluctuate that much. Not because it’s prior probability distribution thinks there is a very small range of outcomes.

(1) Because PEC doesn’t account for house effect, it’s more prone to jitter. A lot of the day-to-day fluctuation in PEC is more about which polling houses have released new polls lately than about actual changes in the race. I think 538 is the better place to look for meaningful trends on a scale of 1-3 days.

(2) The upside of PEC not accounting for house effect & throwing in other metrics is that it’s far less open to methodological suspicion. When people go after 538 for supposed bias, point them to PEC and say, “Well, this is what the unfiltered polls are saying if you take them all at face value.” It’s *very* useful to have that perspective.

yes, the PEC plots are quite affected by the order in which polling organizations release their polls. in fact, if you re-calculate the historical EV estimator using all of the polls we know about today that occurred during days [A -> B], then you will sometimes calculate a different number than we calculated just after day B, simply because even more polls have been released! alas, I don’t have a plot to share at the moment, but I’d like to make one after the election.

Please do, I’ve been curious about the effect on the Meta Margin of ‘late’ polls, particularly around ‘big dates’. Would like to see the impact on Dr Wangs 10-12 day rule of thumb for state polls to catch up.

All of the models you discussed seem to have Obama winning. So can you tell me whether articles like this (http://www.politico.com/news/stories/1112/83288.html?hp=t2_3) are just campaigns/media trying to spin or a legitimate concern. Silver addressed the possibility that all the state polls could be off but do you think pollsters are inaccurately predicting the 2012 turnout demographic vs. that of 2008?

Well, yes. That is because they use the same empirical data. I am unaware of any legitimate calculation that shows otherwise.

The alternative: by some odd coincidence, pollsters as a group, who were within <1% in the last two elections, suddenly all went off beam by an average of 3%. This was noticed right at the moment when it would push the race to Romney, despite an overall Obama +3% lead. All the rarely-polling organizations happen to side with the nonpartisan pollsters, as Drew Linzer has shown. And the three lone holdouts are Rasmussen, Gravis, and ARG.

This hypothesis does not seem parsimonious. It seems to be a transparent ploy. On the other hand, Romney's chief pollster says I am misled.

2008 demographics keep getting thrown in and it keeps getting mentioned that the Obama camp is relying on those numbers being the same. I can see this as a jumping off point but wouldn’t a well crafted poll be looking to identify the current trends? Why poll if you aren’t?

Even so. Cost of buying and selling STILL does not explain the difference. (That cost amount to about 2 percent). It may or may not be legal to do this kind of trading in the US, but it is legal in the Cayman. Come to think of it, may be Bain Capital *is* making money out of it and Romney will laugh all the way to the bank on November 6.

48 hours left — Barack Obama is likely to win, but I am still interested in what happens in FL and NC. Florida is by far the most uncertain state right now, and the early vote numbers for NC show a massive Dem advantage. In 2008 they swung D by ~10k votes. In 2012 early vote is down by a little, but the extent in which the decreased enthusiasm on both sides will affect the final result.

Rats — I meant to say “In 2008 they swung D by ~10k votes. In 2012 early vote is down by a little, but the extent in which the decreased enthusiasm on both sides will affect the final result is unknown.”

After the election you (plural, including Drew, Nate, et al) should call out and name the hacks and shills. Shame them, repeatedly. Hopefully they go away and the discourse moves to a more intelligent plane.

I see that George Will (the climate change denier) has predicted a 321 EV landslide for Romney.

George Will is incapable of shame. Henry Fairlie had his number even back in 1986:

“The writer glides like a skater, and the reader can too easily glide with him. Will in his bow tie is an elegant Victorian skater on the pond, and the maiden on his arm feels blessed. “Ah!” she sighs, “a Tory temperament–you do like to sound oldfashioned, Mr. Will.” Mr. Will pats her muff and skates on: “ … and sentenced to live in this stimulating era.” The maiden begins to flutter, “Oh, to be sentenced … ,” but realizes too late that they have been skating on not even thin ice, and she goes under, as the reader will many times, with no hand held out to rescue her.”

Sorry, I messed up that cut and paste. Here is Henry Fairlie on George Will:

“[George] Will begins with a mildly amusing quotation from Stephen Leacock about the writer’s craft: “Just get paper and pencil, sit down, and write as it occurs to you. The writing is easy–it’s the occurring that’s hard.” But not for Will. “Actually,” he at once says, the ‘occurring’ is not hard for someone blessed with a Tory temperament.”

The writer glides like a skater, and the reader can too easily glide with him. Will in his bow tie is an elegant Victorian skater on the pond, and the maiden on his arm feels blessed. “Ah!” she sighs, “a Tory temperament–you do like to sound oldfashioned, Mr. Will.” Mr. Will pats her muff and skates on: “ … and sentenced to live in this stimulating era.” The maiden begins to flutter, “Oh, to be sentenced … ,” but realizes too late that they have been skating on not even thin ice, and she goes under, as the reader will many times, with no hand held out to rescue her.”

–November 10, 1986

It’s from an entire essay that takes Will down nicely. There’s a nice book of Fairlie essays that was published not long ago. Recommended.

When all his supporters *actually* believed Romney had momentum, they were predicting modest victories and merely hoping he could get near or just over 300 EV.

Now that the momentum that hardly was is clearly gone, their predictions have gone into fantasy land mode showing strong wins and bandying about words like landslide. I guess if you’re going to live in a fantasy, it might as well be a good one. Instead of that 4% chance of Romney winning with 271 or such.

Does your model take into account the voter suppression tactics or the drop in early voting numbers for Obama? Are these things that can statistically accounted for or not? It almost seems like Obama needs to be leading in the state polls by more than 2 or 3 to make up for these things in order to win.

Literally a frequently asked question. :D I think the answer in the past has been that these effects will be small. 2 to 3 points would be HUGE suppression. It may make a difference down ticket more than the presidential race at this point.

There is/was an interesting article in Politico about Romney’s pollster, Neil Newhouse. Neil repeatedly refers to “flawed public polling” as to the reason why he still believes Romney will win. His internal polling tells a different story, he says. Is he just spinning or could there be any truth to what he is saying?

My take as that the Romney camp (and to a large extent the mainstream media) just have a gut feeling that Dem enthusiasm must be down, and therefore any poll that shows D’s outnumbering R’s must be skewed by pollster demographic errors.

I believe that the Romney camp actually believes its own propaganda so the crash they face when science bears out will be devastating for them. Final prediction is Obama with about 50.5% of the popular vote and either 294 or 303 EV depending on which way Colorado goes. In any case Ohio, Wisconsin, and Nevada are solid Obama and that as they say, There she goes!

There’s an overwhelming consensus among nearly all modelers and aggregators that Obama will probably get around 300 or so EVs . If, (and at this late stage it appears to me it’s a very big “if”), Romney gets >270 EVs I’d be interested in any thoughts on the sort of reasons the modelers and aggegators might come up with after the election. For example, the Republican pollster appears to suggest that the public polls are assuming an incorrect demographic turnout but I’m not sure whether there are other assumptions being made in arriving at the EV counts that have the potential, on this occasion, to produce an ‘unlikely’ result.

I kind of rhetorically asked above “Doesn’t a good pollster use their data to help arrive at the demographic profile they think will turn out?” It’s all in the questions they ask. The Republican pollsters keep hanging their hat on that one assumption, over and over and over. Until the polls and Sam’s model are proven wrong, they are the best we have. Sure there is always uncertainty until the ballots are counted, but it is the trailing sides job to create that uncertainty.

Thanks for another very informative article. The three aggregators that seem most reliable to me are PEC, 538, and RCP. Interestingly, as of 11 p.m. on Sunday night, all are predicting the President will get 303 electoral votes.

The FiveThirtyEight blog provides precise predictions on his electoral map (303 electoral votes) but posts somewhat silly numbers on his charts where he splits the difference between winning and losing Florida by boosting his prediction from 303 up to 306…with 306 not being a possible outcome.

Because all three of you are predicting 303 electoral votes, you each can brag “No other aggregator got closer to the final result than we did”.

I’ve seen this confused a lot on here, but Silver is obviously posting the EV mean, not the median, which is much more stable. The EV estimator graph on here jumped a lot today because of exactly this.

It’s embarrassing to watch some of the traditional pundits on TV talk about how they think polls tell us less about likely election outcomes, than their personal “professional” analysis of how Americans feel about this or that, who just said what, who wants what real bad, which candidate appeared where, etc. They practically pump a fist into the air and declare their preferred candidate as the winner, based on something that happened that day or that week, or some kind of “visceral reasoning” that tells them “the voters just gotta be frustrated by X so they’re gonna vote for Y”. They merely pretend to know what they’re talking about in order to try to earn their keep, while being schooled by people actually analyzing what people are really saying. Ed Rollins was on Fox News today doing just this, eventually admitting at one point, “I don’t really know, but my gut tells me that Romney’s gonna win it big.” Chris Cillizza of course made one of the more high-profile numberless, unsourced predictions earlier this week, when he said that, regardless of the numbers, Ohio is a tossup because Romney wants to win it so bad. It’s like having a panel of medieval religious scholars appear on TV today, examining entrails and tea leaves to try to sagely predict the progress and effects of Hurricane Sandy. Pundit-based analysis of how trends, past and present, will translate to future votes, without actually asking the voters, is really showing itself for being a dated dinosaur, and I think people like Rollins are nervously seeing the writing on the wall.

As a Canadian Geezer who sincerely believes that the fate of the common man/woman in the USA is as much a concern for me as it is for the rest of my species around this globe I come to this website and others that are based on statistical science for some succour … I have found some here … in the statistics as presented … and in the commentary of the posters … My only nervous anxiety is that votes will be suppressed or that voters polled will be lethargic. We wish the citizens of America the best … and hope for an Obama victory … yet still despite good analysis … “Half a league, half a league,”
… and until 48 hours hence “all the world wonders”.

I would add that the commentary at Pollster is very good and in depth, well worth checking out. Also I would add TPM’s PollTracker to your list of aggregators. I think it’s a pure aggregator like electoral-vote, but I’m not sure.

ummm….no. Huffpo hosts the execrable Lombardo in a slavish, and its impossible to discuss anything on Blumenthal’s threads in the midst of a multi-thousand comment rant.
Froggy and I usta comment at 538.
Nate has the same problem. His comment section is whelmed with low information conservative trolls ranting about their “gut-feelings”.
Not condusive to mathematical rigor.

Is it possible that the fundamental differences in how data is collected for sports statistics and election polls mean that analysis in one domain is not particularly relavent to successful analysis in the other?

Personally I enjoy looking at the statistical probability of particular professional sports teams winning and then watch the season with anticipation of how the various dynamics effect the outcome — players get injured, managers get fired, trades are made. The odds are readjusted before every game and the eventual championship odds are rarely lopsided. At the end of the season.

Mr. Silver would like to beleive that he has a sophisticated “meta – model” that has predictive value with greater reliability than any indivual poll whose data is aggregated. I beleive that is a mathematical impossibility.

I would posit that the situation is not dissimilar to the continued efforts of financial “chartists” that offer increasingly sophisticated meta-analysis of the movement of stock prices, futures and derivatives. Unfortunately though the time lag between using the essential backward looking explanatory model continues to decrease as computers become ever more powerful the fact remains that forward looking predicition is still not completely accurate. Many learned academics have failed rather spectacularly when they (and others) try to put theory into practice.

The fact is the actual “proof of simulation” for any sports match-up happens with each game played, analysts can tweak their formula mutliple times per week.

The same sort of “on the fly” adjustments can happen for financial models with every execution of a trade — wider ranges of “black swan” type events are actually a boon to those who trade in the less probable price shifts. A well known risk vs. reward gradient exists for things like interest rate moves over time and other such relavent (and hedgeable) indicators.

Mr. Silver is on the second “proof of simulation” under economic, foriegn policy, and incubement circumstances that are wildly different than those of four years ago. If he is successful in predicting even 46 of the states I would strongly encourage him to immeadiately setup shop with a prominent hedge fund, make a huge amount of money very quickly and then retire. If he is not accurate for more than 45 states he will likely be able to learn much from the “second run” of his prediction and can reasonably be expected to garner enough interest from political scientists, econometrics researchers and the huge base of online and print news / opinion sites to profitably pursue both a quasi-academic and pundit-based career for at least another four years…

I am curious about how Silver’s built in “conservatism” in his projections will play out on election day. Perhaps someone more proficient in the statistics can help me out?

Suppose Silver (or anyone) ended up with a large number of state (and/or Senate) probabilities in the 60-85% range . . .
— If these probabilities actually reflect reality, shouldn’t we expect some of these projected outcomes to go against the favorite?
— Conversely, if almost all of the “conservative” projections go toward the favorite, wouldn’t this be an indication that the probabilities were too conservative?

Might we end up with a situation where Silver (or anyone) could call every race correctly, yet undermine the validity of his own model by so doing? (Surely it wouldn’t be reported this way, though.)