Wednesday, October 22, 2008

How likely is the big upset?

Last time I wrote, I declared that Obama had the election in the bag, according to Intraders participating in the state markets. Since then, Obama's strength has only solidified. For the last two weeks, Intraders have been saying Obama is the more likely candidate to win in states totalling 364 Electoral Votes.

Today, even if we give McCain every state where Intraders have Obama's chances below 80%, Obama still gets 273 votes and wins. In other words, let McCain keep every red state even if it is 51-49. Further, take FL (where Dems are currently trading at 63), MO (at 67), NV (73), NC (58), OH (61), and VA (78) away from Obama and give them to McCain and Obama still wins.

So according to Intrade's state markets there is basically no chance McCain can put together the Electoral Votes he needs to wim. Indeed, my simulation has consistently been putting McCain's chances at well under 1%.

At the time of my last column two weeks ago, the price on the Obama national contract was about 65, and it has since moved to about 85. From a certain perspective, 99%+ seems too high for the chance of almost anything that is two weeks in the future, and you suspect something must be inconsistent. From another perspective, the best a predictor can do is give you an answer with near certainty well in advance of the event. Intraders in the state markets have basically no doubt.

The action then, is in the question of how many electoral votes exactly will each candidate get. Intrade has a collection of markets for that as well, and through the simulation, we can get a good picture of what the state markets have to say on the question. A trader might look at the national price of 85 on 2008.PRES.OBAMA and figure the price of the Obama wins >270 Electoral Votes ought to be close to the same. Further, a trader might see that Obama is “expected” to win 364 Electoral Votes and suspect that the price of Obama winning > 360 Electoral Votes (or McCain winning > 170) ought to be around 50. Indeed, that is about where the Intrade markets on Electoral Vote counts are trading.

But the state markets have more to say than that. I simulate the joint dynamics of the state markets out to election day and produce a distribution of how many electoral votes each candidate ultimately may win. We can compare that distribution (in pink) vs. the Intrade Electoral Vote Count Markets (in blue), for the Democrat:And for the McCain, again my results from state markets in pink and Intrade's Elecotral College vote count markets for the Republican in blue:Fundamentally, the state markets are saying something different than the national markets and the electoral count markets. The state markets are actually implying a quite well-settled race at this point (not much can happen that will actually swing a lot of electoral votes). In the price of the national contracts and in the distribution implied by the electoral count markets, we see that Intraders still demand a significant risk premium for the suspicion that something big may yet happen.

31 comments:

The problem with this calculation isn't in the distribution of states based on the odds that are presented, but in the willingness to assume that people are accurately predicting how the states will swing. I haven't looked at the volume of trades within each state, but I feel that its safe to insinuate that each individual state's betting rates are much lower than the national market. Moreover, there's less information and more wiggle room within each state, and this has to scare investors. If this was a popular vote election, Obama would have this in the bag. Unfortunately, there's too much skepticism over a Pennsylvania, Iowa, Wisconsin, or Minnesota taking an unexpected flip for people to readily commit to the Obama nation. I for one believe that Obama is going to win, but there's still too much unknown within the states for people to assume that the percentages presented are a certainty.

The obvious question that arises, though, is why the win probabilities as implied by the electoral college total vote markets differ so significantly from the state markets, when each of those have about the same liquidity.

The Stumpo model assumes the future is just like the past. That is an assumption that doesn't admit the possibility of a major shift in the race. But the individual to win markets should factor in the possibility of such an event.

My guess is the following: The correlations calculated historically between states are an underestimate of the true correlations. The true correlations are higher than historical correlations because if such an event were to occur, and it hasn't yet, we would see all/most state races shift at the same time. Since it hasn't happened yet, no such collective move has informed the historical correlations.

In other words, the assumption that present will be like past implicitly discounts the possibility of a cataclysmic shift. The individual markets incorporate this possibility. Thus the discrepancy.

Dr. Stumpo? Any thoughts? You could probably build this in by making each of the correlations a random variable, with mean equal to the historical value.

A potential reconciliation is indeed that future covariance of states is expected to be higher than past covariance. The recent past does already include some 'cataclysmic shifts' though. The initial Palin effect (which highlighted that there may even be negative correlations between states), and of course the economic crisis.

A credible argument could be made that the next two weeks will actually be less volatile than the past few months. Both because the past month was so historically volatile and because polls report the number of undecideds as quite small now.

National contracts have a volume of about 30,000/day.State contracts have a volume of about 3,000/day. So yes, there is less. But 3,000 is not an insignificant number. Generally in markets, more volume leads to better information - up to a point, then it levels off pretty quickly and extra volume provides little incremental quality of information.

I can think of a rather simple way to reconcile the state and national figures: Consider the national figure as a confidence rating for the data of the state contests.If you accept, as I do, that the state markets represent good data and that you further accept Cody Stumpo's simulation number of 1%< (which I also do) then what explains the (vast) spread between traditional odds of more than 100-1 (state-by-state) and the rather conservative spread of approx 7 - 1 (national contests)? Simple: the national market are expressing a confidence level of approximately 87% in the state markets. Those investing in the national poll are also (instinctively) recognizing that the number of electoral college votes won is part of a chaotic system that could shift radically with even a modest ‘October Surprise’. I will be putting my money where my mouth is by now buying Obama.pres at 87-something.

I still think Obama will win but by less than 300 electoral college votes. This race will tighten as they always do...and Mccain has an outside chance to still win. The trading range of McCain I think is at its alltime low (about 13.5 on Oct 23) between now and about Nov 1. As race tightens...the McCain to win could rise to about 30-35 before petering out as Obama Nation, barring some earth shattering event, begins to become more of a reality as we get close to Nov 4.

Maybe people are simply adjusting for risk on the national polls, while not doing the same for risk on the state polls. This might be because the national poll is much easier to adjust for risk, or it might be because people just forget to adjust for risk on the states, and merely look at the polls and trends. Either way, you're probably right, people are underestimating Obama for president, but overestimating Obama for each individual state. If enough can occur for Penn to go Republican (which is what people are theoretically concerned about), then Wisconsin, Minnesota, Iowa, and a handful of other states might very well also flip in a move this monumental. Its just difficult to understand how much to hedge your bets in these individual states.

It seems that Cody's simulation assumes that the states act independently. That basic fallacy explains the difference between Cody's simulation of the state markets and the national market."

Ken, go back and read the earlier posts in this series. Stumpo is decidedly NOT assuming independence. The states are all correlated in his simulation, and the correlations calculated from historic state market prices. In other words, the only assumption is that the future is like the past, and that the market prices will predict correctly on election day.

Futures markets DO allow for future events to be different. If it were not the case then Obama would be at 100% as no one has ever come back from a deficit this large this late in the race. The market still has to factor in a terrorist attack or a film of Obama having relations with a barnyard animal or something. Nothing is 100% till it is history.... if then.

AP presidential poll: McCain gains, drawing even with Obama with two weeks until Election Day

Democratic presidential candidate Sen. Barack Obama, D-Ill., shakes hands with a supporter on the tarmac at the airport in Richmond, Va. Wednesday, Oct. 22, 2008.(AP Photo/Alex Brandon) 10-22-2008 3:06 PMBy LIZ SIDOTI, Associated Press WriterWASHINGTON (Associated Press) -- The presidential race tightened after the final debate, with John McCain gaining among whites and people earning less than $50,000, according to an Associated Press-GfK poll that shows McCain and Barack Obama essentially running even among likely voters in the election homestretch.The poll, which found Obama at 44 percent and McCain at 43 percent, supports what some Republicans and Democrats privately have said in recent days: that the race narrowed after the third debate as GOP-leaning voters drifted home to their party and McCain's "Joe the plumber" analogy struck a chord.Three weeks ago, an AP-GfK survey found that Obama had surged to a seven-point lead over McCain, lifted by voters who thought the Democrat was better suited to lead the nation through its sudden economic crisis.The contest is still volatile, and the split among voters is apparent less than two weeks before Election Day."I trust McCain more, and I do feel that he has more experience in government than Obama. I don't think Obama has been around long enough," said Angela Decker, 44, of La Porte, Ind.But Karen Judd, 58, of Middleton, Wis., said, "Obama certainly has sufficient qualifications." She said any positive feelings about McCain evaporated with "the outright lying" in TV ads and his choice of running mate Sarah Palin, who "doesn't have the correct skills."The new AP-GfK head-to-head result is a departure from some, but not all, recent national polls.Obama and McCain were essentially tied among likely voters in the latest George Washington University Battleground Poll, conducted by Republican strategist Ed Goeas and Democratic pollster Celinda Lake. In other surveys focusing on likely voters, a Washington Post-ABC News poll and a Wall Street Journal-NBC News survey have Obama up by 11 points, and a poll by the nonpartisan Pew Research Center has him leading by 14.Polls are snapshots of highly fluid campaigns. In this case, there is a margin of error of plus or minus 3.5 percentage points; that means Obama could be ahead by as many as 8 points or down by as many as 6. There are many reasons why polls differ, including methods of estimating likely voters and the wording of questions.Charles Franklin, a University of Wisconsin political science professor and polling authority, said variation between polls occurs, in part, because pollsters interview random samples of people."If they all agree, somebody would be doing something terribly wrong," he said of polls. But he also said that surveys generally fall within a few points of each other, adding, "When you get much beyond that, there's something to explain."The AP-GfK survey included interviews with a nationally representative random sample totaling 1,101 adults, including 800 deemed likely to vote. For the entire sample, the survey showed Obama ahead 47 percent to 37 percent. He was up by five points among all registered voters, including the likely voters.A significant number of the interviews were conducted by dialing a randomly selected sample of cell phone numbers, and thus this poll had a chance to reach voters who were excluded from some other polls.It was taken over five days from Thursday through Monday, starting the night after the candidates' final debate and ending the day after former Secretary of State Colin Powell broke with the Republican Party to endorse Obama.McCain's strong showing is partly attributable to his strong debate performance; Thursday was his best night of the survey. Obama's best night was Sunday, hours after the Powell announcement, and the full impact of that endorsement may not have been captured in any surveys yet. Future polling could show whether either of those was merely a support "bounce" or something more lasting.During their final debate, a feisty McCain repeatedly forced Obama to defend his record, comments and associations. He also used the story of a voter whom the Democrat had met in Ohio, "Joe the plumber," to argue that Obama's tax plan would be bad for working class voters."I think when you spread the wealth around, it's good for everybody," Obama told the man with the last name of Wurzelbacher, who had asked Obama whether his plan to increase taxes on those earning more than $250,000 a year would impede his ability to buy the plumbing company where he works.On Wednesday, McCain's campaign unveiled a new TV ad that features that Obama quote, and shows different people saying: "I'm Joe the plumber." A man asks: "Obama wants my sweat to pay for his trillion dollars in new spending?"Since McCain has seized on that line of argument, he has picked up support among white married people and non-college educated whites, the poll shows, while widening his advantage among white men. Black voters still overwhelmingly support Obama.The Republican also has improved his rating for handling the economy and the financial crisis. Nearly half of likely voters think their taxes will rise under an Obama administration compared with a third who say McCain would raise their taxes.Since the last AP-GfK survey in late September, McCain also has:_Posted big gains among likely voters earning under $50,000 a year; he now trails Obama by just 4 percentage points compared with 26 earlier._Surged among rural voters; he has an 18-point advantage, up from 4._Doubled his advantage among whites who haven't finished college and now leads by 20 points. McCain and Obama are running about even among white college graduates, no change from earlier._Made modest gains among whites of both genders, now leading by 22 points among white men and by 7 among white women._Improved slightly among whites who are married, now with a 24-point lead._Narrowed a gap among unmarried whites, though he still trails by 8 points.McCain has cut into Obama's advantage on the questions of whom voters trust to handle the economy and the financial crisis. On both, the Democrat now leads by just 6 points, compared with 15 in the previous survey.Obama still has a larger advantage on other economic measures, with 44 percent saying they think the economy will have improved a year from now if he is elected compared with 34 percent for McCain.Intensity has increased among McCain's supporters.A month ago, Obama had more strong supporters than McCain did. Now, the number of excited supporters is about even.Eight of 10 Democrats are supporting Obama, while nine in 10 Republicans are backing McCain. Independents are about evenly split.Some 24 percent of likely voters were deemed still persuadable, meaning they were either undecided or said they might switch candidates. Those up-for-grabs voters came about equally from the three categories: undecideds, McCain supporters and Obama backers.Said John Ormesher, 67, of Dandridge, Tenn.: "I've got respect for them but that's the extent of it. I don't have a whole lot of affinity toward either one of them. They're both part of the same political mess."___AP Director of Surveys Trevor Tompson, AP News Survey Specialist

that poll was clearly biased... they interviewed 44% who consider themselves evangelicals. last election 20-22% fell in that category... this year it could be 18-20% with increased youth/minority vote...

The thesis of this article is wrong and ignores basic mathematical principles. What the author views as a disparity between state-by-state odds and national odds is simply evidence that intrade users believe the state-by-state outcomes will be highly CORRELATED. Consider the extreme case of total correlation... Obama could have a 90% chance of winning in each individual state, yet 10% of the time he would lose ALL 50 STATES (due to fully correlated outcomes) and hence lose the national election. He still loses each state only 10% of the time, but all at the same time.

Regarding my comment below, if the author IS assuming correlation based on historical data, then intrade users are implicitly predicting higher correlation this year than in the past. They may or may not be right, but that is what they are predicting.

The point about "investors in the state markets must expect the volatility / correlation to increase in the future" could have been made each day for most of the last 3 months. The expected increase has yet to be seen.

something that your simulations are probably missing is that the probabilities in each state are not independent of one-another. If you treat them as independent and run a monte carlo simulation, you will indeed find that the chances of McCain winning are less than one percent.

But the probabilities are not independent. If McCain were to win Florida, it would be as a result of becoming much more popular. If he were to become popular enough to win in Florida, he would likely also win in Missouri, Indiana and North Carolina at the very least. If the probabilities in those states were independent, his chances of winning all four would only be about 2%. But because there is a high correlation between them, his chances of winning all four are probably more along the lines of 20-25%.

No offense or anything but erroneously assuming independence here is a very sophomoric mistake. Your stats street cred is pretty well shot to bits. The national probability is the best measure of McCain's overall chances, it doesn't matter what the states say.

Hughes, read back through the other posts and make sure you understand them. Stumpo's model is very sophisticated. It uses the historical state prices to get the state correlations. So no, it obviously doesn't assume independence, and yes that would be stupid. The model is very good IMO, and I have a PhD in this stuff. What's interesting is why the projections from this model disagree with the individual markets for Obama and McCain.

About the correlation and volatility, Cody said "the expected increase has yet to be seen." Isn't that because a large part of the expected increase is on election night itself?

Aside from a last-minute mega-gaffe or terrorist attack, a number of factors can only come into play when the final results start coming in on election day -- a Bradley effect, systematic voter suppression, electronic voter machine fraud or security vulnerabilities, any systematic misanalysis that it turns out we've been doing on all these polls, regarding non-landline voters, first-time voters, or whatever.

Any such event would correspond to a monster correlation on Tuesday. This makes me wonder what kind of historical correlation data is being used in this model. Does it include previous election days?

Incidentally, I'd also be curious to know more about the mathematics of the model. I'm sure it's not just a bunch of Brownian motion set to a historical covariance matrix, because the model is presumably constrained so that all the probabilities are 0 or 1 by the end of election day. I'm curious how a constraint like this gets worked into the model. Any recommendation for a good reference about models like this? Thanks!

Dan, interesting question about how to handle the bounded prices. I tried doing a bounded random walk for a while, but that requires some kind of parametrization that I felt would be arbitrary. I settled on letting the D & R prices run free each simulated day, but then normalizing them to add to 100 each day.

I'm curious how this system accounts for systematic election fraud. Many of us do expect significant election fraud, especially on electronic machines that "magically" end up with 100% Republican tallies (happened in 2004 in Ohio), or with more votes than voters in the district (2004 in Indiana).

29, election fraud should be baked into the prices and price movements already. (investors should be making their buy-sell decisions and forming prices knowing that election fraud is a possibility). The simulation need take no additional steps to account for it again.

The problem with handling the voter fraud issue solely by assuming that the possibility is built into the state prices is that if the public is wrong on this issue, it will most likely be wrong in virtually every state, and in the same direction. Voter fraud is an amorphous concept, and it's not as if the individual state intraders are especially privy to information on this; they're just making educated guesses. Presumably the guessors will be similarly biased in their guesses, regardless of the state they are guessing about. So if voter fraud is higher than generally anticipated, the entire simulation would be affected because EVERY state's probability distribution would be skewed in the same direction. As a result, the final electoral vote result could be far more volatile than it seems at first glance