Statistical Report Purporting to Show Rigged Iranian Election Is Flawed

Like most Americans, there are few things I would like to see more than Mahmoud Ahmadinejad, Iran’s hateful President, to be voted out of office. Elections in thuggish, authoritarian states like Iran need be treated with the utmost skepticism and scrutiny. I can’t say I have any real degree of confidence in the official results, which showed Ahmadinejad winning with some 62 percent of the vote.

There is a statistical analysis making the rounds, however, which purports to show overwhelmingly persuasive evidence that the Iranian election was rigged. I do not find this evidence compelling.

Iran’s election results were reported by its Interior Ministry in six waves. The first wave covered about one-third of the total vote; there were then two relatively large waves that reported about 20 percent of the vote each, and then three smaller waves that reported the remainder of the vote. What other observers have found is that, over the course of the six waves, there is an extremely strong, linear relationship between the number of votes reported for Ahmadinejad and the number reported for his principal opponent, Mir Hussein Moussavi (who had declared victory before any results were officially announced):

This relationship is superficially very impressive — an R-squared of .998, which suggests a nearly perfect relationship.

Just how remarkable really is it, however? Rather than deal in abstractions, let’s try a more concrete sort of experiment. Suppose that results from last November’s election between Barack Obama and John McCain were revealed in this fashion, in six large waves. Suppose moreover that these waves were determined based on the alphabetical ordering of the states:

Wave 1: Results from Alabama-Illinois are reported; this represents about 33% of the total vote.Wave 2: Results from Indiana-Mississippi (17% of the total vote) are added to the above totals.Wave 3: Results from Missouri-North Carolina (19%) are added.Wave 4: Results from North Dakota-Pennsylvania (12%) are added.Wave 5: Results from Rhode Island-Texas (10%) are added.Wave 6: Lastly, results from Utah-Wyoming (9%) are added and the counting is complete.

If results were released in this fashion, here is what we would get for the total number of votes for Obama and McCain at each stage:

Now, let’s plot these on a graph:

Wow! The correlation is extremely high — an R-Squared of .9959 — almost as high as the one we saw for Iran. Does that mean the U.S. election was rigged too?

Of course not. The apparently extremely strong relationship is mostly an artifact of the exceptionally simple fact that as you count more votes, both candidates’ totals will tend to increase. In our example, Wave 5 happens to be a very good one for McCain: it contains the results from South Carolina, South Dakota, Tennessee and Texas — four red states — plus Rhode Island, which went for Obama but contains a tiny number of votes. And yet, the impact of Wave 5 is barely visible when the results are presented in this fashion.

Likewise, there was more wave-to-wave variation in the Ahmadinejad-Moosavi results than the statistical analysis I cited above seems to imply. Ignoring votes for minor candidates, Ahmadinejad won a high of 70.4 percent of the votes in Wave 1, and a low of 62.3 percent in the votes newly added in Wave 6. By comparison, Obama’s share of the newly-added votes in our experiment ranged from 56.4 percent in Wave 3 to 44.7 percent in Wave 4. That’s slightly more variance than we saw in the Iranian results but not much.

To be clear, these results certainly do not prove that Iran’s election was clean. I have no particular reason to believe the results reported by the Interior Ministry. But I also don’t have any particular reason to disbelieve them, at least based on the statistical evidence. If Moosavi truly did have the support of a majority of Iran’s citizenry, the best evidence we will have of that is what happens in the streets of Tehran over the next days and weeks.

EDIT: In case this isn’t clear, I am not suggesting that any and all statistical analysis purporting to show tampering in Iran’s election results will turn out to be fruitless. I am merely suggesting that this particular analysis is dubious; it is not a smoking gun.

To properly analyze Iran’s election results is probably something best left to Middle East experts, rather than experts on U.S. electoral politics. Juan Cole, for instance, who certainty does know a thing or two about foreign policy, sees plenty of things that smell fishy to him.

Still, though, would it really be all that hard to rig an election in a way that would be hard for statistical analysis to detect? Suppose that you’re Ahmadinejad, and that you become convinced based on the actual vote totals that you’re on track to lose by several points. Could you not simply take every tenth vote, or every fifth vote, that came in for Moussavi, and count it for yourself? This would preserve an element of randomness and would make the province-by-province results look reasonably correct relative to one another.

My point, I suppose, is this. Out of all the things you’d need to do to rig an election, coming up with a set of results that managed to avoid easy statistical detection would probably be one of the easier ones. So I’m skeptical that statistical analysis alone is going to turn up evidence of fraud. But I’ll be keeping an eye out for other approaches, particularly from those who have a deeper understanding of the Iranian state than I do.

Nate Silver is the founder and editor in chief of FiveThirtyEight. @natesilver538