October 21, 2008

I am currently updating a tool I wrote which can help to make projections on election night. I used it during the primaries and it simply takes the current vote tallies per county, along with the “% precincts counted” and then extrapolates each county individually to arrive at the totals for the whole state. This can be super helpful in cases where a big city may only have 10% of the votes counted whereas the rest of the state might already be 90% done – in which case the overall numbers will have a big skew away from the city voting patterns.

Anyway, subscribe to my feed and I’ll post a link here when it is ready. And, obviously it will only be useful for a couple hours on Nov 4th – less than 20 days away!

October 18, 2008

I’m sure that all of my tiny handful of readers have already found these sites…. but these are some great site which really carefully track the many state-by-state and overall polls. They have a pro-Obama bias in their writing/blog posts, but they do seem to be open minded and fair when it comes to interpreting the polling data. The problem right now is that they are seem to be arguing about how to get the most accurate numbers based on all these polls, based on how well each polling method/company did during the primaries or previous elections… BUT, we are still 3 weeks away and even if the polls were 100% accurate they still won’t predict what will happen on election day since who knows what could happen between now and then. Stock market jumps to 14,000? Terror? major gaffe?

February 20, 2008

About 60-80% of the ballots in the WA State Democrat primary have been counted, and using those numbers, along with the already counted results in each county, I have extrapolated the final vote counts in all the counties for Clinton and Obama. Right now, Obama is winning with 51.6% vs. 48.4% for Clinton (ignoring all other candidates on the ballot). After the final votes are counted, I am predicting that Obama will slightly widen his lead (mostly because King County still has more remaining ballots than most other counties) with a final fraction of 51.8% vs 48.2%.

Another way to look at this is that of the ~500,000 Democrat ballots counted, Obama got 16,000 more than Clinton. After counting the remaining ~200,000 ballots, Obama should widen his lead to 25,000.

This is not a trivial or expected result. For example, it could have been that the strong Obama counties had finished counting their ballots, and most of the remaining pro-Hillary counties still had a lot of counting to do, and the close results could have been swapped by the time the counting is finished. That’s why the media folks have not predicted the results yet. Another bias is that since this election had many absentees (postmarked – not received – by Tues, Feb 19), then the later absentees could have a different preference than the early absentees.

January 30, 2008

I’ve assumed that Edwards and Obama seem to be drawing from the same pool of voters, and that he has been drawing votes away from Obama. But, maybe Edwards and Clinton are both fighting over the traditional Dems and Obama is getting a lot of Independent support? Or, in South Carolina, even though though Obama had a great day, one observation from the exit polling was that if you only looked at white voters, then Edwards would have won, with an estimated 40% of the white vote (Clinton got ~36% and Obama got ~24%). So, maybe Edwards’ voters are uncomfortable voting for Obama? Will they all vote for Hillary?

Well, we can take a look at Exit Polls to get some clues. Unfortunately, they never ask: “If you didn’t vote for your #1 candidate, whom would you vote for?” However, in New Hampshire, South Carolina and Florida, there are some tangential questions in the exit polls which relate to how they feel about the other candidates.

But, it is actually kind of hard to read the exit poll results. You have to know which rows and columns add up to 100% (or close) and which ones are a breakdown of the other. For example, here is one table from the South Carolina exit poll (which combines two questions), asking about the gender and race of the voter:

% Total

Clinton

Edwards

Kucinich

Obama

Black male

20

17

3

0

80

Black female

35

20

2

0

78

White male

19

29

44

0

27

White female

27

42

35

0

22

The proper way convert that data table into a sentence or paragraph format is like this: Of the voters in the SC primary, 20% were black males, 35% were black females, 19% were white males and 27% were white females. Of the white males (for example), 44% voted for Edwards, 29% voted for Clinton and 27% voted for Obama; on the other hand, of the white females, 42% voted for Clinton and only 35% and 22% voted for Edwards and Obama, respectively.

Easy, right? The rows (excluding the first column) each add up to 100%. That is why we can write the sentences we wrote above: each row is a breakdown of that particular category.

However, sometime we don’t want to phrase our explanation the same way that we did above. Sometime we want to know which answers were chosen by supporters of a particular candidate. But, none of the candidate columns add up to anything useful! To demonstrate this issue, it is useful to look at a table where the rows (answers) are not evenly distributed.

Here is another exit poll question from SC: “Do you think this country is ready to elect a black president?”

% Total

Clinton

Edwards

Kucinich

Obama

Ready

77

21

16

0

63

Not ready

22

48

29

0

23

So, 77% of voters think the country is ready to elect a black president and 22% of voters think we’re not ready. Looking at the table, you might also try to incorrectly conclude that more of Edwards’ voters think the country is not ready (29 is higher than 16, right?). Wrong. Since only 22% of voters think we are not ready, and 29% of people choosing that answer voted for Edwards, then 29% of 22% of Edwards’ voters think that we are not ready. Calculating 29% of 22% means only about 6% of all the voters. And, we know from the overall results that 18% of all the voters chose Edwards, so that means that MOST of the Edwards voters actually think we ARE ready for a Black President.

To try and make these exit poll tables more intuitive, I like to convert them to a different format. Imagine that there are exactly 100 voters, and that each number in the table shows the total number of voters with that particular candidate and answer combination (to convert the table, just multiply the percentage in the candidate column by the percentage of people choosing a particular answer). So, after you convert the normal exit poll table to a 100-person (or 100%) table, it ends up like this:

Clinton

Edwards

Kucinich

Obama

USA Ready for a Black President

16

12

0

49

USA Not ready for a Black President

11

6

0

5

Now this can be understood fairly easily. The 100 voters (or 100%) is distributed among all the boxes in the table. The rows add up to the total percentage choosing that answer in the poll and the columns add up to the total percentage that each candidate received. This table would be read as follows: Of all the voters in the primary, 12% voted for Edwards and also thought the US is ready for a Black President, and 6% of all voters chose Edwards but thought the US is NOT ready for a Black President (or 2/3 of the Edwards voters thought we are ready for a Black President).

(There is another way to convert these tables, which is to have each COLUMN add up to 100%, but then the rows are hard to interpret. I like this format shown here, because you can intuitively understand the rows or the columns – even though you might have to do some extra calculations in your head to get some summary percentages.)

Here are some other exit poll results, from other states and shown in the 100-person converted format which relate to how the voters feel about Clinton and Obama:

SOUTH CAROLINA:
Do you think this country is ready to elect a woman president?

Clinton

Edwards

Kucinich

Obama

Ready

25

13

0

38

Not ready

2

6

0

15

Summary from South Carolina: A large majority (more than 2/3) of both Edwards and Obama voters think that the country is ready for a woman president. And, a majority of Edwards and Clinton voters also think the US is ready for a Black President (see above).

FLORIDA:
Do you think this country is ready to elect a black president?

Clinton

Edwards

Kucinich

Obama

Ready

32

9

1

30

Not ready

16

6

0

3

Do you think this country is ready to elect a female president?

Clinton

Edwards

Kucinich

Obama

Ready

46

8

0

26

Not ready

2

6

0

7

No matter how you voted today, how would you feel if Hillary Clinton wins the nomination:

Clinton

Edwards

Kucinich

Obama

Satisfied

48.8

6.4

0.8

24.0

Dissatisfied

0.6

7.6

0.2

9.8

No matter how you voted today, how would you feel if Barack Obama wins the nomination:

Clinton

Edwards

Kucinich

Obama

Satisfied

29.4

7.0

0.7

32.9

Dissatisfied

18.9

7.5

0.3

0.9

Summary of Florida: Edwards voters in Florida are not as willing to believe that the US is ready for a black or woman President as they were in SC, and those voters are equally split between their satisfaction and dissatisfaction over how they would feel if Clinton or Obama wins the nomination.

NEW HAMPSHIRE:

Is your opinion of Hillary Clinton:

Biden

Clinton

Dodd

Edwards

Gravel

Kucinich

Obama

Richardson

Favorable

0.0

37.7

0.0

11.1

0.0

0.7

19.2

3.0

Unfavorable

0.3

0.3

0.0

6.5

0.0

0.5

16.0

1.5

Is your opinion of Barack Obama:

Biden

Clinton

Dodd

Edwards

Gravel

Kucinich

Obama

Richardson

Favorable

0.0

26.0

0.0

15.1

0.0

1.7

35.3

4.2

Unfavorable

0.3

11.4

0.0

2.4

0.0

0.3

0.2

0.6

Summary of New Hampshire: 11% of NH voters voted for Edwards and have a favorable opinion of Hillary, while 6.5% voted for Edwards and have an unfavorable opinion of her (i.e. 37% of Edwards voters have an unfavorable opinion of Clinton). On the other hand, only 14% of Edwards voters have an unfavorable opinion of Obama.

Overall conclusion? I think that Edwards’ supporters are somewhat evenly split between Clinton and Obama, but I think Obama will gain slightly more from Edwards dropping out of the race than Clinton will – mainly due to the slightly higher unfavorable view shown by the New Hampshire voters.

January 29, 2008

Wow! The Florida SOS has set up a really great website to report the results for today’s primary. Interestingly, of their 67 counties, 14 use touch-screen voting machines. Also, in addition to absentee and polling place voting, they have early-voting where you can go to a nearby govt office and cast your vote up to 2 weeks before today.

They even have a way to just download the data in a tab-delimited file (see link on left hand side of their page)! Luckily, they don’t care if you fill out the form with a GET or POST, so here is a direct link (you can use wget, too):

I made a script to calculate the predicted totals, based on differing voting patterns in different counties.

One thing I always like to do here at home in Washington State is get the early returns and use the differences in county voting patterns to extrapolate the totals for the whole state (based on estimates of turnout). This is usually interesting in WA because we have a large chunk of people who vote absentee, but the law here only requires the ballots to be MAILED by election day, so we end up waiting a few days to see how the election ends up. In a close race with lots of absentees, we have a lot of elections dragging out for days.

In Florida, there are also a lot of absentee ballots, but they are required to have them ARRIVE by election day (I think). So, perhaps the first reports on the Florida website will have all the absentees in one batch. This makes it easier to see if there are differences in voting patterns between the absentees and the poll voters.

January 27, 2008

Super Tuesday is coming up. Based on the differing support that the candidates received based on the voters race (in South Carolina in particular), I though it would be interesting to see the racial makeup of the states in Super Tuesday. For states where exit polls exist in the 2004 Primary for Democrats, I reported those values. I also went to the Census to find the fraction of the population which is African American in all the upcoming Super Tuesday states.

In another note, it looks like Latino voters are not too fond of Obama. In Nevada (the only state so far with appreciable numbers of Latinos), he got his lowest support from them. This could be bad for him in those states with high numbers of Latinos (CA, AZ, NY). An interesting article from the Washington Post goes into more details about the importance of the Latino vote in California.

This leads to an estimate of 529,771 total voters (for the top 3), of which 291,374 were cast by African Americans and 238,397 were cast by whites (and others).

If we assume the total number of white voters is constant and reduce the black turnout (but keeping the same candidate distribution), then we can estimate what the results would have been had the black voter turnout been much lower.

In fact, the turnout could have been as low as 18%, and Obama would have still won!

It would be sad to see the Clintons or media spin or suggest Obama’s good showing in South Carolina as solely due to the very strong African American turnout or the unique demographics in SC. Even in states with an average number of African Americans (and the same candidate breakdown*), Obama would still have won. Given that a vast majority of African Americans vote for democrats, and about half the US voters are democrats, and overall about 12% of the population is African American, then 18%-20% is a decent estimate for the total fraction of Democrats voting in a democrat primary.

*(Note: this assumption that there will be the same candidate breakdown in future states is of course not true, since Edwards’ support will not likely get any better than it was in his home state of South Carolina. The big question is: does Edwards take away votes from Clinton or Obama?)

Interestingly, Clinton was the most race-neutral candidate. She had steady support (20% of black voters and 36% of white voters), while Edwards had very strong white support (40% of white voters) and almost no black support (2% of black voters). Thus, if the black turnout was only 18%, then it would have been a virtual dead heat with all candidates getting about 33% (but Obama getting slightly more). Also, Clinton would not have won at any of the black-turnout levels.

In the 2004 Democrat Primaries in South Carolina, 47% of the voters were black, and in this primary, 55% of the voters were black. So, even with average black turnout in South Carolina, the vote would have still favored Obama by a very large margin.

January 20, 2008

Some professional statisticians have taken a look at this data and determined that there are no significant correlations between the counting method and Clinton winning, when taking other variables into account. These other variables include location, past voting, affluence, and others.

January 14, 2008

The observation that the Diebold optical vote scanners may have a bias against Obama (and/or in favor of Clinton) is disturbing. But, before claiming fraud, we need to take a more careful look at the data. Perhaps the townships which use the scanners are generally larger – and the larger townships tend to like Clinton better? Or maybe the towns with Diebold machines are more conservative/liberal and vote differently for Clinton? Or perhaps there are other socio-economic factors which may correlate with the use of the Diebold machines? Many of these reasons have been proposed, but none seem to negate the vote counting effect.

One fairly obvious variable that has not been checked is location. The experts seem to think that the results make sense based on what they know about the geography and locations of the towns within New Hampshire. But, while I agree that the experts and pundits know that different parts of the state vote for different candidates, no one seems to care about the distribution of the actual vote counting methods within the state (which is the main issue).

Figure legend: On the left is a map showing where hand counting and machine counting is used, and on the right shows where the small, medium and big townships are located – and the locations of hand and machine counting for medium sized towns (500-800 democrat votes). (I have updated this map on Jan 17th with better data).

My first observation from looking at the left map is: “That explains it! All the towns which use Diebold machine are in the southeast of the state! If it was a patchwork of man vs. machine, then fraud would be more likely, but now I think that this is all just a location effect.”

But, then I made the map on the right and noted that the larger towns are also in the SE of the state – which agrees the previously observed strong correlation found between the size of the township and the vote counting method.

This means that when we look at just the medium sized towns – for which the Diebold pro-Clinton bias still exists, the map is now a patchwork! Thus, these mid-sized towns seem to not be grouped by their vote counting method and the Diebold bias still exists in that set of towns. So, maybe there really is some fraud there?

As you have probably heard, the NH Sec. of State will be doing a manual recount if Kucinich and Albert Howard can pay for the estimated cost. I don’t know if this is allowed by the NH-SOS, but perhaps it would be more affordable if only the mid-sized towns were counted – since that is really the only place where this potential bias is reliably detectable. The small towns and big towns are already biased in terms of their favorite candidate and biased in terms of the vote counting technique, but the medium towns seem to be missing those confounding variables.

(it has been updated a few times, the earliest version had errors based upon my source. I wrote a script to parse the data from http://checkthevotes.com/. A few days ago, when that data was hosted at http://ronrox.com, this page was much simpler to parse. Now it is a bit of a mess. The reason I chose this site is because it had all the results and counting methods on one page. He grabbed his data from politico.com).

Because of discrepancies between the checkthevotes and the NH SOS website (nearly all the towns have +/- 5 vote count differences and there are/were 19 incorrect hand/Diebold misassignments), I wrote a script to parse the ugly HTML files containing the official results. I added a new worksheet to my Google spreadsheet above, and also made the output available on my box.net account (so far, just the Dems). I will post the script at some point if you want to check it.