Recent blog entries by JoeNotCharles

We have a set of ballots containing the year end Best Albums of 2010 lists from 9 publications. Yesterday I listed 12 possible ways to count these ballots to come up with a unified list.

Today I’ll go through the ways that don’t work for this data.

First Past the Post

I’ve said that First Past the Post is a terrible voting system. Now, let me demonstrate:

First we count up the top vote from every ballot. (Easy to do by hand, since there are only 9 ballots.) The winner is the #1 album of 2010. Then we drop that winner and count up the top votes that remain. The winner is the #2 album of 2010. Repeat until we’ve ranked all the albums.

So Kanye West has the #1 album of 2010, by this count. That doesn’t make any sense – 4 publications ranked him #1, but 3 others didn’t like his album enough to rank it at all! Compare Arcade Fire, whose album was liked enough to put in the top 25 by everyone: on the 4 lists where Kanye is #1, Arcade Fire is #2, #3, #4 and #11, but on the 5 other lists, Arcade Fire beats Kanye West hands down. So over half of our voters greatly prefer Arcade Fire to Kanye West, and most of the remaining voters only prefer Kanye by a tiny amount. Only Pitchfork (Kanye West at #1, Arcade Fire at #11) would be greatly dissatisfied by putting Arcade Fire ahead of Kanye West, while Mojo, Q, and NME would be extremely dissatisfied to put Kanye ahead of Arcade Fire. (And NPR, with Arcade Fire at #1 and Kanye West at #10, is a mirror of Pitchfork – call that greatly dissatisfied. And while Rough Trade wouldn’t be happy with either result, having ranked Arcade Fire way down at #21, they’d surely be even more pissed off if the win went to Kanye West, who they didn’t rank at all.)

More concisely: Arcade Fire is the Condorcet Winner. Kanye West should not win. So we can discard this result already.

Just for fun, let’s remove Kanye and see how the top 3 ends up. Votes for #2 are:

#3 is The Black Keys, by 1 vote. But we were perilously close to a 9-way tie for 3rd. Which illustrates the weirdness of our data set: in most elections, there are a handful of candidates and hundreds of voters. We have hundreds of candidates and only a handful of voters. Some voting methods will produce a lot of ties, just because there are so few votes to go around that everyone will get one. These might be perfectly good voting methods for most elections, they just fall down on this edge case.

It looks like First Past the Post will be vulnerable to ties – if one list had ranked The Black Keys a little lower, we’d have one here. But it doesn’t matter, since we’ve already rejected it for failing to elect the Condorcet Winner. FAIL.

Approval Voting

Here’s one of those ties now. In approval voting, every album which appears on a list at all gets 1 point, and the album with the most points is #1. (Second most points is #2, etc.)

So we start off with a 2-way tie for first, followed by a 3 way tie for third:

This is because we defined “approval” as “anywhere in the year end list”. In an actual approval vote, the voters would know beforehand how the votes would be counted and probably be more selective in who they vote for. These lists are not really saying “any of these 25 or 50 albums would be ok by us as Album of the Year”. Really, they would probably vote for their top 3 or so (or some would vote for their top 3, some for their top 5, some would vote only for their favourite, etc.) If everyone approved only their top 3, we’d get:

So now we have Arcade Fire at #1, Kanye West at #2 (both seem reasonable) and a 4-way tie for #3. And a 9-way tie for #7. And no way to rank anything after that.

In a real election, Approval Voting is better than First Past the Post because it has less need for tactical voting – for instance, take Rough Trade. The top of its Best Of list looks nothing like anyone else’s. If Rough Trade were a voter trying to actually influence an election, they would know (based on polls and publicity) that voting for Caribou, or Gil Scott-Heron, or These New Puritans, was useless – they have no hope of winning. They might even have picked up enough from the media to know that it’s shaping into a contest between Arcade Fire (who they rank 21) and Kanye West (who they hate – they didn’t even rank him). So they might be tempted to hold their nose and vote for Arcade Fire just to make sure Kanye West doesn’t win. It would be a tough choice, though, because what if their preferred candidates have a lot of underground support that isn’t getting media attention? With Approval Voting, Rough Trade could vote for their top 3 or 5 (or however many they wish) to show their support, plus throw in a vote for Arcade Fire just to make sure they have at least one vote that isn’t wasted. (You could say that this is still voting tactically, but Approval Voting at least gives more and better options for tactical voting.)

However, Approval Voting isn’t guaranteed to elect the Condorcet Winner – it depends entirely on how the voters choose to define “approval”. The various preferential ballot methods are clearly better at selecting the correct winner, because they let each voter give more information about their preferences. To balance this, Approval Voting is much easier to explain and count – you don’t even need a computer to count the ballots! So for a general election it may be a fair choice.

Regardless, it doesn’t work for our purposes, due to the number of ties we get when there are so many more candidates than there are ballots (which wouldn’t be a problem in a real election). FAIL.

Smith/Minmax

This one does need a computer to count. Using the ballots.txt file we generated in Part 3, we generate results with:

voteengine.py -m s//minmax sminmax-data.txt

This will think for a minute or two and then spit out “sminmax-data.txt”, a file containing a bunch of data about how it counted the votes, ending with the final results, in a line in the same format as the ballot:

Those are the candidate numbers of each album. To get a human-readable list out of that, we need to look up the name of each ballot. Remember that when we generated ballots.txt, we also saved the candidate names to candidates.txt – the name of candidate 1 is on line 1, candidate 2 is on line 2, etc. So we can write another simple python script, that reads candidates.txt and stores a map of candidate number to candidate name, and then reads the last line of sminmax-data and looks up each candidate name.

Broken Bells was ranked 5th by NPR and 11th by Rough Trade. And that’s it. There’s no way they should be anywhere near the top 3.

John Grant was ranked 1st by Mojo – so he’s got that going for him – and 6th by Q. And that’s it. Again, no way he should be ahead of Arcade Fire and Kanye West.

After that it starts spitting out albums in alphabetical order. Remember that in ballot.txt we specified that we’d break ties alphabetically. So this indicates that all the remaining ballots are tied for 3rd – or tied for last, depending how you look at it. That’s not useful at all.

This looks to me like VoteEngine’s s//minmax algorithm is buggy, because these results are just too weird to explain any other way. But life’s too short to debug it when there are 9 other algorithms to test out. FAIL.

Tomorrow, I’ll start going through algorithms that work fairly well, and start looking for the best.

Part 3. (Ha ha. That post starts with the correction that I’m using VoteEngine and not pyvote – but I didn’t actually fix the post title! And nobody wrote to correct me. Shows how much attention you’re paying.)

Whew. Going back to work did a number on the amount of thinking and typing I feel like doing in my off hours. Sorry for the delay.

In the first three posts, I showed how to turn a list of data into a ballot file for VoteEngine. Now I’ll run that ballot file through a bunch of voting systems and see what comes out. Here are the twelve voting methods I’ll be using. Ten of them just happen to be the methods supported by VoteEngine, and two of them are popular methods that I can count (and dismiss) just by looking at the list.

Before I list the methods, though, I need to explain an important concept in voting: the Condorcet Criterion. The Condercet Criterion was invented by the Marquis de Condorcet in the 18th century. It’s an attempt to answer the question, “Does this voting system always give a reasonable winner?”

The Condorcet Winner is the one candidate that would beat all other candidates in head-to-head races. That is, take a pair of candidates (for example, Arcade Fire and LCD Soundsystem). Look at all the ballots, comparing only those two candidates. (Looking back at the original lists, Mojo has Arcade Fire in 2nd and LCD Soundsystem in 36th, so Arcade Fire wins on this ballot – the fact that there was else somebody in 1st is irrelevant at the moment, because we’re only looking at a context between those two candidates. Q has Arcade Fire in 1st and LCD Soundsystem in 36th, so again Arcade Fire wins. In total, 8 of our 9 ballots have Arcade Fire ahead of LCD Soundsystem, and only 1 – Pitchfork – has LCD Soundsystem ahead. So, overall, Arcade Fire beats LCD Soundsystem.) Now repeat that for every possible pair of candidates. (Arcade Fire vs. Beach House, Arcade Fire vs. Vampire Weekend, Beach House vs. Vampire Weekend, etc.) When you’re finished you’ll have a list of exactly which candidates win, lose and tie against which other candidates in head-to-head (or “pairwise”) elections.

Now, if there’s one candidate that beats all other candidates, that’s the Condorcet Winner. It seems reasonable that this should be the overall winner – after all, if a voting system says that LCD Soundsystem is the overall winner, you have to justify why it beats Arcade Fire overall when 8 out of 9 voters preferred Arcade Fire! So the Condorcet Criterion says, “If a Condorcet Winner exists, does this voting method guarantee that they will be chosen?” (There isn’t always a Condorcet Winner. There could easily be a bunch of candidates all tied for first – they each beat the same number of other candidates, but not all other candidates. The Smith Set is all the candidates that are tied in this way – again, it seems reasonable that one of them should be the overall winner, but it’s not always clear which. You could also look at the Condorcet Winner as being a Smith Set with only candidate in it.) The Condorcet Winner and Smith Set are good ways to precisely define the common sense notion of “the candidates whose winning wouldn’t be ridiculous”.

Of course, the Condorcet Criterion isn’t the last word in evaluating a voting system. Even if a method passes the Condorcet Criterion, there are a lot of other things it could do poorly.

Officially, a Condorcet Method is a method of counting votes that passes the Condorcet Criterion. A method that doesn’t pass it is not a very good method, because it will sometimes elect the wrong person – a person who doesn’t make any sense as the winner. (Some people disagree that this is all that important – maybe it could theoretically happen, but doesn’t happen often in practice, and the system has some other property, like simplicity, that makes it attractive. I’ve even seen people argue that the Condorcet Winner tends to be a middle-of-the-road candidate who is many people’s second choice but few people’s first choice, and they would prefer to elect a candidate that is loved by a faction even if they’re hated by another faction. I – disagree – with this opnion.)

I will be using Condorcet Method more specifically, though: there are a lot of voting methods that are defined as, “Count up all the head-to-head elections between every possible pair of candidates, as defined above, to find the Condorcet Winner and, if it doesn’t exist, the Smith Set. If a Condorcet Winner exists, that’s the winner. Otherwise, use some tie breaking procedure to pick a member of the Smith Set to be the winner.” You don’t have to actually count the votes this way to get a method that satisfies the Condorcet Criterion, you just need a method that returns the same winner as you would get if you counted the votes this way. By the true definition, any method that satisfies the Condorcet Criterion is a Condorcet Method, but I’ll use the term only to describe methods that start out by finding the Condorcet Winner explicitly as described above. The only difference between all the different Condorcet Methods, by my definition, is what procedure they use to break ties if there’s no Condorcet Winner.

By the way, according to the 9 best-of lists we’re looking at, Arcade Fire is the Condorcet Winner. It’s the only band whose album was ranked above every other album on the majority of the lists. So we’d better hope that every voting system says it’s #1! It’s obvious just by looking at the lists that Arcade Fire had the most popular album of 2010 – it’s near the top of almost every list, and even on the Rough Trade and Pitchfork lists, it’s near the middle. The other obvious contender, Kanye West, is #1 on the lists that love him but doesn’t even appear on some lists. So the more interesting question is who else gets ranked at the top according to each system (and specifically, where does Kanye end up?)

With that out of the way, the ten voting systems supported by VoteEngine are (in alphabetical order):

The Borda count (borda) – give each candidate points according to their position on the ballot, from 0 for a last place finish, 1 for second last, etc. up to N for a first place finish among N candidates. The winner has the most points.

Borda Elimination (aka Baldwin’s Method) (borda-elim) – Like the Borda Count, except that instead of returning the candidate with the most points immediately, you keep eliminating the candidate with the least points (and then recounting as if that candidate had never been on the ballot) until you’re left with one winner.

Copeland’s Method (copeland) – A Condorcet Method where ties are broken by the number of head-to-head victories minus the number of head-to-head defeats.

Instant Runoff Voting (irv) – Look at only the first place votes on the ballots. If one candidate has a majority of votes (not just “the highest number of votes”; they need over 50%) then they’re the winner. Otherwise, eliminate the candidate with the fewest first place votes, and then recount as if that candidate had never been on the ballot. Repeat until one candidate has a majority.

Minimax (minmax) – A Condorcet Method where ties are broken by choosing the candidate with the smallest margin of defeat. Actually, the candidate with the “minimum maximum” margin of defeat – hence minimax. (For example, Beach House loses to Arcade Fire 2 to 7. It also loses to LCD Soundsystem 4 to 5. Look at how a candidate compares with all other candidates and find it’s “maximum” margin of defeat – Beach House’s is probably that 2 to 7, but I haven’t looked at all of it’s contests by hand. Now the overall winner is the one whose maximum margin of defeat is smallest – the “minimum maximum”. Whew. And some people call this the simplest Condorcet method!)

Nanson’s Method (nanson) – Like the Borda Count, except that instead of returning the candidate with the most points immediately, you keep eliminating all the candidates with less than the average number of points (and then recounting as if those candidates had never been on the ballot) until you’re left with one winner.

Pairwise Elimination (pw_elim) – Like Minimax, except that instead of returning the candidate with the lowest maximum immediately, you keep eliminating the candidate with the highest maximum (and then recounting as if that candidate had never been on the ballot) until you’re left with one winner.

Ranked Pairs (aka Tideman’s Method) (rp) – A Condorcet Method where ties are broken by ranking all pairs of candidates by margin of victory, and then adding them each to a graph (in order), skipping any pair that would create a cycle in the graph. The final graph will be a tree (since it has no cycles) so the root of the tree is the winner.

Schulze’s Method (schulze) – A Condorcet Method where ties are broken by – um – something do to with graphs again. Really, it’s complicated, which is unfortunate as it seems to give the best results. I’ll describe it when I discuss the results in detail.

Smith/Minimax (s//minimax) – Minimax has a problem, which is that if there’s no Condorcet winner, then the winner it returns isn’t guaranteed to be in the Smith Set. So, first get rid of all the candidates outside the Smith Set, then use Minimax to count the remainder.

The other two methods are:

First-Past-the-Post (aka Winner-Take-All) – Each voter can vote for exactly one candidate. The winner is the candidate who gets the most votes, even if that’s not a majority. (Even though we have ballots with complete preferences on them, we can count them using first-past-the-post by only looking at the #1 preference.) Pretty much a terrible voting system; the only thing it has to recommend it is that it’s simple to explain.

Approval Voting – Each voter can vote for as many candidates as they want, but all their votes have the same score. (That is, each voter either “approves” or “disapproves” of each candidate.) The winner is the candidate approved by the most voters. This has the advantage that it’s less restrictive than first-past-the-post, but at the same time it’s easier to explain and fill in a ballot than systems needing full preferential ballots. (Ballot instructions are basically, “Mark an X next to any candidate you find acceptable. You may choose as few or as many as you wish.”) For this sample, we can assume that any album that appears on a list would be “approved” by that list’s compilers.

Note that all these voting methods return a single winner. To get a complete ranked list, just take that winner out and count the votes again as if that candidate had never been on the ballot – whoever wins this time is in 2nd place. Then do it again to get the 3rd place winner, etc. (For the methods that already say, “Eliminate the candidate with the least number of votes and then recount,” the complete list is just the candidates in reverse order of elimination.) VoteEngine already does this for all the methods it supports above.

There are four other methods supported by VoteEngine, which I didn’t include, mostly because they only return one winner rather than a list. I could have written a wrapper to update the ballot file and run them again to find the list, but it was too much work. The other four methods are:

Bucklin’s Method (bucklin) – Count only 1st place votes. If one candidate has a majority, that’s the winner. Otherwise, add the second place votes. Repeat, adding lower placed votes each time, until one candidate with a majority is found. (In this test, Arcade Fire gets the majority after adding second place votes, and then VoteEngine stops counting.)

Condorcet/IRV (c//irv) – Return the Condorcet Winner if one exists, otherwise use IRV. In other words, a Condorcet Method using IRV to break ties. (In this test, it returns Arcade Fire, the Condorcet Winner, and then stops counting.) In theory this could return a winner outside the Smith Set, because if there’s no Condorcet Winner it throws away all the pairwise data it just counted up.

Smith/IRV (s//irv) – Get rid of all the candidates outside the Smith Set, then user IRV to find the winner. In other words, a Condorcet Method using IRV on the Smith Set only to break ties. (In this test it works the same as c//irv since there is a Condorcet Winner.)

UK Usenet (ukvt) – “apply the rules used by the uk.* usenet hierarchy”. I just ignored this one, because it’s not a standard voting method, so honestly who cares?

Tomorrow I’ll take a closer look at each of the twelve methods and talk about the results in detail.

First, a correction: for the last two posts, I’ve been linking to pyvote as the program to automatically count votes. Except I actually used VoteEngine. Natural mistake – they’re both Python programs used to count votes, and “pywhatever” is a common naming scheme for Python.

Short one this time. Last time, we turned 9 end-of-year best album lists into preferential ballots in a standard format. In order to count these ballots with VoteEngine, though, we need one more thing: as well as the file of ballots, VoteEngine needs a complete list of all candidates, which can either be passed on the command line with the “-cands” parameter or added to the ballot file itself. Our candidates are named with numbers counting up from 1, so this is easy for us to generate.

Another parameter that’s helpful is “-tie”, which takes a list of candidates in an order to use as tiebreakers. Whenever a voting system returns a tie between two candidates, the one that appears first in the -tie list is counted as the winner. I’m not actually sure what order is used if a candidate doesn’t appear in the tiebreaker list, but since we’re autogenerating the candidate list anyway it’s easy to always fill in a complete -tie list. We’ll break ties in alphabetical order based on song name.

Since we plan to count the same list of ballots over and over again with different voting methods, it will make things much easier to add these two parameters to the ballot file itself. This is a simple edit to the script we wrote last time. First, since “-cands” needs to come at the start of the file, we delay actually writing lines to the ballot file until after all ballots have been read. Then, after reading all input files and filling in the candidate map (which records candidate numbers mapped to song names), we write all the ballot lines plus two last lines: “-cands ” and “-tie “.

Yesterday I linked to 9 lists of the Best Albums of 2010, from magazines, web sites, and one radio show, and promised to explain how we distilled those lists into one canonical Top 25. The first step is to turn each list into a preferential election ballot.

The standard First Past the Post voting system used in pretty much every political election in North America is… not very good. Its one virtue is that it’s simple to explain: you can vote for one, and only one, candidate, and the candidate with the most votes wins. The problem is that this often makes it really tough to decide who to vote for. To get your preferred result, you need to consider a whole host of things other than “how good this candidate is”. You need to vote tactically. The obvious example is when you don’t think your favourite candidate can win – do you vote for them anyway, or switch your vote to your second choice? (It’s a tough decision if your candidate has ALMOST enough support to be viable.) Another example is if you mainly want one candidate to LOSE – you still need to pick one of their opponents to vote for. There are more serious problems in an election like the Canadian Parliament or US Electoral College, where a bunch of individual elections each elect one winner, and then the team with the most winners gets the grand prize, but let’s consider only elections where everyone votes directly for a single winner, like a city mayor or state governor.

Every voter can, in theory, rank all the candidates in order of preference (although if they don’t have strong opinions or aren’t very informed, that ranking may just have their favored candidate in first and everyone else tied for last…) Judging whether a voting system is any good basically involves measuring how much a voter’s “true preferences” contribute to the election’s outcome. First past the post is a poor system because it forces voters to leave out most of the information about their preferences, and in fact encourages them to “lie” by listing a candidate who isn’t actually their first choice. A better system would give each voter a more complicated ballot in which they could list all their preferences. Say, by putting a 1 next to their favoured candidate, a 2 next to their second candidate, etc. Or with a computer touch screen which removes each candidate’s name as the voter touches it, and keeps track of the order the voter chose them in. Or, although this has obvious practical problems in a large election, by writing down each candidate’s name in order on a sheet of paper (which is exactly what we have with our 9 best-of-2010 lists!) There are many physical ballots we can imagine that could record a voter’s complete preferences. But in order to study or simulate a voting system, it’s helpful to have a standard notation. Whoever counts the votes – human or computer – can start by translating each physical ballot into this standard notation.

The notation commonly used is to give each candidate a symbol (such as the first letter of their name or party), and for each ballot, list the symbols for each candidate on one line, from the most liked on the left to the least liked on the right. The symbols are separated by “>” to show that the candidate to the left is preferred to the candidate on the right, or “=” to show that they’re tied. (And, if not all candidates are listed, all the unlisted candidates are assumed to all be tied for last place.)

So, with the Canadian political parties Conservative (C), Liberal (L), New Democrat (N), and Green (G) – assume we are not in Quebec so the Bloc Quebecois is unavailable – we have examples like the following:

The extreme right-winger would like the Conservatives to win, and at all costs wants the NDP and Green Party to lose – they’d vote “C > L > N = G”.

Or you may have an NDP supporter who thinks the Green Party is also a good left wing choice, but that the Liberals are no better than the Conservatives – “N > G > C = L”.

Or an NDP supporter who thinks that the Green Party are a bunch of upstarts who can’t be trusted – “N > C = L > G”.

Or any weird and wonderful combination of these.

So, we have 9 lists of songs, with over 200 songs between them. The first thing we need to do is get a symbol for each song, and make sure we can turn that symbol back into a name when it’s time to output the results. Save each list into a text file (with just the songs, one per line – no rank numbers). This may take some cutting and pasting. Then go through each list and make sure that each song is spelled EXACTLY the same each time it appears (the Unix commands sort and uniq may help with this).

Now we want to read all the text files and turn them into two data structures: a map of Candidate Number to song name, and a set of ballots in the above format. Since I’m using pyvote, a Python program to count votes, the natural way to do this conversion is to write another Python script.

The script will read each file given on the command line and write out two files. “candidates.txt” is the complete list of all candidate songs, in order of Candidate Number – given a number, the song name is on that line number of candidates.txt. “ballots.txt” is the list of ballots – so for our best-of lists, it will be 9 lines long. Since WordPress removes indentation in code blocks, which is fatal for Python code, I’ve put the script on pastebin:

It’s pretty simple – read each line of each file, generate a number for the song, and save the name and number in a map called “candidates”. For each file, write the numbers of each song as a string joined by ” > ” to “ballots.txt” – since none of these lists have ties, we don’t need to worry about “=”. Finally, sort the candidate map by number, and write each name in “candidates.txt”.

Save the Python file as “convertLists2Ballots.py” and run it with “convertLists2Ballots.py <list of text files>”.

The candidates.txt file this spits out isn’t very interesting, but ballots.txt looks like this:

Incomprehensible to a human, but pyvote will be able to read this and then try lots of different voting systems on it. Tomorrow I’ll show how to make this happen.

(As an aside, instead of a bunch of text files, it’s pretty common to get data you want to turn into votes as a spreadsheet. To process this with Python, save your spreadsheet in CSV format – that’s “comma separated value”, a simple text format that can’t handle formatting or formulas – and then read it into Python using the csv module.)

These lists actually have 25 to 100 entries on them, and they don’t all have the same entries or in the same order. The Rough Trade list doesn’t have a single one of our Top 10 in their top 10! (And our #3 song isn’t on their list at all!) So where’d we come up with these numbers?

Voting theory!

Treat each end-of-year list as a ballot in an election. In a standard election, several thousand or several million people all choose from a handful of candidates. Here we have just 9 ballots cast to choose between several hundred songs. So it’s the reverse of the elections that are commonly studied, but there’s no reason standard election tools wouldn’t work. And it’s an edge case that might reveal interesting properties of the methods used to count the votes.

I counted the votes using 10 separate methods (using vote-counting software – I’m not insane!) and then we picked the result that seemed to make the most sense. Not scientific at all, but nobody said this was a scientific study – it’s just for curiosity. We ended up choosing the results returned by the Schulze Method, the same method used by elections in a lot of open source software groups (including Debian and Gentoo).

Tomorrow I’ll explain how to turn a list into a ballot (and what that even means), and how to use pyvote to process the ballots. Then I’ll go through each of the 10 voting systems, describe them, and discuss the results. That’ll take a while… (A lot longer than it did to actually generate them!)