Saturday, April 30, 2011

One of the joys of writing code in Scala is how expressive it looks. Instead of dealing with the minutia of handling data, I can concentrate on what the code is actually doing. With experienced Scala programmers, that goes without saying. Newcomers have a harder time, because they are still a bit confused by grammar and vocabulary to pay proper attention to what is being said, so to speak.

The most striking example of that comes from the use of Scala's collection. There are many powerful collection libraries out there, but you are usually made very aware that you are handling a collection -- code to be hidden inside a class, to avoid contaminating business logic with it. Scala's collections, on the other hand, can easily fit in the highest abstraction levels of the code.

Let me take an example from the upcoming British referendum about the adoption (or not) of Alternative Vote. In this voting system, each voter ranks the candidates in order of preference. Depending on the specifics of its implementation, it may be possible or not to leave candidates unranked. The winner is the candidate that manages to get 50% of the votes, with the the candidate with the least votes being removed and his or her votes being reassigned according to the voter's preferences until some candidate gets elected.

So, let's consider how one could implement the algorithm that decides who the winner is. Let's say we have a set of candidates, identified by their names, and a list of votes. Each vote is a list of candidates ranked by preference. From that, we have to produce the name of the winner. In other words:

So, where do we begin? There are three essential tasks: we need to tally the votes for every candidate, we need to see if the candidate with most votes has at least 50% of all votes, and we need to discover who's the candidate the the least amount of votes.

Let's say we did these tasks, and now we know who the candidate with least and most votes are, and we have a tally of votes for all candidates. In this case, we can return the winner very easily, if there's one:

Which still leave us the problem of how to handle the (expected) usual case of no candidate reaching 50% on first tally. There's an easy solution for that, though: just remove the candidate with least votes from the pool of candidates, and try to get a winner out of that. It's easy, because we already have a function that does that:

else elect(candidates - leastVotesCandidate, votes)

Of the three tasks posed earlier, two are pretty simple as well: deciding who's the candidate with least and most votes. We could sort all candidates by vote and take the first and last candidate, but we don't actually need to sort anything: just knowing top and bottom is enough. We can do that like this:

Now all that's left is finding how many votes each candidate has. We could pick the first preference in each vote and do a tally on that, but some candidates may have been removed from the pool. Instead let's say the valid candidates of a vote are the ranked list of candidates for a vote that are still in the running. We can compute that for a vote by doing:

This doesn't read right, actually. A minor complain some people have about collection methods is that filter seems to do exactly the opposite of what's wanted: if you say filter X (those for which X is true), then it will keep X instead of discarding it. So, when we say "filter candidates", it will keep these candidates in the vote, and discard the rest.

The other non-obvious thing about this line is "candidates" itself. What does it mean to say "filter candidates"? Well, "filter" takes a function which, given a value of the collection, will return true or false depending on whether that value must be kept or discarded. That means "candidates" must be a function which, given the name of a candidate, returns true or false.

However, "candidates" is a set! We declared it so in the very first line of code presented, didn't we? Well, in Scala a set is also a function that tests whether a value is present in the set or not, returning true or false accordingly. In fact, sequences and maps are also functions in Scala, the former from indices to values, and the latter from keys to values.

Well, enough about that. We can now just take the first candidate in the list of valid candidates and tally the votes... as long as all candidates are ranked in each vote. If that is not the case, then a vote may well not contain any candidates still in the running, in which case the list of valid candidates will be empty.

Since this example comes from the British referendum, and the AV proposed in that referendum does not require votes to rank all candidates, we'll deal with that. Let's say the first valid candidate of a vote may be some candidate or no one. That is,

Finally, we have to do something about the possibility of no candidate reaching 50%, which is possible in a system where not all candidates are ranked. I don't know how the proposed system will do in that case, but I'll just choose the most voted candidate if there aren't more than two.

About Me

I'm an IT guy, a gamer and someone curious about my profession, politics, philosophy and anything that engages the mind. I like movies, books and cats. I'm a former FreeBSD committer, and a present (small) contributor to Scala.