This is a practical problem that arose in real life, which I believe creates interesting mathematical questions.

There is a festival of small plays lasting 8 weeks. Each week 10 short plays are staged (as one show) and the audience votes for their favorite play among those 10. The next week we have a set of 10 new plays, the audience (different to last's week's audience) votes again and so on. Thus, from each week we have a most popular play. So 8 most popular plays. We want to select the 2 "most popular" out of these 8 plays to advance to the Final. The quotes around "most popular" are there to show that it is inherently unfair to compare popular plays from different weeks, since each week has a different set of plays. But let's say we cannot do anything about this and still we need to find a "fair" way to announce the 2 most popular out of the popular winners from each week. One obvious thing we do is to use percentages of votes instead of absolute vote numbers, since the audience numbers (hence votes) vary from week to week. The two plays with the highest percentages among the 8 winners are named "most popular" and advance to the Final.

There is a complication though that makes the whole problem mathematically interesting. Not all weeks have exactly 10 plays. Some might have 8 or 9, some might have 11. It is unfair to compare percentages of popular votes from sets that have different cardinality. If you do not see the unfairness immediately, consider an extreme case: week A has just two plays and week B has 10 plays. The winner of week A gets 60%, the winner of week B 20%. Is really A's winner that much more popular? Of course not. We need a way to adjust the percentages to a normative week of 10 plays.

The organisers of the festival have recognised this and apply the following formula, where $p$ is the percentage of received votes of the most popular play of one week, $\hat{p}$ the adjusted percentage, and $n$ the actual number of plays in that week.

$$\hat{p} = p \cdot \frac{n}{10}$$

This formula might seem intuitive at some level, but on closer inspection one can find that it is really arbitrary. Let's take n to be 9, and assume one play got 20%. The adjusted percentage is 18%. This can be interpreted in 2 equivalent ways: either

The total number of votes remains the same and 1/10 of each play's votes were taken to be given to an imaginary extra play, or

The votes of each play remain constant and the extra imaginary play gets an extra 1/9 of the total votes cast, so we have 10/9 total votes cast (compared to the actual votes cast). In other words the extra imaginary play got the average number of votes among the 9 plays.

Makes some sense, but let's look closer.

For interpretation 1: why should we take the same % votes from each play? This way we are taking more votes from the popular plays and less from the unpopular. Why not take the same amount of votes from each play (not same %)? Or even take more votes from unpopular plays than from popular. Afterall isn't probable for a popular play to retain its votes?

For interpretation 2: Why should the imaginary play added, add to the total votes, the average votes per play? Why not the median? Why not some other estimation that gives the most probable number of votes given the distribution of votes we have seen so far?

Do different ways matter, and is there a more fair way? (maybe in the sense of maximum likelihood). I give a partial answer below in the answers section

I would also appreciate your help in finding a better title for the problem.

Another way for comparison: use the ratio of actual percentage divided by the average percentage a play can get. For example, both these cases get a "grade" of 2: 1) a play among 5 plays getting 40%, 2) a play among 10 plays getting 20%. This is exactly equivalent to the "official" formula: your grade is the adjusted percentage multiplied by the number of normative plays (10 in the official formula). What is unfair in this way? Grade upper limit depends on number of plays in a week. Eg 2plays(95%, 5%) 10plays(20%, 15%, 15%, 10%,...). Play in 2nd case seems more popular using the grade. Is it?
–
ThanassisJan 20 '12 at 23:15

3 Answers
3

Say you had eight batches of exactly ten plays. For the first five weeks, they had evenly split ballots - roughly 10% for each play, with one play winning by one vote.

Then in week 6, the winner has 12%, while the other 9 have around 9.77% apiece. In week 7, the same thing happens, except at 12.01%. In week 8, the winner has 13%, 2nd place has just under 12%, and the remaining films have just over 9.375% apiece.

Who should the top two winners be, intuitively? I ask because that would give you clues on how exactly you want your rule to work.

If I label each of those four winners A through D, then different cases could be made for:

1) A and B should win since they both had the highest margins of weekly winners.

2) B and C should win since they had the highest percentages.

3) C and D should win since they had the highest margin ahead of the films they scored better than.

At any rate, I think you could basically argue that it is a poorly formed question (so far) just because it depends on a nebulous definition of fairness. If you specify more requirements of what a fair outcome would be in an ideal structure (ten plays every week) then it might be easier for you to come up with a consistent approach.

I'm assuming the festival's organizers would pick #2) B and C, even though it strikes me as unfair because a winner could be penalized just from it being a good night at the theater.

If that's the case, then you need to do another round of soul-searching. Imagine eight rounds where you have clear results of a 12% winner, and nine 9.77% 2nd-place winners. Take one of those 9.77% plays and move it to another night. What does that mean for the 11-play night? Does it mean that the people that the people that didn't vote for the first-place-winner would just split their votes ten ways instead of nine ways, thereby creating more margin between 1st and 2nd place? Or does it mean that the same margin is retained, with scores of 11.12% for the winner vs 8.89% for the other ten finishes? Or perhaps you want to retain the percentage-above-average that the winner would get, in which case the winner would have 10.91% while the others would have 8.91% At any rate, I think it's something where you have specify the ideal behavior first before you can come up with the math on how to approximate that behavior afterwards.

Thanks @tunesmit for pointing out even more unfairness issues with the general voting scheme, but this is a given as a I state in my initial question. The pinnacle of unfairness is that the plays are not compared to each other (only in limited batches of 10). I was interested in a specific twist of the voting scheme after we accept it. When we accept the scheme the answer to your example is 2). For my question (on the twist) I was pointing the problems with averages and asking what would a metric of fairness could be, offering some suggestions and insights.
–
ThanassisDec 6 '13 at 1:31

The way I see it, you have N plays total, and if they would be evaluated all together, each would get p_i of the votes. Now, you have them randomly put together in groups, and assume the group we are looking at has n members (let's call the collection of plays in that group A). All we can observe are the probabilities within the group, let's call them q_i. So for play i, we have:
q_i = p_i / Sum_over_A(p_k)
I'm not sure how to define fairness exactly, but it seems like finding the best estimate of p_i given q_i would be the thing to do. I don't know how to actually do this though.

A simple thing would be to approximate Sum_over_A(p_k) as (1/N)*n. This gives p_i = q_i*n/N, which is equivalent to the "official" formula. As long as the n don't differ to much between groups, it could be a reasonable approximation.

Or alternatively, Sum_over_A(p_k) = p_i + Sum_over_A{i}(p_k) = p_i + (1-p_i)*(n-1)/(N-1). This would result in something like p_i = q_i * (n-1)/(N-1-q_i*N+q_i*n). I realize none of this is too mathematical or exact.

I tested the effect of adjustment using the actual vote distributions from various weeks. So in a sense I took an experimental approach. Here's what I did:

I have actual voting data of several weeks with 10 plays each. Let's just look at one week here. I assumed that one play was not there. I calculated the adjusted percentage for the remaining plays using the "official" formula and compared it with the actual percentage. I did this for every possible excluded play. Adjusted percentages were above or below the actual percentage depending on which play was excluded, but the averages of these adjusted percentages (over all possible excluded plays) followed some clear trends.

More formally. Let $p_i$ be the vote percentage for play $i$, in a week of $m$ plays. Let $p_i^k$ be the vote percentage of play $i$, when we exclude play $k$ from the set ($i\neq k$). That is, $p_i^k$ is the observed percentage from a set of $m-1$ plays ($m$ plays, minus play $k$). Let $\hat{p_i^k}$ be the adjusted percentage based on the official formula ( $\hat{p_i^k} = p_i^k \cdot \frac{m-1}{m}$). The average adjusted percentage $\hat{p_i}$ for play i is given as:

Is $p_i \approx \hat{p_i}$ ? It turns out, not really. The popular plays are getting disadvantaged and the unpopular ones advantaged by the adjustment. For practical purposes the difference is not very big (observed at most 0.2%), since the distribution of votes is not very uneven, but if a play gets say 70% of the votes then the adjusted percentage is about 5% less.

I also tried a different adjustment method to replace the "official" formula: I add an extra play, which brings extra votes. The extra votes are distributed uniformly from the votes of the $m-1$ plays in the set. With this new method, the adjusted percentage is closer to the actual observed one, but still popular plays get slightly penalised. For example the 70% runaway winner got 2.5% less in the adjusted percentage.

So different ways do matter, and they matter most when the distribution of votes is most uneven. Is there a fair way? I guess the question can be transformed to this: If we add another sample in our set (in our case: votes for an imaginary play), what is a fair sample? Maybe the answer depends on which property of the set's distribution you want to keep. Maybe not. Anything fair comes to mind when talking about votes/preferences?