Generalizing Fairness (B)

Fairness (B) has been defined as the quality of a tournament design that answers the desire to give everyone an equal chance. As discussed elsewhere, it is sometime in conflict with other forms of fairness.

The only fairness (B) metric defined so far is rather a crude measure, suitable only for assessing the equality of opportunity in first round of a tournament, and then in a form that was greatly affected by the structure of the prize fund. In this post, I’ll propose a new fairness (B) metric which can be applied to any round, and is normalized so as to be unaffected by the absolute value of the prize fund.

As will be made clear in future posts, this new fairness (B) metric will be a powerful tool for analyzing various aspects of a tournament.

The new fairness (B) measure has the same basic structure as the old one:

fairness (B) = 1 / (SD)

where SD is a standard deviation.

(An earlier version used (SD + 0.01) in the denominator, with the idea that it was necessary to have something to prevent the measure from becoming infinite where the standard deviation was very close to zero. But I found that this distorted the statistic too much. The little bit of residual randomness even with runs in the hundreds of thousands was causing dramatic-looking variations in places where there was really no variation at all. And so I’ve decided to omit the 0.01, but to report the statistic as 100 whenever the deviation is so small as to push the reciprocal above 100.)

Initially, this was applied only to the starting position of the players, and it makes sense to consider this as a special case of the statistic, which is more descriptive of the overall fairness of a format than the measure as applied to later rounds. But there’s also a role for a measure that will look at the equity of individual rounds other than the first.

The second innovation has to do with deciding what numbers the standard deviation should be taken from. In the oldest version, it was applied only to the expected number of individual match wins from each line – I chose individual match wins rather than tournament wins with the thought that this would make the measure more suitable for judging early rounds of the tourney, which were so remote from the overall result that the absolute values of the numbers were very small.

But a better way to approach this problem is to normalize the inputs, and to use the payout structure of the tourney (rather than the number of wins). Thus:

normalized expectation = (IE * n) / (∑ IE)

where IE is the individual expectation for each line, n is the number of lines, and ∑ IE is the sum of the n individual expectations.

Here’s an example. In one of the grouped-byes formats considered in an earlier post, the four lines in the D round had expectations of 17.214, 17.190, 23.619, and 23.604. Normalizing, these expectations are 0.8435, 0.8424, 1.1574, and 1.1567. This yields a fairness (B) statistic of 6.37. By comparison, with byes ungrouped, the expectations in round D were 20.509, 20.512, 20.524, and 20.510, which normalize to 0.9998, 0.9999, 1.0005, and 0.9998, so that fairness (B) is 100 (or, actually, 3410, but I’m applying the convention of reporting a maximum of 100).

This is not so say that the ungrouped-byes tourney is fairer than the grouped-byes tourney in a ratio of 100 to 6.37. But it does show, I hope, something notable about the way these two tourneys are structured – that the fairness (B) issues associated with the placement of the byes linger in the grouped-byes tourney into round D, but they’ve been washed out by round D when the byes are spread.

Applying the new statistic to round A, the grouped-byes format comes in at 8.22, compared to 21.60 for the spread byes format. These numbers are, I think, more reflective of the fairness of the formats as a whole.

This raises an issue of notation. When applying the measure to the entry round, which might be considered to show the fairness of the format as a whole, I’ll continue to call it fairness (B). But when applying it to an individual round, I’ll lower-case the “b”, insert a colon, and then state the round to which the measure is applied. Thus, for the grouped-bye example above, fairness (B) = 8.22, and fairness (b:D) = 6.37.

Sort of. Fairness (B) is just the fairness statistic applied to round A, or, if you have some entrants who enter in other rounds, to the entry lines wherever they are.

The reason that it is a plausible measure for the tourney as a whole is that because it’s calculated on the first round, it’s affected by all subsequent rounds. Those rounds may be uniformly fair or unfair, but they can also be unfair in ways that tend to cancel each other out. For example, in an A.B.C.D.|.| format, half of the lines get C drops, and the other half get D drops. So, with some level of round progression the half of the lower bracket lines that get hit with C drops is unbalanced for a round, but when the other half of the lines get hit with a D drop in the next round, the two will likely cancel each other out, to some extent, so that the first consolidation round looks pretty good.

Thus, a bad fairness (B) score is a negative indicator for the tourney as a whole, while the fairness (b:G) and (b:H) scores tell you more about what’s happening in particular parts of the bracket. But a good fairness (B) score does not necessarily mean that there aren’t some troubling features of individual rounds later on.