The Ugly Bottom

Hold everything! I’ve made another boneheaded error, and need to revisit pretty much all of my recent results to correct it.

Here’s the story of how I recognized the problem.

In a recent comment, one reader pointed out that I hadn’t really exhausted the possibilities for bracket shifts – there were still shifts like an 8DE: {AB.C.|.X.R} available. The first element means that the A and B drops go to the same round, so the design is very similar to Joe’s 8DE, except that there is no contingent F1 match – you simply play out the lower bracket with a consolidation round after the C drop. It looks like this: ugly8. I’m dubbing this the “ugly 8” because I think dropping the A and B rounds together into the D round is an ugly move – it violates the first maxim.

Given that I find this idea so silly that I’d overlooked it entirely, I had low expectations for the design. But I was also low on ideas about what to test next, and sometimes it’s interesting to show an idea being a bad idea. So I built an “ugly 8” structure file for the simulator.

Now, I’ll confess that I often sneak a peek at simulation results for short preliminary runs. A full million trials of a new 8DE takes about an hour to run, so it makes sense to do a shorter run first, just to see if I’ve built the structure file correctly. But I should know better than to look at the fairness (C) results. Fairness (C) is a really volatile statistic, and it takes lots of iterations for the average value to settle down.

But in the case of the ugly 8, I did sneak a peek at fairness (C) after just 1000 trials. And, to my surprise, it was better than the number for either Joe’s 8DE or the standard 8DE. But not to worry – I looked at the confidence interval on the fairness (C) calculation, and it was huge, including lots of room for the ugly 8 to be, in fact worse than either of the others, which is what I expected. So I ran another 10,000 trials. And the results were the same!

So, I started the full million-trial run, but I also started speculating about what caused the unexpectedly good result from the short runs. And I came up with an explanation! And started to plan a whole new line of inquiry based on that theory.

Well, the result of the long run came in an hour later, and is was just as I originally expected – the ugly 8 was slightly worse than Joe’s 8 or the standard 8 at luck = 1. So I didn’t really need an explanation for an anomalous result – there was none.

The trouble is, by this time I’d decided that the theory I’d come up with to explain the (non-existent) anomalous result was entirely plausible. Even if it wasn’t strictly needed, is seemed to be something I really needed to test. Perhaps it was affecting the results of my earlier experiments in ways I hadn’t yet understood.

Here’s the theory. All of my recent runs assumed a winner-takes-all payout: 100% of the prize fund to the champion, nothing for anyone else. Now, this is not my usual assumption when testing double eliminations – I tend to favor paying three places, 50/30/20. But some of the designs I was looking at didn’t identify a unique third place, so I couldn’t really pay three places. And if you’re interested in the need for a recharge round, it makes sense to look only at the top prize, as that’s the only one that’s affected. So I paid just one place.

Now, how does this potentially explain the success of the ugly bracket? Well, perhaps was was going on was that, at least at low luck levels, the A, B, and C rounds at the top of the bracket were usually finding the best player. And then in the last two rounds, the survivor of the lower bracket had a chance to knock this worthy upper-bracket winner out of its deserved first place. So, I theorized, what was happening was that by mucking up the bottom bracket, the system was tending to produce a less-worthy challenger for the last two rounds, and that was reducing the frequency with which fairness (C)-diminishing upsets were happening in those two rounds. The figure looked better because the single-elimination top bracket had already found the proper winner, and by mucking up the lower bracket, the design was tending to protect the best player, sort of the way seeding increases fairness (C) by protecting the high seeds. I’ll call this the ugly-bottom effect.

It’s a nutso theory, and as I explain it now I can only wonder how I managed to believe if for as long as an hour.

But still. Doesn’t it make sense that strange things happening in the lower bracket might have strange effects on fairness (C) if all you’re looking at is the top place? Could there be an ugly-bottom effect after all? Surely it can’t be good to ignore the system’s skill at producing a reasonable second place.

Now, it’s true that I’d have problems with a 50/30/20 scheme, but there is no reason I couldn’t pay two places – all designs have a unique second place. So I need to see what happens when I replace the winner-takes-all payout with something like a 65/35 payout.

And here’s what happens:

65/35

1

3

standard

14.13

54.78

ugly8

14.81

55.07

Joe8

15.75

57.75

Order is restored to the universe. The Ugly 8 bracket is shown more decisively to be, well, ugly, as compared to the standard bracket. But wait. The Ugly 8 is still better than Joe’s 8! Perhaps there is an ugly-bottom effect of the sort I’d imagined, but the Ugly 8 wasn’t ugly enough to show it. Joe’s 8 is even uglier, and in that case the ugly-bottom effect actually did manage to invert the rank order of the designs.

I still have much work to do before I understand the full depredations of the ugly-bottom effect. For the time being, however, much of what I thought I’d shown in the past several posts is now in doubt. Stay tuned.