Menu

The virtue of high variance approaches

Suppose you’re trying to do something. You can adopt a variety of strategies – some of them are likely to produce better results on average, some of them are more variable and could either produce great or terrible results, etc.

Lets adopt an overly simplistic mathematical model and say that how “good” the result of the strategy is is determined by a normal distribution \(N(\mu, \sigma^2)\).

Lets further simplify and say that all we’re actually interested in to determine whether a strategy is good or not is the expected value of this goodness.

So how should we choose the best strategy?

Well we pick the one with the highest value of \(\mu\). The \(\mu\) parameter is exactly the expected value.

Now, suppose there are multiple people all working independently at the same task. What strategy should they adopt? Should they just all adopt the individually best strategy?

Well… it depends.

Specifically it depends on how these values combine.

If they just add together (e.g. say you’re doing something whose output is measured in quality adjusted life years and there are more than enough people you should be helping that you’re not meaningfully going to start interfering with eachother) then yes, you should adopt the same strategy. The sum of independent normal random variables is another normal random variable with expectation equal to the sum of the expectations of its components.

But not everything is additive. For example, suppose you’re all trying to come up with a new type of invention, or to write the best essay, or anything where you can simply take the best result and replicate it as many times as you want.

What strategy should you adopt then?

Suppose that each of \(N\) people adopt a normally distributed strategy as above. Say the results of person \(n\) are \(X_n\). Then the overall result is \(R = \mathrm{max} X_n\).

By scaling, the expected value of this is \(\mathbb{E}(R) = \mu + \sigma \mathbb{E}(\mathrm{max} Z_n)\) where the \(Z_n\) are standard normal random variables.

And it turns out that for \(N \geq 4\), \(\mathbb{E}(\mathrm{max} Z_n) > 1\). So for groups of at least 4 people, increasing the standard deviation actually improves the best result more than increasing the mean by the same amount does. This shouldn’t be terribly surprising – at 4 people there’s just over a 50% chance that someone will get at least one standard deviation above the mean.

So for groups of more than a couple people who are working independently and whose results can simply be replicated, the optimal strategy is not in fact to optimise your local strategy but to pursue higher variance ones. You should still try to produce good strategies of course, but by pursuing higher variance at some cost of optimality you may significantly improve the results.

Obviously this model is overly simplistic – quality isn’t scalar, it’s probably not normally distributed, and independence can be hard to achieve in practice, but I hope it is at least suggestive.