A fair random-number generator a la Laplace

Browsing through Keynes’ “A Treatise on Probability,” I came across a pretty nugget which Keynes credits to Laplace. Suppose you want to make a fair decision via coin flip, but are afraid the coin is slightly biased. Flip two coins (or the same coin twice), and call it “heads” if the flips match, tails otherwise. This procedure is practically guaranteed to have very small bias. In fact, if we call the bias of flip , a quick calculation shows that the bias of the double flip is , so that a 1% bias would become a near-negligible .01%.

I noticed that we can extend this; consider using flips, and calling the outcome “heads” if the number of tails is even, “tails” if it is odd. An easy induction shows that the bias of this procedure is , which of course goes to 0 very quickly even if each coin is quite biased. Here also is a nice direct calculation: consider expanding the product

The magnitude of each of the terms is the probability of a certain sequence of flips; the sign is positive or negative according to whether the number of tails is even or odd. Done.

I can hardly believe it of such a simple observation, but this actually feels novel to me personally (not to intellectual history, obviously.) Not surprising exactly, but novel. I suppose examples such as the following are very intuitive: The last (binary) digit of a large integer such as “number of voters for candidate X in a national election” is uniformly random, even if we know nothing about the underlying processes determining each vote, other than independence (or just independence of a large subset of the votes.)

Meta

18 comments

In fact, there is a simple procedure to simulate a (exactly) unbiased random coin from a biased one.
Flip your coin twice (and repeat the procedure if you obtain the same outcome).
Call “Heads” if you first got heads than tails, and “Tails” otherwise.

Every ultimate Frisbee player knows this procedure, because Frisbees are biased (they’re more likely to land top-down). To determine sides of the field and offense/defense, we flip two disks, which is close enough to unbiased for our purposes.

Interesting. My brother was in an Ultimate club, and I remember they used to play either rock-paper-scissors or odds-evens, the basic game-theory way to get a fair random outcome. I think he said the team had fairly serious discussions about their RPS strategy, but I don’t know if it helped much :-).

Brother chiming in. Rock-Paper-Scissors was used sometimes to choose sides in ultimate when I played in college, but the 2-disc-flip Bryce describes tended to be used more often, especially in more competitive settings.

Frustratingly, we had longer discussions about whether to call “same” or “different” with the 2-disc-flip than we ever did re:RPS… possibly because there actually is a correct answer that’s more satisfying than a mixed-strategy Nash eq, so those on the side of rationality keep trying to convince others that calling “different” is at best a 50-50 proposition, and worse if there’s any bias. (If the two discs have the same bias, U^2 + (1-U)^2 > 2*U*(1-U). And of course all you need is the two discs to be biased in the same direction, not identical.) These “debates” always felt similar to trying to convince someone skeptical of the correct Monty Hall strategy… one side argues pseudo-philosophy while the other sticks to probability.

I assume each team flips one disc? If there is any way to manipulate the flip to make it come out on the other side of 50-50, you are sort of in the odds/evens case again. You could then select randomly when to do this to generate 50-50.

The procedure Anthony mentions came from von Neumann, and does produce an perfectly unbiased result. However, it throws away at least 3/4 of the entropy available in the coin tosses. By iterating the procedure, you can recover as much of the entropy as you like.
(http://www.stat.berkeley.edu/~peres/mine/vn.pdf)

For example, looking at a sequence of coin flips in chunks of two, extract results as
HT -> H
TH -> T
HHTT -> H
TTHH -> T
HHHHTTTT -> H, etc.
After removing that segment from the sequence, continue the procedure on what remains.

Let G be the group Z/2Z and X_1,…,X_n be independent G-valued random variables with distributions P_1,…,P_n. We can view each P_i as a function from G to complex numbers (which happens to get real nonnegative values that sum to 1). Then the distribution of the sum X=X_1+…+X_n (sum in G) is the convolution P_1*….*P_n. What the equation in Jonathan’s post says is that the fourier transform of the convolution is the product of the fourier transforms of P_i.

The extension for arbitrary finite cyclic group G=Z/nZ would therefore be

When n=2 the only non-trivial root is \zeta=-1, and then E \zeta^{X_i} is the bias of the coin X_i and E\zeta^X is the bias of the sum X=X_1 + … + X_n modulo 2.

Similarly, in any finite abelian group G we have that
E\chi(X) = \prod_i E\chi(X_i) for every character \chi of G

This is useful, for example, to show that the last digit in the decimal expansions of the number of votes a candidate gets is also uniform. (take G=Z/10Z). Another implication: if you are having a stochastic trip with independent increments on a group of prime order, which means at every day t you start at some S_t in G, randomizes your step X_t independently of your previous steps and moves to S_{t+1}=S_t + X_t then after many days your location will be uniformly distributed, regardless of the distribution of the steps (as long as they are not too close to being deterministic).

Interesting. So your point is that (for prime moduli, where all nontrivial characters have trivial kernel) E\chi(X_i) has modulus less than 1 for each non-trivial character whenever X_i is not deterministic, thus the modulus of E\chi(X) goes to 0 exponentially, just as in the binary case. Now, any given character could have zero expectation without X being uniform…but I guess E\chi(X) being zero for *all* non-trivial characters implies X is uniform?…this certainly sounds right, but I’m not coming up with a simple proof right this minute. I’ll sleep on it.

Ok, I see it. Let a_i = P(X=i). Then all non-trivial characters having zero expectation means p(z) = a_0 + a_1 z +a_2 z^2 + … + a_(p-1) z^(p-1)} has a root at each of the (p-1) non-trivial roots of unity, which along with p(1)=1 determines it uniquely, forcing all a_i = 1/p.

I guess this part doesn’t use primality; the part that does is that every non-point-mass distribution has modulus less than 1 under every non-trivial character. For general abelian groups I guess you need to stipulate that each distribution X_i is not concentrated on a coset of a proper subgroup to guarantee the modulus is less than 1 for each character. in Z/nZ this must mean differences cannot all be multiples of a proper divisor d|n.

Right. I think this is the exact condition. Maybe also for non-abelian groups though I guess for such groups one need to go beyond the characters to general group representations if one wants to prove the assertion with similar methods. but that’s over my head.

I should have been more careful with my last parenthetical in the original post (saying independence of a large subset of voters suffices.) This is false. You could have N-1 independent voters, but the last voter could just vote according to whether the total of those N-1 is even or odd. This is implausible of course, but it does say if we can weaken independence at all we’d have to be very careful how we do it.

I guess if you interpret my parenthetical to mean that the “large subset” are independent not just from each other but from everyone else as well, it’s true again. I claim only partly tongue-in-cheek that this is what I had in mind, but the lesson is that independence assumptions need to be carefully stated.

Continuing this public conversation with myself: I guess independence *conditional* on some global parameter \theta would be good enough, provided the bias is bounded away from 1 for each value of \theta. This is relevant to the voting example; having a common shock is OK provided you have conditional independence and the shock doesn’t determine the votes too precisely.

What if instead of a coin, one had a six faced die…. biased differently on each face. How could one make a sequence of rolls to attribute to each an unbiased face value?

Since this is now multinomial… I wonder if a solution even exists…
Basically, of all the multinomial coefficients, I need to partition them into six bins of equal value to ensure fairness… Is this predictably possible?
This partitioning process needs to also carry on ad infinitum, such that the persistent bias approaches zero…