Perhaps I should add that an example of a "practical" question would be: If a random number in [0, 2^1024-1] is required, how should one best go about to generate it as a bit sequence through throwing normal dice?
–
Mok-Kong ShenJan 30 '13 at 22:07

You could always discard die rolls corresponding to 4 and 5. Then each die roll produces two bits of entropy.
–
Stephen TousetJan 30 '13 at 23:20

4 Answers
4

Thomas' first procedure produces 2 bits per roll with a probability of $\frac23$, i.e. it produces $\frac43 \doteq 1.333$ bits on the average. This can be improved as he describes, but it gets quite complicated soon.

Producing a single bit per roll by taking the result mod two leads to $1$ bit per roll, which is not much worse. Combining the two simple methods like

1 -> 00
2 -> 01
3 -> 10
4 -> 11
5 -> 0
6 -> 1

is quite simple and produces $1+\frac23 = \frac53 \doteq 1.667$ bits per roll. AFAIK, this is the maximum you can get without aggregating the rolls.

You can apply this idea to two rolls and obtain $5$ bits with probability of $\frac{32}{36}$ and $2$ bits otherwise thus getting

bits for two rolls, i.e., $2\frac13 \doteq 2.333$ bits per roll, which is quite close to the theoretical maximum of $\log_2 6 \doteq 2.585$.

For a simple "implementation without computer", note that you can extract 1 bit directly from each roll and need to process the "rest" only (e.g. you can get the bit via $n \bmod 2$ and the rest via $\lfloor \frac{n-1}3 \rfloor$). This makes even combining 3 or 4 rolls using a small table easy.

This family of procedures is optimal on the criteria of requiring the least dice throws to generate a given number of bits in the worst case.
–
fgrieuJan 31 '13 at 20:07

The method you described for both single and two rolls are very fine for the practice, I believe. One just have to look up in a table to write down the bits. For three rolls, however, there is a drop in efficiency, if my computation is not in error. (For the method described by Thomas the efficiency of using 3 rolls is also less than that of using 2 rolls.)
–
Mok-Kong ShenJan 31 '13 at 22:35

I have computed the values of efficiencies for different number of rolls in increasing order (in parentheses is value of the method described by Thomas) up to 100 rolls: 1 1.66667 (1.33333), 2 2.33333 (2.22222), 4 2.38272, 7 2.53178 (2.40800), 12 2.57454 (2.54856), 24 2.57598, 36 2.57708, 48 2.57712, 53 2.58454 (2.57951)
–
Mok-Kong ShenFeb 1 '13 at 11:02

The obvious approach is to consider the following bit extraction algorithm:

Roll the die, producing an integer $n$ uniform in $\mathbb{Z}_6$.

If $n < 4$, return $n$ as two bits. Otherwise, go to 1.

This will return two uniform bits. The algorithm will always terminate, since the probability of recursion is equal to $\frac{2}{6}$ which is subcritical. A proof of correctness can be found in Dennis' post on this question.

How many dice rolls will the algorithm require? At least one, and arbitrarily many, but on average:

That is, on average, the algorithm will require one and a half dice rolls. This is, of course, in accordance with information theory, which states that the Shannon entropy of an unbiased dice roll is equal to:

However, this is not optimal. A lot of effort is wasted into compressing this into two uniform bits. Can we do better? Yes. For instance, consider rolling two dice together, drawing a uniform $n$ out of $\mathbb{Z}_{36}$. Then we can apply the same algorithm, but returning five bits instead of four, since $32 < 36$. Similarly, the probability of recursion is also lower at $\frac{4}{36}$ which means we are wasting less time and entropy.

We could also roll three dice, drawing $n$ out of $\mathbb{Z}_{216}$, but this is not optimal, even though we can extract seven bits, the probability of recursion is $\frac{88}{216}$ which is much higher than with even just a single die.

So, clearly, the "trick" is to select an integral number of dice such that $6^k$ is very close to (but greater than) a power of two, to minimize the probability of recursion and maximize our use of entropy. A computer program could be written to find the optimal number of dice to roll according to some efficiency metric, depending on your application's needs (or perhaps a general one exists).

The "intuition" behind this is that you can't use arbitrary amounts of entropy, since the bit is the absolute smallest amount of information possible. Any non-integral amount of entropy remaining must be "left behind" (here we use a recursive procedure to eliminate it, but there are other alternatives*).

However, if you can combine multiple such non-integral amounts of entropy through a non-destructive process (here, throwing multiple dice at the same time) you will naturally get more integral amounts of entropy ($0.5 + 0.5 = 1$) which allows you to make better use, of what you would otherwise have had to throw away.

*The answers to this question also suggest recursion is not an optimal approach in terms of efficiency.

A quick calculation on my computer shows that throwing the dice 53 times would be best (from 1 to 100 dice throws) ... ah, 12 throws is a bit more practical and quite good too :)
–
Maarten BodewesJan 31 '13 at 0:21

12 throws gives you 31 bits, nice for an positive signed integer, but 32 would have been more practical :(
–
Maarten BodewesJan 31 '13 at 0:27

Since number and rotations should both be random variables, the distribution is uniform, unbiased and random. The good thing is rotational distinctions between upper lower half, and all four quadrants are almost as easy to distinguish with the naked eye as the numbers themselves. Though on the boundaries there will be some error and bias, introducing a small additional noise into the signal.

Effectively this procedure amounts to the following : roll the die measuring its number and quadrant of rotation, and discard 1 bit of the quadrant information for the numbers {2,3,4 and 5}.

Here is a specialization of this answer that is efficient in term of average entropy loss, and easily implemented.

Let $n$ be the number of faces on the dice ($n=6$ for a common dice). Let $b$ be the minimum value on the dice ($b=1$ on a common dice). Let $m$ be some large integer with $m≥n^2$. No value used will exceed $m$.

Set $r←0$, $s←1$.

If $s$ is even then proceed to 8.

If $s>m/n$ then proceed to 7.

Throw dice, getting $x$

Set $r←(x-b)+(n⋅r)$ and $s←n⋅s$

Proceed to 2 (optionally: or directly to 8 for even $n$).

If $r=s-1$ then proceed to 1.

Output $r\bmod 2$

Set $r←⌊r/2⌋$ and $s←⌊s/2⌋$

Proceed to 2.

The algorithm maintains $r$ uniformly random with $0≤r<s$, and its correctness follows from that.

As $m$ grows, the average proportion of lost entropy vanishes. However, the algorithm will from time to time (including initially) experience a longer period where it accumulates entropy and temporarily outputs at a low rate (or not at all for several consecutive dice throws, for odd $n$).