You are here

Home » Frank Morgan's Math Chat - Rolling a Die Until Every Face Has Appeared

Frank Morgan's Math Chat - Rolling a Die Until Every Face Has Appeared

January 7, 1999

OLD CHALLENGE (Steve Jabloner). How many times do you think you need to roll a normal die to be 90% sure that each of the six faces has appeared at least once? Why?

ANSWER. It takes 23 rolls, as Al Zimmerman discovered in a computer experiment rolling 5000 dice until 90% of them had shown each face. In a less accurate experiment with just 300 dice, Eric Brahinsky wrongly concluded that 22 rolls provide a 90.7% probability. Actually, the probability for 22 rolls is just 89.33%, while for 23 rolls it is 91.08%. John Robertson and Jean-Pierre Carmichael found a general formula for the probability for n rolls (see Feller's "An Introduction to Probability Theory and Its Applications" (2nd ed., vol. 1, p. 106)):

(*) 1x6n - 6x5n + 15x4n - 20x3n + 15x2n - 6 all divided by 6n.

Derek Smith of the Math Chat staff explains how to get this formula by so-called "inclusion-exclusion." Let's call the sides 1, 2, ..., 6, so that one sequence of, say, n = 10 rolls we can write as 1635241533. The total number of possible sequences of length n is 6n, because at each roll there are 6 possibilities for the side that's up.

We're interested in those sequences that use ALL of the sides at least once, so let's throw away the ones that don't. There are 5n sequences of length n that don't mention side 1, since at each roll we need the side up to be 2, 3, 4, 5, or 6. Similarly, there are 5n sequences of length n that don't mention side 2, 5n that don't mention side 3, and so on.

So it looks like the number we want is 6n - 6x5n, but that's not quite it, because we threw away some sequences more than once. For example, if n = 10, 2233223322 is a sequence which doesn't include 1, so we throw it away on that basis; but it also doesn't include 4, so we throw it away for this reason as well. We need to make up for this overlap, and then we need to make up for any overlap our new correction causes, etc.

Fortunately, this process stops after a while, and we get the formula

f(n) = 1x6n - 6x5n + 15x4n - 20x3n + 15x2n - 6x1n + 1x0n

for the number of sequences of length n that contain each of the numbers 1, 2, 3, 4, 5, and 6 at least once. (The coefficients {1, 6, 15, 20, 15, 6, 1} in front of the powers come from Pascal's triangle.)

Since we said the total number of sequences of length n is 6n, in order that 90% of the sequences of length n mention each of the sides, we need to have f(n)/6n be at least .90, and this first happens when n = 23.

Joe Shipman starts his analysis by ignoring the smaller possibility that four or fewer distinct faces appear. Then the number of ways only five distinct faces appear is 6x5n, which we want to be at least 10% of 6n. So we set 60x5n = 6n, divide log 60 by log(6/5), and get 22.456. This suggests we look at 22 and 23. When n = 23 there are 623 = 789,730,223,053,602,816 possibilities, and at most 6x523 = 71,525,573,730,468,750 omit a face. This latter number multiple-counts scenarios where more than one face is omitted. So for n = 23 there is a greater than 90% chance that all six faces will appear. A similar, but slightly more complicated argument shows that for n = 22 the probability is less than 90%.

NEW CHALLENGE. In his Contract Bridge syndicated column on December 31, Philip Alder says that the odds against getting a "Yarborough--a hand with no card higher than a nine" are 1827 to 1, and that the odds against two players getting a Yarborough on the same deal are about 182 million to 1. Is he right? (A bridge deck consists of 52 cards, AKQJT98765432 in each of four suits. Each of four players is randomly dealt 13 cards.)