How random is Python’s random?

Jerry Stratton, March 20, 2011

Computers have a poor reputation for randomness, and with good reason. In the old days, randomness wasn’t taken seriously. In order to get a random number the system might poll some non-random aspect of the hardware. Those whose programmers understood some mathematics might use an algorithm, but they would reset the algorithm every time the computer started up, or, worse, every time a program started up, resulting in the same series of “random” numbers every time a program was run.

Nowadays you shouldn’t have to worry about that. The reason is security: modern security relies heavily on getting truly random numbers out of the computer’s random algorithms. This has resulted in a lot of research into computer-friendly random algorithms. Python’s random functions are able to leverage this research.1 You should have no problem using your computer’s random number generators for gaming purposes nowadays.

The default random number generators in most programming languages are not cryptographically-appropriate random, however.2 The reason for this is that computer programs are bound by predictability: programmers want predictability, even when they say they want randomness.

Imagine, for example, that you’re animating a film clip in a 3D renderer. You carefully place all of the major components of the scene, but you also have a flock of birds flying in the background. You outline the path that the flock will take, but after you render the video, the birds are far too uniform. They look more like a smooth amusement park ride than a flock of individual creatures. You might say they look like a railroaded adventure.

So you add some randomness to each individual bird’s flight path; each bird follows the basic flight path of the flock, but each moves left, right, up and down by some random amount each frame of the video. You don’t want to individually plot every single member of the flock, you just want them to act like birds. A little random variation is fine.3

But when you look at the film, you see that some of the birds are flying through a telephone pole. So you make some minor adjustments to the flight path and try again, and this time it’s great.

So you make your film and you send it off to the producers, and they say, awesome! Let’s make a high-definition version for the theaters. So you re-render the film… and the birds are moving in different directions this time, because it’s random! Every time you re-render the film, the birds move differently.

The solution, and this is often enforced by the renderer, is to require you to “seed” a random number generator that emulates statistical tests for randomness but which produces results from a series of artificially—that is, pseudo—random numbers. Python uses the Mersenne twister. This algorithm produces 219937-1 “random” results before repeating itself. That’s the number “4” followed by 6,001 digits.4 When you ask for a random number, or a random choice within a list, Python “seeds” this long list by creating a random starting point for the algorithm, and then chooses each successive number from the algorithm’s “list”.

By default, you let Python choose your seed for you. But if you give the random number generator a seed, it will produce the same random numbers each time you give it that seed. Thus, you can use randomness to make a better flock without fear of the birds acting differently each time you render the scene. Persistence of Vision uses this method to give you, as a 3D artist, non-random pseudorandomness.

In the examples I’ve provided so far, we have not specified a seed, so the computer chooses a seed randomly. This makes our computer-generated rolls, for all practical purposes, fully random. But you can specify a seed if you want to see this phenomenon in action, and thus get the same random numbers every time.

[toggle code]

#!/usr/bin/python

import optparse

import random

parser = optparse.OptionParser()

(options, args) = parser.parse_args()

if not args:

args = [1]

for dieCount in args:

dieCount = int(dieCount)

total = 0

dice = []

for counter in range(dieCount):

die = random.randrange(6)+1

total = total + die

dice.append(die)

print total,

if dieCount > 1:

print ':', dice

else:

print

Save this as die6, make it executable, and you can run it as ./die6. You’ll get a random sequence of d6 rolls every time. You can roll up some stats using:

This is what makes this particular algorithm unsafe for cryptography: if a hacker can determine where in the sequence you were when you generated your random numbers, they can predict what those numbers are. For gaming purposes, however, this is a non-issue. Unless your players are mathematical savants of unearthly power, they are not going to be able to predict where in that series you are, nor will it matter, because by default you get a different seed each time you “roll the die”.6

That said, if you let them roll their own dice programs, you might consider looking at the code. Here’s a variation that lets you specify your own seed:

[toggle code]

#!/usr/bin/python

import optparse

import random

parser = optparse.OptionParser()

parser.add_option('-s', '--seed', type='int')

(options, args) = parser.parse_args()

if not args:

args = [1]

if options.seed:

random.seed(options.seed)

for dieCount in args:

dieCount = int(dieCount)

total = 0

dice = []

for counter in range(dieCount):

die = random.randrange(6)+1

total = total + die

dice.append(die)

print total,

if dieCount > 1:

print ':', dice

else:

print

Seed 631 isn’t a bad set of stats. 635 is also believable with a couple of seventeens and only one dump stat. Seed 3555, on the other hand, is probably not believable. But take a look at 48191: two eighteens, two eights. You could get away with that. On the other hand, seed 122075, probably not.

$ ./die6 3 3 3 3 3 3 --seed 631

10 : [5, 3, 2]

14 : [4, 6, 4]

18 : [6, 6, 6]

12 : [5, 6, 1]

15 : [6, 5, 4]

14 : [4, 6, 4]

$ ./die6 3 3 3 3 3 3 --seed 635

12 : [3, 5, 4]

13 : [2, 5, 6]

8 : [1, 2, 5]

12 : [4, 6, 2]

17 : [6, 6, 5]

17 : [6, 6, 5]

$ ./die6 3 3 3 3 3 3 --seed 3555

12 : [6, 4, 2]

15 : [6, 4, 5]

16 : [6, 4, 6]

17 : [5, 6, 6]

18 : [6, 6, 6]

12 : [4, 3, 5]

$ ./die6 3 3 3 3 3 3 --seed 48191

18 : [6, 6, 6]

8 : [1, 3, 4]

12 : [3, 5, 4]

18 : [6, 6, 6]

8 : [1, 2, 5]

15 : [6, 6, 3]

$ ./die6 3 3 3 3 3 3 --seed 122075

11 : [4, 5, 2]

12 : [3, 6, 3]

18 : [6, 6, 6]

12 : [6, 5, 1]

18 : [6, 6, 6]

18 : [6, 6, 6]

But as long as you don’t use the same seed every time, this will be random enough.

“If your list is such that there are 2**19937 or more permutations, some of these will never be obtained by shuffling the list. You’d (again, conceptually) generate all permutations of the list, then generate a random number x, and pick the xth permutation. Next time, you generate another random number y, and pick the yth permutation. And so on. But, since there are more permutations than you’ll get random numbers (because, at most after 2**19937-1 generated numbers, you’ll start getting the same ones again), you’ll start picking the same permutations again.”

“The Mersenne twister is a pseudorandom number generator developed in 1997 by Makoto Matsumoto and Takuji Nishimura. It provides for fast generation of very high-quality pseudorandom numbers, having been designed specifically to rectify many of the flaws found in older algorithms.”

“The Birthday Paradox applies where given value can occur more than once during the period of the generator—and therefore duplicates can happen within a sample of values. The effect of the Birthday Paradox is that the real likelihood of getting such duplicates is quite significant and the average period between them is smaller than one might otherwise have thought. This dissonance between the perceived and actual probabilities makes the Birthday Paradox a good example of a cognitive bias, where a naive intuitive estimate is likely to be wildly wrong.”

My d20 appears to have been rolling a lot of ones, a disaster if I were playing D&D but a boon for Gods & Monsters. Is my die really random, or is it skewed towards a particular result? Use the ‘R’ open source statistics tool to find out.

My d20 appears to have been rolling a lot of ones, a disaster if I were playing D&D but a boon for Gods & Monsters. Is my die really random, or is it skewed towards a particular result? Use the ‘R’ open source statistics tool to find out.

Blogroll

Comments?

The undiscovered comment form, whose bourn no poster returns.

Your comment

Your name

Your email

Your web page

Your location

Your email, URL, and location are optional—but I won’t be able to contact you if you don’t leave a working email. Your email does not get displayed, your URL and location do. Your name is required but may vary as the needs of the day demand, or you can just use the anonymous Will Stratford name. You can use the following tags: <em>, <a>, <blockquote>. Use them wisely and post intelligently. Comments may take some time to approve, especially if I’m on a three-day gaming bender.