"Linux Gazette...making Linux just a little more fun!"

Taming rand() and random()

In the lower levels of the Ontario Science Center in Toronto, Canada,
there is a wide circular device made of thin rods of steel. Curious bystanders
can take billiard balls, put there for that purpose, and let them loose
on the machine. The balls whiz along their rails, richocheting off
pins, clanging through wind chimes, grabbed by
counterweighted arms and lifted towards the ceiling. At several
places the balls chose one rail or another purely at random. How
is it that a construct not powered in any way, laid out in a rigid pattern,
still produces unexpected results?

Writing programs that use random numbers requires an understanding of
error estimation, probability theory, statistics and other advanced numeric
disciplines.

Bunk.

Random numbers are about getting your programs to do the unexpected
without a core dump being involved. They're about having fun.

Random But Not Really Random

Computers do not use "real world" random numbers. Like the
billiard-ball machine, computers are rigid, constrained by rules and logical
behaviour. For a computer to generate truly random numbers, it would
have to choose numbers by examining real world events. In the early
days, people might roll some 10-sided dice and compose a list of digits
for a program to use.

Unfortunately real-world random numbers can be unexpectedly biased.
As the old saying goes, "the real world
is a special case." Instead, computers rely on mathematics to
generate uniformly distributed (that is, random but not too random) numbers.
They are "pseudo-random", generated by mathematic functions which create
a seemingly non-repeating sequence. Over time, the numbers in the
sequence will reliably occur equally often, with no one number being favoured
over another.

The Linux standard C library (stdlib.h) has two built-in random number
functions. The first, rand(), returns a random integer between 0
and RAND_MAX. If we type

The other standard library function, random(), returns a positive long
integer. On Linux, both integer and long integer numbers are the
same size. random() has some other properties that are discussed
below.

There are also older, obsolete functions to produce random numbers

* drand48/erand48 return a random double between 0..1.
* lrand48/nrand48 return a random long between 0 and 2^31.
* mrand48/jrand48 return a signed random long.

These are provided for backward compatibility with other flavours of
UNIX.

rand() and random() are, of course, totally useless as they appear and
are rarely called directly. It's not often we're looking for a number
number between 0 and a really big number: the numbers need to apply to
actually problems with specific ranges of alternatives. To tame rand(),
its value must be scaled to a more useful range such as between 1 and some
specific maximum. The modulus (%) operator works well: when a number
is divided, the remainder is between 0 and 1 less than the original number.
Adding 1 to the modulus result gives the range we're looking for.

int rnd( int max ) {return (rand() % max) + 1; }

This one line function will return numbers between 1 and a specified
maximum. rnd(10) will return numbers between 1 and 10, rnd(50) will
return numbers between 1 and 50. Real life events can be simulated
by assigning numbers for different outcomes. Flipping a coin
is rnd(2)==1 for heads, rnd(2)==2 for tails. Rolling a pair of dice
is rnd(6)+rnd(6).

The rand() discussion in the Linux manual recommends that you take the
"upper bits" (that is, use division instead of modulus) because they tend
to be more random. However, the rnd() function above is suitably
random for most applications.

The following test program generates 100 numbers between 1 and 10, counting
how often each number comes up in the sequence. If the numbers were
perfectly uniform, they would appear 10 times each.

Linux's rand() function goes to great efforts to generate high-quality
random numbers and therefore uses a significant amount of CPU time. If
you need to generate a lot mediocre quality random numbers quickly, you can use
a function like this:

This function sacrifices accuracy for speed: it will produce random
numbers not quite as mathematically uniform as rnd(), but it uses only
a few short calculations. Ideally, the offset and multiplier should
be prime numbers so that fewer numbers will be favoured over others.

Replacing rnd with fast_rnd() in the test functions still gives a reasonable
approximation of rand() with

for fast_rnd(), graph[1..10] is 11 4 4 1 8 8 5 7 6 5

Controlling the Sequence

A seed is the initial value given to a random number generator
to produce the first random number. If you set
the seed to a certain value, the sequence of numbers will always repeat,
starting with the same number. If you are writing a game, for example,
you can set the seed to a specific value and use the fast_rnd() to position
enemies in the same place each time without actually having to save any
location information.

The seed for the Linux rand() function is set by srand(). For example,

srand( 4 );

will set the rand() seed to 4.

There are two ways to control the sequence with the other Linux
function, random(). First, srandom(), like srand(), will set a seed
for random().

Second, if you need greater precision, Linux provides two functions
to control the speed and precision of random(). With initstate(),
you can give random() both a seed and a buffer for keeping the intermediate
function result. The buffer can be 8, 32, 64, 128 or 256 bytes in
size. Larger buffers will give better random numbers but will take
longer to calculate as a result.

using a 256 byte state, we get 510644794 using a 256 byte state, we get 625058908 resetting the state, we get 510644794

You can switch random() states with setstate(), followed by srandom()
to initialize the seed to a specific value.
setstate() always returns a pointer to the previous state.

oldstate = setstate( newstate );

Unless you change the seed when your program starts, your random numbers
will always be the same. To create changing random sequences, the
seed should be set to some value outside of the program or users control.
Using the time code returned by time.h's time() is a good choice.

srand( time( NULL ) );

Since the time is always changing, this will give your program a new
sequence of random numbers each
time it begins execution.

Randomizing Lists

One of the classic gaming problems that seems to stump many people
is shuffling, changing the order of items in a list. While I was
at university, the Computer Center there faced the task of sorting a list
of names. Their solution was to print out the names on paper, cut
the paper with scissors, and pull the slips of paper from a bucket and
retype them into the computer.

So what is the best approach to shuffling a list? Cutting up a
print out? Dubious. Exchanging random items a few thousand
times? Effective, but slow and it doesn't guarantee that all items
will have a chance to be moved. Instead, take each item in the list
and exchange it with some other item. For example, suppose we have
a list of 52 playing cards represented by the numbers 0 to 51. To
shuffle the cards, we'd do the following:

Different Types of Randomness

People acquainted with statistics know that many of real life events
do not happen with a uniform pattern. The first major repair for
a car, for example, might happen between 5 and 9 years of after purchase,
but it might be most common around the 7th year. Any year in the
range is likely, but its most likely to be in the middle of the range.

Small unexpected events like these occur in a bell curve shape (called
a normal distribution in statistics). Creating random numbers that
conform to such a complex shape may seem like a duanting task, but it really
isn't. Since our rnd() function already produces nicely uniform "unexpected"
events, we don't need a statistics textbook formula to generate normally
distributed random numbers. All we need to do is call rnd() a few
times and take the average, simulating a normal distribution.

In each recursion, low_rnd() splits the range in half, favoring the
lower half of the range. By deducting a low random number from the
top of the range, we could write a corresponding high_rnd() favoring numbers
near the max:

This function is true the specified percentage of the time making it
easy to incorporate into an if statement.

if ( odds( 50 ) ) printf( "The cave did not collapse!\n" )else printf( "Ouch! You are squashed beneath a mountain
of boulders.\n" );

The standard C library rand() and random() functions provide a program
with uniformly distributed random numbers. The sequence and precision
can be controlled by other library functions and the distribution of numbers
can be altered by simple functions. Random numbers can add unpredictability
to a program and are, of course, the backbone to exciting play in computer
games.