I was trying to solve a hobby problem that required generating a million random numbers. But I quickly realized, it is becoming difficult to make them unique. I picked up Algorithm Design Manual to read about random number generation.

It has the following paragraph that I am fully not able to understand.

Unfortunately, generating random numbers looks a lot easier than it really is. Indeed, it is fundamentally impossible to produce truly random numbers on any deterministic device. Von Neumann [Neu63] said it best: “Anyone who considers arithmetical methods of producing random digits is, of course, in a state of sin.” The best we can hope for are pseudo-random numbers, a stream of numbers that appear as if they were generated randomly.

Why is it impossible to produce truly random numbers in any deterministic device? What does this sentence mean?

Are you really asking why you cannot produce truly random number on a deterministic device? Doesn't the question already include the answer?
–
herbyDec 9 '11 at 17:02

34

If all the numbers you generate have to be unique, they aren't really random. It's entirely possible that a true random number generator will give the same result ten times in a row.
–
TMNDec 9 '11 at 17:22

25

There is a flaw in looking for random numbers which are unique. If you are con-straining the numbers to be unique, then they are not random as random demands the possibility of repetition no matter how improbable.
–
Mark BoothDec 9 '11 at 17:29

12

Outside the computer, is any random number ever truly random? Throw a die, it's physics with just a lot of vectors.
–
MPelletierDec 9 '11 at 17:49

9

@MPelletier: Not quite. Quantum mechanics might (once scientists have figured more of it out) imply the existence of true randomness, depending on your definition of randomness.
–
BrianDec 9 '11 at 17:53

9 Answers
9

One should look for a cryptographically secure pseudo-random number generator. Most PRNG are linear congruence generators (so next number is a linear function of previous number), so if you plot next number vs previous number you'll get a chart of parallel lines. A CSPRNG will not do that. The trade-off is that they're slow.

Why is it impossible to produce truly random numbers in any deterministic device ?

A deterministic device will always produce the same output when given the same starting conditions and inputs - that is what it means to be deterministic. "Truly random number" is more of a philosophical viewpoint, as what does it mean to be random is the crux of the philosophical navel gazing (folks aren't even certain if atomic decay is random or follows some pattern we just can't figure out yet). A cryptographically secure random number generator is going to take some external source of entropy to make the device non-deterministic.

That is why it is impossible to get a truly random number. Even if the sequence never repeats, which is not guaranteed for random numbers, Another run of the program with the same inputs will produce the same results. So, someone else can reproduce your random numbers at a later time, which means it wasn't really random at all.
–
Spencer RathbunDec 9 '11 at 17:12

2

@user973810 The problem with that definition from information theory is that you cannot exhibit an actual instance of a random sequence. We can prove, for any reasonable definition language, that almost every infinite sequence (in a technical sense) is random, because it cannot be described in the language at all. What is more useful is the concept of a random sequence generator: not one that produces a random sequence, but one that produces a sequence randomly.
–
GillesDec 9 '11 at 22:07

@David: We can go a bit further than that even. The various experiments on Bell's inequality shows that certain quantum processes are definitively unpredictable. They may be random in some philosophical sense or they may depend on non-local hidden variables, but either case prevents reliable prediction.
–
dmckeeDec 10 '11 at 18:50

6

@dmckee: yeah, I just figured it would be easier to stay away from trying to explain the connection between Bell's inequality and wavefunction collapse in comments on prog.SE. People can always come to our site if they're curious ;-) Tangurena: true, Einstein did say that, but that just meant he really really wanted the universe to be deterministic. It's not, though. Experiments done after Einstein's death showed that pretty conclusively (barring non-local hidden variables, a.k.a. weirdness). Just because he's Einstein doesn't mean he was right about everything.
–
David ZDec 12 '11 at 1:45

True randomness implies nondeterminism. If it's deterministic, it can be accurately predicted (this is what determinism means); if it can be predicted, it is not random.

The best thing you can get from a deterministic pseudo-random number generator is a stream of numbers that has a very long cycle (non-repeating is impossible unless your RNG device has unlimited storage) which, for the length of the cycle, produces a stream numbers that meets all the other properties of a random sequence (a uniform distribution of values being the most interesting one).

To solve this issue, many modern UNIXes and Unix-likes have kernel RNG's that use physical noise sources to generate true randomness.

Another common approach is to take the current time as the seed for a deterministic RNG (srand(time(NULL)); in C); cryptographically speaking, this is worthless, since the current time is no secret, but for things like physical simulations or video games, it is good enough.

Note that non-repeating is also impossible for any generator with bounded output value (limited number of bits). But of course the cycle length of a deterministic generator is most probably way shorter than the theoretical maximum which is all possible permutations.
–
9000Oct 27 '13 at 12:51

@9000: Of course this isn't true. Take an irrational number of use the digits (any base) as your "random" sequence. Boom! non-repeating sequence (by definition) and still bounded (to your base).
–
ThePopMachineFeb 19 at 16:27

@ThePopMachine: you can generate a non-repeating sequence of bits of any length, equivalent to a non-repeating sequence of numbers of unbounded length. You cannot generate a non-repeating sequence of integer numbers of limited magnitude (e.g. 32-bit); once you have generated all permutations of 32-bit values, a sequence must repeat. You are right; we're just talking about different things.
–
9000Feb 19 at 17:50

@9000: No weaseling. You made a blanket statement which is false. If you're really just trying to there aren't more than n^k different sequences of length k for n different values, and therefore it must repeat, then this is pretty obvious and not interesting.
–
ThePopMachineFeb 19 at 18:40

1

@ThePopMachine: I'd appreciate if you toned down a bit. To quote, « non-repeating is also impossible for any generator with bounded output value (limited number of bits)». What you explicitly talk about is an unbounded number of bits, as a sequence of [binary] digits of an irrational number. Your statement, while true, is unrelated to the problem.
–
9000Feb 19 at 21:00

The second chapter of the book Discrete-Event Simulation: A First Course by Lawrence Leemis gives a fantastic introduction to random number generators (or more accurately, psuedo-random number generators).

An excerpt from his book explains it well in my opinion:

Historically three types of random number generators have been
advocated for computational applications: (a) 1950's-style table
look-up generators like, for example, the RAND corporation table of a
million random digits; (b) hardware generators like, for example,
thermal "white noise" devices; and (c) algorithmic (software)
generators. Of these three types, only algorithmic generators have
achieved widespread acceptance. The reason for this is that only
algorithmic generators have the potential to satisfy all of the
following generally well-accepted random number generation criteria. A
generator should be:

random - able to produce output that passes all reasonable statistical tests of randomness;

controllable - able to reproduce its output, if desired;

portable - able to produce the same output on a wide variety of computer systems;

efficient - fast, with minimal computer resource requirements;

documented - theoretically analyzed and extensively tested.

So while it could be possible to use a white-noise generator to get "better" random numbers, they have not gained acceptance because they do not follow most of the criteria above.

I would recommend that you get your hands on a copy of that book (or on something similar). Understanding exactly how PRNG's work will definitely assist you in your efforts.

Because You need to write code to generates the random numbers and Code is NOT random. (It's deterministic)

So you wind up starting with a "Seed value(s)" that is picked at "Random" (usually the current time stamp) then use it in an algorithm to start generating numbers. But the entire set of is based off the original Seed value!

So if you run your code again with the exact same Seed value(s), you will get the EXACT same SET of numbers! How can any reasonably person call that random? But it sure does LOOK random.

Regarding making them unique, After generating a number simply check if you already have that number, if you do, throw it away and generate a new one.

Since you are generating random numbers, you should expect the generated values to be non-unique. This is a property of randomness - you can't say a sequence of truly random (or even pseudo-random) numbers is unique, because that requirement would allow the final value in the range to be predicted, as well as changing the probability of all the unchosen numbers each time a new one is selected.

The problem with a computer is that it always knows ALL variables. The random number is simply a mathematical function of some seed value.
The best we can do is to give the computer a pseudo-random seed value, which is usually based off a variable that we can't predict (such as exact time).

Even though a computer is absolutely unable to create a random number, it is good at introducing too many variables to predict!

Generating truly random numbers in software is indeed not possible as others have pointed out, however it is possible with hardware to build a device which can generate truly random numbers*. There are quite a few examples of this on the internet, and there are a variety of methods used, from reading the time between ticks on Geiger counter to sampling the white noise (mostly background radiation from the universe) of an untuned receiver. I myself have built a few using a few of the methods available.

*Any good physics geek will point out that given the way the universe operates none of these are hyper-technicaly truely random but there is no reasonable way to predict the results so for the sake of this discussion they are sufficiant.

As a part-time physics geek, generators based on quantum events are (as far as we've been able to tell) truly random. People who dislike randomness have been trying to take the randomness out of quantum mechanics since it started, and all it's done is pile up more evidence that it's truly random.
–
David ThornleyDec 9 '11 at 21:43

@Chad: There is no formula in the usual sense; that was ruled out by the EPR experiments. It's certainly conceivable that it's all deterministic, but not in any easily understandable way.
–
David ThornleyDec 9 '11 at 22:28

@DavidThornley, I knew that was the wrong word to use. I think we know what I was trying to say. Nearly whenever someone says something is impossible, someone else eventually proves them wrong. It's human nature.
–
CaffGeekDec 9 '11 at 22:38

2

That's like saying eventually someone is going to make a machine that can solve the halting problem because someone said it was impossible. It's not a matter of finding the equation, it actually is random according to all the experiments that have been run and that math that backs it up.
–
AlexDec 10 '11 at 3:28

There is no way you can produce a random number without a special hardware. In my freshman year, a couple of classmates and I proposed a random number generator that has basically a AM receiver and tuned to 4 different channels, get the input into a A to D converter and add them all (modulo your max number). Since the combination of analog input from any arbitrary number of stations is random and we could produce a large number of random numbers from the A2D convertor we proposed this could be a good generator. Of course, even this is not truly random in a philosophical sense, though for most practical purposes this could work.

Determinism is essentially a function. Remember from Algebra that a function is a correspondence between a domain and range such that each member of the domain corresponds to exactly one member of the range.

So if f(x) = z, f(x) != y unless y is z. That is a function. Imagine JavaScript:

No matter how many times you call Add(2,3) it will always return 5. In other words, Add() is a deterministic function.

External factors can make Add behave in a non-deterministic fashion. For example, if you introduce multithreading into the equation. Human input also causes non-determinism.

Now, this is where things get interesting.

“Anyone who considers arithmetical methods of producing random digits is, of course, in a state of sin.”

Note Von Neumann states, "arithmetical methods of producing [...]". This is not talking about human input, concurrency, sample wind speeds read from a precise instrument or other non-algorithmic ways of producing random input to a deterministic function.

This simply states a function or system of functions is not going to suddenly become non-deterministic. In other words, Add(2,3) will not somehow return 6 or anything other than 5 given the same inputs. That is impossible.

The quoting author takes it a step further.

The best we can hope for are pseudo-random numbers, a stream of numbers that appear as if they were generated randomly.

The context is previously defined to be "on any deterministic device". I could end the argument here. But, what if we change up the context by introducing a new element to the system? A non-deterministic element added as input makes the system a non-deterministic system. Although, by removing the non-deterministic element we are reduced back to a deterministic system. If we can somehow trace or otherwise reproduce the inputs we can reproduce a result. But this entire paragraph is tangetenial to what the author is saying. Remember the context.

One could argue over the meaning of non-determinism. Once again, tangetenial. Remember the context.

So he is correct. On any deterministic device it is impossible for a deterministic system to produce a true random result.