Linux /dev/random and Other
Sources of Entropy

Random Data on Linux (And Other UNIX-Family Operating Systems)

A number of scientific, engineering, and cybersecurity
tasks need random data.
We need them for scientific applications including
Monte Carlo methods for simulating complex physical processes.
We need to generate random noise signals in order to test
digital signal processing techniques.
Finally, several cryptographic tasks need unpredictable,
thus unguessable, data.
These include long-term RSA and ECC key pairs for SSH and PGP,
one-use-only session keys for encrypting SSH and TLS/SSL
connections and for encrypting stored data,
and the initialization vectors used for the various
chaining and feedback modes of symmetric block ciphers.

The problem is that a computer is a completely deterministic
device.
If you run the same program multiple times it should do the
same thing each time.
Otherwise it would have a serious problem.
A computer program is instead a
pseudorandom generator.
John von Neumann said
"Anyone who considers arithmetical methods of producing
random digits is, of course, in a state of sin."

An older way of generating a pseudorandom sequence on a
Unix-family operating system
(Linux,
BSD,
Apple OS X, Solaris, etc., is to first seed the
sequence generator with srand() and then
repeatedly call rand() to obtain the sequence.
The problem is that the output is too regular.

As the GNU manual page explains,
rand() and srand() first
appeared in Version 3 AT&T UNIX and conform to
ANSI C89 (ANSI X3.159-1989).
The low dozen bits go through a cyclic pattern.
Things were different then.

The Solaris manual page is a little more kind
but still makes the point that these old functions
are not good:

USAGE
The spectral properties of rand() are limited. The
drand48(3C) function provides a better, more elaborate
random-number generator.

The functions srandom() and random()
seed and then generate a sequence with much better
characteristics.
The GNU manual page for random() explains
that it uses a non-linear additive feedback random
number generator with a period of approximately
16×(231-1) or 34,359,738,352.

For Monte Carlo simulations or digital signal processing,
you just need pseudorandom data with the desired distribution.
In fact, it may be preferred because you get the same
sequence every time you start with the same seed.

However, security applications need truly
unpredictable random numbers
for purposes including the generation of cryptographic keys.
This leads to the concept of a cryptographically strong
pseudorandom number generator,
something that makes it adequately difficult to predict
the next values even after observing the sequence so far.
These unpredicatable sequences could be used to generate
long-term SSH keys for servers and users, SSL keys for
servers, or the session keys used to encrypt sensitive
files or e-mail messages.

A more mundane (and therefore frequently overlooked) need
is for unpredictable TCP initial sequence numbers and
DNS transaction ID numbers.
The TCP risks were pointed out by
Robert Morris in 1985
and
Steven Bellovin in 1989,
but we still had
problems into the 2000s
with operating systems implementing TCP in a way that
allowed attackers to hijack connections.
The DNS problems are more recent, with
RFC 5452
suggesting some interesting extensions.

Random Devices

Linux was the first operating system to include a
pseudo-device producing pseudorandom data seeded by
sources of entropy or true randomness.
The Solaris manual page for the random
device says:
"An implementation of the /dev/random and /dev/urandom
kernel-based random number generator first appeared
in Linux 1.3.30."
Other Unix-family operating systems have since added them.
The Linux random(4) manual page describes
these pseudo-devices as follows:

The random number generator gathers environmental noise from
device drivers and other sources into an entropy pool.
The generator also keeps an estimate of the number of bits
of noise in the entropy pool.
From this entropy pool random numbers are created.

When read, the /dev/random device will only return random
bytes within the estimated number of bits of noise in the
entropy pool.
/dev/random should be suitable for uses that need very high
quality randomness such as one-time pad or key generation.
When the entropy pool is empty, reads from /dev/random will
block until additional environmental noise is gathered.

A read from the /dev/urandom device will not block
waiting for more entropy.
As a result, if there is not sufficient entropy in the
entropy pool, the returned values are theoretically vulnerable
to a cryptographic attack on the algorithms used by the driver.
Knowledge of how to do this is not available in the current
unclassified literature, but it is theoretically possible
that such an attack may exist.
If this is a concern in your application,
use /dev/random instead.

That same manual page continues with some guidelines
for using these kernel features:

If you are unsure about whether you should use /dev/random or
/dev/urandom, then probably you want to use the latter.
As a general rule, /dev/urandom should be used for everything
except long-lived GPG/SSL/SSH keys.

[...]

The amount of seed material required to generate a
cryptographic key equals the effective key size of the key.
For example, a 3072-bit RSA or Diffie-Hellman private key
has an effective key size of 128 bits (it requires about 2^128
operations to break) so a key generator only needs
128 bits (16 bytes) of seed material from /dev/random.

While some safety margin above that minimum is reasonable,
as a guard against flaws in the CPRNG algorithm, no
cryptographic primitive available today can hope to
promise more than 256 bits of security,
so if any program reads more than 256 bits (32 bytes) from
the kernel random pool per invocation, or per reasonable
reseed interval (not less than one minute),
that should be taken as a sign that its cryptography is
not skillfully implemented.

Devices and Kernel Data Structures

The Linux kernel data structures in
/proc/sys/kernel/random/* provide an additional
interface to the /dev/random device.
The read and write wakeup thresholds can be changed by
writing to those files, the other values are read-only.
All can be read by cat or sysctl.

boot_id

Random string generated at boot time.

entropy_avail

The number of bits of available entropy.

poolsize

The size of the entropy pool, the maximum
size of entropy_avail.

read_wakeup_threshold

The number of bits of entropy required for
waking up processes that sleep waiting for
entropy from /dev/random.

uuid

Random UUID string generated afresh at
each read.

write_wakeup_threshold

The number of bits of entropy below which we
wake up processes that do a
select() or
poll() for
write access to /dev/random.

Below we check on the random devices available
in various UNIX-family operating systems.

All have random and urandom
devices.
OpenBSD is the odd one with its additional
arandom and srandom.
All of the OpenBSD devices have unique minor device numbers,
but I think that they all use the same underlying arc4random
algorithm.
All four on OpenBSD are very fast and highly random.

Hardware Random-Number Generators

If your CPU or motherboard has a hardware random number
generator, the corresponding Linux kernel module
can create a random device.

For example, you can buy a TPM or Trusted Platform
Module for about US$ 12–16 from Amazon.
With kernel support, there is now a
/dev/tpm0 and
/dev/hwrng.
The second of those is continuously read by
the rngd daemon to
feed entropy to the kernel.

The screenshot shows the graphical configuration tool used
to define a kernel build.
See my
kernel building page
for details on configuring, building, and installing
a custom kernel.

The build configuration process is hardware specific.
Here you see a kernel build being configured on AMD64
hardware, where these five hardware RNG devices may
be found.

The result is one or more kernel modules in
/lib/modules/release/kernel/drivers/char/hw_random,
including the following on IA64/AMD64 platforms.

Selecting hardware random number generator support
under Device drivers ⇒ Character devices
in the Linux kernel build configuration.

The
Raspberry Pi
platform is based on the
Broadcom BCM2835
system-on-a-chip with a low-power ARM1176JZ-F processor
and a hardware random number generator.
The bcm2708_rng kernel module detects
and handles the hardware random number generator,
creating device node hwrng:

Add the rng-tools package to fully take
advantage of the hardware random number generator.
You will also need to add the kernel module
bcm2708_rng to the list of automatically
loaded modules in /etc/modules.

# apt-get install rng-tools
# echo bcm2708_rng >> /etc/modules

On the Pidora distribution, Fedora ported to the
Raspberry Pi, the driver is built into the monolithic
kernel so there is no separate loadable module.

IC2 is the SoC and RAM.
It's the large module (12.5×12.5 mm)
in the center of the board,
between the yellow RCA connector and the orange-topped
HDMI connector, to the right of the "Raspberry Pi" logo.
The Samsung SDRAM is stacked on top of the
Broadcom BCM2835 SoC.
IC3 is the combined USB and Ethernet controller.
It's the chip between the blue audio connector,
the USB connector and the Ethernet connector.

You could edit /etc/default/rng-tools to specify
the hardware device, but as a comment in that file warns,
you should just leave that commented out so the boot
script will know to auto-detect the device.

Broadcom has not released any detailed documentation
on their hardware random number generator, but this
is better than nothing.
It shouldn't make things any worse, because it is just
being used as another source of entropy by the Linux kernel,
and it should make things better.
The Raspberry Pi does not have any traditional disk
controllers, leaving it without the typical good sources
of entropy.

It would make sense that the Broadcom hardware devices works
somewhat like the urandom device, generating
output even when it has run low on entropy and the result
becomes less random.
Broadcom designed the device for use in telephone handsets,
generating GSM and 3G/4G session keys.
Users would not find it acceptable to have to wait through
mysterious math-based delays before placing calls.

The rngd daemon sends its collected statistics
to syslog every ten minutes and when it shuts down.
Here is an example:

How Random is the Result?

Analyze your random data with the
ent program
from John Walker,
the founder of Autodesk and co-author of AutoCAD.

This table shows the results for a 1-megabyte file
from each source, collected this way:

# dd if=/dev/name bs=1024 count=1024

On Linux add the option iflag=fullblock,
and on Linux on x86_64 be ready to wait for a long
time for the random device.
It took several hours to collect one megabyte.

However, see the above discussion of how much random
data is really needed.
A one-time pad is the only perfectly secure cipher,
but it is an enormous bother to use one.
If you really care enough to use a one-time pad, then it
makes little sense to use a program (including kernel
modules) to generate it.
To really do randomness correctly, use physics.
The Australian National University has built a
quantum optics random number generator
and you can even
download a unique live random number stream
from their system.

A pseudo-random number generator (PRNG) is a deterministic
algorithm that produces numbers whose distribution is
indistinguishable from uniform. A formal security model
for PRNGs with input was proposed in 2005 by Barak and
Halevi (BH). This model involves an internal state that
is refreshed with a (potentially biased) external random
source, and a cryptographic function that outputs random
numbers from the continually internal state. In this work
we extend the BH model to also include a new security
property capturing how it should accumulate the entropy
of the input data into the internal state after state
compromise. This property states that a good PRNG should
be able to eventually recover from compromise even if
the entropy is injected into the system at a very slow
pace, and expresses the real-life expected behavior
of existing PRNG designs. Unfortunately, we show that
neither the model nor the specific PRNG construction
proposed by Barak and Halevi meet this new property,
despite meeting a weaker robustness notion introduced
by BH. From a practical side, we also give a precise
assessment of the security of the two Linux PRNGs,
/dev/random and /dev/urandom. In particular, we show
several attacks proving that these PRNGs are not robust
according to our definition, and do not accumulate entropy
properly. These attacks are due to the vulnerabilities of
the entropy estimator and the internal mixing function
of the Linux PRNGs. These attacks against the Linux
PRNG show that it does not satisfy the "robustness"
notion of security, but it remains unclear if these
attacks lead to actual exploitable vulnerabilities in
practice. Finally, we propose a simple and very efficient
PRNG construction that is provably robust in our new
and stronger adversarial model. We present benchmarks
between this construction and the Linux PRNG that show
that this construction is on average more efficient when
recovering from a compromised internal state and when
generating cryptographic keys. We therefore recommend
to use this construction whenever a PRNG with input is
used for cryptography.