==Phrack Inc.==
Volume 0x0b, Issue 0x3b, Phile #0x0f of 0x12
|=-------------=[ CRYPTOGRAPHIC RANDOM NUMBER GENERATORS ]=--------------=|
|=-----------------------------------------------------------------------=|
|=-----------------=[ DrMungkee <pub@drmungkee.com> ]=-------------------=|
----| Introduction
Every component in a cryptosystem is critical to its security. A single
failure in one could bring down all the others. Cryptographic random
numbers are often used as keys, padding, salt and initialization vectors.
Using a good RNG for each of these components is essential. There are many
complications imposed by the predictability of computers, but there are
means of extracting the few bits of entropy regardless of them being
exponentially out-numbered by redundancy. This article's scope covers the
design, implementation and analysis of RNGs. RNGs subject to exploration
will be NoiseSpunge, Intel RNG, Linux' /dev/random, and Yarrow.
----| Glossary
RNG - Random Number Generator
PRNG - Pseudo Random Number Generator
entropy - Unpredictable information
redundancy - Predictable or probabilistic information
----| 1) Design Principles of RNGs
1.0) Overview
A variety of factors come into play when designing an RNG. It's output must
be undissernable from white noise, there must be no way of predicting any
portion of it, and there can be no way of finding previous or future
outputs based on any known outputs. If an RNG doesn't conform to this
criteria, it is not cryptographicaly secure.
1.1) Entropy Gathering
To meet the first and second criteria, finding good sources of entropy is
an obligation. These sources must be unmoniterable by an attacker, and any
attempts by an attacker to manipulate the entropy sources should not make
them predictable or repetitive.
Mouse movement is often used as entropy, but if the entropy is improperly
interpreted by the RNG, there is a segnficant amount of redundancy. To
demonstrate, I monitered mouse movement at an interval of 100 miliseconds.
These positions were taken consecutively while the mouse was moved
hecticaly in all directions. These results say it all:
X-Position Y-Position
0000001011110101 0000000100101100 Only the last 9 bits of each
0000001000000001 0000000100001110 coordinate actualy appear
0000001101011111 0000001001101001 random.
0000001000100111 0000000111100100
0000001010101100 0000000011111110
0000000010000000 0000000111010011
0000001000111000 0000000100100111
0000000010001110 0000000100001111
0000000111010100 0000000011111000
0000000111100011 0000000100101010
The next demonstration shows a more realistic gathering of entropy by
keeping only the 4 least significant bits of the X and Y positions and
XORing them with a high-frequency counter, monitoring them at a random
interval:
X Y Timer XORed
1010 1001 00100110 01111111
0100 1100 00101010 00000110
0101 0010 01011111 01110101
1001 1100 10110000 11111100
0101 0100 11001110 11100010
0101 1100 01010000 01111100
1011 0000 01000100 00011100
0111 0111 00010111 00101000
0011 0101 01101011 01110110
0001 0001 11011000 11010001
Good entropy is gathered because 4bits from each coordinates represents a
change in 16 pixels in each direction rather than assuming a motion of
65536 can occur in all directions. The high-resolution timer is used as
well because although it is completly sequencial, it's last 8 bits will
have been updated very often during a few CPU clock cycles, thus making
those bits unmonitorable. An XOR is used to combine the entropy from the 2
sources because it has very the very good property of merging numbers in a
way that preserves the dependency of every bit.
The most common sources of entropy used all involve user interaction or
high-frequency clocks in one way, shape, or form. A hybrid of both is
always desirable. Latencies between user-triggered events (keystroke, disk
I/O, IRQs, mouse clicks) measured at high-precisions are optimal because
of the unpredictable nature of a user's behaviors and precise timing.
Some sources may seem random enough but are in fact not. Network traffic is
sometimes used but is unrecommended because it can be monitored and
manipulated by an outside source. Another pittfall is millisecond precision
clocks: they don't update frequently enough to be put to good use.
A good example of entropy gathering shortcommings is Netscape's
cryptographically _broken_ not-so-RNG. Netscape used the time and date with
its process ID and its parent's process ID as it's only source of entropy.
The process ID in Win9x is a value usualy below 100 (incremented once for
each new process) that is XORed with the time of day Win9x first started.
Even though the hashing function helped generate output that seemed random,
it is easy to estimate feseable values for the entropy, hash them, and
predict the RNG's output. It doesn't matter weather or not the output
looks random if the source of entropy is poor.
1.2 Entropy Estimations
Evaluating the quantity of entropy gathered should not be overlooked. It
must be dones in order to prevent the RNG from attempting to output more
entropy than it has gathered. Depending on system parameters, you can
assign quality estimates for each of your entropy sources. For example,
you can evaluate all keyboard generated entropy as being 4bits in size,
regardless of how many bits of entropy you collect from it. If the RNG is
on a file server and uses disk I/O as an entropy source, it could derrive
an entropy estimate proportional to the number of users accessing the disk
to prevent sequencial disk access from resulting in redundant entropy.
The entropy estimates do not need to be the same size as the inputs or
outputs of entropy gathering. They are meant as a safety precaution in
further calculations.
There are alternative methods for estimating the entropy. You could bias
entropy from a source to be of better quality if that source has not
supplied entropy for a period exceeding a certain interval. You can
accumulate large amounts of entropy in a buffer, compress it, and derive
an estimation from the compression ratio. Statistical tests comparing the
last input entropy with a large quantity of previous inputs doesn't do much
in terms of finding the current input's quality, but it gives the RNG an
oppertunity to reject inputs that increase statistical probability of the
group of entropy inputs.
The best approach to this is also a hybrid. One method of estimating
entropy quality usualy isn't enough. There are cases where an entropy
source can be assumed to provide a consistant quality of entropy however.
In these cases, a fixed size can be assigned to all entropy inputs from
that source, but carefull analysis should be done before this assumption
is made. It is wisest to calculate multiple estimates and assume the
smallest value to be the most accurate.
1.3) Entropy Pools
No entropy source should be assumed perfect. More specificaly, no entropy
source should be assumed perfect on a computer. That is why entropy is
gathered in a buffer (entropy pool) to undergo supplimentary processing.
After entropy is gathered from a source, it is input into an entropy pool.
The entropy pool must do several things with this input. It must keep track
of the amount of entropy contained within it, mix the last input uniformaly
with all the previous inputs contained within it, and provide an at least
seamingly random state regardless of the quality of the entropy input
(patternistic inputs should still look random in the pool).
Mixing the contents of the entropy pool should neither sacrifice any of
the entropy within it nor be considered to add entropy to its state. If the
mixing function expands the pool, entropy estimation of its contents should
not change. Only the entropy gathering functions are responsible for
increasing entropy and are dealt with serperately.
The best candidates for mixing functions are hashing algorithms. The
hashing algorithm should accept any size input, and have a large sized
output that reflects the speed at which entropy is gathered, and have a
non-deterministic output. To preserve gathered entropy, the hashing
function should not input more entropy than the size of it's output. With
that said, if the hashing function outputs 160bits, it should not be input
more than 160bits prior to output. If the hashing algorithm is
cryptographically secure (which it should be) the output will yield the
same amount of entropy as the input. If the output is larger than the
input, the state of the pool cannot be assumed to have increased in
entropy.
There are several approaches to using large pools of entropy. One approach
implments a pool that is hashed linearly. For this method, you would need a
buffer that is concatinated with the last input of entropy. Hashing should
be started at the end of the buffer. The rest of the buffer should be
hashed, one chunk (the size of the output) at a time, each time XORing the
output with the output of the last block's hash to ensure the entire pool
is affected by the last input, without overwritting any previous entropy.
This is only an examplar method. Whichever procedure you choose, it should
meet all the criteria mentioned in the previous paragraphs.
Another approach to maintaining a large entropy pool is using multiple
hashed contexts which are used to affect each other. A common use is a pool
that contains unmanipulated entropy. Once that pool is full, it is hashed
and used to update another pool either by updating a hashing context or
XORing. This is cascaded through as many pools as desired, but to avoid
losing previous entropy, some pools should only be updated after it's
parent pool (the one that updates it) has been updated a certain number of
times. For example, once the first hashed pool has been updated 8 times, a
second pool can be updated. Once the second hashed pool has been updated 3
times, it can update a third pool. With this method, the third pool
contains entropy from the last 24 entropy updates. This conserves less
entropy (limited by the size of the hashing contexts) but provides better
quality entropy. Entropy is of better quality because the source of the
entropy containted within the third pool is completly dependent on 24
entropy inputs.
Inputing entropy into a pool is usualy called updating or seeding. Entropy
pools combined with the output function by themselves are in fact PRNGs.
What makes a RNG is the entropy gathering process which obtains truly
random seeds. As long a good entropy is input, the RNG will have an
infinite period (no output patterns) as oposed to PRNGs which have a
semi-fixed point at whitch they will start to repeat all previous outputs
in the same order.
Entropy pools are the key to preventing any previous or future outputs of
RNG from being predicted. Attacks against an RNG to determine previous and
future outputs are either based on knowledge of the entropy pool, entropy
inputs or previous outputs. The pool should be designed to prevent
knowledge of its current state from compromising any or all future
outputs. To do this, entropy pools should undergo a drastic change from
time to time by removing protions or all of its entropy. This is called
reseeding. Reseeding should _always_ replace the entropy that is removed
with fresh entropy before outputing. If the entropy is not replaced, the
pool will be in a severely weakened state. An RNG does not need to reseed,
but if it doesn't, it must have entropy added at a rate greater than the
RNG's output.
Reseeding should only occur after sufficient unused entropy has been
accumulated to fill a large portion of the pool, and the entropy estimation
of the pool should be adjusted to the estimated size of the input entropy.
Reseeding should not occur very often, and only based on the number of
bits output by the RNG and the size of the pool. A safe estimation on the
reseeding frequency of an RNG would be the after an 95% of the size of the
entropy input has been output. This estimate assumes that entropy is added
to the pool in between the RNG's outputs. If this is not the case,
reseeding should occur more frequently. The less entropy is input between
outputs, the better the chances that an attacker who has found one output
will find the previous output (which can cascade backwards after each
output is found).
1.4) Output Functions
An RNG's output should be passed through a one-way function. A one-way
function's output is derrived from its input, but that input is
computationaly infeasable to derive from its output. One-way hash
functions are perfect for this. More complex methods involve using
portions of the pool as key data fed to a symmetric encryption algorithm
that encrypts another portion of the pool and outputs the ciphertext.
Expansion-compression is a very effective one-way function as well. To do
this you can use portions of the pool as seeds to a PRNG and generate
multiple outputs (each the size of the PRNG's seed) and then inputing all
of these into a hash function and outputing its result. This is effective
because many intermediate (expanded) states could result in the same hash
output, but only one iniciate (before expansion) state can result in that
intermediate state.
Every time the RNG outputs, its entropy estimate should be decremented by
the size of the output. This is done with the assumption that the output
entirely consists of entropy. Because that output's entropy is still in
the pool, it is now redundant and cannot be assumed as entropy (inside the
pool) any longer. If the pool is 512bits in size, and 160bits of entropy
is consumed on every output then almost all entropy hash been used after 3
outputs and the pool should be reseeded.
There is a problem nearly impossible to overcome that occurs when
implementing entropy pools: there is no way of determining what entropy
bits were output, and which were not. The best way to nullify the symptomes
of this problem is by making it impossible to know when entropy has been
used more than once based on the the RNG's output. When an output occurs,
the pool's state must be permuted so that consecutive outputs without any
entropy added or reseeding will not result in identical RNG outputs. The
pool's state permutation must be a one-way function and must apply the same
concepts and criteria used in the output function. The pool's entropy size
is always assumed to be identical after permutation as long as the
procedure follows the criteria.
1.5) Implementation
All the effort put into a well designed RNG is useless if it isn't properly
implemented. Three layers of the implemetation will be covered: media,
hardware/software, and usage of the output.
Storage and communication media each represent a risk in an unencrypted
state. The following lists various degrees of risk assigned to storage and
communication media. Risks are assigned as such:
0 - no risk
1 - low risk
2 - medium risk
3 - high risk
MEDIA RISK
------------------------------------
RAM <storage> 0 *&
Hard Drive <storage> 1 *&
Shared memory <transfer> 1 *&
Removable disks <transfer> 2
LAN <communication> 2 &
WAN <communication> 3
Any properly encrypted media's risk is 0.
* If the storage media is on a computer connected to a network, risk is
increased by 1.
& If physical access is possible (computer/LAN)., risk is increased by 1.
The highest risk of all medias should be interpreted as the
implementation's risk (weakest link, good bye!). High risk is unacceptable.
Medium risk is acceptable depending on the value of the RNG's output
(what's it worth to an attacker?). A personal diary can easily cope with
medium risk unless you have many skeletons in your closet. Industrial
secrets should only use 0 risk RNGs. Acceptable risk is usualy up to the
programmer, but the user should be aware of his choice.
Hardware RNGs should be tamper-proof. If any physical modification is
attempted, the RNG should no longer output. This precaution prevents
manipulation of the entropy pool's state and output. There should be no
way of monitoring hardware RNGs through frequencies, radiation, voltage, or
any other emissions generated by the RNG. Any of these could be used as a
source of information with whitch the RNG's entropy pool or output could be
compromised. To prevent this, all hardware RNGs should be properly
shielded.
Software implementations can be very tricky. Reverse engineering will
remain a problem until digital signing of executable files is implemented
at the operating system level. Until then, any attempts made on the
programmer's behalf to prevent reverse engineering of the RNG's software
implementation will only delay the innevitable. It is still important that
the programmer takes care in writting the software to have to lowest
possible risk factor (the chart takes into account reverse engineering of
software).
// the following applies to RNGs seperate from their calling applications
The RNG must take special care to ensure that only one program has access
to each of the RNG's outputs. The method by which the data is transfered
from the RNG to the program must not succomb to observation. Distinct
outputs are usualy guarrentied by the output function, but sometimes the
output is copied to a temporary buffer. It might be possible to trick an
RNG into conserving that buffer, or copying it elsewhere providing easy
observation. A quick solution is for an application to encrypt the RNG's
output with a key it generates by its own means. However, you could go all
out and implement a full key-escrow between the RNG and the calling
applications and still be vulnerable to a hack. The kind of _prevention_ a
programmer incorporates into software only serves as a road block, but this
is often enough to discourage 99.9% of its users from attempting to
compromise security. Not much can be done about 0.1% that can still
manipulate the software because there will always be a way to crack
software.
1.6) Analysis
There are two important aspects to analysing an RNG: randomness and
security. To evaluate an RNG's randomness, one usualy resorts to
statistical analysis of the RNG's input (entropy gathering process) and
output (output function). To evaluate it's security, one would look for
flaws in its entropy gathering, entropy pool, mixing function, and output
function that allow an attacker to find past, present, or future outputs by
any means possible. There is no guarrentying the effectiveness of either of
these aspects. The only certain thing is once the RNG is broken, it is
broken; until then, you can only speculate.
There are many statistical tests available on the internet suitable for
testing randomness of data. Most require a large sample of data stored in
a file to derive significant results. A Probabilistic value is obtained
through statistical analysis of the sample. This value is usualy in the
form of P, a floating point number between 0 and 1. Tests are done in
various block sizes usualy between 8 and 32bits. P's precision varies from
one test to the next. A P value close to 0.5 is what is usualy desired.
When P is close to 0.5, probability is at it's midrange and there is no
incline towards either 0 or 1. An RNG is not weak because it has a value
close to 1 or 0. It can occur even with purely random data. If it were
impossible to obtain a value close to 0 or 1, the RNG would be flawed
anyway. This is because when data is completly random, all outputs are
equaly likely. This is why patterned outputs are possible. When P is less
then satisfactory, many new samples should be created and tested. If other
samples result in bad Ps then the RNG most likely has deterministic output
and should not be used. DieHard offers an armada of 15 tests that use P
values. Other tests describe there results with an integer and it's target.
The closer the integer is to its target the better. An example of this is
the Maurer Universal Statistics Test.
The problem with statistical tests is that any good PRNG or hashing
function will pass them easily without any entropy. Even if the output is
non-deterministic the RNG is only an RNG if it cannot be predicted. For
that reason, the RNG's entropy must be non-deterministic as well. Unless
the entropy source can be guarrentied to function properly, it is wise to
use the same tests on the raw entropy itself. By doing this you can achieve
a sufficient level of confidence about the randomness. A big speed-bump
stares you right in the eyes when you're trying to do this, however.
Entropy is often gathered at a very slow pace making the gathering of a
sufficiently large data sample extremely tedius and in some circumstances
it might not even be worthwhile. Whether this is the case or not, it is
logical to intellegently scrutinise entropy sources, rather than depending
on statistical tests (which cannot guarrenty anything) to find flaws (see
1.1).
Evaluating an RNG's security is a complexe task with infinite means and
only one end: a break. The odds are always well stacked against an RNG. No
matter how many provisions are made to prevent breaks, new attacks will
always eventualy emerge from that RNG or another. Every aspect of the RNG
must be studied carefully, from entropy gathering right up to the delivery
of the RNG's output. Every component should be tested individualy and then
as a group. Tests include the possibility of hacks that can tamper with or
monitor entropy gathering, and cryptanalysis of mixing and output
functions. Most breaks are discovered under laboratory conditions. These
are called academic breaks and they usualy require very specific
conditions be met in order to function (usualy highly improbable). Finding
these breaks is a broad topic on its own and is beyond of the scope in
article. Successful breaks are usually the result of months (often years)
of pain-staking work done by cryptanalysts with years of experience. The
best thing to do is to carefully design the RNG from start to finish with
security in mind.
Even as the limits of mathematics and cryptanalysis are reached in testing,
advancements in sience could reak havoc on your RNG. For example, Tempest
scanning could be used by an attacker to follow keystrokes and mouse
positions. Discoveries can even be made in the analysis of white noise,
eventualy. These breaks are usualy found by scholars and professionals who
seek only to make their knowledge available before damage occurs. Not much
can be done to prevent attacks that are unknown. Finding an effective fix
quickly and learning from the is what is expected from developers.
Thankfully, these attacks emerge very rarely, but things are changing as
research increases.
Only the security analysis of the RNGs in section 2 will be discussed
because each has already been tested for and passed randomness analysis.
----| 2 Description of specific RNGs
2.1) NoiseSpunge's Design
Information Source: Uhhhh, I wrote it.
2.1.0) NoiseSpunge Overview
NoiseSpunge was specifically written for generating random 256bit keys
suitable for strong encryption. Gathering entropy for a single output
(256bits) requires a few seconds of mouse movement on the user's part. Its
structure is complex and computationaly expensive. NoiseSpunge is meant to
be a component within cryptosystems, and for that reason, special
consideration has to be made in order to prevent it from being a liability.
The trade off in this implementation is it would be clumsy at best if
large quantities of random data were needed regularly because it would
require intense user-interaction and it would consume too many CPU cycles.
2.1.1) NoiseSpunge Entropy Gathering
A PRNG is seeded with initial zeros. The PRNG then outputs a value used to
calculate the length of the interval used. When the interval is triggered,
the mouse position is checked for movement. If the mouse has moved since
the last trigger the PC's high-frequency clock is queried for its current
value. The 4 least significant bits are XORed with the 4 least significant
bits of the mouse's x & y coordinates. A new interval is then calculated
from the PRNG. The 4 bits produced are concatenated until 32 bits are
gathered and output. The 32bits are concatenated to the an entropy buffer
and also used to update the PRNG that sets the interval. The process is
then repeated. If the mouse has not moved, a new interval is set and the
process repeats until is has moved. There is also a function that allows
the programmer to input 32bits of entropy at a time. This function is
suitable if there is a hardware entropy device or another known secure
source of entropy on a particular system. However, the use of another RNG's
output would be redundant if it is good and useless if it is bad.
2.1.2) NoiseSpunge Entropy Estimation
Entropy estimation is straight forward. The worst case scenario is assumed
with each input. Only 4bits are gathered for every mouse capture. No
further estimations are done because they would only yield results 4bits or
greater. Entropy estimation for the supplementary function that allows the
programmer to supply his own entropy requires the programmer to guarrantee
his entropy is of good quality; estimation of this input's entropy is left
in his hands.
2.1.3) NoiseSpunge Entropy Pool
The internal state comprises 762bit. There is a 256bit seed, a 256bit
primary hash, and a 256bit secondary hash. 256bit Haval is used as the
hashing function. When a 32bit block of entropy is added, it is appended to
a 256bit buffer. Once the buffer is full the primary hash is updated with
it. The seed is XORed with The primary hash's output unless this is the 8th
primary reseed. In that case, the primary hash's output is input into the
secondary hash and that hash's output is permuted (see bellow) and replaces
the seed. Seed permutation is accomplished by an expansion-compression.
32bit words of the seed are fed as a PRNG's random seed and used to output
two 32bit words. All 512bits of the PRNG's output are hashed and replace
the pool's seed. After every primary reseed, a KeyReserve counter is
incremented and capped at 8. The KeyReserve reperesents the number of
256bit groups of entropy that have been added to the internal state. This
KeyReserve is a rough estimate of when there is no longer any purpose to
adding entropy into the pool and the entropy gathering thread can be paused
(until the RNG outputs).
2.1.4) NoiseSpunge Output Function
There are 2 methods provided for the RNG's output: safe and forced. A safe
output makes sure the KeyReserve is not zeroed and decrements it after
output. A forced output ignores the KeyReserve. To output, the seed is
copied to a temporary buffer and is then permuted. The new seed is used a
key to initialize Rijndael (symmetric block cipher). The temporary buffer
is encrypted with Rijndael and then permuted with an expansion-compression
(the same way the seed is). This is repeated for N rounds (chosen by the
programmer) and the buffer is then output.
2.1.5) NoiseSpunge Analysis
[1] The heavy relyance upon mouse movement could _starve_ the entropy pool
if the mouse is not in use for an extended period of time. However, a
counter prevents output when entropy is low.
[2] The programmer could forcefully input poor quality entropy and weaken
the RNG's internal state.
[3] There are no provisions for systems without high-resolution timers.
[4] Even though the pool's internal state is 762bits long, there is a
maximum of 256bits of entropy at any state. (The other bits are only there
to prevent back-tracking and to obfuscate the seed). That makes this RNG
only suitable when small amounts of secure random data are needed.
2.2) Intel RNG's Design
Information Source: Intel Random Number Generator White Paper *1
2.2.0) Intel RNG Overview
The Intel RNG is system-wide. It is designed to provide good quality random
data in massive quantities to any software that requires it. It's average
throughput is 75Kb/s (bits). The Intel Security Driver provides a bridge
between the middleware (CDSA, RSA-BSAFE, and Microsoft CryptoAPI) that will
serve out the random numbers to requesting applications and the hardware.
The hardware portion is in Intel's 810 chipset, and will be in the 82802
Firmware Hub Device for all future 8xx chipsets.
{WARNING: these are some of my personal opinions; take them with a grain of
salt}
Intel has chosen to eloquantly label its RNG as a TRNG (True Random Number
Generator), but then they go on to call it an RNG through the rest of the
paper. Thechnicaly there is no fundamental difference that sets it asside
from any other good RNG; it is a label for hype and has nothing to do with
its ability to produce random numbers (RNG==TRNG & TRNG==RNG). As for your
daily dose of corporate assurance: "The output of Intel RNG has completed
post-design validation with Cryptography Research Inc. (CRI) and the
Federal Information Processing (FIPS) Level 3 test for statistical
randomness (FIPS 140-1)." I find it reassuring that a company (CRI) has
analyzed and is supporting this RNG. That isn't something you see very
often. On the other hand FIPS140-1 is just another hype generator. After
reading FIPS140-1, one realises it has absolutely NOTHING to do with the
quality of the RNG, but hey! Who cares? Microsoft seems to think it's good
enough to use in their family of _high_quality_and_security_ products, so
it must be great. All kidding asside, despite the corporate stench, this
RNG is well designed and will prevent many RNG blunders such as Netscape's.
I think this is a step in the right direction. Rather than letting Joe,
Timmy his cousin, and Timmy's best friend's friend design their own RNGs,
they provide a good solution for everyone without having them trip on their
own feet like Netscape did.
2.2.1) Intel RNG Entropy Gathering
Intel's Random Number Generator is to be integrated into PC motherboards.
There are 2 resistors and 2 oscillators (one slow, the other fast). The
voltage difference between the 2 resistors is amplified to sample thermal
noise. This noise source is used to modulate the slow clock. This clock
with variable modulation is used to set intervals between measurements of
the fast clock. When the interval is triggered the frequency of the fast
clock is then filtered through what Intel calls the von Neumann corrector
(patent pending). The corrector compensates for the fast clocks bias
towards staying in fixed bit states (regardless of the slow clock's
variable modulation). It works by comparring pairs of bits and outputing
only one or no bits ([1,0]=0; [0,1]=1; [0,0]or[1,1]=no output;). The
output of the corrector is grouped in 32bit blocks and sent to the Intel
Security Driver.
2.2.2) Intel RNG Entropy Estimation
No estimations are done for a few reasons. Because the entropy source is
hardware based, it cannot be manipulated unless it is put into temperatures
far beyond or bellow resonable ambient conditions, or the computer's power
is cut off (in which case the entropy gathering stops). Beyond that, all
entropy is gathered in the same way and can be assumed of identical
quality.
2.2.3) Intel RNG Entropy Pool
The Intel Security Driver takes care of mixing the RNG's output. The pool
is composed of 512bits of an SHA-1 hash contexts divided into two states.
An 80bit hash of the first state is generated and appended with 32 bits of
entropy (from the hardware) and the first 160bits from the first state to
create the second state. When another 32bits of entropy are generated, the
second state becomes the first state and the same process is repeated.
2.2.4) Intel RNG Output Function
The last 16bits of the 80bit hash of the first state are output to the
middleware. The Intel Security Driver ensures that each output is
dispatched only once. If desired, additional processing of the output will
have to be done by the program that requested the random data.
2.2.5) Intel RNG Analysis
[1] The need to implement the von Neumann corrector is demonstration of
the RNG's affinity for repetitive sequences. An attacker could calculate
when 1s or 0s are disproportionatly output by estimating it's throughput
in bits/sec, but this doesn't lead to any feasable attacks (yet).
[2] The use of contracted middleware may lead to security holes. Before
using a company's middleware, you may want to wait a few months just to
see if a quick break is released.
2.3) Linux' /dev/random's Design
Information Source: /dev/random source code *2
2.3.0) /dev/random Overview
Linux provides the /dev/random character device as an interface for
applications to recieve random data with good quality entropy. It provides
a gernourously sized entropy pool (512 bytes) to accomodate the operating
system and all software running on it. When quality entropy is not
necessary, a second character device /dev/urandom is provided as a PRNG to
avoid wastefully depleting /dev/random's entropy pool.
2.3.1) /dev/random Entropy Gathering
External functions from the kernel trigger the addition of entropy into the
pool. Events that trigger this are key presses, mouse movement, and IRQs.
Uppon each trigger, 32bits of a high-frequency timer are copied, and
another 32bits are derrived depending on the type of trigger (either the
mouse coordinates, keybaord scancode, or IRQ number).
2.3.2) /dev/random Entropy Estimation
Entropy estimation is calculated with the help of three deltas. Delta1 is
the time elapsed since the last trigger of its type occured. Delta2 is the
difference between Delta1 and the previous Delta1. Delta3 is the difference
between Delta2 and the previous Delta2. The smallest of the three deltas
calculated is chosen as Delta. The least significant bit of Delta is
ignored and the next 12bits are used to increment the entropy counter.
2.3.3) /dev/random Entropy Pool
This RNG uses an entropy pool of 4096bits. Prior to input, a marker
denoting the current position along the pool is decremented by 2 32bit
words. If the position is 0, the position is wrapped around backwards to
the second last 32bit word. Entropy is added in two 32bit words: x & y. A
variable, j determines how many bits to the left the entropy should be
rotated. Before entropy is added, j is incremented by 14 (7 if the pool is
in position 0). Entropy is rotated by j. Depending on the current position
along the pool, y is XORed with 5 other fixed portions of the pool (the
following positions are wrapped around from the current position: 103,76,
51,25,1 (for a 4096bit pool) and x is XORed with each next word. x is
shifted to the right 3bits, XORed by a constant within a 1x7 table (0,
0x3b6e20c8, 0x76dc4190, 0x4db26158, 0xedb88320, 0xd6d6a3e8, 0x9b64c2b0,
0xa00ae278) the index of which is chosen by x AND 7 (bitwise, 3bits). x
XOR y is then appended to the pool skipping one word. y is shifted to the
right 3bits, XORed with the constant table the same way x was and then
copied into the word that was skipped in the pool. The pool remains at
this position (previous position - 2, possibly wrapped around the end).
2.3.4) /dev/random Output Function
When output is requested from the RNG, the timer and the number of bytes
requested is added to the pool as entropy. The pool is then hashed with
SHA-1 and the first 2 words of the hash are fed as entropy into the pool;
this is repeated 8 times, but each time the next 2 words of the hash are
fed into the pool. The first half of the final hash is then XORed to its
second half to produce the output. The output is either the requested size
or 20 bytes (half the hash size); the smallest of these is chosen.
2.3.5) Linux' /dev/random Analysis
[1] Monitoring and predicting of some IRQs is possible in a networked
environment.
[2] There is allot of redundancy in the lower 16bits of the entropy added.
For example, when a keypress occurs a 32bit variable holds 16bits from a
high-resolution timer, and the lower 16 bits are 0-255 for the keypress
(256+ are used to designate interupts). This leaves 8bits of redundancy
for every keypress.
[3] The time elapsed since the last block of entropy was added is usually
irrelevent to the quality of the entropy, unless that lapse is very short.
This doesn't take into account sequencial entropy entries like continuous
disk access while moving a file.
[4] When output occurs, the mixing mechanism re-enters allot of hashed
entropy which may or may not be of good quality. These re-entered words
are added to the entropy count but should not. They are bits of entropy
that have already been counted. After output, 512bits of entropy are
redundantly entered. If this estimate is accurate, then after 8 calls to
output there are 4096bits (the entire pool) of entropy of undifinable
quality. Under these circumstances, if no entropy is input from
user-interacting during the calls, the RNG becomes a PRNG.
2.4) Yarrow's Design
information sources: Yarrow source code and White Papers *3,*4
2.4.0) Yarrow Overview
Yarrow is designed by Bruce Schneier, auther of Applied Cryptography and
designer of block ciphers Blowfish and AES finalist Twofish. Yarrow is
Schneier's interpretation of the proper design of an RNG and is accompanied
by a detailed paper descibing its inner-workings and analysis (see the
second information source). It is the product of lengthy research and sets
standard in properties expected to be found in a secure RNG. It is
discussed here for comparisson between commonly trusted RNGs and one
designed by a seasoned proffessional.
2.4.1) Yarrow Entropy Gathering
System hooks wait for keyboard or mouse events. If a key has been pressed,
the time elapsed since the last key-press is appended to an array. The same
is done when a mouse button has been pressed. If the mouse has moved, the
x and y coordinates are appended to a mouse movement array. Once an array
is full is is passed to the entropy estimation function.
2.4.2) Yarrow Entropy Estimation
The entropy estimation function is passed an estimated number of bits of
entropy chosen by the programmer's bias towards it's source. One could
decide that that mouse movement only represents 4 bits of entropy per
movement, while keyboard latency is worth 8bits per key-press. Another
measurement uses a small compression algorithm and measures the compressed
size. The third and last measurement is half the size of the entropy
sample. The smallest of these three measurements increments the entropy
estimate.
2.4.3) Yarrow Entropy Pool
When entropy is input, it is fed into a fast pool (SHA-1 context) and an
entropy estimate is updated for that pool. Once the pool has accumulated
100bits of entropy, the hash output of this pool is fed into the slow pool
and its entropy estimate is updated. When the slow pool has accumulated
160bits of entropy it's hash output becomes the current key.
2.4.4) Yarrow Output Function
When output is required, the current key (derived from the slow pool)
encrypts a counter (its number of bits is chosen by the programmer) and
outputs the ciphertext; the counter is then incremented. After 10 outputs,
the RNG reseeds the key by replacing it with another (forced) output. The
key will next be reseeded either when the slow pool has accumulated 160bits
or 10 outputs have occured.
2.4.5) Yarrow Analysis
[1] Mouse movement on its own is very redundant, there is a very limited
range of motion between the last postion and the current position after
the OS has sent the message that the mouse has moved. Most of the bits
representing the mouse's position are unlikely to change and throw-off the
entropy estimates in this RNG.
[2] Even though the pool's internal state is 320+n+kbits long, there is a
maximum of 160bits of entropy during any state. "Yarrow-160, our current
construction, is limited to at most 160 bits of security by the size of
its entropy accumulation pools." *4
----| 3) NoiseSpunge Source Code
The Following source code is simply a brief example. Do whatever you want
with it; even that thing you do with your tongue and the rubber ... never
mind. It _WILL_NOT_COMPILE_ because about 1,200 lines have been omitted,
consisting of Haval, Rijndael and the PRNG). Haval and Rijndael source
code is readily available. Any PRNG will do, but make sure it works with
32bit inputs and outputs and has a period of at least 2^32 (4294967296).
I've devided it into 3 chunks: entropy gathering, entropy pool, output
functions.
[ENTROPY GATHERING]
This loop must run on a thread independent of the application's main
thread. For OS dependancies, I've created dummy functions that should be
replaced:
int64 CounterFreq; //high-res counter's frequency/second
int64 QueryCounter; //high-res counter's current value
Delay(int ms); //1 milisecond precision delay
int GetMouseX; //current mouse x coordinate
int GetMouseY; // " y coordinate
#define MOUSE_INTERVAL 10
{
Prng_CTX PCtx;
int x,y;
unsigned long Block;
unsigned long BitsGathered;
int65 Interval,Frequency,ThisTime,LastTime;
unsigned long BitsGathered=0;
bool Idled=false;
Frequency=CounterFreq;
bool Terminated=false; //Set value to true to end the loop
do
{
if (Idled==false)
{
Delay(MOUSE_INTERVAL);
Idled=true;
}
ThisTime=QueryCounter;
if ((ThisTime-LastTime)>Interval)
{
if ((x!=GetMouseX)&&(y!=GetMouseY)
{
x=mouse.cursorpos.x;
y=mouse.cursorpos.y;
Block|=((x^y^ThisTime)& 15)<<BitsGathered;
BitsGathered+=4;
if (BitsGathered==32)
{
PrngInit(&PCtx,Block);
AddEntropy(Block); //this function is defined lower
Block=0;
BitsGathered=0;
}
Interval=((((Prng(@PCtx)%MOUSE_INTERVAL)>>2)+MOUSE_INTERVAL)
* Frequency)/1000;
}
LastTime=QueryCounter;
Idled=false;
}
} while (Terminated==false);
}
[ENTROPY POOL]
#define SEED_SIZE 8
#define PRIMARY_RESEED 8
#define SECONDARY_RESEED 8
//parameters
#define MAX_KEY_RESERVE 8
#define KEY_BUILD_ROUNDS 16
typedef unsigned long Key256[SEED_SIZE];
Key256 Seed;
Key256 EntropyBuffer;
Haval_CTX PrimaryPool;
Haval_CTX SecondaryPool;
unsigned char PrimaryReseedCount;
unsigned char EntropyCount;
unsigned char KeyReserve;
//FUNCTIONS
void NoiseSpungeInit
{
HavalInit(&PrimaryPool);
HavalInit(&SecondaryPool);
for (int i=0;i<8;i++) Seed[i]=0;
EntropyCount=0;
PrimaryReseedCount=0;
KeyReserve=0;
}
void PermuteSeed
{
Key256 TempBuffer[2];
Prng_CTX PCtx;
Haval_CTX HCtx;
for (int i=0;i<SEED_SIZE;i++) //expand
{
PrngInit(&PCtx,Seed[i]);
TempBuffer[0][i]=Prng(&PCtx);
TempBuffer[1][i]=Prng(&PCtx);
}
HavalInit(&HCtx);
HavalUpdate(&HCtx,&TempBuffer,64); //compress
HavalOutput(&HCtx,&Seed);
}
void PrimaryReseed
{
Key256 TempSeed;
HavalUpdate(&PrimaryPool,&EntropyBuffer,32);
if (PrimaryReseedCount<SECONDARY_RESEED)
{
HavalOutput(&PrimaryPool,&TempSeed);
for (int i=0;i<SEED_SIZE;i++) Seed[i]^=TempSeed[i];
PrimaryReseedCount++;
} else SecondaryReseed;
for (int i=0;i<SEED_SIZE;i++) EntropyBuffer[i]=0;
if (KeyReserve<MAX_KEY_RESERVE) KeyReserve++;
EntropyCount=0;
}
void SecondaryReseed
{
HavalOutput(&PrimaryPool,&Seed);
HavalUpdate(&SecondaryPool,&Seed,32);
HavalOutput(&SecondaryPool,&Seed);
PermuteSeed;
HavalInit(&PrimaryPool);
PrimaryReseedCount=0;
}
void AddEntropy(unsigned long Block)
{
EntropyBuffer[EntropyCount++]=Block;
if (EntropyCount==PRIMARY_RESEED) PrimaryReseed;
}
[OUTPUT FUNCTIONS]
int SafeGetKey(Key256 *Key)
{
Key256 TempSeed;
Key256 TempBuffer[2];
Rijndael_CTX RCtx;
Prng_CTX PCtx;
Haval_CTX HCtx;
if (KeyReserve==0) Return 0;
for (int i=0;i<SEED_SIZE;i++) TempSeed[i]=Seed[i];
PermuteSeed;
RijndaelInit(&RCtx,&Seed);
for (int i=0;i<KEY_BUILD_ROUNDS;i++)
{
RijndaelEncrypt(&RCtx,&TempSeed[0]); //encrypt
RijndaelEncrypt(&RCtx,&TempSeed[4]);
for (int j=0;j<SEED_SIZE;j++) //expand
{
PrngInit(&pctx,TempSeed[j]);
TempBuffer[0,j]=Prng(&PCtx);
TempBuffer[1,j]=Prng(&PCtx);
}
HavalInit(&HCtx);
HavalUpdate(&HCtx,&TempBuffer,64);
HavalOutput(&HCtx,&TempSeed);
}
for (int i=0;i<SEED_SIZE;i++) Key[i]=TempSeed[i];
if (KeyReserve>0) KeyReserve--;
Return 1;
}
void ForcedGetKey(Key256 *Key)
{
Key256 TempSeed;
Key256 TempBuffer[2];
Rijndael_CTX RCtx;
Prng_CTX PCtx;
Haval_CTX HCtx;
for (int i=0;i<SEED_SIZE;i++) TempSeed[i]=Seed[i];
PermuteSeed;
RijndaelInit(&RCtx,&Seed);
for (int i=0;i<KEY_BUILD_ROUNDS;i++)
{
RijndaelEncrypt(&RCtx,&TempSeed[0]); //encrypt
RijndaelEncrypt(&RCtx,&TempSeed[4]);
for (int j=0;j<SEED_SIZE;j++) //expand
{
PrngInit(&pctx,TempSeed[j]);
TempBuffer[0,j]=Prng(&PCtx);
TempBuffer[1,j]=Prng(&PCtx);
}
HavalInit(&HCtx);
HavalUpdate(&HCtx,&TempBuffer,64);
HavalOutput(&HCtx,&TempSeed);
}
for (int i=0;i<SEED_SIZE;i++) Key[i]=TempSeed[i];
if (KeyReserve>0) KeyReserve--;
}
----| 4) References
*1 Intel Random Number Generator White Paper
http://developer.intel.com/design/security/rng/CRIwp.htm
*2 /dev/random source code
http://www.openpgp.net/random/
*3 Yarrow source code
http://www.counterpane.com/Yarrow0.8.71.zip
*4 Yarrow-160: Notes on the Design and Analysis of the Yarrow
Cryptographic Pseudorandom Number Generator
http://www.counterpane.com/yarrow-notes.html