Introduction

Pseudo Random Number Generators (PRNG) are used
in many cryptographic operations. One of the most important cryptographic
operations is generating encryption keys. If an attacker can compromise your
(pseudo) random number generator, they can potentially also compromise your
encryption key. This is bad; if an attacker knows the key used to encrypt your
data, they can decrypt what you have encrypted. Therefore a good PRNG must
generate 'random' numbers difficult to predict and the the state of the PRNG
itself must be difficult to compromise. Although Fortuna is new (and
cryptographers never trust anything new), it has been designed with both of
these goals in mind. Fortuna was designed by two well known people in the
cryptography / security field with experience in this area.

Fortuna is a random number generator developed by Bruce Schneier and Niels Ferguson in their book Practical
Cryptography. Fortuna addresses some of the shortcomings of their previous
PRNG Yarrow. The biggest
challenge implementing Yarrow is that Yarrow requires entropy estimates. Fortuna
overcomes this issue by removing the entropy esimators. The biggest design
change besides the removal of the entropy esimators is that Fortuna has 32 pools
to collect entropy.

My implementation of Fortuna is composed of 4 parts. There is a generator
that generates the actual pseudo random data. There is an accumulator composed
of 32 pools of source data (entropy). There is a source manager that manages the
data sources. And there is a seed file that maintains a copy of the state of the
pools and ensures that Fortuna can generate pseudo random data even after the
computer has just rebooted. Fortuna avoids the problem of requiring an accurate
estimate of the entropy by using 32 entropy pools. The entropy pools are used to
periodically reseed the generator. Therefore if one pool or stream of pseudo
random numbers is compromised, at the next reseed the generator will be using a
new seed so that future PRN's are not also compromised.

The authors give a lot of implementation details for Fortuna. For the most
part I followed their suggestions, but I did make a few changes. For more
complete info goto www.citadelsoftware.ca/fortuna.
For one thing, the authors recommend not using C or C++ for writing secure
software, and this is a C++ component. I'll try to point out along the way where
I deviate from their recommendations.

Design Goals

The main design goal of this implementation of Fortuna is to develop a PRNG
which would be suitable for use on a web server. One distinguishing feature of a
web server is that typically there is not much input from the keyboard or mouse.
Therefore, this implementation of Fortuna uses the available information from
Windows as 'entropy' sources to the generator. Note that this type of generator
is more correctly called 'chaotic deterministic' than 'pseudo random' because
the sources are not truly random. Some of the entropy for this implementation of
Fortuna comes from using the high performance timer to time operating system
calls. I realize that timing data is not true entropy because it is
deterministic. I'll say more below on the high performance counter. And being
unable to predict the outcome of the PRNG is main property desired.

Another design goal of this implementation of Fortuna is to make the PRNG
difficult to successfully attack. This is done by

There is a large volume of data fed to the entropy pools, some with high
entropy and some with low entropy. An attacker will have difficulty compromising
the state of a single pool due to the high volume of data.

Data from the Windows registry is used to so that application data specific
to your environment is included as a (low) entropy source to the accumulator

The status information of each process (page faults, memory, I/O Reads and
Writes) are used as entropy sources

Extensive use of the high precision timer is used as a source of entropy

Being careful to randomize valuable random data in memory when the valuable
data is no longer needed

Note that a true 'entropy' source requires a physical source; for a hardware
RNG see ComScire (I am not affiliated
with this company, but they appear to be a respected well established company
with a good product from all I have read).

If you:

trust your network security

are confident an attacker cannot get access to the complete registry, and
monitor the status of all the processes running on your machine

are confident that an attacker cannot run his own software on your machine
to attempt to determine the contents of the 32 data pools and compromise the
generator

I believe that this PRNG may suit your needs quite well.

In order to compromise this implementation of Fortuna, an attacker would
require access to the entire registry, process monitoring etc. An attacker would
have to severely compromise a machine to do this. Because of this, I believe
this implementation of Fortuna would be acceptable for many web based
applications. Below I list all of the data souces used to populate the data
pools. Note that I said 'data pools', not 'entropy pools'. This is because some
of the sources used don't have much entropy. For example, I use the entire
registry as a source for the pools. Now, there is not a lot of entropy in the
registry, but for an attacker to know the entire registry would require that the
entire machine has been compromised. If this happens, you have a lot of problems
besides the possibility of your PRNG being compromised.

Class Design

This implementation of Fortuna is highly multithreaded. Each data source runs
on its own thread and each data pool also runs on its own thread. Each data
source cycles over all the pools, distributing one byte to each pool in a round
robin fashion. Each pool object has a mutex protecting the pool data. Because
each source spreads it's data over each of the pools, this adds another level of
'deterministic chaos' to the distribution of source data over the pools. When
two different sources access a pool at the same time, the thread scheduler
determines which thread will have access first. While the thread scheduler is
not random, given that there are 32 pools each on their own thread, and over 32
sources each on their own thread, predicting which source byte ends up in which
pool quickly becomes intractible.

All of these threads do not add much overhead to your application because
they spend most of their time waiting. Each source calls 'WaitForSingleObject'
to determine if the component is shutting down.

UML diagrams for my implementation of Fortuna can be found on my web site at
here.

Entropy Sources

Entropy sources (using the word entropy loosely as described above, they are
more appropriately described as data sources) are distributed among the entropy
pools. In the book Practical Cryptography, each source event is added to
only one pool(see 10.5.6 in the book). In this implementation, each byte from
each source event is distributed among the pools in a round robin fashion. With
each byte of data there is also a byte which specifies the number of bits of
entropy in that source byte of data. This allows the pools to keep track of how
many bits of entropy are in the pool.

Each source must provide this virtual method to provide some 'chaotic data'
(I know, I know, it's a cheesy name)

// derived classes must implement this to return their data
// pair - unsigned char, this is the chaotic data
// pair - int, this is the number of bits of
// entropy in the associated byte
virtualbool GetChaoticData(std::vector<std::pair<unsignedchar,
int> > vData)=0;

The High Resolution Performance Counter

Most new computers now come with a high resolution performance counter. The
frequency of the counter is available from the api function:

BOOL QueryPerformanceFrequency(LARGE_INTEGER* lpFrequency);

If there is no high performance timer, the return value is 0. On my 2.2GHZ
laptop, the frequency of the high performance timer is 3,579,545. Note that the
frequency is different from the CPU frequency. The value of the high resolution
performance counter is obtained by:

BOOL QueryPerformanceCounter(LARGE_INTEGER* lpPerformanceCount);

When timing data is obtained using the high performance counter, each byte
that is different in the timer adds one bit of entropy to the pool.

I did some experiments regarding the high performance counter to determine
how many bits of entropy I could claim by using the high performance counter to
time the counts for a Sleep of 1ms. I wrote a separate program for these tests.
I used the high performance timer to get the high performance count around the
Sleep fn as follows:

where hpTimer is a simple class to wrap the high performance
timer. Then I add the lsb from the difference in time to a bitvector (called
BitBucket) I wrote myself (member m_bitbucket of type BitBucket). I
need the mutex because I used 200 threads to speed up the bit generation and had
a separate thread which extracted the accumulated lsb's and wrote them out to a
file. The file writer thread Sleeps 100ms after extracting the bits from each
bit generation thread.

I accumulated all of the lsb's, writing 60MB or so to a file and then ran the
DieHard randomness tests on the bits.
The data passed most of the diehard tests. If you're interested in more of the
gory details you can find them at my web site www.citadelsoftware.ca/hptimer.

The Windows Registry

The Windows registry is a huge data repository. This is one source of data
used to populate the data pools. Again, this is not random data, but reproducing
this data requires the attacker to have access to the entire registry. One of
the challenges of working with the registry for this sort of application is
ensuring that every handle to a registry key that is opened also gets closed.
For this PRNG to be able to run on a server for weeks or months without
rebooting, there can be no resource leaks. I have taken care to close registry
key handles using RegCloseKey() when the handle is no longer
needed. Also, registry keys are opened with KEY_ENUMERATE_SUB_KEYS |
KEY_QUERY_VALUE so that only read permission is requested.

None of the data in the registry is considered to add entropy to the pools
except for the CLSID's. Again this is a very conservative approach. Using the
registry with four different source threads spreading the data over the pools
complicates an attackers job considerably.

The registry is a system wide resource. Almost all applications will at some
point make use of the registry. The FortunaCSI PRNG must be careful not to place
too many demands on the registry. To accomplish this, every source thread has
it's thread priority lowered. This is done by

Also, after every access to the registry, each source thread walking the
registry sleeps for some length of time. The Sleep is typically from 3-10 ms.
The actual sleep time is a function of the thread id, the 'this' pointer, and
the data extracted from the registry. This makes it more difficult for an
attacker to determine which of the 32 pools the data from the registry will end
up in.

The case of a key with restricted access is also handled properly in the
code. Attempting to open registry keys can result in an error code of
ERROR_ACCESS_DENIED. When this happens, the subkey is abandoned and
execution continues with the parent key of the key with the access restriction.

I tested the Registry walking logic by writing out a log file with every key
and value in the registry. This pointed out a number of problems, particularly
where I wasn't properly handling keys that didn't have read permission.

HKEY_CLASSES_ROOT\CLSID

One source thread of this Fortuna implementation cycles over all the guids in
the HKCR\CLSID list. This is a list of every guid for every COM object installed
on the machine. On my 2.2 Ghz Windows/XP laptop (about 1 year old) I have
roughly 5600 guid's in this list and each cycle over these guids takes around 1
min.

When adding the bytes of the guids to the data pools, I only count 1 bit of
entropy for each byte of the guid that is different than the preceeding guid.
This keeps the estimate of the amount of 'entropy' very conservative. This is
important to avoid reseeding the generator too often.

Reading all the CLSID's on my system once takes about 30 minutes.

HKEY_CLASSES_ROOT

Processing HKCR takes about 2 and 1/2 hours on my machine.

HKEY_CURRENT_USER

HKCU\Software is where the applications store application specific data in
the registry. For example, your text editor will likely put the list of your
most recently used files here. Each application contains a lot of data like this
plus gui data such as your background colour etc. On my Win/XP laptop, it takes
about 8 min for a dedicated thread to cycle through all this registry data.
Dumping out the key names and only ascii data for the keys resulted in a file of
around 270K.

On my machine it takes about 30 minutes to walk HKCU.

HKEY_LOCAL_MACHINE

HKLM is the largest repository of the registry. On my machine it takes about
5 hours to process.

Process Information

There is a wealth of information available for each
process. If you look at the Task Manager, under "View", "Select Columns" you can
see some of the different information that is available:

Process Name

Process id

Memory Usage

Page Faults

Handles

Threads

I/O Reads

I/O Read Bytes

I/O Writes

I/O Write Bytes

I/O Other

I/O Other Bytes

Each of the above quantities is available via different Win32 API calls.
Following are some of the methods and the data that is available. For more
information on the specific functions, see MSDN.

Each non zero byte from the following structures is added to the data pools.
When a byte differs from the value obtained for the same process on a previous
call, the byte is marked as containing one bit of entropy. Otherwise, the data
is added to the pool but no entropy is claimed.

GetProcessIOCounters

The method fills in the IO_COUNTERS structure. This provides the
following information:

QueryWorkingSet

The Win32 API provides QueryWorkingSet to access the current
working set of memory pages loaded for a process. I found a somewhat dated
article to provide more information on the working set. If you are interested,
here is the article by Matt Pietrek which
gives more information on the working set (if someone knows of a more recent
article I'd be glad to provide a reference to that instead).

Each time the working set for a process is obtained, the bytes are compared
the bytes from the previous working set. If the size is different, then each non
zero byte adds one bit of entropy. If the number of bytes is the same, then each
each different byte adds one bit of entropy. Bytes that are the same as those
from the previous working set are added to the pool but no entropy is claimed.
This keeps the entropy estimate very conservative.

Some working sets can be quite large, well over 10MB of data. Because of
this, I decided to limit the working sets considered to only those working sets
below 1MB of data. Here are some of the applications I found with large working
sets:

svchost.exe, 22MB

Windows Explorer, 21MB

Yahoo Pager, 18MB

Visual Studio 7, 45MB

Internet Explorer, 30MB

I was surprised how large the working set was for Yahoo Pager. Before I
started filtering out the large working sets, Fortuna was using large amounts of
memory and generating a lot of page faults. Removing the large working sets
resulted in more reasonable system resource usage for the PRNG.

Microsoft Crypto API

Microsoft has introduced a cryptographic API which they claim is suitable for
cryptographic purposes. Unfortunately, nothing is published regarding the
internals of the Crypto API. Therefore some people are not comfortable using it
due to privacy concerns. Also, given some Microsoft security practices in the
past, it is difficult to know to what level should the Crypto API be trusted.

My approach here is to use the Crypto API as one source of 'random' data. Why
not? It increases the workload of the attacker; now the attacker must compromise
the Microsoft Crypto API as well as the other data sources used. The Crypto API
generates 32 bytes of random data every 100ms. These bytes are distributed over
all of the data pools using an estimate of 1 bit of entropy per byte of 'random'
data. This is a conservative estimate obviously, because the internals of the
Crypto API are unpublished.

Ping

Ping is a network command used to determine the time of a network
transmission. I use the high resolution timer to time the various calls to
perform a Ping. Even if the ping times out, the timing data still has useful
randomness for seeding the prng pools. Ping timing data only compresses by
around 2%. I wasn't able to gather enough timing data from Ping to analyze the
results using Diehard.

Seed File

I have implemented the seed file a little differently than the authors of
Practical Cryptography. If I'm understanding section 10.6.1 (page 178)
correctly, they are only writing out the state from the generator. So, each time
the PRNG is restarted, all the entropy in the pools is lost.

In my implementation, I write out the state of each of the 32 entropy pools.
I also write out the key from the generator as well as the counter used by the
generator. Some of the entropy pools are only used in the reseed very seldom and
provide security during reseeds when the generator state is compromised. So in
my opinion the state of the entropy pools is worth saving in the seed file.

I encrypt the seed file using AES with a 256 bit key in CTR mode. The key is
the SHA256 hash of the user provided seed file password concatenated to a salt
value.

One of the issues surrounding the use of the seed file is ensuring that the
same seed file is never used twice. To avoid possible repeated values if the
same seed file is used twice (see the book for a good discussion of how this
could happen), after the seed file is read and before the Fortuna PRNG is
allowed to return pseudo random bytes, more data is added to the hash of the
pool entropy. This data includes a hash of the current snapshot of all
processes, time information, and other available data. See the function
GetMachineSignature() in FortunaUtils.cpp for all the gory
details.

Using the code

A brief description of how to use the article or code. The class names, the
methods and properties, any tricks or tips.

Using the code is quite straightforward. All of the complexity is hidden from
the user. Here is how to set up the Fortuna PRNG.

There is also a method to get an arbitrarily long array of random bytes

bool GetRandomBytes(unsignedchar* pData, unsignedint numBytes);

Points of Interest

Writing server applications requires extensive testing. After adding a new
feature, I would usually test by letting the program run for 8 to 12 hours,
requesting a random number every 1 to 10 seconds. After adding the
QueryWorkingSet data source and running for 8 hours I found (from
the Task Manager) that there were some 24000 handles being used - oops. I
quickly found the problem and fixed it. The GetPerformanceInfo
function returns the number of handles; perhaps that value should be monitored
and an indication of a problem be written to a log file if the value goes over
some threshold.

To demonstrate the use of the prng I wrote a little MFC app which generates
passwords. The screen shots at the top of this CodeProject page show the Fortuna
monitor I added. This is an MFC dialog which monitors the state of the generator
and the source and pool objects. The 'Peek' edit box for the Sources shows the
actual data being fed to the pools from the currently selected source.

If you have any questions or comments on my implementation of Fortuna I'd be
glad to hear from you. My email address is ron at citadelsoftware dot ca.
Cheers!

History

Revision 1.0

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

Share

About the Author

Ron got his BSc in Chemical Engineering from the University of Waterloo (Ontario, Canada) and his Masters in Chem Eng at the Univsity of Alberta (Edmonton, Alberta, Canada). After over 10 years writing software for engineering simulators, Ron started consulting in Mar 2001. Since then he has worked in chemical engineering, personal finance (Intuit, QuickTax), at a wireless startup doing network programming and done a little teaching on writing software for Palm handheld devices.

Located in Calgary, Alberta, Canada, Ron has recently taken up snowboarding as he tries to keep up with his 4 boys.

Comments and Discussions

Dear citadel,
I'm currently trying to expand the amount of algorithms available for <ahref="http://www.cryptopp.com/">Crypto++</a>.
As of now, there's only one "good" CSPRNG in the library: ANSI X9.17 Appendix C.
Would it be OK for you if I take your code, modify it at some points (i.e. generalize cipher & hash) and propose it for adding to the library. The source files would be public domain (the reason for me question) and I would enlist our both names as authors, with foxus on you.
This is not final, it may be that I find some security holes later on which need to be fixed or it won't be accepted, or I won't be able to transform it to Crypto++'s interface.
BR
JPM

Not sure if you are still monitoring this forum, but I have a question.

First, I like what you've done. Not really hard to understand, and all the work is hidden. Makes using it a breeze.

My question concerns the threading code. I realize it isn't yours. But other than the copyright I cannot find any info on use restrictions. My safari doesn't seem to have this book available, so I can't check what the author's have to say. Any info? Can the threading code be used commercially?

Great article. I'm glad to see your implementation of the Fortuna algorithm, since good PRNGs are hard to come by, and not everyone has the book. Plus I notice you've come up with some very innovative implementations for sources of pool entropy.

I think that security would be reduced by saving the state of your entropy pool. Here's why. The state is meant to give a 'fast' output, and when combined with the counter, produces a nice PRNG sequence quickly. The entropy pool is meant to slowly accumulate randomness, and update the state.

If you save the state and the pool contents, then an attacker that can 'restore' the machine from a backup would know the contents of the state (the Fortuna authors plan for this in their original design), but also the contents of the entropy pool. By saving the pool, and restoring from saved pool state, the entire pool contents become known to an attacker.

The major problem I see with this is that it takes longer for the 'known' data to flush out of the entropy pool, and thus longer before the PRNG gets to a state the attacker couldn't predict. This reduces security. Fortuna was designed to help get the PRNG to a random state an attacker can't predict, even if they know the state. If they also get to control the entropy pool, this defeats much of that design.

On the other hand, if you implement the original design, where you start with an empty pool and just accumulate the new (hopefully unknown to the attacker) entropy, then it is able to have an immediate effect, thus getting the PRNG to an unpredictable state more quickly and increasing security.

What do you think? I'm curious to know your rationale for saving this pool entropy. Maybe you are assuming if the new data is hashed against the previous contents, then using a known state is OK since it increases the starting position?

Thanks for an excellent article, source code, design docs, etc. Openness is truly the best way to get good security.

Thanks for the note, it's nice to get at least one response after doing this project 6 months ago.

First, the reason that I decided to persist the pool entropy is that with 32 pools, some of the pools will only be used after years of running Fortuna. I had a reference for the length of time it would take to use the 32nd pool if a request was made every 0.1 seconds but I can't find it right now. So wiping out the entropy pools every time you had to restart Fortuna or reboot the machine seemed like good entropy (likely never used before for the higher numbered pools) was being lost.

I agree that there is a danger that an attacker could replace the entropy pools on the machine if they were somehow able to restore known entropy pools from a backup or even just replacing the pools with state known only to the attacker. (Of course, if this was to happen then your machine is likely compromised and then you've got other problems as well). To get around this problem I've done two things. One is to provide a password and encrypt the pool data when it is persisted. I know this isn't perfect because the password is likely on the machine somewhere and so it could be found. But it does provide extra work for the attacker.

The other thing that I do to avoid the pools being compromised on starting up Fortuna, is that when Fortuna is initializing, after the state of existing pools are read in from disk, I hash into the pools 'random' bits collected from the state of the machine. I know this isn't truly random data, but I would say that it is difficult to predict.

I call this startup data the 'Machine Signature' and if you're interested here is a link to the C++ source code in Doxygen format:Machine Signature

By the way each pool is initialized with the machine signature regardless of how the initial state was initialized. This initialization is done before any random bytes can be requested. Pool::AddMachineSignatureToPool shows the machine signature being hashed in to the pool contents:Pool::AddMachineSignatureToPool and you'll have to scroll down to line 302.

Thanks for your comments Doug, let me know if there is anything else you'd like to discuss. Do you have any other suggestions for entropy sources? I was thinking about randomly walking the disk and hashing in files size, access time, create time etc (maybe do 1 a minute or something like that).

Thanks for the feedback. I see what you are doing and think it is fine. Someone would have a hard time knowing the state of your machine without some serious compromising, so it seems the way you are using the reload would be effective.

Have you considered using any other performance counters? Things like CPU, network I/O rates? There seem to be a number of metrics that could be measured, and most the Perfmon counters should be available as interesting sources. Also, I'm wondering if there is a way to sample audio from a mic or line in, and then take the LSBit as a source?.

I'm in the process of writing a C# implementation of Fortuna, and have most the underlying logic done. At this point I just need to implement the seed file, and then write the entropy sources, so I'll be doing some more thinking about these. I'll let you know if I hit on any other good sources.

Thanks for sharing your implementation with the community as well as writing the inspirational article.