The idea for this article was born 2004 when I was working on my bachelor thesis, which included the implementation of a simulation and test environment for a flow control algorithm. For various reasons, I chose to implement it as a managed application developed with C#. The only time I regretted this choice was when I started implementing the simulation of data traffic, which requires distributed random variables. Unfortunately, I couldn't find any free-available managed implementations of random number distributions in either the .NET Framework Class Library or any other resource. So, I implemented the needed random number distributions myself. Since then, I've had the idea to create a class library on the basis of these few implementations and publish it here on CodeProject, but I've never really had the time to realize it. Until now.

The Random Class Library contains abstract base classes for random number generators and random number distributions, as well as various concrete classes that are derived from both. Before I start to describe these classes, I have to mention that all algorithms which generate the (distributed) random numbers aren't my intellectual work, as I'm no brilliant mathematician. Thereforee this article, as well as the source code, contain references to the respective knowledge resources.

The Generator type declares common functionality for all random number generators. This includes the one provided by the System.Random type, plus some extensions. So, the class additionally declares two overloads for the NextDouble method, a NextBoolean method and the possibility to reset the random number generator. This can be very useful when using pseudo-random number generators. The following table lists all abstract members, together with a brief description.

Abstract Member

Description

bool CanReset { get; }

Gets a value indicating whether the random number generator can be reset, so that it produces the same random number sequence again.

bool Reset();

Resets the random number generator, so that it produces the same random number sequence again.Returns true, if the random number generator was reset; otherwise, false.

int Next();

Returns a nonnegative random number less than Int32.MaxValue; that is, the range of return values includes 0 but not Int32.MaxValue.

int Next(
int maxValue);

Returns a nonnegative random number less than the specified maximum; that is, the range of return values includes 0 but not maxValue.maxValue must be greater than or equal to 0.

int Next(
int minValue,
int maxValue);

Returns a random number within the specified range; that is, the range of return values includes minValue but not maxValue.maxValue must be greater than or equal to minValue.

double NextDouble();

Returns a nonnegative floating point random number less than 1.0; that is, the range of return values includes 0.0 but not 1.0.

double NextDouble(
double maxValue);

Returns a nonnegative floating point random number less than the specified maximum; that is, the range of return values includes 0.0 but not maxValue.maxValue must be greater than or equal to 0.0.

double NextDouble(
double minValue,
double maxValue);

Returns a floating point random number within the specified range; that is, the range of return values includes minValue but not maxValue.maxValue must be greater than or equal to minValue. The distance between minValue and maxValue must be less than or equal to Double.MaxValue.

bool NextBoolean();

Returns a random Boolean value.

void NextBytes(
byte[] buffer);

Fills the elements of a specified array of bytes with random numbers.Each element of the array of bytes is set to a random number greater than or equal to 0, and less than or equal to Byte.MaxValue.

Currently, the library provides four classes derived from Generator. These are listed in the following table, together with a short description and links for further reading.

Implementation

CanReset

Description / Links

ALFGenerator

true

Represents an Additive Lagged Fibonacci pseudo-random number generator with some additional Next methods.This type is based upon the implementation in the Boost Random Number Library. It uses the modulus 232 and, by default the, "lags" 418 and 1279. This can be adjusted through the associated ShortLag and LongLag properties. Some popular pairs are presented on Wikipedia - Lagged Fibonacci generator.

MT19937Generator

true

Represents a Mersenne Twister pseudo-random number generator with period 219937-1 and some additional Next methods.This type is based upon information and the implementation presented on the Mersenne Twister Home Page.

StandardGenerator

true

Represents a simple pseudo-random number generator.This type internally uses an instance of the System.Random type to generate pseudo-random numbers.

XorShift128Generator

true

Represents a xorshift pseudo-random number generator with period 2128-1 and some additional Next methods.This type is based upon the implementation presented in the CP article " A fast equivalent for System.Random" and the theoretical background on xorshift random number generators published by George Marsaglia in the paper "Xorshift RNGs".

The Distribution class declares common functionality for all random number distributions. Its abstract members which have to be implemented by inheritors are some properties providing information on distribution characteristics and the NextDouble method. The following table lists all abstract members together with a brief description.

Abstract Member

Description

double Minimum { get; }

Gets the minimum possible value of distributed random numbers.

double Maximum { get; }

Gets the maximum possible value of distributed random numbers.

double Mean { get; }

Gets the mean of distributed random numbers.If the mean can't be computed, the Double.NaN constant will be returned.

double Median { get; }

Gets the median of distributed random numbers.If the median can't be computed, the Double.NaN constant will be returned.

double Variance { get; }

Gets the variance of distributed random numbers.If the variance can't be computed, the Double.NaN constant will be returned.

double[] Mode { get; }

Gets the mode of distributed random numbers.If the mode can't be computed, an empty array will be returned.

double NextDouble();

Returns a distributed floating point random number.

In addition to its abstract members, the Distribution type provides some implementation details common to all random number distributions. As the computation of distributed random numbers necessarily requires a random number generator, the generator field stores an instance of the Generator class. This instance is accessible to inheritors through its respective property. Furthermore, two protected constructors are defined: one takes a user-defined Generator object and the other is a standard constructor that applies an instance of the StandardGenerator type. The Distribution type also offers the same reset functionality as the Generator class. In fact, it simply forwards the results of the stored Generator instance, as a random number distribution can only be reset if its underlying random number generator is resettable.

The Random Class Library currently provides Distribution inheritors for various continuous and discrete distributions. They are listed in the following tables, together with a short description, links for further reading and information on the range and specific distribution parameters.

Besides the inherited members, all classes derived from Distribution share some more similarities. Firstly, all distributions offer two constructors: one that takes a user-defined Generator object as an underlying random number generator and another as a standard constructor that uses an instance of StandardGenerator type for this purpose.

Secondly, each distribution provides methods that allow you to determine whether a value is valid for one of its specific parameters and thereforee can be assigned through the belonging property. These methods follow the naming scheme "IsValid{parameter}" and, to be consistent, are also available for parameters whose range of values isn't restricted.

Also, the classes derived from Distribution make use of helper variables to speed up the random number generation, if possible. These helpers store intermediate results that only depend on distribution parameters and therefore don't need to be recalculated in successive executions of NextDouble. The computation of the helper variables is encapsulated inside the UpdateHelperVariables method, which gets called during construction and whenever a distribution parameter involved in the helper's calculation changes. The following code snippet is taken from the PoissonDistribution type and should illustrate the preceding explanations.

Changed the Distribution.Reset method to be virtual, so it can be overridden in derived classes

Overridden the Reset method in the NormalDistribution: The override discards an already computed random number to be returned next -- the underlying generation algorithm always computes two random numbers at a time -- so the distribution is always properly reset

Overridden the Reset method in the BetaDistribution, BetaPrimeDistribution, ChiDistribution, ChiSquareDistribution, FisherSnedecorDistribution, LogNormalDistribution, RayleighDistribution and StudentsTDistribution: The override explicitly resets the respective underlying distribution(s), which in most cases is the NormalDistribution, so the listed distributions are always properly reset

1.3

Fixed bug in NormalDistribution: Changes to the parameters μ and σ now discard an already computed random number to be returned next -- the underlying generation algorithm always computes two random numbers at a time -- so in any case changes to the parameters reflect in generated random number beginning with the first one

1.2

Changed the access modifier of field Distribution.generator from protected to private and made it accessible through the new protected property Distribution.GeneratorAdapted all inheritors of Distribution to the above changeFollow the link for further explanation: Visual Studio Team System - Do not declare visible instance fields

Random Class Library is free software. You can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation, either version 2.1 of the License, or (at your option) any later version. This library is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY, without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details. You should have received a copy of the GNU Lesser General Public License along with this library. If not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.

I'd begun developing the RandomTester application when I implemented the first random number distributions for my bachelor thesis. At the time, its only purpose was to visualize the distribution of generated random numbers and the effects of the specific distribution parameters upon it. During the work on the Random Class Library, I refined this visualization and added performance tests for random number generators and distributions.

The RandomTester application allows you to test and compare random number generators with respect to their performance. The following picture shows the user interface of this test after running it.

All classes derived from Generator are listed at the left edge. They can be selected/deselected either one-by-one through clicking them on the respective list entry or all at once by using the "Select all" or "Deselect all" buttons. Beneath those buttons you can specify how many samples have to be generated by each generator to benchmark their performance. Finally, one has to choose which Next methods -- declared by the Generator type and implemented in the derived classes -- should be tested. The performance of random number generators is measured in calls per second. The results are displayed in a datagrid that contains a row for each random number generator and a column for each tested Next method.

Random number distributions can be tested in two ways. Firstly, a performance test is available that is similar to the one provided for the random number generators. Secondly, the distribution of generated random numbers and the effects of the specific distribution parameters upon it can be visualized. The user interface of those tests is shown by the below image.

As mentioned, the performance test is similar to the random number generator benchmark. At the left edge, all classes derived from Distribution are listed and can be selected/deselected either one by one through clicking on the respective list entry or all at once by using the "Select all" or "Deselect all" buttons. Furthermore the number of samples each distribution has to generate during the benchmark can be specified, as well as an underlying random number generator. The latter adjustment offers to use a class derived from Generator or the distribution default. In case an inheritor of Generator gets selected, the tested distributions are instantiated using their constructor when taking such an object. Otherwise, the standard constructor is used. The performance of the selected random number distributions is expressed as the time it takes to generate the specified number of samples. The results are shown in ascending order inside the textbox.

In contrast to both performance tests, the main focus of the visualization test isn't performance but rather the testing of a single random number distribution. Therefore it employs a ZedGraphControl to show a histogram of the distribution of generated random numbers, i.e. how often distinct values occur or, more precisely, how likely it is for them to occur (probability density function).

In case of discrete distributions, this can be done easily by counting the occurrences of discrete values and dividing by the overall number of generated samples. Unfortunately, most random number distributions are continuous and have a quite large range. Thus, the probability of a given value being generated more than once is fairly low, even if many numbers are generated. That's why the visualization test divides the domain of generated samples into a specified number of sections and then shows how likely it is for the values inside these sections to occur. Such a histogram is also used for discrete random number distributions, since it provides the same results as long as the number of sections is equal to or greater than the number of discrete values. In this case, the complicated distinction between distribution types becomes redundant.

At the top left corner, a dropdown list lets you select the distribution to be visualized from all classes derived from Distribution. Beneath that list, a groupbox displays the current characteristics of the selected distribution. A second groupbox allows you to adjust the specific parameters. Depending on the selected distribution, the change of a parameter also causes one or more of its characteristics to change. The final adjustment directly related to the distribution is the selection of an underlying random number generator that allows you to choose either the distribution default or a class derived from Generator.

Any further settings are related to the visualization of the random number distribution. You can specify how many samples and sections are used to create the histogram, whether the histogram curves should be smoothed and whether specific bounds are used, i.e. any generated random number lying outside will be ignored.As shown by the above image, multiple histograms can be displayed at the same time. This allows you to examine the effects of the parameters on the distribution of random numbers or even compare different distributions. Each newly generated histogram is drawn as a separate curve. The name and parameter values of the tested distribution are added to the legend. In addition to the histogram, the bottom left textbox shows how much time was needed to generate the random numbers, as well as their minimum, maximum, mean and variance.

Adjusted distribution visualization so that the last interval of histograms is displayed correctlyUntil now, the histogram graphs consisted of points representing the minimum bounds of histogram intervals, so the last interval wasn't really drawn. Therefore graphs now contain an additional point for the maximum bound of the last interval which, of course, has the same y-value as the corresponding minimum bound.

1.1

Display unit "samples/s" in generator test

Use byte[64] for testing Generator.NextBytes method so the test is less time consuming

RandomTester is free software. You can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 2 of the License, or any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY, without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should receive a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA.

Article history

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

Comments and Discussions

I wanted to use your tester program but it gives a zillion problems with respect to the ZedGraph stuff. For example it says that it can't tell if it's using a ZedGraph or a System.Windows Label. If I disambiguate those then I get a bunch of other serious errors.

As I'm sure it worked on your system, I suspect that the problem arises from the fact that I have a recent ZedGraph installation on my PC and for some reason it's clashing with yours. However, my version of the .dll is identical to yours.

You're right
Nevertheless I checked again and found something strange. Though I copied ZedGraph.dll version 5.0.1.41097 into the RandomTester project and referenced it, Visual Studio tells me in the properties that it references version 4.2.1.35091 at a path that no longer exists and copied it to the output directory (don't know from where). I've experimented a bit, but now everything is messed up and I'm no longer able to let the RandomTester project reference the Random project which is quite essential
I will try to fix this mess on weekend and update the downloads if I succeed.

Regards, Stefan

"Programming today is a race between software engineers striving to build bigger and better idiot-proof programs, and the Universe trying to produce bigger and better idiots. So far, the Universe is winning." - Rick Cook

Got some time tonight and think I fixed it.
It seems it was some weird problem with the VS project. I copied all resources to a new project where I could reference my Random project again and also the ZedGraph.dll version I want. When referencing the version of ZedGraph which I thought using the whole time, I faced the same ambiguity problems as you and they seem to have changed some other things too. So obviously on your computer the VS project didn't show the weird behaviour as on mine.
Cause I didn't want to rewrite the code for the newer ZedGraph version and the RandomTester runs well with the older one, I stick to it and updated the project so it hopefully uses this one correctly on every computer.

"Programming today is a race between software engineers striving to build bigger and better idiot-proof programs, and the Universe trying to produce bigger and better idiots. So far, the Universe is winning." - Rick Cook

Hi there, first of all *great* article. Very useful for me.
I was wondering if in your work you have came accross an FInv funtion (return the quantile according to the F distribution - avaialble in Excel FInv).

"Programming today is a race between software engineers striving to build bigger and better idiot-proof programs, and the Universe trying to produce bigger and better idiots. So far, the Universe is winning." - Rick Cook

"Programming today is a race between software engineers striving to build bigger and better idiot-proof programs, and the Universe trying to produce bigger and better idiots. So far, the Universe is winning." - Rick Cook

"Programming today is a race between software engineers striving to build bigger and better idiot-proof programs, and the Universe trying to produce bigger and better idiots. So far, the Universe is winning." - Rick Cook

Almost ten years ago I took a class in Simulations and Testing for my CE Masters. In that class I had to hand code a half dozen different random number generators in order to to use them simultaneously for different variables/inputs to guarantee they were related. Boy could I have used something like this back then ... though C# wasn't even on the radar then.
Worth a 5 to me even though I don't have a current need to RNG functions.

One thing though, my text book back then had many examples of bad RNG functions and charts/graphs to show why. They were often time series plots in 3D that would show patterns, etc. If I recall correctly, often it wasn't the algortithm, but the parameters that could make or break an RNG. I no longer have the book, but I was wondering how hard it would be to add such tests to your test app so that you could experiement with the parameters to the different RNG functions &/or create more and be able to spot a bad one.

It's hard to say without having/knowing the book you refer to and therefor not knowing how complex these test are.

"Programming today is a race between software engineers striving to build bigger and better idiot-proof programs, and the Universe trying to produce bigger and better idiots. So far, the Universe is winning." - Rick Cook

I am trying Random Tester and the DescreteUniformDistribution. And it gives me quite strange results. The histogram is almost OK: the numbers are uniformly distributed, except for 3 points in it. First is the number something less then 1/3 of the range, then the number less then 2/3 of the range and then the upper bound.

If I generate numbers between 0 - 100, than the histogram is OK except for number(s) a bit less than 30 and 60 and at the number 100.

If I generate numbers between 0 - 5000, than the histogram is OK except for number(s) about 1400 and 2900 and 5000. To see this anomaly, you have to increase number of samples (I used 10 000 000)

To see it cleary is goot to not use large range (say 0 - 1000), and use many samples (it's visible in 1 000 000 samples, but much better in 10 000 000 samples).

the described behaviour results from the used computation of the displayed histogram. In case you generate discrete random numbers between 0 and 100 there are 101 discrete numbers that occur. At the same time the histogram uses by default only 100 steps (intervals), so it cannot differentiate all generated numbers.
Therefor increase the number of histogram steps, so it equals at least the number of generated discrete numbers.

Regards, Stefan

"Programming today is a race between software engineers striving to build bigger and better idiot-proof programs, and the Universe trying to produce bigger and better idiots. So far, the Universe is winning." - Rick Cook

I was looking for a way to generate randomised stockmarket data for testing puposes. Random generation of close signal is relatively straight forward. The problem is generating open/high/low/close/volume data which is stochastic i.e. some pattern is present related to CLV (Close Location Volume), ATR(average true range) , OBV(On Balance Volume) etc.

At least with your set of random generators I can now simulate stochastics, by applying various distributions to OHLC prices and volumes.

I,m just beginning tonight with your classes and already I have created a better simulator using your classes than I had been able to create previously.
Cheers
Anton

Do you mind if I merge/integrate your random library with/to the Math.NET Iridium [2] package (it is LGPL, too)? We already have some distributions and generators implemented, but your implementation seems to be superior. Of course we would retain your copyright notice, and mention the contribution on the website, as usual.

"Programming today is a race between software engineers striving to build bigger and better idiot-proof programs, and the Universe trying to produce bigger and better idiots. So far, the Universe is winning." - Rick Cook

The 'next' functions all have side effects. It doesn't take much work to rewrite this as side effect free. The end result is that testing is much easier, and you can use assertions easier. For details, have a read of Object Oriented Software Construction, edition 2 by B Meyer.