Testing Random Properties With Type Classes

Given a stream of bytes whose length is unknown, select a random byte from the stream using constant memory.

I chose this example because it is a non-trivial example of a problem involving random number generation.

One naive solution would be to store all of the bytes in a list in memory, waiting for the stream to close. This violates the last requirement of the problem though, which asks for a solution with constant memory utilization.

Let's start by defining a Stream datatype so that we can encode the problem in terms of finding a function of the correct type:

The idea is to keep two accumulator parameters - the first of type IO a represents some value of type a chosen with uniform probability from the values seen so far. The second of type Int is the number of values seen so far.

If we reach the end of the list, we return the first accumulated value. If not, we choose the new value with probability 1/n and the value we had already chosen with probability (n-1)/n.

Now you'd like to write some QuickCheck properties to verify that the results are indeed uniform.

The problem is that the function select' works in the IO monad, and is inherently non-deterministic. We could replace the use of randomRIO with a deterministic random function using a seed value, but then we would not be able to guarantee full coverage of the code. How many random samples would it take to gain confidence that your function indeed performs as expected?

The trick is to replace the IO monad with some monad living in a suitable typeclass.

Let's replace the call to randomRIO with a call to the new function uniform:

class (Monad r) => MonadRandom r where
uniform :: (Int, Int) -> r Int

The new typeclass MonadRandom has at least one inhabitant that we know of, which is IO:

instance MonadRandom IO where
uniform = randomRIO

Now instead of working with random values, let's identify the values with their probability distributions. This way, we do not lose any information by selecting a single value from the distribution.

Introduce the type Dist a, of probability distributions with values in type a:

However, now we can specialize select to the Dist monad for the purposes of testing our QuickCheck properties.

The idea is that since select is universally quantified in the monad r, we cannot cheat and use any specific knowledge we have about r to make our tests pass. If a test passes for one instance of MonadRandom, then we would expect the test to pass for any sensible instance of MonadRandom.

For example, let's write a property to check that selecting a random value from a Stream does not exclude any values: