Pseudo-random number generators (PRNGs) power property tests and Monte Carlo simulations. They are also used to model random behaviour in nature and to initialise machine learning models1. Their performance and quality matter.

Unfortunately, up to version 1.1, the default PRNG of the random library for Haskell was neither very fast nor very high quality. The recently announced version 1.2 fixes this. Random version 1.2 is a collaboration between Dominic Steinitz, Alexey Kuleshevich, and myself, with invaluable support from Alexey Khudyakov and Andrew Lelechenko.

In this blog post, I will focus on how we ensured that the implementation used in random v1.2 is, indeed, a high quality PRNG.

### Pseudo-random number generators

A PRNG produces from a seed a sequence of approximately uniformly distributed numbers.

init :: Seed -> State
next :: State -> (Output, State)

This allows us to generate a output sequence x1, x2, x3, … as follows:

let g0 = init seed
(x1, g1) = next g0
(x2, g2) = next g1
(x3, g3) = next g2

This is not always enough though. Often you may want two or more parallel sequences that are still (approximately) random even when assembled. Such sequences are said to be independent.

split :: State -> (State, State)

This can arise when you have different threads. But it’s of prime importance in a lazy language like Haskell, since it makes it possible to generate deterministic random numbers lazily. For instance, split is used in QuickCheck’s Gen monad.

An easy implementation of split is to duplicate the current state. But, the two parallel sequences will not be independent: indeed they will be identical. A slightly more sophisticated implementation is to generate the two split states randomly, but this will typically not yield independent sequences either.

The split function in random up to version 1.1 creates generators that are not independent (#3575, #3620).

In a blog post comparing the performance of Haskell PRNG implementations, Alexey Kuleshevich found splitmix to be the fastest pure PRNG written in Haskell, making it the top candidate to replace the default PRNG in the proposed new version of random. But it also had to be good enough at generating random sequences.

In the following, I will go through how we tested the split function in random version 1.1 and splitmix-0.1, which we used for version 1.2, in order to make sure that, while in version 1.1 split generates sequences which are not independent, in version 1.2 split generates independent sequences.

## Legacy and SplitMix.

The PRNG used by random up to version 1.1 is one of the PRNG algorithms proposed by L’Ecuyer, with an ad hoc split function tacked on without statistical justification. We shall refer to this PRNG as Legacy. Since version 1.2, the PRNG is SplitMix, a PRNG designed from the ground up to be splittable, as implemented in the splitmix package.

Every PRNG with a finite state is periodic: it will necessarily come back to the same state after a certain number of iterations, called its period. In contrast, long sequences of truly random numbers are unlikely to be periodic. As a result, every PRNG with a finite state is distinguishable from a source of true randomness, at least in theory.

Legacy produces numbers in the range $[1, 2^{31}-5)$ and has a period of just under $2^{61}$; SplitMix produces numbers in the range $[0, 2^{64})$ and has a period of exactly $2^{64}$.

Considering the output range and period of Legacy and SplitMix, we see something interesting: Legacy’s period of roughly $2^{61}$ contains its range of roughly $2^{31}$ numbers many times over. For SplitMix, on the other hand, the output range and period are both $2^{64}$. SplitMix in fact generates each output in $[0, 2^{64})$ precisely once.

We actually expect repetitions much more often than that: observing $2^{64}$ numbers where each number in the range $[0, 2^{64})$ occurs exactly once is vanishingly unlikely in a truly random sequence of numbers. This is is a manifestation of the Birthday problem.

And indeed, SplitMix fails a test that checks specifically that repetitions in the output sequence occur as frequently as expected according to the Birthday problem.

How much of an issue is this? It depends on the application. Property tests, for example, don’t benefit from repetitions. And if, as is common, you only need values in a subrange of the full $[0, 2^{64})$, you will have many repetitions indeed.

## Reproducible test environment

While certain PRNG properties like output range and period can be determined “by inspection”, there is no clear definition or test for its quality, that is, how similar to uniformly distributed numbers a PRNG’s output sequence is.

The closest thing we have are collections of tests, often called test batteries, which empirically check if a PRNG’s output sequence has certain properties expected of a random number sequence — like the distance between repetitions, the famous Birthday problem mentioned in the previous section.

To make it easy to test PRNGs, we created a repository with a Nix shell environment to run the most well-known PRNG test batteries.

### Setup

Nix is required to use the test environment. The easiest way to enter it is using this command:

$nix-shell https://github.com/tweag/random-quality/archive/master.tar.gz You can test that it works by running PractRand on a truly terrible “PRNG”: /dev/zero, which outputs zero bytes. The PractRand binary is called RNG_test, passing “stdin32” makes it read unsigned 32-bit integers from standard input: $ RNG_test stdin32 < /dev/zero

PractRand should immediately show a large number of test failures. Don’t use /dev/zero as a source of random numbers!

### PRNGs

While this post is about testing Legacy and SplitMix, the environment we created can test any PRNG written in any language. To demonstrate this, we have included programs in C, Python, Perl and Lua, each using the language’s standard PRNG to output a stream of unsigned 32-bit integers2. To see the source code of the generator scripts, run e.g. cat $(which generate.pl). You can run these programs as generate.c, generate.py, generate.pl and generate.lua within the Nix environment. All the statistical tests in this environment consume binary input. To convert the hexadecimal 32-bit numbers to binary, pipe them through xxd -r -p, e.g. $ generate.lua | xxd -r -p.

### Test batteries

The environment contains a collection of PRNG test batteries. All of them either support reading from standard input or were wrapped to support this. As a result, all the tests can be run as follows:

• To test a system-provided randomness source, run TEST < $SOURCE, e.g. $ RNG_test stdin32 < /dev/random
• To test a program that generates pseudo-random numbers, run PROGRAM | $TEST, e.g. $ generate.py | xxd -r -p | RNG_test stdin32

The test batteries include:

• PractRand, the most recent PRNG test in the collection. It is adaptive: by default, it will consume random numbers until a significant test failure occurs.

Example invocation: $RNG_test stdin32 Help: $ RNG_test -h

• TestU01, containing the test batteries SmallCrush (-s), Crush (-c) and BigCrush (-b). We have wrapped it in the executable TestU01_stdin, which accepts input from standard input.

Example invocation: $TestU01_stdin -s Help: $ TestU01_stdin -h

For a full list, see the README.

## Split sequences

These test batteries test the randomness of individual sequences. But we want to test whether sequences generated by splits are independent or not, which involves at least two sequences at a time.

Rather than coming up with new test batteries dedicated to split, Schaathun proposes four sequences built of a combination of next and split. These “split sequences” are traditional (non-splittable) PRNGs built off next and split. The split sequences are then fed into regular PRNG tests.

Here is the API for splittable PRNGs again:

init :: Seed -> State
next :: State -> (Output, State)
split :: State -> (State, State)

A naming convention for the result of split will come in handy. Let’s say that a generator g is split into gL and gR (for “left” and “right”), that is, (gL, gR) = split g. Then we get gLL and gLR by applying split to gL via (gLL, gLR) = split gL, etc. Let’s also call rX the output of next gX, i.e. (rLL, _) = next gLL.

This lets us express concisely the first of the sequences proposed by Schaathun:

Sequence S_L: given g, output rL; repeat for g = gR

This sequence recurses into the right generator returned by split while outputting a number generated by the left generator. We can visualise the sequence as a binary tree where the nodes are split operations. Asterisk (*) stands for a generated output, ellipsis (…) is where the next iteration of the sequence continues:

     split
/  \
*  split
/  \
*    …

The other sequences can be expressed as follows:

Sequence S_R: given g, output rR; repeat for g = gL

split
/  \
split *
/  \
…    *
Sequence S_A: given g, output rR, rLL; repeat for g = gLR

split
/  \
split *
/  \
*    …
Sequence S: given g, output rRLL, rRLR, rRRL, rRRR; repeat for g = gL

split
/     \
…      split
/     \
split      split
/  \       /  \
*    *     *    *

We implemented these sequences in a test program which uses Legacy or SplitMix as the underlying PRNG implementation and outputs the random numbers on stdout.

## Test results for Legacy and SplitMix

The following table shows our test results. We tested with PractRand and TestU01. The sequences S, S_A, S_L and S_R are those discussed above.

The test battery output is usually a p-value per test indicating the probability of the input being random. We summarise the results as “Fail” or “Pass” here. Each table cell links to the full result, which includes the full command used to run them. Since the seeds of the sequences are fixed, you should be able to reproduce the results exactly.

PRNG / sequence PractRand result TestU01 result
Legacy / S Fail (8 MiB) Fail (SmallCrush)
SplitMix / S Pass (2 TiB) Pass (Crush)
Legacy / S_A Pass (1 TiB) Pass (Crush)
SplitMix / S_A Pass (2 TiB) Pass (BigCrush)
Legacy / S_L Fail (8 GiB) Fail (Crush)
SplitMix / S_L Pass (2 TiB) Pass (BigCrush)
Legacy / S_R Fail (64 GiB) Fail (Crush)
SplitMix / S_R Pass (2 TiB) Pass (BigCrush)

## Conclusion

In this post, I’ve described how we’ve made sure that the PRNG used in the newly released version 1.2 of the random library is of high quality. The legacy PRNG used up to version 1.1 had a poor quality implementation of the split primitive, which we wanted to fix.

To this effect, we created a general-purpose reproducible PRNG testing environment, which makes it easy to run the most commonly used PRNG tests using Nix with a minimal amount of setup work. For testing, we used split sequences to reduce the quality of split to the quality of a traditional PRNG.

For further details, check the reddit announcement for random 1.2. And don’t hesitate to use our repository to setup your own PRNG testing.

2. In the case of C, if RAND_MAX is not equal to 2^32, the program will not output the full 32-bit range and will thus fail statistical tests. The other programs should in theory output numbers in the correct range.