Pseudo-random number generators (PRNGs) power property tests and Monte Carlo simulations. They are also used to model random behaviour in nature and to initialise machine learning models1. Their performance and quality matter.
Unfortunately, up to version 1.1, the default PRNG of the random library for Haskell was neither very fast nor very high quality. The recently announced version 1.2 fixes this. Random version 1.2 is a collaboration between Dominic Steinitz, Alexey Kuleshevich, and myself, with invaluable support from Alexey Khudyakov and Andrew Lelechenko.
In this blog post, I will focus on how we ensured that the implementation used in random v1.2 is, indeed, a high quality PRNG.
Pseudo-random number generators
A PRNG produces from a seed a sequence of approximately uniformly distributed numbers.
init :: Seed -> State next :: State -> (Output, State)
This allows us to generate a output sequence
x1, x2, x3, … as follows:
let g0 = init seed (x1, g1) = next g0 (x2, g2) = next g1 (x3, g3) = next g2
This is not always enough though. Often you may want two or more parallel sequences that are still (approximately) random even when assembled. Such sequences are said to be independent.
split :: State -> (State, State)
This can arise when you have different threads. But it’s of prime
importance in a lazy language like Haskell, since it makes it possible
to generate deterministic random numbers lazily. For instance,
split is used in
QuickCheck’s Gen monad.
An easy implementation of
split is to duplicate the current
state. But, the two parallel sequences will not be independent: indeed
they will be identical. A slightly more sophisticated implementation is to generate
the two split states randomly, but this will typically not yield
independent sequences either.
In a blog post comparing the performance of
Haskell PRNG implementations, Alexey Kuleshevich found
be the fastest pure PRNG written in Haskell, making it the top
candidate to replace the default PRNG in the proposed new version of
random. But it also had to be good enough at generating random sequences.
In the following, I will go through how we tested the
random version 1.1 and
splitmix-0.1, which we used for
version 1.2, in order to make sure that, while in version 1.1
generates sequences which are not independent, in version 1.2
Legacy and SplitMix.
The PRNG used by
random up to version 1.1 is one of the PRNG
algorithms proposed by L’Ecuyer, with an ad hoc
tacked on without statistical justification. We shall refer to this
PRNG as Legacy. Since version 1.2, the PRNG is SplitMix, a PRNG
designed from the ground up to be splittable, as implemented in the
Every PRNG with a finite state is periodic: it will necessarily come back to the same state after a certain number of iterations, called its period. In contrast, long sequences of truly random numbers are unlikely to be periodic. As a result, every PRNG with a finite state is distinguishable from a source of true randomness, at least in theory.
Legacy produces numbers in the range and has a period of just under ; SplitMix produces numbers in the range and has a period of exactly .
Considering the output range and period of Legacy and SplitMix, we see something interesting: Legacy’s period of roughly contains its range of roughly numbers many times over. For SplitMix, on the other hand, the output range and period are both . SplitMix in fact generates each output in precisely once.
We actually expect repetitions much more often than that: observing numbers where each number in the range occurs exactly once is vanishingly unlikely in a truly random sequence of numbers. This is is a manifestation of the Birthday problem.
And indeed, SplitMix fails a test that checks specifically that repetitions in the output sequence occur as frequently as expected according to the Birthday problem.
How much of an issue is this? It depends on the application. Property tests, for example, don’t benefit from repetitions. And if, as is common, you only need values in a subrange of the full , you will have many repetitions indeed.
Reproducible test environment
While certain PRNG properties like output range and period can be determined “by inspection”, there is no clear definition or test for its quality, that is, how similar to uniformly distributed numbers a PRNG’s output sequence is.
The closest thing we have are collections of tests, often called test batteries, which empirically check if a PRNG’s output sequence has certain properties expected of a random number sequence — like the distance between repetitions, the famous Birthday problem mentioned in the previous section.
To make it easy to test PRNGs, we created a repository with a Nix shell environment to run the most well-known PRNG test batteries.
Nix is required to use the test environment. The easiest way to enter it is using this command:
$ nix-shell https://github.com/tweag/random-quality/archive/master.tar.gz
You can test that it works by running PractRand on a truly terrible “PRNG”:
/dev/zero, which outputs zero bytes. The PractRand binary is called
RNG_test, passing “stdin32” makes it read unsigned 32-bit integers from standard input:
$ RNG_test stdin32 < /dev/zero
PractRand should immediately show a large number of test failures. Don’t use
/dev/zero as a source of random numbers!
While this post is about testing Legacy and SplitMix, the environment we created can test any PRNG written in any language.
To demonstrate this, we have included programs in C, Python, Perl and Lua, each using the language’s standard PRNG to output a stream of unsigned 32-bit integers2.
To see the source code of the generator scripts, run e.g.
cat $(which generate.pl).
You can run these programs as
generate.lua within the Nix environment.
All the statistical tests in this environment consume binary input.
To convert the hexadecimal 32-bit numbers to binary, pipe them through
xxd -r -p, e.g.
$ generate.lua | xxd -r -p.
The environment contains a collection of PRNG test batteries. All of them either support reading from standard input or were wrapped to support this. As a result, all the tests can be run as follows:
- To test a system-provided randomness source, run
$ $TEST < $SOURCE, e.g.
$ RNG_test stdin32 < /dev/random
- To test a program that generates pseudo-random numbers, run
$ $PROGRAM | $TEST, e.g.
$ generate.py | xxd -r -p | RNG_test stdin32
The test batteries include:
PractRand, the most recent PRNG test in the collection. It is adaptive: by default, it will consume random numbers until a significant test failure occurs.
$ RNG_test stdin32
$ RNG_test -h
TestU01, containing the test batteries SmallCrush (
-s), Crush (
-c) and BigCrush (
-b). We have wrapped it in the executable
TestU01_stdin, which accepts input from standard input.
$ TestU01_stdin -s
$ TestU01_stdin -h
For a full list, see the README.
These test batteries test the randomness of individual sequences. But we want to test whether sequences generated by splits are independent or not, which involves at least two sequences at a time.
Rather than coming up with new test batteries dedicated to
Schaathun proposes four sequences built of a
split. These “split sequences”
are traditional (non-splittable) PRNGs built off
split. The split sequences are then fed into regular PRNG tests.
Here is the API for splittable PRNGs again:
init :: Seed -> State next :: State -> (Output, State) split :: State -> (State, State)
A naming convention for the result of
split will come in handy. Let’s say that a generator
g is split into
gR (for “left” and “right”), that is,
(gL, gR) = split g.
Then we get
gLR by applying
(gLL, gLR) = split gL, etc.
Let’s also call
rX the output of
next gX, i.e.
(rLL, _) = next gLL.
This lets us express concisely the first of the sequences proposed by Schaathun:
Sequence S_L: given g, output rL; repeat for g = gR
This sequence recurses into the right generator returned by
split while outputting a number generated by the left generator.
We can visualise the sequence as a binary tree where the nodes are
Asterisk (*) stands for a generated output, ellipsis (…) is where the next iteration of the sequence continues:
split / \ * split / \ * …
The other sequences can be expressed as follows:
Sequence S_R: given g, output rR; repeat for g = gL split / \ split * / \ … *
Sequence S_A: given g, output rR, rLL; repeat for g = gLR split / \ split * / \ * …
Sequence S: given g, output rRLL, rRLR, rRRL, rRRR; repeat for g = gL split / \ … split / \ split split / \ / \ * * * *
We implemented these sequences in a test program which uses Legacy or SplitMix as the underlying PRNG implementation and outputs the random numbers on stdout.
Test results for Legacy and SplitMix
The following table shows our test results. We tested with PractRand and TestU01.
S_R are those discussed above.
The test battery output is usually a p-value per test indicating the probability of the input being random. We summarise the results as “Fail” or “Pass” here. Each table cell links to the full result, which includes the full command used to run them. Since the seeds of the sequences are fixed, you should be able to reproduce the results exactly.
|PRNG / sequence||PractRand result||TestU01 result|
|Legacy / S||Fail (8 MiB)||Fail (SmallCrush)|
|SplitMix / S||Pass (2 TiB)||Pass (Crush)|
|Legacy / S_A||Pass (1 TiB)||Pass (Crush)|
|SplitMix / S_A||Pass (2 TiB)||Pass (BigCrush)|
|Legacy / S_L||Fail (8 GiB)||Fail (Crush)|
|SplitMix / S_L||Pass (2 TiB)||Pass (BigCrush)|
|Legacy / S_R||Fail (64 GiB)||Fail (Crush)|
|SplitMix / S_R||Pass (2 TiB)||Pass (BigCrush)|
In this post, I’ve described how we’ve made sure that the PRNG used in
the newly released version 1.2 of the
random library is of high
quality. The legacy PRNG used up to version 1.1 had a poor quality
implementation of the
split primitive, which we wanted to fix.
To this effect, we created a general-purpose reproducible PRNG
testing environment, which makes it easy to run the most
commonly used PRNG tests using Nix with a minimal amount of setup
work. For testing, we used split sequences to reduce the quality of
the quality of a traditional PRNG.
Melissa O’Neill’s website on PCG, a family of PRNGs, contains a wealth of information on PRNGs and PRNG testing in general.
Security-relevant applications also often require random data. They should not rely on the kinds of PRNGs discussed in this blog post. Instead, security-relevant applications should use cryptographically-secure pseudo-random number generators that have been vetted by the cryptographic community for this purpose.↩
In the case of C, if↩
RAND_MAXis not equal to 2^32, the program will not output the full 32-bit range and will thus fail statistical tests. The other programs should in theory output numbers in the correct range.