Review of High-Quality Random Number Generators
Review of High-Quality Random Number Generators
https://doi.org/10.1007/s41781-019-0034-3
REVIEW
Received: 2 March 2019 / Accepted: 7 December 2019 / Published online: 8 January 2020
© The Author(s) 2020
Abstract
This is a review of pseudorandom number generators (RNG’s) of the highest quality, suitable for use in the most demand-
ing Monte Carlo calculations. All the RNG’s we recommend here are based on the Kolmogorov–Anosov theory of mixing
in classical mechanical systems, which guarantees under certain conditions and in certain asymptotic limits, that points on
the trajectories of these systems can be used to produce random number sequences of exceptional quality. We outline this
theory of mixing and establish criteria for deciding which RNG’s are sufficiently good approximations to the ideal math-
ematical systems that guarantee highest quality. The well-known RANLUX (at highest luxury level) and its recent variant
RANLUX++ are seen to meet our criteria, and some of the proposed versions of MIXMAX can be modified easily to meet
the same criteria.
Keywords Pseudorandom numbers · High quality randomness · Kolmogorov-Anasov mixing · Dynamical chaos ·
RANLUX · MIXMAX
We are concerned here with pseudorandom number gen- High-level scientific research, like many other domains,
erators (RNG’s), in particular those of the highest quality. has become dependent on Monte Carlo calculations, both
It turns out to be difficult to find an operational definition in theory and in all phases of experiments. It is well-known
of randomness that can be used to measure the quality of a that the MC method is used primarily for calculations that
RNG, that is the degree of independence of the numbers in are too difficult or even impossible to perform analytically,
a given sequence, or to prove that they are indeed independ- so that our science has become dependent to a large extent
ent. The situation for traditional RNG’s (not based on Kol- on the random numbers used in extensive MC calculations.
mogorov–Anasov mixing) is well described by Knuth in [1]. But how do we know if those numbers are random enough?
The book contains a wealth of information about random In the early days (1960’s) the RNG’s were so poor that, even
number generation, but nothing about where the randomness with the very slow computers of that time, their defects were
comes from, or how to measure the quality (randomness) of sometimes obvious, and users would have to try a differ-
a generator. Now with hindsight, it is not surprising that all ent RNG. When the result looked good, it was assumed to
the widely-used generators described there were later found be correct, but we know now that all the generators of that
to have defects (failing tests of randomness and/or giving period had serious defects which could give incorrect results
incorrect results in Monte Carlo (MC) calculations), with the not easily detected.
notable exception of RANLUX, which Knuth does mention As computers got faster and RNG’s got longer periods,
briefly in the third edition, but without describing the new the situation evolved quantitatively, but still unacceptable
theoretical basis. results were occasionally obtained and of course were not
published. Until 1992, when the famous paper of Ferrenberg
et al. [2] showed that the RNG considered at that time to
be the best was giving the wrong answer to a problem in
phase transitions, while the older RNG’s known to be defec-
* Frederick James
[email protected]
tive gave the right answer. Since most often we don’t have
any independent way to know the right answer, it became
1
CERN, Geneva, Switzerland
13
Vol.:(0123456789)
clear that empirical testing of RNG’s, at that time the only for example, Arnold and Avez [4] for the theory. At the
known way to verify their quality, was not good enough. time, these mathematicians were certainly not thinking
Fortunately, the particular problem which was detected by of RNG’s, but it turns out that their results can be used to
Ferrenberg et. al. was soon solved by Martin Lüscher (in produce sequences of random numbers that have some of
[3]), but it became clear that if we were to have confidence in the properties of the trajectories of the dynamical systems
MC calculations, we would need a better way to ensure their (See Savvidy [5] and Savvidy [6]). The property of interest
quality. Fortunately the theory of Mixing, outlined below, here is called Mixing, and is usually associated with the
now offers this possibility. names Kolmogorov and Anosov. Mixing is a well-defined
The experience gained from developing, using and dis- concept in the theory, and will be seen to correspond
covering defects in many RNG’s has taught us some lessons quite exactly to what is usually called independence or
which we summarise here (they are explained in detail in randomness.
[1]): The representation of dynamical systems appropriate for
our purposes is the following:
1. The period should be much longer than any sequence
that will be used in any one calculation, but a long
x(i + 1) = A × x(i) mod 1, (1)
period is not sufficient to ensure lack of defects. where x(i) is the N-vector of real numbers specifying com-
2. Empirical testing can demonstrate that a RNG has pletely the state of the system at time i, and A is a (constant)
defects (if it fails a test), but passing any number of N × N matrix which can be thought of as representing the
empirical tests can never prove the absence of defects. numerical solution to the equations of motion. The N-dimen-
3. Making an algorithm more complicated (in particular, sional state space is a unit hypercube which because of the
combining two or more methods in the same algorithm) modulo function becomes topologically equivalent to an
may make a better RNG, but it can also make one much N-dimensional torus by identifying opposite faces. The vec-
worse than a simpler component method alone if the tors x represent points along the continuous trajectory of the
component methods are not statistically independent. abstract dynamical system in N-dimensional phase space.
4. It is better to use a RNG which has been studied, whose All the elements of the matrix A are integers, and the
defects are known and understood, than one which looks determinant of A must be one. This ensures that A is invert-
good but whose defects are not understood. ible and the elements of A−1 are also integers. The theory
5. There is no general method to determine how good a is intended for high- (but finite) dimensional systems
RNG must be for a particular MC application. The best (1 ≪ N < ∞). In practice N will be between 8 and a few
way to ensure that a RNG is good enough for a given hundred.
application, is to use one designed to be good enough
for all applications.
Mixing and the Ergodic Hierarchy
The Theory of Mixing in Classical Mechanical Let x(i) and x(j) represent the state of the dynamical sys-
Systems tem at two times i and j. Furthermore, let v1 and v2 be any
two subspaces of the entire allowed space of points x, with
It has been known, at least since the time of Poincaré, that measures (volumes relative to the total allowed volume),
classical dynamical systems of sufficient complexity can respectively 𝜇(v1 ) and 𝜇(v2 ). Then the dynamical system is
exhibit chaotic behaviour, and numerous attempts have said to be a 1-system (with 1-mixing) if
been made to make use of this “dynamical chaos” to produce
random numbers by numerical algorithms which simulate P(x(i) ∈ v1 ) = 𝜇(v1 )
mechanical systems. It turns out to be very difficult to find
and a 2-system (with 2-mixing) if
an approach which produces a practical RNG, fast enough
and accurate enough for general MC applications. To our P(x(i) ∈ v1 and x(j) ∈ v2 ) = 𝜇(v1 )𝜇(v2 ),
knowledge, only two such attempts have been successful,
both based on the same representation and theory of dynami- for all i and j sufficiently far apart, and for all subspaces vi .
cal systems. Similarly, an n-system can be defined for all positive integer
This theory grew out of the study of the asymptotic values n. We define a zero-system as having the ergodic
behaviour of classical mechanical systems that have no property (coverage), namely that the state of the system will
analytic solutions, developed largely in the Soviet Union asymptotically come arbitrarily close to any point in the state
around the middle of the twentieth century by Kolmog- space.
orov, Rokhlin, Anosov, Arnold, Sinai and others. See, Putting all this together, we have that asymptotically:
13
• A system with zero-mixing covers the entire state space. 𝛿(a, b) = max d𝜅 ,
𝜅
• A system with one-mixing covers uniformly. { } (2)
• A system with two-mixing has 2 × 2 independence of d𝜅 = min |a𝜅 − b𝜅 | , 1 − |a𝜅 − b𝜅 | ,
points.
where 𝜅 runs over the N components of the vector indicated.
• A system with three-mixing has 3 × 3 independence.
Note that the distance defined in this way is a proper distance
• etc.
measure and has the property 0 ≤ 𝛿 ≤ 1∕2. Now we use Eq.
(1) on a(0) and b(0) to produce a(1) and b(1), and calculate
Finally, a system with n-mixing for arbitrarily large values
𝛿(a(1), b(1)). Then we continue this process to produce two
of n is said to have K-mixing and is a K-system. It is a result
series of points a(i) and b(i), and a set of distances 𝛿(i), for
of the theory that the degrees of mixing form a hierarchy
i = 1, 2, 3 … until the 𝛿 reach a plateau at the value expected
[7], that is, a system which has n-mixing for any value of n
for truly random points, which according to Lüscher is
also has i-mixing for all i < n. There are additional systems,
𝛿 = 12∕25 for RANLUX. Then it is a well-known result of
in particular Anasov C-systems and Bernoulli B-systems
the theory that, if A represents a K-system, the distances
which are also K-systems, but K-systems are sufficient for
𝛿i will diverge exponentially with i, so that if plotted on a
our purposes.
logarithmic scale, the points 𝛿i vs. i should lie on a straight
Now the theory tells us that a dynamical system repre-
line. The inevitable scatter of points should be reduced by
sented by Eq. (1) will be a K-system if the matrix A has
averaging the 𝛿 over different starting pairs (a(0), b(0)) (the
determinant equal to one and eigenvalues 𝜆 , all of which
b(0) must of course always be the same very small distance
have moduli |𝜆i | ≠ 1.
from the a(0), but in different directions).
The rate of divergence of nearby trajectories (𝜈, the Lyapu-
The Eigenvalues of A nov exponent) is equal to the logarithm of the modulus of the
largest eigenvalue:
We have seen that to obtain K-mixing, none of the eigenval-
𝜈 = ln |𝜆|max ,
ues of A should lie on the unit circle in the complex plane. In
fact, in order to obtain sufficient mixing as early as possible which, for a K-system, should be the slope of the straight
(recall that complete mixing is only an asymptotic property) line described above.
it is desirable to have the eigenvalues as far as possible from
the unit circle. Decimation
An important measure of this distance is the Kolmogorov
entropy h: The asymptotic independence of a and b is guaranteed by the
∑ mixing (if it is a K-system), but as long as 𝛿(i) remains small,
h= ln |𝜆k |,
k∶|𝜆k |>1
a(i) and b(i) are clearly correlated. The point where the straight
line of divergence of nearby trajectories reaches the plateau
where the sum is taken over all eigenvalues with absolute of constant 𝛿 indicates the number of iterations m required to
values greater than 1 [It is also equal to the sum over all make the K-system “sufficiently asymptotic”, in the sense that
eigenvalues less than 1, but then it changes sign]. As its the points a and b generated on the following iteration are on
name implies, it is analogous to thermodynamic entropy as it average as far apart as independent points would be. We may
measures the disorder in the system, and it must be positive call this criterion the “2-mixing criterion”, since it apparently
for an asymptotically chaotic system [This actually follows assures that 2×2 correlations due to nearby trajectories will be
from the definition if all |𝜆| ≠ 1]. negligible. The question whether this criterion is sufficient to
Another important measure is the Lyapunov exponent, eliminate higher order correlations, is important and will be
defined in the following section. discussed below in connection with RANLUX++.
Some plots of divergence for real RNG’s are given below. If
the K-system is used to generate random numbers, to eliminate
The Divergence of Nearby Trajectories
the correlations due to nearby trajectories, after one vector ai is
delivered to the user, the following m vectors ai+1 , ai+2 , … ai+m
The mechanism by which mixing occurs in K-systems can be
must be discarded before the next vector ai+m+1 is delivered to
observed “experimentally” by noting the behaviour of two
the user (decimation). It has become conventional to charac-
trajectories which start at nearby points in state space. Using
terise the degree of decimation by the integer p, defined such
the same matrix A, let us start the generator from two differ-
that after delivering a vector of N random numbers to the user,
ent nearby points a(0) and b(0), separated by a very small
p − N numbers are skipped before the next N numbers are
distance 𝛿(a(0), b(0)). The distance 𝛿 is defined by
delivered [3, 8].
13
From the Theory to the Discrete RNG 2. Kolmogorov entropy must be positive (follows from the
above).
Equation (1) will be used directly to generate random num- 3. The discrete algorithm must have points sufficiently
bers, where x(0) will be the N-dimensional seed, and each dense to accurately represent the continuous system.
successive vector x(i) will produce N random numbers. 3. Divergence of nearby trajectories must be exponential
However, in the theory, x is a vector of real numbers, con- (follows from the above) .
tinuous along the unit line. The computer implementation 4. Decimation must be sufficient to assure that the aver-
must approximate the real line by discrete rational numbers, age distance between successive vectors is the expected
which is valid provided the finite period is sufficiently long, distance between independent points.
and the rational numbers are sufficiently dense, so that the 5. Period must be long enough, >10100
effects of the discreteness are not detectable. Thus the com- 6. Some practical criteria: double precision available, port-
puter implementation has access only to a rational sublat- able, repeatable, independent sequences possible.
tice of the continuous state space, and we must confirm that
the discrete approximation preserves the mixing properties
of the continuous K-system. Fortunately, the divergence of The High‑Quality RNG’s: 1. RANLUX
nearby trajectories offers this possibility, since a theorem
usually attributed to Pesin [7] states that a dynamical system The first widely-used RNG to offer reliably random num-
has positive Kolmogorov entropy and is, therefore, K-mixing bers was Martin Lüscher’s RANLUX, published in 1994.
if and only if nearby trajectories diverge exponentially. Then He considered the RNG proposed by Marsaglia and Zaman
we can expect the discrete system to be K-mixing only if the [10], installed many years ago at CERN with the name RCA
same condition is satisfied. This is discussed in more detail RRY, and now known variously as RCARRY, SWB (subtract
below in connection with Extended MIXMAX. with borrow) or AWC (add with carry). He discovered that,
if the carry bit was neglected (see below), this RNG had
The Period a structure that could be represented by Eq. (1), and was,
therefore, possibly related to a K-system.
The most obvious difference between continuous and dis-
crete systems is that continuous systems can have infinitely
long trajectories, whereas a computer RNG must always
The SWB (RCARRY) Algorithm
have a finite period. This means that a trajectory ‘eats up’
SWB operates on an internal vector of 24 numbers, each
state space as it proceeds, since it can never return to a state
with a 24-bit mantissa, and each call to the generator pro-
it has previously occupied without terminating its period.
duces one random number which is produced by a single
The fact that the available state space becomes progressively
arithmetic operation (addition or subtraction) operating on
smaller indicates necessarily a defect in the RNG, but this
two of the numbers in the internal vector. Then this random
defect is easily seen to be undetectable if the period is long
number replaces one of the entries in the internal vector.
enough. According to Maclaren [9], when the period is P,
Lüscher realised that if SWB is called 24 times in succes-
using more than P2∕3 numbers from the sequence “leads to
sion it generates a 24-vector which is related to the starting
excessive uniformity compared to a true random sequence.”
24-vector by (1), and he could determine the matrix A which
This is compatible
√ with the RNG folklore which sets the would reproduce almost the same sequence as SWB. The
usable limit at P.
only difference would be due to the “carry bit” which is
necessary for attaining a long period, but affects only the
Summary of Criteria for Highest Quality
least significant bit, so it does not alter significantly the mix-
ing properties of the generator. The carry bit is described in
We will consider as candidates for highest quality RNG’s
detail, both in the paper of Marsaglia and Zaman [10] and
only those based on the theory of chaos in classical mechani-
that of Lüscher [3].
cal systems, for the simple reason that we know of no other
class of systems which can offer—even in some limit which
may or may not be attainable—the uniform distribution and SWB: The Lyapunov Exponent and Decimation
lack of correlations guaranteed by k-mixing. To be precise,
our criteria are the following: The eigenvalues of this “equivalent matrix” were seen to sat-
isfy the conditions for a K-system, but the RNG nevertheless
1. Matrix A must have eigenvalues away from the unit cir- failed several standard tests of randomness. Plotting the evo-
cle, and determinant =1. lution of the separation of nearby trajectories immediately
indicated the reason for the failure and how to fix it. The
13
13
Distance in log2
itself is defective.1
40
A few years before the publication of RANLUX, George 20 log (λmax ) = 11.6
Savvidy et al. in Erevan [5] were working on a different 2
approach to the same problem using the same theory of mix- 10 Slope = 11.8 ± 0.4
ing. Their approach was to look for a family of matrices A
which could be defined for any dimension N, having eigen- 0
values far from the unit circle and, therefore, large Lyapu- 0 2 4 6 8 10
nov exponents for different values of N, to reduce or even Iteration Number
eliminate the need for decimation.
Fig. 2 Divergence of nearby trajectories for MIXMAX 1.0, N = 256
MIXMAX: The Matrix, the Algorithm, and Decimation
The family of matrices which they found was (for N ≥ 3) important properties of MIXMAX for 13 different values of
the dimension N from 10 to 3150. The implementation uses
⎛1 1 1 1 ⋯ 1 1 ⎞ 64-bit integer arithmetic internally, so that the user always
⎜1 2 1 1 ⋯ 1 1 ⎟
⎜1 3 + s 2 ⎟ gets standard double-precision floating-point numbers with
1 ⋯ 1 1
⎜ ⎟ 53-bit mantissas. All the periods are longer than 10165, so
A = ⎜1 4 3 2 ⋯ 1 1 ⎟ (3) that is not a problem. The paper gives the results of the Big
⎜⋮ ⋮ ⋮ ⋮ ⋱ ⋮ ⋮ ⎟
⎜1 N − 1 N − 2 N − 3 ⎟ Crush test of the TestU01 package [16] for 13 values of
⋯ 2 1
⎜ ⎟ N, and the test is failed for N ≤ 642 but passes for N ≥ 88.
⎝1 N N − 1 N − 2 ⋯ 3 2 ⎠
In practice, N = 256 became the default value, probably
where the “magic” integer s is normally zero, but for some because of the results of these tests.
values of N, s = 0 would produce an eigenvalue |𝜆i | = 1, in As expected, most of the eigenvalues of the Savvidy
which case a different small integer must be used. matrix are far from the unit circle (compared with those of
The straightforward evaluation of the matrix-vector RANLUX), although there are always a few close to one. For
product in Eq. (1) requires O(N 2 ) operations to produce N some values of N (including N = 256) the “magic” integer
random numbers, making it hopelessly slow compared with s must be invoked as described above. The moduli of the
other popular RNG’s, so a faster algorithm would be needed. largest eigenvalues are typically of order N. For large N, the
After considerable effort, they reduced the time to O(N ln N), smallest eigenvalues cluster around |𝜆| = 1∕4.
much faster but still too slow. Thus, the Kolmogorov entropy is greater than that of
George Savvidy’s son Konstantin, also a theoretical phys- RANLUX, indicating a faster convergence toward the
icist, found the algorithm [15] that reduced the computation asymptotic independence that is guaranteed by the mix-
time to O(N), making it competitive in speed with the fast- ing property. To demonstrate clearly that mixing is indeed
est generators and, very important, the time to generate one occurring, one can plot the divergence of nearby trajectories
number would now be essentially independent of N. as was done for RANLUX, and verify that the divergence
Considerable work remained, the most difficult being the is exponential as it should be for a continuous K-system.
determination of the period of the new generator for all inter- Then the same plot can be used to determine how much (if
esting values of N. The periods are so long that this requires any) decimation is required to eliminate the correlation of
the use of finite (but very large) Galois fields. nearby trajectories.
By 2014, the essential problems were solved, MIXMAX
was being tested at different sites including CERN, and Kon-
stantin Savvidy published the paper [15] giving some of the
2
for N = 64, the failure of the Big Crush test is termed “only mar-
1
Several defects have been reported in RANLUX, but in each case it ginal”, but the probability of 𝜒 2 ≥ 372 for 232 degrees of freedom is
turned out that the test procedure was incorrect or incorrectly applied. 1.12 × 10−8.
13
44 8
88 6
40
MIXMAX 1000 256 5
1000 4
30 MIXMAX 256
MIXMAX 88
20 MIXMAX 44
all separations right up to the expected asymptotic value
MIXMAX 16
10 which is always close to 1 / 2. In addition, the slope of the
MIXMAX 10 exponential is in all cases equal to the maximum Lyapunov
0 exponent as predicted by the theory. And since the Lya-
0 2 4 6 8 10 12 14 16 18 20 punov exponent increases with N, the decimation required
Iteration Number for maximum separation decreases with N. However, there
does not seem to be much to be gained in going to values
Fig. 3 Divergence of nearby trajectories for MIXMAX 1.0, N = 10,
of N > 256 , especially since the overhead of initialization
16, 44 ,88, 256, 1000 and memory usage would then start to become significant.
With the possible exception of N = 10 (which some
may consider too small) all the generators shown in Fig. 3
Unfortunately, there was nothing in the published paper satisfy all our criteria for highest quality RNG’s with no
[15] on the divergence of nearby trajectories and no mention known defects, provided of course that appropriate deci-
of decimation. At CERN, we were mainly interested in get- mation is applied. This decimation can be read off the
ting a RNG of the highest quality with no known defects, so figure and is given in Table 1.
we did our own calculation of the divergence of nearby tra- MIXMAX also offers the possibility to seed the genera-
jectories for N = 256 as shown in Fig. 2. It can be seen that tor at different faraway points in such a way as to avoid
(1) the trajectories do indeed diverge exponentially and (2) overlapping with sequences used in other calculations. So
after five iterations, the average distance between two nearby it satisfies all our criteria for highest quality as long as
trajectories has almost reached the asymptotic value (slightly appropriate decimation is applied.
less than the maximum value of 1/2). Note that as long as However, the developers of MIXMAX were already
the average distance between trajectories is significantly less working on an extended MIXMAX which hopefully would
than the asymptotic value, this is direct evidence of a defect, not require any decimation.
since it indicates that the trajectories still “remember, where
they came from”, which would not be the case if the points
were independent. But this generator passed the Big Crush
test, so we see clearly that the divergence of trajectories is The High‑Quality RNG’s: 3. Extended
more sensitive to this defect than Big Crush. MIXMAX
Therefore, MIXMAX 256 requires decimation, but less
than RANLUX, making it a candidate for the world’s fastest The search for a generating matrix with larger entropy (for
high-quality RNG. This version was introduced at CERN faster convergence to asymptotic mixing) consists in look-
as a standard generator for ROOT, with the possibility to ing for transformations of A which do not change the deter-
choose different levels of decimation. minant, and do not interfere with the algorithm for fast
We have also considered the properties of MIXMAX multiplication but do move the eigenvalues further away
for other values of the dimension N. In particular, we have from the unit circle. For the purposes of this section, it will
looked at the divergence of nearby trajectories for some be convenient to rewrite Eq. (1) as it is implemented in the
values of N for which the period was published in [15] . computer algorithm, using explicitly integer arithmetic:
These curves are shown in Fig. 3 for N = 10 , 16, 44, 88,
256 and 1000. (We have also tried N = 44000 , but the
a(i + 1) = A × a(i) mod p, (4)
overheads involved in handling such a large matrix are where a(i) is now an N-vector of 60-bit integers and
not worth the possible gains.) It is seen that for all these p = (261 − 1), so the real x of Eq. (1) are now approximated
values of N, the divergence is remarkably exponential, not by the rational a / p. As before, A is an N × N matrix of
only for small separations as required by the theory, but for
13
N=8 N = 17
60 60
50 50
Distance in log2
Distance in log2
40 40
30 30
Fig. 4 Divergence of nearby trajectories for extended MIXMAX for Fig. 5 Divergence of nearby trajectories for extended MIXMAX for
N = 8, m ≈ 236 N = 17, m ≈ 236
13
Lyapunov exponent as was the case for the original MIX- Although originally the spectral test was applicable only to
MAX. The same plots for extended MIXMAX for even simple LCG’s, it has been extended more recently to more
larger values of m (but still keeping Nm < p ), show a diver- complex LCG’s, and L’Ecuyer et al. have applied it [19]
gence so fast that it reaches the asymptotic value on the to extended MIXMAX which can be analyzed as a matrix
first step, so it is impossible to see whether the divergence LCG. They find that the lattice structure in certain dimen-
is exponential. sions (notably in 5 dimensions for N = 240 and m = 251 + 1)
We have tried to find the reason for this behaviour. The is “very bad”, all points lying on a number of hyperplanes
obvious suspect is the large values of some elements of much smaller than should be the case for their p = 261 − 1.
A which have caused the extended algorithm to lose the They also point out the existence of “simple linear relations”
mixing properties of the continuous K-system. One upper between coordinates of successive points which indicate
limit on the allowed magnitude of elements of A is given unwanted correlations. Both problems remain even when
by the developers of MIXMAX in [17]: the guilty coordinates are skipped in the output.
These problems were known to us but we did not consider
It is most advantageous to take large values of m,
them serious, because they were both eliminated (at least in
but preferably keeping Nm < p , such as to have an
RANLUX) by decimation as described above in this paper.
unambiguous correspondence between the continu-
Note that the skipping of coordinates performed by L’Ecuyer
ous system and the discrete system on the rational
et. al. in [19] (and also by the MIXMAX developers in [18])
sublattice.
is very different from the decimation proposed in RAN-
The condition Nm < p is certainly necessary as indicated in LUX. L’Ecuyer et. al. discard certain coordinates of every
the above quotation, but the values of mN in our examples vector generated, whereas Lüscher discards whole vectors
already satisfy that criterion, so there must be an addi- at a time. The latter is justified by the asymptotic nature
tional explanation having to do with the multiplication of of K-mixing so the fact that it is effective for RANLUX is
large integers (modulo p). further indication that K-mixing is indeed occurring. For
Let us look first at the continuous system of Eq. 1 and extended MIXMAX, we cannot use the theory of K-mixing
consider for example just one component of the matrix- to justify decimation.
vector multiplication in the continuous system, that is, how
the ith component of the new vector x(t + 1) depends on
the jth component of the previous vector xj (t): The High‑Quality RNG’s: 4. RANLUX++
xi (t + 1) = Aij xj (t) mod 1,
As mentioned above in connection with the spectral test, the
When the integer Aij is very big, this equation describes a RANLUX algorithm is known to be equivalent to a linear
function that loops rapidly around the torus, as indicated by congruential generator (LCG) with an enormous multiplier.
the diagonal lines in Fig. 7. and when Aij is equal to p, there This fact was used by Lüscher [3] to apply the spectral test
to RANLUX, but otherwise seemed to be of little practical
13
interest, since the computers of that time were unable to do Here, we see that as the luxury level increases, the obvi-
long multiplications fast enough to be useful. More recently, ous regularity in the factor ap mod m decreases, but is still
however, Sibidanov [8] has produced a version called RAN- clearly present for the value of p corresponding to the high-
LUX++, implemented using the linear congruential algo- est standard luxury level of RANLUX, p = 389.4 Sibidanov
rithm with a clever way of doing long integer multiplication remarks that apparently, it would be advantageous to apply
on modern computers, which makes it faster than standard still more decimation, at least to p = 1024 or p = 2048 ,
RANLUX and does the decimation at the same time. The which can be obtained “for free”. In this context, we have
algorithm of Sibidanov cannot be programmed efficiently in applied a simple test of randomness to the hex patterns,
the usual high-level languages, but is possible in assembler based on counting runs of identical hex digits, for the fac-
code using operations he says are now available on most tors ap mod m for the two new values of p. We find that
computers used for scientific computing. they both pass the test for expected numbers of 2-runs,
The heart of the new software is the procedure to multi- 3-runs and higher, which would indicate that the patterns
ply two 576-bit integers using 81 multiplications of 64 × 64 are already as random for p = 1024 as expected for a truly
bits cleverly organised to be extremely fast using modern random sequence of 144 hex digits. So, if the randomness of
extensions of arithmetic and fetching operations described ap mod m is an indication of the degree of mixing, p = 389
in Sibidanov [8]. For our purposes, there are two important is not enough, but p = 1024 is already fully mixing.
features of this method: First, that it is able to reproduce the
sequences of RANLUX much faster than the high-level lan-
guage versions, and second, that it can produce more deci- Conclusions
mation (higher luxury levels) than the predefined levels in
RANLUX, without increasing the generation time. We now have several high-quality RNG’s available for gen-
To understand the importance of this second feature, we eral use based on the theory of Mixing in classical dynami-
should look more closely at the mechanism used to imple- cal systems.
ment RANLUX with the LCG algorithm. Using the notation
of Sibidanov, one random number xi+1 is generated by RANLUX
x(i + 1) = a ⋅ x(i) + c mod m. (6) This easily portable and well-documented RNG exists in
With the parameters corresponding to RANLUX, the base both single- and double-precision versions both of which
b = 224 , the modulus m = b24 − b10 + 1, the multiplier satisfy our criteria for K-mixing. At the highest standard lux-
a = m − (m − 1)∕b, and c = 0. ury level (currently level 2, p = 397 for the single-precision
Decimation at level p3 in RANLUX is implemented version, p = 794 for the double-precision version), it has for
by delivering 24 numbers to the user, then throwing away many years been considered the most reliable RNG avail-
p − 24 numbers before delivering the next 24. In RAN- able, but too slow for some applications. Recent improve-
LUX++, the decimation is called “skipping” [8], because it ments in code optimization have speeded it up considerably,
is implemented directly by so it should now be acceptable for most applications (see
timings below).
x(i + p) = ap mod m ⋅ x(i) mod m, (7) However, new evidence from RANLUX++ indicates that
even more decimation up to p = 1024 may be needed to
where ap mod m is a precomputed very long integer constant.
eliminate the higher-order correlations. This level of deci-
As expected, this makes generation in RANLUX++ much
mation would make RANLUX slower, but is obtained “for
faster than that of the original RANLUX. But the surprise
free” with RANLUX++.
comes when we look at the values of ap mod m for differ-
ent values of p corresponding to the standard luxury levels,
as well as two new higher levels, p = 1024 and p = 2048,
MIXMAX
shown in Table 1 of [8].
We consider only the original MIXMAX with small s and
no m, described in [15]. This RNG produces always dou-
ble-precision floating-point numbers, and the user has the
3
Note that this p is not related to the symbol p used in the discrete freedom to choose N, the size of the generating matrix A. In
approximation of x by a / p. Since no confusion is possible, we prefer
practice one will choose a value of N for which the period is
to use the same notation as the original authors of the algorithms.
4 known, tabulated in [15]. The code is maintained and avail-
The values of p corresponding to different luxury levels have
changed, so the value used by Sibidanov (389) is slightly different able under HepForge.
from the value for the current highest luxury level for the 24-bit algo- For users who require the highest quality random num-
rithm (397), but the difference is not significant. bers, it will be necessary to introduce into MIXMAX the
13
13
12. James F (1994) RANLUX: a Fortran implementation of the high- 17. Savvidy K, Savvidy G (2016) Spectrum and entropy of C-systems.
quality pseudorandom number generator of Lüscher. Comput Phys MIXMAX random number generator. Chaos Fract Solitons 91:33;
Commun 79:111 e-print arXiv:1510.06274
13. Hamilton KG, James F (1997) Acceleration of RANLUX. Comput 18. Martirosyan N, Savvidy K, Savvidy G (2019) Spectral test of the
Phys Commun 101:241–248 MIXMAX random number generators. arXiv:1806.05243v2
14. Shchur LN, Butera P (1998) The RANLUX generator: resonances 19. L’Ecuyer P, Wambergue P, Bourceret E (2017) Spectral analysis of
in a random walk test. Int J Mod Phys C 9:607–624 arXiv:hep- the MIXMAX random number generators. Submitted to INFORMS
lat/9805017 J Comput
15. Savvidy K (2015) The MIXMAX random number generator.
Comput Phys Commun 196:161–165. https://doi.org/10.1016/j. Publisher’s Note Springer Nature remains neutral with regard to
cpc.2015.06.003. arXiv:1404.5355 jurisdictional claims in published maps and institutional affiliations.
16. L’Ecuyer P, Simard R (2007) TestU01: a software library in ANSI
C for empirical testing of random number generators. ACM Trans
Math Softw 33:22
13
1. use such content for the purpose of providing other users with access on a regular or large scale basis or as a means to circumvent access
control;
2. use such content where to do so would be considered a criminal or statutory offence in any jurisdiction, or gives rise to civil liability, or is
otherwise unlawful;
3. falsely or misleadingly imply or suggest endorsement, approval , sponsorship, or association unless explicitly agreed to by Springer Nature in
writing;
4. use bots or other automated methods to access the content or redirect messages
5. override any security feature or exclusionary protocol; or
6. share the content in order to create substitute for Springer Nature products or services or a systematic database of Springer Nature journal
content.
In line with the restriction against commercial use, Springer Nature does not permit the creation of a product or service that creates revenue,
royalties, rent or income from our content or its inclusion as part of a paid for service or for other commercial gain. Springer Nature journal
content cannot be used for inter-library loans and librarians may not upload Springer Nature journal content on a large scale into their, or any
other, institutional repository.
These terms of use are reviewed regularly and may be amended at any time. Springer Nature is not obligated to publish any information or
content on this website and may remove it or features or functionality at our sole discretion, at any time with or without notice. Springer Nature
may revoke this licence to you at any time and remove access to any copies of the Springer Nature journal content which have been saved.
To the fullest extent permitted by law, Springer Nature makes no warranties, representations or guarantees to Users, either express or implied
with respect to the Springer nature journal content and all parties disclaim and waive any implied warranties or warranties imposed by law,
including merchantability or fitness for any particular purpose.
Please note that these rights do not automatically extend to content, data or other material published by Springer Nature that may be licensed
from third parties.
If you would like to use or distribute our Springer Nature journal content to a wider audience or on a regular basis or in any other manner not
expressly permitted by these Terms, please contact Springer Nature at