Good Random Number Generators Are (Not So) Easy To Find PDF
Good Random Number Generators Are (Not So) Easy To Find PDF
P. Hellekalek
*
Dept. of Mathematics, Salzburg University, Hellbrunner Strae 34, A-5020 Salzburg, Austria
Abstract
Every random number generator has its advantages and deficiencies. There are no ``safe'' generators. The practitioner's
problem is how to decide which random number generator will suit his needs best. In this paper, we will discuss criteria for
good random number generators: theoretical support, empirical evidence and practical aspects. We will study several recent
algorithms that perform better than most generators in actual use. We will compare the different methods and supply numerical
results as well as selected pointers and links to important literature and other sources. Additional information on random
number generation, including the code of most algorithms discussed in this paper is available from our web-server under the
address http://random.mat.sbg.ac.at/ # 1998 IMACS/Elsevier Science B.V.
1. Introduction
Random number generators (``RNGs'') are the basic tools of stochastic modeling. As any other
craftsman, the modeler has to know his tools. Bad random number generators may ruin a simulation.
There are several pitfalls to be avoided.
For example, if we try to check the correlations between consecutive random numbers x
0
, x
1
,F F F, then
a (still!) widely used generator produces non-overlapping pairs (x
2n
, x
2n1
), n=0,1,... above suspicion at
the first look, see Fig. 1. In sharp contrast, the triples (x
3n
, x
3n1
, x
3n2
) are extremely correlated and
happen to lie on only fifteen planes, see Fig. 2. In both figures, 2
15
points have been generated. For
details on this phenomenon, we refer to Section 4 and Section 5.
In this paper, safety-measures against such unpleasant surprises will be given. We will discuss the
current standards for good random number generators, the underlying mathematical and statistical
concepts, and some new generators that meet these standards. Further, we will summarize the
advantages and deficiencies of several algorithms to generate and test random numbers. Finally, we will
present a ``RNG Survival Kit'' that contains the most important literature on this subject and links to
web-sites that offer code and documents, and a ``RNG Checklist'' that allows the reader to assess his
preferred generator on the basis of the concepts given in this concise survey.
Mathematics and Computers in Simulation 46 (1998) 485505
k1
j=0
a
j
y
nj
(mod m)Y n _ 0
defines an MRG with initial values y
0
,F F F, y
k1
. We use the normalization x
n
:=y
n
/m to produce random
numbers x
n
in the unit interval [0,1[. The maximum period of this generator is m
k
1, see [55,35]. The
paper [35] contains tested parameters for MRGs with moduli m up to 2
63
and code for one MRG in C.
We will denote this MRG by ``MRG1'' and exhibit test results for it in Section 7.
One general problem with linear methods is the fact that correlations between random numbers
separated by lags may be rather strong. Even in certain variants of the MRG, like the AWC and SWB
Fig. 3. Minimal Standard: LCG(2
31
1,16807,0,1).
P. Hellekalek / Mathematics and Computers in Simulation 46 (1998) 485505 489
generators of [46], this may lead to a very unfavorable performance in simulations, see [29,34,55] for
further information. One solution to this problem is to combine generators. In the simplest version of
this technique, we combine two generators by adding their output sequences (x
(1)
n
)
n_0
and (x
(2)
n
)
n_0
to
obtain a new sequence (x
n
)
n_0
,
x
n
X= x
(1)
n
x
(2)
n
(mod 1)Y n _ 0Y
If the two generators are chosen properly, then the period of the sequence (x
n
)
n_0
will be the product of
the periods of the components. Combining generators without theoretical support may lead to
disastrous generators. In the case of the LCG and MRG, the theory is well-known, see [32,34]
Example 2. In [32], combined MRGs (``cMRG'') were introduced and thoroughly analyzed. Further,
a particular cMRG was assessed by the spectral test up to dimension d=20 and its implementation in C
was given. In Section 7, we will refer to this particular cMRG as ``cMRG1''.
Tausworthe generators can have unacceptably bad empirical performance. For this reason, in [33],
combined Tausworthe generators (``cTG'') were introduced to improve on the properties of single
Tausworthe generators.
Example 3. In [33], an implementation in C of a cTG is given that has a period length of order 2
88
.
Further, the equi-distribution properties of this type of generator are analyzed. In Section 7, we will
present test results for this generator, which is denoted by ``cTG1''.
It is well-known that generalized feedback shift-register generators (``GFSR'') are fast, although a
little bit tricky to initialize. Recently, a very interesting variant of this linear method has been presented
in [49,50], the twisted GFSR (``tGFSR''). This generator produces a sequence (x
n
)
n_0
of w-bit integers
by the rule
x
np
= x
nq
x
n
AY n _ 0Y
where (w, p, q, A) are the parameters of the tGFSR and A is a ww matrix with binary entries. This
generator is fast and reliable if the parameters are chosen properly.
Fig. 4. SIMSCRIPT: LCG(2
31
1,630360016,0,1).
490 P. Hellekalek / Mathematics and Computers in Simulation 46 (1998) 485505
Example 4. The tGFSR ``TT800'' presented in Mat[50]94a has a period length of 2
800
and strong
theoretical support. We will exhibit convincing empirical evidence in Section 7.
Inversive generators were constructed to overcome one property of linear generators that may turn
into a deficiency (depending on the simulation problem), the lattice structure of d-tuples of consecutive
random numbers. There are several variants of inversive generators, inversive congruential generators
(``ICG''), explicit-inversive generators (``EICG''), digital inversive congruential generators (``dICG''),
and combinations of ICGs and EICGs. Inversion certainly slows down the generation of random
numbers. Compared to LCGs of the same size, inversive generators are three to ten times slower,
depending on the processor's architecture (see [43]).
The importance of inversive random number generators stems from the fact that their intrinsic
structure and correlation behavior are strongly different from linear generators. Hence, they are very
useful in practice for verifying simulation results. We refer the reader to [17] for a concise survey of the
ICG and EICG, in comparison to LCGs. A comprehensive discussion of all available nonlinear methods
is contained in [55]. The implementation of inversive generators is discussed in [43]. This generic
implementation in C is also available from the server http://random.mat.sbg.ac.at/. In the case of the
ICG and EICG, composite moduli lead to less convincing generators than prime moduli.
At the present state of the art, the dICG is slower than the ICG. From a disappointing speed factor of
about 150 in the first implementation (see [[8] page 72]) this disadvantage has now been reduced to a
factor less than 8, see [58].
For a given prime number p, and for cZ
p
, let "c X= 0 if c=0 and "c X= c
1
if c,=0. In other words, "c
equals the number c
p2
modulo p.
Example 5a. Inversive congruential generators (``ICG'') were introduced in Eic[9]86a. We have to
choose the modulus p, a multiplier a, an additive term b, and an initial value y
0
. Then the congruence
y
n1
= a"y
n
b(mod p)Y n _ 0Y (1)
defines an ICG. We denote this generator by ICG(p, a, b, y
0
). It produces a sequence (y
n
)
n_0
in the set
Z={0, 1, ..., p1}. Pseudorandom numbers x
n
in [0,1[ are obtained by the normalization x
n
:=y
n
/p.
A prominent feature of the ICG with prime modulus is the absence of any lattice structure, in sharp
contrast to linear generators. In the following scatter plot, all possible points (x
2n
, x
2n1
), n_0, in a
region near the point (0.5, 0.5) are shown in Fig. 5.
Example 5b. Explicit inversive congruential generators (``EICG'') are due to [11]. The
EICG is easier to handle in practice, for example when producing uncorrelated substreams. The cost
is a slightly smaller maximum usable sample size, as empirical tests have shown (see [63,42]). We
choose a prime number p, a multiplier aZ
p
, a,=0, an additive term bZ
p
, and an initial value n
0
in Z
p
.
Then
y
n
= a(n n
0
) b (mod p)Y n _ 0Y
defines a sequence of pseudorandom numbers in {0,1,F F F, p1}. As before, we put x
n
:= y
n
/p, n_0, to
obtain pseudorandom numbers in [0,1[. We shall denote this generator by EICG(p, a, b, n
0
). In the
definition of EICG(p, a, b, n
0
), the additive term b is superfluous and can be omitted, see [18,43].
It is easy to create ICG and EICG on demand. The choice of parameters for the EICG is simple. In
case of the ICG, we may use a ``motherchild'' principle that yields many ICGs from one ``mother''
ICG (see [17]).
P. Hellekalek / Mathematics and Computers in Simulation 46 (1998) 485505 491
The ``compound approach'' presented in [10,12] allows to combine ICG and EICG, provided they
have full period. This method has important advantages: we may obtain very long periods easily,
modular operations may be carried out with relatively small moduli, increasing the effectiveness of our
computations, and the good correlation structure of the ICG and EICG is preserved. The price to pay is
a significant loss of speed that makes combined inversive generators considerably slower than linear
generators of comparable period length.
4. Theoretical support
Theoretical support for random number generators is still widely ignored by practitioners. The three
main questions here are period length, the intrinsic structure of the random numbers and -vectors
produced by a generator, and correlation analysis.
It is clear that the period length of a generator will put a limit to the usable sample size. Random
number generation is equivalent to drawing without replacement, see [31]. Hence, the sample size
should be much smaller than the period length of the generator. In the case of linear methods, the square
root of the period length seems to be a prudent upper bound for the usable sample size. This
recommendation is based on empirical experience, there is no theoretical analysis available (see [44]
for a short discussion). We refer to Section 7 for examples that show how different types of random
number generators behave quite differently when the sample size is increased.
In the case of good random number generators, it is possible to provide conditions for the parameters
of the generator to obtain maximum period length, see [55]. Further, it is important to have algorithms
at hand to compute such parameters. The case of the ICG is a good illustration for this requirement, see
[17,55].
Intrinsic structures of random number generators like grid structures and related results like estimates
of the number of points on hyperplanes are important to be known. For example, if one is aware of the
Fig. 5. All points of ICG(2
31
1,1288490188,1,0).
492 P. Hellekalek / Mathematics and Computers in Simulation 46 (1998) 485505
grid structure of LCGs, then it will come as no surprise that this type of generator has difficulties with
certain simulations. An instructive example is the nearest pair test, see [9] and [28,37].
The most difficult and most important part of the theoretical assessment of random number
generators is correlation analysis. More than twenty years of experience have shown that certain figures
of merit for random number generators allow very reliable predictions of the performance of the
samples produced with a generator in empirical tests. The latter are nothing less than prototypes of
simulation problems. Hence, the importance of theoretical correlation analysis for numerical practice is
beyond question. It should be stated clearly that none of these figures of merit can give us guarantees
for the performance of the generator in our simulation. At present, there is no firm mathematical link
between any figure of merit for random number generators and the empirical performance of samples.
Nevertheless, and this fact is truly remarkable, the quality of prediction is excellent.
The basic concept to analyze correlations between random numbers is the following. Suppose we are
given random numbers x
0
, x
1
, ... in the unit interval [0,1[. To check for correlations between consecutive
numbers, we construct either overlapping d-tuples x
n
X= (x
n
Y x
n1
Y F F F Y x
nd1
) or non-overlapping d-
tuples x
n
:= (x
nd
, x
nd1
,F F F, x
nd d1
) and assess the empirical distribution of finite sequences
3 = (x
n
)
N1
n=0
in the d-dimensional unit cube [0,1[
d
. The task is to measure how ``well'' 3 is uniformly
distributed. Strong correlations between consecutive random numbers will lead to significant deviations
of the empirical distribution function of 3 from uniform distribution, in some dimensions d. It is clear
that the restricted type of d-tuples that is considered here cannot ensure against long-range correlations
among the numbers x
n
themselves. For this topic, we refer the reader to [5]. In the case of the EICG of
[11], more general types of d-tuples have been considered (see also [54,55]).
Interestingly, it has turned out that the behavior of full-period sequences 3 with respect to theoretical
figures of merit allows very reliable predictions of the performance of the random numbers x
n
themselves in empirical tests. If the full-period point set 3 has a good empirical distribution with
respect to certain figures of merit in various dimensions d, then good empirical performance of the
samples is highly probable. Practical evidence is that many target distributions will be simulated very
well, see, for example, the empirical results in [15,28,42,23]. This relation between properties of full-
period sequences in higher dimensions and the behavior of comparatively small-samples in low
dimensions has not yet been put into rigorous mathematical form.
There are two schools of thought. The approach of the first school (see [30]) is to optimize the
parameters of a given type of generator such that the empirical distribution function of the point sets 3
in [0,1[
d
approximates uniform distribution as closely as possible, leading to a so-called ``super-
uniform'' distribution. This is done in as many dimensions d as is feasible in practice. The figure of
merit that is used for this task is the spectral test, due to [3]. The spectral test has an important
geometrical interpretation as the maximum distance between successive parallel hyperplanes covering
all possible points x
n
that the generator can produce. This interpretation leads to efficient algorithms to
compute the value of the spectral test. We refer the reader to [26,57,16,14,30,31,34,61] for details. The
generator RANDU of Figs. 1 and 2 is not bad with respect to the spectral test in dimension d=2, but the
value of the spectral test in dimension 3 is extremely small, thereby reflecting the catastrophic lattice
structure in this dimension. This example explains why we have to consider the spectral test for a whole
range of dimensions and compute its value for each of them.
The approach of the second school (see [55]) is to construct generators where the empirical
distribution function does not approximate uniform distribution ``too well''. The maximum distance
between the empirical distribution function and uniform distribution is preferred to be of order 1a
N
_
,
P. Hellekalek / Mathematics and Computers in Simulation 46 (1998) 485505 493
where N denotes the number of points x
n
we consider. This is, roughly speaking, and thinking of the law
of the iterated logarithm ``LIL'' for discrepancy, the order of this quantity in the case of realizations of
i.i.d. random variables on [0,1[
d
(see [25,55]). We will call this kind of equidistribution ``LIL-
uniformity''. The figure of merit that is used here is the two-sided KolmogoroffSmirnov test statistic,
also known as discrepancy in number theory. In terms of discrepancy, super-uniformity has an order of
magnitude O((log N)
d
/N). Niederreiter has developed a powerful number-theoretic method to estimate
discrepancy by exponential sums, see [51,53,55]. This method has allowed to assess this figure of merit
for most types of random number generators.
Both the spectral test and discrepancy have their advantages and shortcomings. The advantage of the
spectral test is that it is readily computable even in higher dimensions (i.e. above d=20) under the
condition that points x
n
have lattice structure, see [39,26,57,61]. This will only be the case for full-
period sequences 3 and for certain types of generators, mostly linear ones. Discrepancy is not limited to
point sets with lattice structure. It is well-defined for every sample 3. Unfortunately, it is not possible to
compute its value in practice, due to a complexity of order O(N
d
), where N denotes the number of points
and d the dimension. There are only upper and lower bounds available, due to Niederreiter's advanced
method. For both figures of merit, their distribution in dimensions d_2 is not known. Therefore, we
cannot design an empirical test for random number generators from this quantities. In dimension one,
the commulative distribution function (``c.d.f.'') of discrepancy is known. We refer the reader to
[26,53,55] for details. Recently, a probabilistic algorithm has been presented by [64].
There is a new addition to this list of figures of merit, the weighted spectral test. It is due to [18].
This figure of merit is derived from the original concept of the spectral test. It is related to discrepancy,
can be estimated as the latter, it does not require lattice structure for the point sets 3, and it takes O(d
N
2
) steps to compute it in any dimension d, see [20,19]. The weighted spectral test may be interpreted
as a mean square integration error, see [22]. Results on its distribution are already available, see
[41,22].
Beyer quotients are another figure of merit to assess lattices. Unfortunately, this quantity is known to
be defined properly only in dimensions up to 6. Beyond this dimension, the Minkowski-reduced lattice
bases involved need not be unique any more and their Beyer quotients might be different (see [[40],
4.3.1]). For this reason, any results on bad Beyer quotients in dimensions higher than 6 are without
mathematical justification at the present state of the art. A wrong basis might have been used.
5. Empirical evidence
Theoretical support for random number generators is not enough. Empirical evidence is
indispensable. Every empirical test is a simulation. If selected with care, then it will cover a whole
class of simulation problems. As we have indicated before, nothing can be deduced from the results of
an empirical test if the practitioner uses completely different parameters in his own simulation problem.
It is relatively easy to design an empirical (``statistical'') test for random numbers. Every function of
a finite number of U(0,1)-distributed random variables whose distribution is known and which can be
computed efficiently will serve for this purpose. What really matters here is to design tests that
constitute prototypes of simulation problems and measure different properties of random numbers.
Hence, every test in our battery should represent a whole class of empirical tests. No serious effort has
yet been undertaken to classify the many empirical tests available according to this principle.
494 P. Hellekalek / Mathematics and Computers in Simulation 46 (1998) 485505
There are well-established batteries of empirical tests, see [26,45,28]. Marsaglia's DIEHARD battery
is available on CD-ROM and from the Web-server http://stat.fsu.edu/~geo/diehard.html.
Several test statistics have been found to be rather discriminating between random number
generators. In the class of bit-oriented tests that count the number of appearances of certain blocks of
bits, Marsaglia's M-tuple test is outstanding, see [45,63,42,24]. Other reliable tests are the run test (see
[26]) and a geometric quantity, the nearest-pair test (see [28,57]). Recently, a discrete version of an
entropy test has been presented in [36]. It is not yet clear if this interesting test is really a new prototype
not covered by the M-tuple test as employed in [63]. This example shows that, while it is relatively easy
to design a new empirical test for random number generators, it is a nontrivial task to show that the new
quantity is a meaningful addition to the established batteries of tests and strongly different from known
test statistics.
On the basis of our practical experience we recommend the following approach to test design.
Suppose we use a random variable Y with known c.d.f.F
Y
. With the help of a random number generator,
we produce K realizations y
1
,F F F,y
K
of this random variable. In the second step, we compare the
empirical distribution function
F
Y
of the samples to the target distribution F
Y
by some goodness-of-fit
test, like the KolmogoroffSmirnov (KS) statistic. This procedure is called a two-level test, see [28,30].
Two-level test designs are a good compromise between speed and power of a test, see [28] for details.
If we want to test a random number generator without a particular application in mind, then it makes
more sense to choose a smaller number of strongly different test statistics and to vary the parameters of
the tests (like the sample size or the dimension) within large intervals. In our opinion, it is less relevant
for practice to run an enormous battery of tests without any idea if all these tests really measure
different properties of the generator. Further, if we do not vary the parameters enough and work, for
example, with fixed sample sizes in our tests then our chances to meet the user's needs are small.
6. Practical aspects
Several aspects of a random number generator are of practical importance. For implementation, we
need tables of parameters for good random number generators. Without portable implementations a
generator will not be useful to the simulation community. Power users need large samples. For certain
generators, in particular linear types, the limit for the usable sample size is close to
P
_
, P the period of
the generator, in many empirical tests. On 32-bit machines, most software packages work with LCGs of
period length below 2
32
. Hence, the maximum usable sample size is about 2
15
, which is much too small
for demanding simulations.
Parallel simulation creates additional problems (see [1]). Even reliable generators are unsafe when
submitted to parallelization techniques. Basically, there are the following methods to generate random
numbers on parallel processors. We may assign (i) L different generators to L different processors, or
(ii) L different substreams of one large-period generator to the L processors. Technique (ii) has two
variations. Either we use (a) a ``leap-frog'' method where we assign the substream (x
nLj
)
n_0
to
processor j, 0_j<L, or (b) we assign a whole segment (x
n
)
n_n
j
to processor j, where n
1
,...,n
L
is an
appropriate set of initial values that assures disjointness of the substreams. Technique (b) is called
``splitting'' of a random number generator.
Approach (i) cannot be recommended in general. There are no results on correlations
between different random number generators, with one notable exception. For the EICG the correlation
P. Hellekalek / Mathematics and Computers in Simulation 46 (1998) 485505 495
behavior of parallel generators has been analyzed. It was found to be remarkably good, see
[52,54].
Approach (ii) is also unsafe territory. Linear methods like the LCG or MRG may occasionally
(and unexpectedly) produce terrible sub-sequences with the leap-frog technique. We will illustrate
this point with an example from [13]. Even a good generator like LCG(2
48
, 55151000561141, 0)
(see [14]) produces a leap-frog subsequence that performs even worse than RANDU in the spectral
test. If we happen to assign the leap-frog subsequence (x
23n
)
n_0
to one processor, then the
values of the spectral test show that this was an unfortunate decision which we might regret, see
Table 1.
Splitting is not safe either. So-called long-range correlations are lurking in the shadows (see [4,6]).
Again, the EICG recommends itself for empirical testing, due to strong theoretical support with respect
to splitting, see [11,52,54].
Available parallel random number generator libraries are based on linear algorithms. We refer the
reader to [48,38,47,56,21]. These libraries should be used with the above warnings in mind.
7. Examples revisited
In the preceding sections, we have discussed several aspects of a good random number generator. We
will now revisit the generators we have presented in Section 3.
Examples 1,2,3, and 4 have been constructed such that super-uniformity is achieved in as many
dimensions as possible. For MRG1 and cMRG1 (see Examples 1 and 2) the parameters have been
chosen with the spectral test, see [35,32]. In Example 3, theoretical analysis yielded conditions for
optimal equi-distribution properties. These conditions allowed to perform exhaustive searches for
optimal parameters, see [33]. A similar approach based on deep theoretical analysis was used in [50] to
find good tGFSR like TT800.
Inversive generators are much less sensible to the choice of parameters. They yield LIL-uniformity
once maximum period is assured by the parameters.
We will now report on the performance of our examples in a stringent test, Marsaglia's M-tuple test
(see [45]). The test design and the graphic presentation of the results have been developed in [63]. We
refer to this thesis for details of our setup.
From every random numbers x
0
,x
1
...[0,1[, we take the first consecutive r blocks of 4 bits in its
binary representation, r=6,8. This procedure gives a sequence of random numbers (y
n
)
n_0
in the range
{0, ..., 15}. We then consider the overlapping d-tuples (y
n
, y
n1
,F F F, y
nd1
) and apply the overlapping
M-tuple test, where d=4,5. For a given sample size N, we compute 32 values of the, theoretically equi-
distributed, upper tail probability of the M-tuple test. In the following figures, the sample size ranges
between 2
18
and 2
26
. The sample size is given in a logarithmic scale.
Table 1
Spectral test for dimensions 2 to 8
d=2 3 4 5 6 7 8
0.2562 0.0600 0.0114 0.0462 0.1275 0.2031 0.2077
496 P. Hellekalek / Mathematics and Computers in Simulation 46 (1998) 485505
In Fig. 6, we show the result of a two-sided KS test applied to these 32 values for MRG1. Values of
the KS test statistic greater than the critical value 1.59 that corresponds to the significance level of 1%
are shown in dark grey and indicate that the generator has failed the test.
In Fig. 7, we plot the 32 equi-distributed values of this test statistic. The resulting patterns should be
irregular. If, for a given sample size N, the corresponding box is either totally white or black, the
generator has failed. White indicates too good approximation (which is a result of super-uniformity),
black signals too large deviation from the expected values. We observe that the MRG performs well for
the first 6 blocks of digits of length 4 in dimensions 4 and 5, but it fails to simulate the theoretical
distribution if we consider 8 blocks of 4 bits in dimension 5 due to the fact that it is a 31-bit generation.
The combined generator cMRG1 (see Example 2 in Section 3) yields similar results and is therefore
omitted.
The cTG of Example 3 performs considerable better, as the Figs. 8 and 9 shows. cTG1 has no
problems with 32 bits. The tGFSR ``TT800'' (Figs. 10 and 11 of example 4) gives a flawless
performance. It is the only generator here without rejections.
Fig. 6. MRG1: KS values.
Fig. 7. MRG1: Upper tail probabilities.
P. Hellekalek / Mathematics and Computers in Simulation 46 (1998) 485505 497
Fig. 8. cTG1: KS values.
Fig. 9. cTG1: Upper tail probabilities.
Fig. 10. TT880: KS values.
498 P. Hellekalek / Mathematics and Computers in Simulation 46 (1998) 485505
In comparison to these long-period generators, we see that only a compound ICG of period length
close to 2
32
is able to keep up with the large generators above. cICG1 (Figs. 16 and 17) combines
ICG(1031,55,1,0), ICG(1033,103,1,0), and ICG(2027,66,1,0). An ICG with period length 2
31
1 like
ICG1= ICG(2
31
1,1288490188,1,0) finally becomes ``overloaded'' (Figs. 12 and 13). A LCG of the
same period length will perform poorly, as our example shows (Figs. 14 and 15).
8. RNG survival kit
The following selection of papers and links allows an easy orientation in the field of random number
generation.
At starting points, we recommend [30,34], which cover a broad range of aspects. For readers that are
interested in the mathematical background, [55] contains a wealth of comments and references. There
Fig. 11. TT800: Upper tail probabilities.
Fig. 12. cICG1: KS values.
P. Hellekalek / Mathematics and Computers in Simulation 46 (1998) 485505 499
Fig. 13. cICG1: Upper tail probabilities.
Fig. 14. IGG1: KS values.
Fig. 15. ICG1: Upper tail probabilities.
500 P. Hellekalek / Mathematics and Computers in Simulation 46 (1998) 485505
are two monographs [53,61] in this field for further reading. [26] is considered to be the ``bible'' of
random number generation.
Numerous links to information and software can be obtained from the Web site http://
random.mat.sbg.ac.at/
9. RNG checklist
Theoretical support
Period length conditions
algorithms for parameters
Structural properties intrinsic structures
points on hyperplanes
equidistribution prop.
Fig. 16. ANSI-C: KS values.
Fig. 17. ANSI-C: Upper tail probabilities.
P. Hellekalek / Mathematics and Computers in Simulation 46 (1998) 485505 501
Correlation analysis for particular parameters
for particular initializations
for parts of the period
for subsequences
for combinations of RNG's
Empirical evidence
v Variable sample size
v Two-or higher level tests
bit-oriented tests
tests for correlations
geometric test quantities
complexity
transformation methods: sensitivity
Practical aspects
Tables of parameters available
Portable implementations available
Parallelization techniques apply
Large samples available
10. Summary
Random number generators are like antibiotics. Every type of generator has its unwanted side-
effects. There are no safe generators. Good random number generators are characterized by theoretical
support, convincing empirical evidence, and positive practical aspects. They will produce correct
results in many, though not all, simulations.
Open questions in this field concern reliable parallelization, the creation of good generators on
demand, the sensitivity of transformation methods (to obtain nonuniform random numbers) to defects
of the uniform random number generators, the classification of empirical tests, and the mathematical
foundation of forecasting the empirical performance by theoretical figures of merit.
There are three rules for numerical practice that are worth to keep in mind.
1. Do not trust simulation results produced by only one (type of) generator, check the results with
widely different generators before taking them seriously.
2. Do not combine, vectorize, or parallelize random number generators without theoretical and
empirical support.
3. Get to know the properties of your random number generators. (We have supplied pointers and a
checklist for this task)
Nowadays, the tool-box of stochastic simulation contains numerous reliable random number
generators. It is up to the user to make the best out of them.
502 P. Hellekalek / Mathematics and Computers in Simulation 46 (1998) 485505
Acknowledgements
I would like to thank my research assistants Stefan Wegenkittl, who has carried out the necessary
computations for the figures in Section 7, and Karl Entacher, who has calculated Table 1.This work has
been supported by the Austrian Science Foundation, project P11143/MAT.
References
[1] S.L. Anderson, Random number generation on vector supercomputers and other advanced architectures, SIAM Review
32 (1990) 221251.
[2] A. Compagner, Operational conditions for random-number generation, Phys. Review E 52 (1995) 56345645.
[3] R.R. Coveyou, R.D. MacPherson, Fourier analysis of uniform random number generators, J. Assoc. Comput. Mach. 14
(1967) 100119.
[4] A. De Matteis, J. Eichenauer-Herrmann, H. Grothe, Computation of critical distances within multiplicative congruential
pseudorandom number sequences, J. Comp. Appl. Math. 39 (1992) 4955.
[5] A. De Matteis, S. Pagnutti, Long-range correlations in linear and non-linear random number generators, Parallel
Computing 14 (1990) 207210.
[6] A. De Matteis, S. Pagnutti, Critical distances in pseudorandom sequences generated with composite moduli, Intern. J.
Computer Math. 43 (1992) 189196.
[7] L. Devroye, Non-Uniform Random Variate Generation, Springer, New York, 1986.
[8] C. Doll, Die digitale Inversionsmethode zur Erzeugung von Pseudozufallszahlen, Master's thesis, Fachbereich
Mathematik, Technische Hochschule Darmstadt, 1996.
[9] J. Eichenauer, J. Lehn, A non-linear congruential pseudo random number generator, Statist. Papers 27 (1986) 315326.
[10] J. Eichenauer-Herrmann, Explicit inversive congruential pseudorandom numbers: the compound approach, Computing
51 (1993) 175182.
[11] J. Eichenauer-Herrmann, Statistical independence of a new class of inversive congruential pseudorandom numbers,
Math. Comp. 60 (1993) 375384.
[12] J. Eichenauer-Herrmann, Compound nonlinear congruential pseudorandom numbers, Mh. Math. 117 (1994) 213222.
[13] K. Entacher, A collection of selected pseudorandom number generators with linear structures, Report, The pLab Group,
Department of Mathematics, University of Salzburg, 1996.
[14] G.S. Fishman, Multiplicative congruential random number generators with modulus 2