END322E System Simulation
Class 7: Random Number Generation
Housekeeping
• Last week today:
• Queueing theory:
• Calling pop, system capacity, arrival dist., queueing discipline, queue behavior, service
capacity and service dist.
• ρ, L, W and other notations
• Little’s law
• Long-term (steady state behavior) of known queue types:
• M/G/1
• M/M/1
• M/M/c
• M/M/1/N
• M/M/c/N
• M/M/∞
16/04/2019 END322E System Simulation 2
Introduction
• Uniform(0,1) random numbers are the key to random variate
generation in simulation — you transform uniforms to get other RVs
• So, we should start producing Uniform(0,1)s first!
16/04/2019 END322E System Simulation 3
Properties of Random Numbers
• Two important statistical properties:
• Uniformity
• Independence.
• Random Number, Ri, must be independently drawn from a uniform
distribution with pdf:
1, 0 x 1
f ( x)
0, otherwise
1
2
1 x 1
0
E ( R ) xdx
2
0
2
3 1 2
x 1 1 1 1
V ( R) x 2 dx E R
1 2
0 3 0 2 3 4 12
16/04/2019 END322E System Simulation 4
Generation of Pseudo-Random Numbers
• “Pseudo”, because generating numbers using a known method removes the
potential for true randomness.
• Goal: To produce a sequence of numbers in [0,1] that simulates, or imitates, the
ideal properties of random numbers (RN).
• Important considerations in RN routines:
• Fast
• Portable to different computers
• Have sufficiently long cycle
• Replicable
• Closely approximate the ideal statistical properties of uniformity and
independence.
16/04/2019 END322E System Simulation 5
Techniques for Generating Random #s
• Some lousy generators
•
output of random device
•
table of random numbers
•
midsquare
•
Fibonacci
• Linear Congruential Method (LCM).
• Combined Linear Congruential Generators (CLCG).
16/04/2019 END322E System Simulation 6
Lousy generators (we won’t use these)
a) Random Devices
Nice randomness properties. However, Unif(0,1) sequence storage difficult, so it’s tough to
repeat experiment.
Examples:
• flip a coin
• particle count by Geiger counter
• least significant digits of atomic clock
b) Random Number Tables
List of digits supplied in tables.
• A Million Random Digits with 100,000 Normal Deviates
www.rand.org/content/dam/rand/pubs/monograph_reports/MR1418/MR1418.digits.pdf
Cumbersome, slow, tables too small — not very useful.
Once tabled no longer random
16/04/2019 END322E System Simulation 7
Lousy generators (we won’t use these)
c) Mid-Square Method (J. von Neumann)
Idea: Take the middle part of the square of the previous random
number. John von Neumann was a brilliant guy, but method is terrible!
16/04/2019 END322E System Simulation 8
Lousy generators (we won’t use these)
d) Fibonacci and Additive Congruential Generators
These methods are also no good!
Problem: Small numbers follow small numbers.
16/04/2019 END322E System Simulation 9
Linear Congruential Generators
• LCGs are the most widely used generators. These are pretty good when implemented properly.
• To produce a sequence of integers, X1, X2, … between 0 and m-1 by following a recursive
relationship:
X i 1 (aX i c) mod m, i 0,1,2,...
The The The
multiplier increment modulus
• The selection of the values for a, c, m, and X0 drastically affects the statistical properties and the
cycle length.
• The random integers are being generated [0,m-1], and to convert the integers to random numbers:
Xi
Ri , i 1,2,...
m
• If c= 0, LCG is called a multiplicative generator
16/04/2019 END322E System Simulation 10
LCG - example
i 0
Xi 0
Ri 0
16/04/2019 END322E System Simulation 11
Try
• Does this achieve full cycle?
16/04/2019 END322E System Simulation 12
Another example
• Full period with a cycle length >2 billion
• Try implementing in Excel, C, Matlab, Python etc.
16/04/2019 END322E System Simulation 13
Characteristics of a Good Generator
• Maximum Density
• Such that the values assumed by Ri, i = 1,2,…, leave no large gaps on [0,1]
• Solution: a very large integer for modulus m
• Maximum Period
• To achieve maximum density and avoid cycling.
• Achieve by: proper choice of a, c, m, and X0.
• Uniformity Apply tests
• Independence Apply tests
• Most digital computers use a binary representation of numbers
• Speed and efficiency are aided by a modulus, m, to be (or close to) a power of
2.
16/04/2019 END322E System Simulation 14
What could go wrong with LCGs?
• Something not full period and that only produces even integers like
• Something full period but only produces very non-random output
• In any case, if m is small, you’ll get quick cycling whether or not the
generator is full period. “Small” could mean anything less than 2
billion or so.
• And just because m is big, you still have to be careful. In addition to
above, some subtle problems can arise. Take a look at IBM’s RANDU.
16/04/2019 END322E System Simulation 15
RANDU generator
• IBM's RANDU is widely considered to be one of the most ill-conceived
random number generators ever designed.
• It was popular in 60s.
• If you plot Ri’s in 3D (courtesy of Wikipedia)
16/04/2019 END322E System Simulation 16
Combined Linear Congruential Generators
• Reason: Longer period generator is needed because of the increasing complexity of
stimulated systems.
• Approach: Combine two or more multiplicative congruential generators.
• Let Xi,1, Xi,2, …, Xi,k, be the ith output from k different multiplicative congruential
generators.
• The jth generator:
• Has prime modulus mj and multiplier aj and period is mj-1
• Produces integers Xi,j is approx ~ Uniform on integers in [1, m-1]
• Wi,j = Xi,j -1 is approx ~ Uniform on integers in [1, m-2]
16/04/2019 END322E System Simulation 17
16/04/2019 END322E System Simulation 18
Combined Linear Congruential Generators
• Suggested form:
Xi
m , X i 0
k
X i (1) X i , j mod m1 1
j 1
Hence, Ri 1
j 1 m 1
1 , Xi 0
m1
The coefficient:
Performs the
subtraction Xi,1-1
(m 1)( m2 1)...( mk 1)
• The maximum possible
P 1 period is:
k 1
2
16/04/2019 END322E System Simulation 19
A Really Good Combined Generator due to L’Ecuyer
For 32-bit computers, L’Ecuyer [1988] suggests combining k = 2 generators with m1 = 2,147,483,563, a1 =
40,014, m2 = 2,147,483,399 and a2 = 20,692. The algorithm becomes:
Step 1: Select seeds
• X1,0 in the range [1, 2,147,483,562] for the 1st generator
• X2,0 in the range [1, 2,147,483,398] for the 2nd generator.
Step 2: For each individual generator,
X1,j+1 = 40,014 X1,j mod 2,147,483,563
X2,j+1 = 40,692 X1,j mod 2,147,483,399.
Step 3: Xj+1 = (X1,j+1 - X2,j+1 ) mod 2,147,483,562.
Step 4: Return X j 1
, X j 1 0
R j 1 2,147,483,563
2,147,483,562 , X j 1 0
2,147,483,563
Step 5: Set j = j+1, go back to step 2.
• Combined generator has period: (m1 – 1)(m2 – 1)/2 ~ 2 x 1018
16/04/2019 END322E System Simulation 20
Choosing a good generator
• For LCG, when m is power of 2 (m=2b), LCG is full period if (i) c is
relatively prime to m (that is the greatest common factor of c and m is
1), and (ii) a =1+ 4k where k is an integer.
• A multiplicative generator is full period when m=2b if (i) the seed X0 is
odd, and (ii) a= 3 + 3k or a = 5 + 8k.
16/04/2019 END322E System Simulation 21
Tests for Random Numbers
• Two categories:
• Testing for uniformity:
H0: Ri ~ U[0,1]
H1: Ri ~ U[0,1] /
• Failure to reject the null hypothesis, H0, means that evidence of non-uniformity has
not been detected.
• Testing for independence:
H0: Ri ~ independently
H1: Ri ~ independently /
• Failure to reject the null hypothesis, H0, means that evidence of dependence has not
been detected.
• Level of significance a, the probability of rejecting H0 when it is true:
a = P(reject H0|H0 is true)
16/04/2019 END322E System Simulation 22
Tests for Random Numbers
• When to use these tests:
• If a well-known simulation languages or random-number generators is used, it is
probably unnecessary to test
• If the generator is not explicitly known or documented, e.g., spreadsheet
programs, symbolic/numerical calculators, tests should be applied to many
sample numbers.
• Types of tests:
• Theoretical tests: evaluate the choices of m, a, and c without actually generating
any numbers
• Empirical tests: applied to actual sequences of numbers produced. The authors’
emphasis.
16/04/2019 END322E System Simulation 23
Frequency Tests
• Test of uniformity
• Two different methods:
• Kolmogorov-Smirnov test
• Chi-square test
16/04/2019 END322E System Simulation 24
Kolmogorov-Smirnov Test
• Compares the continuous cdf, F(x), of the uniform distribution with the empirical cdf,
SN(x), of the N sample observations.
• We know: F ( x) x, 0 x 1
• If the sample from the RN generator is R1, R2, …, RN, then the empirical cdf, SN(x)
is:
number of R1 , R2 ,..., Rn which are x
S N ( x)
N
• Based on the statistic: D = max| F(x) - SN(x)|
• Sampling distribution of D is known (a function of N, tabulated in Table A.8. in your
textbook)
• A more powerful test, recommended.
16/04/2019 END322E System Simulation 25
Kolmogorov-Smirnov Test
• Example: Suppose 5 generated numbers are 0.44, 0.81, 0.14, 0.05, 0.93.
Step 1: R(i) 0.05 0.14 0.44 0.81 0.93
i/N 0.20 0.40 0.60 0.80 1.00
i/N – R(i) 0.15 0.26 0.16 - 0.07
Step 2:
R(i) – (i-1)/N 0.05 - 0.04 0.21 0.13
Step 3:
Step 4:
16/04/2019 END322E System Simulation 26
Example
• Let’s see if RAND() function in Excel produces uniform(0,1).
• Use K-S test
16/04/2019 END322E System Simulation 27
Chi-square test ]
• Chi-square test uses the sample statistic:
n is the # of classes Ei is the expected
n
(Oi Ei ) 2
# in the ith class
02
i 1 Ei Oi is the observed
# in the ith class
• Approximately the chi-square distribution with n-1 degrees of freedom (where
the critical values are tabulated in Table A.6 in your textbook)
• For the uniform distribution, ENi, the expected number in the each class is:
E , where N is the total # of observatio n
i
n
• Valid only for large samples, e.g. N >= 50
16/04/2019 END322E System Simulation 28
Chi-square test
16/04/2019 END322E System Simulation 29
Example
• Let’s see if RAND() function in Excel produces uniform(0,1).
• Use Chi-square test
16/04/2019 END322E System Simulation 30
Tests for Autocorrelation
• Testing the autocorrelation between every m numbers (m is a.k.a. the lag),
starting with the ith number
• The autocorrelation rim between numbers: Ri, Ri+m, Ri+2m, Ri+(M+1)m
• M is the largest integer such that i (M 1 )m N
• Hypothesis:
H 0 : im 0, if numbers are independent
H1 : im 0, if numbers are dependent
• If the values are uncorrelated:
• For large values of M, the distribution of the estimator of rim, denoted ̂im
is approximately normal.
16/04/2019 END322E System Simulation 31
Tests for Autocorrelation
ˆ im
• Test statistics is: Z0
ˆ ˆ im
• Z0 is distributed normally with mean = 0 and variance = 1, and:
1 M
ˆρim
M 1 k 0
Ri km Ri (k 1 )m 0.25
13M 7
σˆ ρim
12(M 1 )
• If r im > 0, the subsequence has positive autocorrelation
• High random numbers tend to be followed by high ones, and vice versa.
• If rim < 0, the subsequence has negative autocorrelation
• Low random numbers tend to be followed by high ones, and vice versa.
16/04/2019 END322E System Simulation 32
Shortcomings
• The test is not very sensitive for small values of M, particularly when the
numbers being tests are on the low side.
• Problem when “fishing” for autocorrelation by performing numerous tests:
• If a = 0.05, there is a probability of 0.05 of rejecting a true hypothesis.
• If 10 independence sequences are examined,
• The probability of finding no significant autocorrelation, by chance alone,
is 0.9510 = 0.60.
• Hence, the probability of detecting significant autocorrelation when it
does not exist = 40%
16/04/2019 END322E System Simulation 33