0% found this document useful (0 votes)
43 views20 pages

A Single Number Is Not Random. - Random Means The Absence of Order

Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
43 views20 pages

A Single Number Is Not Random. - Random Means The Absence of Order

Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 20

Random Number Generation (RNG)

read “Numerical Recipes” on random numbers and


the chi-squared test.

Today we discuss how to generate and test random numbers.

What is a random number?


– A single number is not random. Only an infinite
sequence can be described as random.
– Random means the absence of order. (Negative property).
– Can an intelligent gambler make money by betting on
the next numbers that will turn up?
– All subsequences are equally distributed. This is the
property that MC uses to do integrals and that you will
test for in homework.
“Anyone who considers arithmetical methods of random digits is, of course, in
a state of sin.” John von Neumann (1951)

“Random numbers should not be chosen at random.” Knuth (1986)

MSE485/PHY466/CSE485
1
©D.D. Johnson and D. Ceperley 2009
Random numbers on a computer
• Truly Random - the result of a physical process such as
timing clocks, circuit noise, Geiger counts, bad memory
– Too slow (we need 1010/sec) and expensive
– Low quality
– Not reproducible
• Pseudo-random. prng (pseudo means fake)
– Deterministic sequence with a repeat period but with the
appearance of randomness (if you don’t know the
algorithm).
• Quasi-random (quasi means almost random)
– “half way” between random and a uniform grid.
– Meant to fill space with max. distance between space to fill
“holes” sequentially (e.g., [0,100] as 0,1,2,3,…,100).

If doing a low-D integral, QRNG gives up correlation so the sequence will be


more uniform than PRNG and converge to mean value quicker than N -1.2.

MSE485/PHY466/CSE485
2
©D.D. Johnson and D. Ceperley 2009
Desired Properties of RNG on a computer

• Deterministic: easy to debug (repeatable simulations).

• Long Periods: cycle length is long.

• Uniformity: RN generated in space evenly.


– Go through complete cycle, all integers occur once!
– TEST: N occurrences of x. The no. in (a,b) is equal to
N(b-a) ± [N(b-a)]-1/2

• Uncorrelated Sequences:
<f1(xi+1)…fk(xi+k)>= <f1(xi+1)>…<fk(xi+k)>
Monte Carlo can be very sensitive to such
correlations, especially for large k.

• Efficiency: Needs to be fast.

MSE485/PHY466/CSE485
3
©D.D. Johnson and D. Ceperley 2009
Pseudo Random Sequence

S: State and initial seed.


T: Iteration process,
F: Mapping from state to integer RN (I ) or real RN (U).

MSE485/PHY466/CSE485
4
©D.D. Johnson and D. Ceperley 2009
Period or Cycle Length (L)
• If internal state has M bits, total state space is 2M values.
• If mapping is 1-1, space divides into a finite no. of cycles.
• Best case is a single cycle of length 2M, otherwise 2M/L.
• It is easy to achieve very long cycle length: 32 or 24 bit
generators are not long enough!
• Entire period of the RNG is exhausted in:
• rand (1 processor) ~ 100 second
• drand48 (1 processor) ~ 1 year
• drand48 (100 processor) ~ 3 days
• SPRNG LFG (105 procs) ~ 10375 years
Assuming 107 RN/sec per processor

e.g., In 1997, computer has ~108 RN/proc/sec (or 226).


Hence, it takes 1 sec to exhaust generator with 26 bits and 1 year for 2 51 states.
e.g., consider 1024x1024 sites (spins) on a lattice. With MC, sweep 10 7 times to get averages.
You need to generate 1013 RN. A 32-bit RNG from 1980’s dangerous on today’s computers.

MSE485/PHY466/CSE485
5
©D.D. Johnson and D. Ceperley 2009
Common PRNG Generators
Linear Congruential Generator (LCG) modulo (mod) is remainder after division
64-bit word (or 2 32-bit LCG) give a period of ~10 18.
• Multiplicative Lagged zn = zn-k * zn-l
Fibonacci vary
• Modified Lagged initialization
zn = zn-k + zn-l
Fibonacci
(modulo 2m )

• 48 bit LCG
• 64 bit LCG zn = a*zn-1 + p vary p
• Prime Modulus LCG (modulo m )

vary a

• Combined Multiple zn = an-1*zn-1 + ...+


Recursive vary LCG
an-k*zn-k + LCG

Recurrence Parallelization
MSE485/PHY466/CSE485
6
©D.D. Johnson and D. Ceperley 2009
Uniformity
• Output consists of taking N bits from state and making
an integer in (0,2N-1) “Ik”.

• One gets a uniform real “Uk” by multiplying by 2-N.

• We will study next time how to get other distributions.

• If there is a single cycle, then integers must be uniform


in the range (0,2N-1).

• Uniformity of numbers taken 1 at a time is usually easy to


guarantee. What we need is higher dimensional uniformity!

MSE485/PHY466/CSE485
7
©D.D. Johnson and D. Ceperley 2009
Example of LCG: linear congruent generator

Example of LCG(a,c,m,I0).

e.g., LCG(5,1,16,1): In+1 = a In + c mod(m)

• I0 is seed to initiate sequence. (Allows repeatable runs.)


• Sequence is determined by values of (a,c,m,I0).
• mod(m) is leftover from multiples of m from LCG.

• In+1 are RNG integers.


Rn = In /float( m ) for [0,1)
• Real RNG are created by
or Rn = In /float(m-1) for [0,1]

For uniformity on [0,1), what is desired average and variance of R?

MSE485/PHY466/CSE485
8
©D.D. Johnson and D. Ceperley 2009
Example of LCG: In+1 = a In + c mod(m)

e.g., LCG(5,1,16,0) where a = 5, c =1, m =16 and I0 = 0


• Choosing udrn: Rn = In /float(m) for [0,1).

Thus,
LCG mod(m) Integer Binary Real
I0 : 0 + 1 = 1 - 0x16 1 0001 1/16 = 0.0625
I1: 5x1 + 1 = 6 - 0x16 6 0110 6/16 = 0.3750
I2: 5x6 + 1 = 31 - 1x16 15 1111 15/16 = 0.9375
I3: 5x15 + 1 = 76 - 4x16 12 1100 12/16 = 0.7500
I4: 5x12 + 1 = 61 - 3x16 13 1101 13/16 = 0.8125

I15: 5x3 + 1 = 16 - 1x16 0 0000 0/16 = 0.0000

Note: period is P=2M.


Here P=16, where M=Log(m)/base2 = ln(16)/ln(2) = 4
MSE485/PHY466/CSE485
9
©D.D. Johnson and D. Ceperley 2009
For LCG(5,1,16,0): In+1 = 5 In + 1 mod(16)
For LCG(5,1,16,0): In+1 = a In + c mod(m)
• We have a period P =16, which is the longest it can be.
Integer Binary Real
I0: 1 0001 1/16 = 0.0625
I1: 6 0110 6/16 = 0.3750 However, there is
I2: 15 1111 15/16 = 0.9375 correlation in the
I3: 12 1100 12/16 = 0.7500 most significant
I4: 13 1101 13/16 = 0.8125 binary digit!
I5: 2 0010 2/16 = 0.1250
I6: 11 1011 11/16 = 0.6875 …1010101…
I7: 8 1000 8/16 = 0.5000
I8: 9 1001 9/16 = 0.5625
I9: 14 1110 14/16 = 0.8750 Show analytically that
I10: 7 0111 7/16 = 0.4375 <R> = 1/2
I11: 4 0100 4/16 = 0.2500 <(R - <R>)2> = 1/12.
I12: 5 0101 5/16 = 0.3125
I13: 10 1010 10/16 = 0.6275 Try it!
I14: 3 0011 3/16 = 0.1875
I15: 0 0000 0/16 = 0.0000 MSE485/PHY466/CSE485
10
©D.D. Johnson and D. Ceperley 2009
Sequential RNG Problems
• Correlations non-uniformity in higher dimensions

Uniform in 1-D but


non-uniform in 2-D

This is the important


property to guarantee:
 f1(xi1 ) f 2 (xi2 )  f1(xi1 )  f 2 (xi2 ) 
MC uses numbers many at a time--they need to be uniform.
e.g., 128 atom MC moving 3N coords. at random needs 4 PRNG (x,y,z, i) or
 128 x 4=1024. So every 1024 moves may be correlated!
Compare error for each PRNG, if similar OK. (Use χ2-test.)
MSE485/PHY466/CSE485
11
©D.D. Johnson and D. Ceperley 2009
LCG Numbers fall in planes.
Good LCG generators have the planes close together.
For a k-bit generator, do not use them more than k/2 together.

MSE485/PHY466/CSE485
12
©D.D. Johnson and D. Ceperley 2009
SPRNG: originally a NCSA project

• A library of many well tested excellent parallel PRNGs


• Callable from FORTRAN.C, C++, and JAVA
• Ported to popular parallel and serial platforms

SPRNG Functions
• int *init_sprng(int streamnum, int nstreams, int seed, int param)
• double sprng(int *stream)
• int isprng(int *stream)
• int print_sprng(int *stream)
• int make_sprng_seed()
• int pack_sprng(int *stream, char **buffer)
• int *unpack_sprng(char *buffer)
• int free_sprng(int *stream)
• int spawn_sprng(int *stream, int nspawned, int ***newstreams)

MSE485/PHY466/CSE485
13
©D.D. Johnson and D. Ceperley 2009
What is a chisquare test
• Suppose { x1,x2,….} are N Gaussian random numbers mean zero
variance 1
N
• What is the distribution of y   x 2?
k 1

• Q(  2 | N ) is the probability that y>2

• The mean value of y is N, its variance is 2N.

• We use it to test hypotheses. Is the fluctuation we see, what we


expect?
N
 x x  fit 2

2  
k k

 k2
k 1

• Roughly speaking we should have N  2N   2  N  2N

• More precisely look up value of Q(  2 | N )

MSE485/PHY466/CSE485
14
©D.D. Johnson and D. Ceperley 2009
Chi-squared test of randomness
• HW exercise: do test of several generators
– Divide up “cube” into “N” bins.
– Sample many “P” triplets.
– Number/bin should be n=P/N ±(P/N)1/2 N=100
In example: expected n = 10 ± (10)1/2
P=1000

GOOD BAD

MSE485/PHY466/CSE485
15
©D.D. Johnson and D. Ceperley 2009
See “Numerical Recipes” on reserve or on-line
(link on class webpages)

2 (Ni  ni )2
• Chi-squared statistic is   i
ni
– ni (Ni) is the expected (observed) number in bin i.
– Ni is an integer! (Omit terms with 0 = ni = Ni)
– Term with ni = 0 and Ni ≠ 0, correctly give χ2 = ∞.)
– Large values of χ2 indicate the “null” hypothesis, i.e. unlikely.

• The chi-squared probability is Q(χ2|N-1) The incomplete Γ-fct


(N-1) because of “sum rule”. N.R. Sec. 6.2 and 14.3

• Table may help with interpreting the chi-squared probability.


Probability Interpretation 
< 1 %  reject 
1 - 5 %  suspect 
5 - 10 %  almost suspect 
10 - 90 %  pass 
90 - 95 %  almost suspect 
95 - 99 %  suspect 
> 99 %  reject  16
MSE485/PHY466/CSE485
©D.D. Johnson and D. Ceperley 2009
Recommendation:
For careful work use several generators!
• For most real-life MC simulations, passing existing
statistical tests is necessary but not sufficient.

• Random number generators are still a black art.

• Rerun with different generators, since algorithm may


be sensitive to different RNG correlations.

• Computational effort is not wasted, since results can


be combined to lower error bars.

• In SPRNG, relinking is sufficient to change RNGs.

MSE485/PHY466/CSE485
17
©D.D. Johnson and D. Ceperley 2009
Parallel Simulations

• Parallel Monte Carlo is easy? Or is it?

• Two methods for easy parallel MC:


– Cloning:
Cloning same serial run, different random numbers.
– Scanning:
Scanning different physical parameters (density,…).

• Any parallel method (MPI, ..) can be used.

• Problems:
– Big systems require excessive wall clock time.
– Excessive amounts of output data generated.
– Random number correlation?

MSE485/PHY466/CSE485
19
©D.D. Johnson and D. Ceperley 2009
Parallelization of RNGs

• Cycle Division: Leapfrog or Partition. Correlation problems

• Cycle Parameterization: If PRNG has several cycles,


assign different cycles to each stream

Set of possible states Each cycle gets a disjoint subset

• Iteration Parameterization: Assign a different iteration


function to each stream

MSE485/PHY466/CSE485
20
©D.D. Johnson and D. Ceperley 2009
Examples of Parallel MC codes and Problems

• Embarrassingly parallel simulations. Each processor has


its own simulation. Scanning and cloning.
Unlikely to lead to problems unless they stay in phase.

• Lots of integrals in parallel (e.g. thousands of Feynman


diagrams each to an accuracy of 10-6).
Problem if cycle length is exhausted.

• Particle splitting with new particles forking off new


processes. Need lots of generators.
Problem if generators are correlated initially.

• Space-time partitioning. Give each local region a


processor and a generator.
Problem if generators are correlated.

MSE485/PHY466/CSE485
22
©D.D. Johnson and D. Ceperley 2009

You might also like