0% found this document useful (0 votes)
57 views21 pages

Von Neumann'S Comparison Method For Random Sampling From The Normaland Other Distributions

- The document describes Von Neumann's comparison method for generating random samples from probability distributions like the normal distribution. It generalizes Von Neumann's original method for the exponential distribution. - The generalized method works by selecting a random number interval based on the target distribution, then performing comparisons within that interval to determine the random sample value. - The method can generate samples from any distribution satisfying a first-order differential equation, and is particularly efficient for the normal distribution, using an average of 4.036 uniform random numbers per sample.

Uploaded by

emiliogasco
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
57 views21 pages

Von Neumann'S Comparison Method For Random Sampling From The Normaland Other Distributions

- The document describes Von Neumann's comparison method for generating random samples from probability distributions like the normal distribution. It generalizes Von Neumann's original method for the exponential distribution. - The generalized method works by selecting a random number interval based on the target distribution, then performing comparisons within that interval to determine the random sample value. - The method can generate samples from any distribution satisfying a first-order differential equation, and is particularly efficient for the normal distribution, using an average of 4.036 uniform random numbers per sample.

Uploaded by

emiliogasco
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

_

_...

-L-e

---

.-we-_

_.

_.

VON NEUMANNS COMPARISON METHOD FOR RANDOM SAMPLING FROM THE NORMALAND OTHER DISTRIBUTIONS .

BY GEORGE E. FORSYTHE

STAN-CS-72-254 JANUARY, 1972

COMPUTER SCIENCE DEPARTMENT School of Humanities and Sciences STANFORD UN IVERS ITY

Von Neumann's Comparison Method for Random Sampling from the Normal and Other Distributions

George E. Forsythe Computer Science Department Stanford University

Abstract The author presents a generalization he worked out in 1950 of von Neumann's method of generating random samples from the exponential distribution by comparisons of uniform random numbers on (0,l) . It is shown how to generate samples from any distribution whose probability density function is piecewise both absolutely continuous and monotonic on ( -=,a) . cost of only A special case delivers normal deviates at an average
4.036 uniform deviates each. This seems more efficient

than the Center-Tail method of Dieter and Ahrens, which uses a related, but different, method of generalizing the von Neumann idea to the normal distribution.

This research was supported in part by the Office of Naval Research under Contracts N-000%67-A-Oll2-0057 (NR 044-402) and N-000%67-A-0112-0029 (NR 044~ZTL), and by the National Science Foundation under Grant GJ- 992. Reproduction in whole or in part is permitted for any purpose of the United States Government.

Von Neumann's Comparison Method for Random Sampling from the Normal and Other Distributions

George E. Forsythe Computer Science Department Stanford University

1.

Introduction.

In the summer of 1949, at the Institute for

Numerical Analysis on the campus of the University of California, Los Angeles, John von Neumann ['] lectured on various aspects of generating pseudorandom numbers and variables. At the end he presented an ingenious

method for generating a sample from an exponential distribution, based solely on comparisons of uniform deviates. In his last snetence he

commented that his "method could be modified to yield a distribution satisfying any first-order differential equation". In 1949 or 1950 I wrote some notes about what I assumed von Neumann had in mind, but I do not recall ever discussing the matter with him. This belated polishing and publication of those notes is

stimulated by papers by Ahrens and Dieter [l, 21 in which several

related algorithms are studied, and by a personal discussion $-ith the authors on how the von Neumann dea can be extended.

In Section 2 the general method is presented, and in Section 3 its efficiency is analyzed. In Sections 4 and 5 it is shown how the

exponential and normal distributions show up as special cases. In Section 6 the method for a normal distribution is compared with the Center-Tail method of [l] and [2]. are mentioned. Although this introduction has emphasized historical matters, the method of Section 6 is a good one,-and is competitive with the best known methods for generating normal deviates. I thank both Professors Ahrens and Dieter for their careful criticism of a first draft of this paper. 2. The general algorithm. Let f(x) > 0 be defined for all In Section 7 possible generali7ati.01
1s

x > 0 and satisfy the first-order linear different ial equation -

( 11

f'(x) Let

b(x)

f(x)

(0

< _

<

where b(x) - 0. > (2) and assume that

B(x)

= J 0

b(t)

d-t

(3)
Then

(4)
is the unique solution of (1) PTith 1, and hence f is

the probability density distribution of a nonnegative random variable. Suppose we have a supply of independent random variables u wi th

a unil'orm distribution on [0, l), and that we wish to generate a random variable y with the density distribution f(x). proceed. We first prepare three tables of constants {qk] , (r,] , [dk) for k = 0, 1, . . . . K, as follows. (K is defined below.) Let q. = 0. Here is one way to

For each k = 1, 2, . . . . K, pick qk as large as possible, subject to the two constraints

(5)

qk

qk-1

(6)
Next, compute

B(qk)

- B(qk-l)

(7

>

qk rk = f(x) dx (k = 0, 1, . . . . K).

Here K is chosen as the least index such that rK exceeds the largest representable number less tIlah 1. rK (K may be chosen smaller, if one sets
by

L 1, and if' one is willing to truncate the generated variable qK to the interval [qKml, qK).)

reducing any value above Finally, compute

(8)

dk = qk - 9J.p1

(k = 1, 2, . . . . K).

For simplicity we define the functions

(9)

Gk(X)

= B(qkwl + x) -

B(qkwl)

(k

1,

2,

..Y

K).

Now we present the algorithm. interval

Steps 1 to 3 determine thich Steps 4 to 11

[q k-l, qk) the variable y Cl1 belong to. y within tllat interval. Set k C-1.

determine tile value of' 1.

[Begin choice of interval.] deviate u. If u <

Generate a uniform

2.

[Test.]

r k, go to step 4. If u > rks set k f-k + 1 and go back

3.

[Increase interval,] to step 2.

4.

[Begin computation of y another uniform deviate

in the selected interval.] u and set w <-udk.

Generate

5.

Set t WGk(w).

6. Generate another uniform deviate uJc.

7. 8.

[Test.] If u* > t, go to step 11. [Trial continues.] If U* < t, generate another uniform deviate u.

9. [Test.] If u < u++, set tf---u and go back to step 6. 10. 11. [Reject the trial.] If u > u*, go back to step 4. [Finish.] Return yeqk 1 + w as the sample variable. Since we = K,

We now shob that the above algorithm works as claimed.

assume that each u < 1, the test in step 2 must be passed when k if not sooner. of r k

Hence an interval [qkml, q ) is selected, and the values k were chosen to make the probabilities of choosing the various

intervals correct. Fix k. The remainder of the algorithm can be described as follows: is selected uniformly from the interval 0 < w < d k'

First? a random number w

Then the algorithm continues to generate independent uniform deviates ui f'rom 10, 1) until the least n > Gk(W) is found with (n

u2 (10)
U

= l), or

n+l

>u

<u

n-l

<

. . .

<u

<u 2

<

GkcW)

(n
If

2 2). n

With probability

1 such an n will be found, as will be shown.

is odd, we return y*qkS1 + w.

If n is even, we reject w and

all the u, choose a new w, and repeat. We now determine the probability P(k, w) that one w determined in step 4 will be accepted without returning to step 4. be the universe of all events. tile event
U

Let El(k, w)

For

= 2, 3, . ..) let E,(k, w) be

<

11

n-l

<

<u

<u

<

Gk(W

I-

Then the probability of

E,(k, w) is given by

(n
Gk(W)
Prob fE,(k, w)]

1)

dX2 fJ 0 Gkb') dx \/ 0 Gk(w)n-l = (n-l) ! 2 dx


3

(n

= 2)

. . .

(n 2

3)

(all n).

The occurrence of

(10)

is the conjunction of

E,(k, w) and not-

E,+l(k, w)* Since E,+l(k, w) implies E (k, w), the probability n that (10) occurs for a given n and w is (11) Prob [I3 (k, w) n and nOt-En+l(k, w)] =
Gk(w)n-l Gk(w)n

(n-l)

Summing over all odd n, we see that

(12)

P(k,

F')

= (n-l) ! -

k(w)

'Gkh) e '

I 1=

Since

< -

d < -

k' we have 'kcdk) = B(qk) B(qk-l) 2 '7

Gk(W)

-1

for all k and w. is selected in the interval

Now g< -

d,-'dc g+ that

is the probability that w ds. 5 5

w <

Combining this bith (l2), Fe see that the w 5 g + dg and that w is accepted is given

probability bY

0~)

Prob

(g

F'

dg

and

w is

accepted]

dg e -Gk(w) dk

Corresponding
--

to an accepted w, we return y = Hence, from

qkW1 + w as the

sample variable.

(lb), the probability that y is in

the range x - y < x + dx, for given k, is < -

1 e dk 1 II dk = 1 e

-Gk(X - $1) dx

-B(x)

+-

'(qk-1) dx 9 bY (9)

Ce

B(qk-l)

-1 -B(x) C e

dx

1 = dkf (qk-1) f (4 dx,


bY (4).

That is, .. f(x) dx (15) Prob [ x < y < x + dx and y is accepted] =


dkf (qk-1) l

Since this is proportional to

f(x) dx, we see that any accepted y

has the desired probability density distribution within the interval hk-1' q& Since? from (13), the probability of an infinite loop

back to step I+ is zero, the second half of the algorithm terminates w'lttl probability 1. works as claimed. 3* Efficiency -- algorithm. of the For a general function b , I This concludes the demonstration that the algorithm

shall derive a representation for the expected number of uniformly distributed random variables u that must be used to generate one variable A similar deriv-

y -with the probability density proportional to f(x). ation is given in

PI.

The preliminary game to select k -- steps 1 to 3 of the algor thm -- requires one u.

The rest of the algorithm is different for each k, and we shall first determine the expected number N(k) of steps to determine y.

To do this, we shall first assume t hat k is fixed and that w has been picked in the interval 0 -< w < dk. Define E,(k, w) as in Section 2, and introduce the abbreviations 1 00 and e n = yp, w> = Prob {E_(k, w)) ( n = 1, 2, . ..)

(17)

g = Gk(w .

Then, as in Section 2, we have the following expression for the probability P(k, w) of accepting w without returning to step 4:

P(k, w) = (el - e2) + (e3 - e4) + (e5 - e6) + . . . . Moreover, given k and w and given that w is accepted, the expected needed will be 2(el - e2) + 4(e3 - e4) + 6(e5 - e6) + 2-l n!
.0 3

number of uniform deviates u

maOh 4

= p(k,

w)-'

1
l

(n+l).

Similarly, the probability'1 - P(k, w) that w is rejected is given by l- P(k, w) = (e2 - e3) + (e4 - e5) + (es - 7) + *e

Moreover, given

k and w and that w is rejected, the expected number needed is

of uniform deviates u

mr(k, w> = cl - P(k, w)ll


09) 1 =

[3(e p3) +
67 n-l

5(e4-e5) + 7(e

2 n!

1-P(k, w) even n (n-l)! '[ n>2 -

6-7

') + . ..I

(n + 1) .

Now, if a w is rejected, the algorithm returns to step 4, a new w is picked, and the process repeats. Let M(k, w) be the is finally Then N(k)

expected number of uniform deviates selected until a y selected, given a fixed is the average of 0 < w < dk. k and an initially chosen w.

M(k, w) over all w

uniformly distributed on

We have (20) M(k, w) = P(k, w)ma(k, w) + i1 - P(k, 41 br(k, w> + N(k)1 , is rejected, the whole process is repeated. Using
(18)

since, in case w the expressions from (20) that

and

(19)

f'or

ma(k, w) and m,(k, w), we get

g M(k, w) =

n-l (n+l) n! II + cl - P(k, w)] N(k)

x (n-l)! n=l [

= 1 + eg + [l - P(k, w)] N(k),


or (21)

M(k, w) = 1 + e Gk(W) + [l - P(k, w)] N(k) . (21) for 0 < w <- d , and using (12) , we find that dk >'w' dw + N(k) c - + I"-""") dj .

Averaging

1 N(k) = 1 + dk

Solving for N(k), we dk (22) N(k) = +

get

dk

G.

(w)

ek

dw .

-Gk(W)

dw

Finally, the expected number Nof uniform deviates drawn in the main game until a y is returned is the average of N(k) over the

intervals, weighted by the probabilities of selecting the various intervals. That is,

k=l If we make use of

0) c

N(k)

bk

rk-ll

(4), (7),

and (9)

to express

??

in terms of B(x),

we obtain the ugly representation co = z k=l (24 > 1 = 03 e-B(X) dx s 0 d ewB(qk-l) + e-2B(qk-l) k L eB(") dx J e B(qjp.l) dk + e

-B( qk-1)

9k J qk-l Tr

eB(") dx

e -B(x)

dx

cYBCx)

dx

e-B(x) dx J 0

%-1

10

4.

Special case:

exponential distribution.

If b(x) '= 1

in (l), then B(x) = x and y(x) = eoX, corresponding to the exponential distribution treated in [3]. For the algorithm of Section 2 we have -k , and Gk(x) = x, for all k. Since = k, dk = 1, rk=l-e dk 9k and Gk(x) are independent of k, steps 4 to 10 of the algorithm are the same for all k. steps 1 to 5. is over They can therefore be carried out independently of

By (l-2), the probability that a chosen w is not accepted (for all k), and the average value of 1 - eWw e-l.

l- P(k, w) = 1 - eBw 0 < w < 1 is

If the preliminary game of steps 1 to 3 were played, the interval [k -1, 'k) would be selected with probability rk - rk 1 = e -k e-k (1 - e -l), for k=l,2, . . . .
-1
-e

-(k+Q _ -

Thus the interval [0, 1) would be , and rejected with probability e .


-1

accept4 with probability 1 - e 101 Ir - 1, 2, . . . .

ii' [k-l, k) is rejected, then [k, k+l) would be


-1 -I

, and rejected with probability e . -1 Since the rejection ratio for each interval has the same value e , accepted with probability 1 - e which is the a priori probability of rejecting in the main game any w selected in step 4, von Neumann could use the rejection of w signal to change the interval from as the

[k-l, k) to [k, k+l). Thus the

preliminary game of steps l-3 is unnecessary for the exponential distribution. This made von Neumann's game very elegant. I know of no com-

parable trick for general b(x). From (22) and (23), since N(k) = 5, ential distribution (25) i;s = 1 + (e-l) -1 1 - e = e 1 - e 11 -1 we see that for the expon-

t 4.30026

as stated in

Cl1

There was an error in ['I. normal distribution. If b(x) = x in (l),

5.

Special case:

then B(x) = x2/2

and f(x) = /? e-x2/2 , corresponding to the For the algorithm of Section

positive half of the normal distribution. 2 we have = 0, q1 = 1, . ..$ qk =

40

42k - 1

(k > 2).

Hence 3 dl = 1, d, = J- - 1, . . . , dk'J= L Also,


Gk(x) = -

-Jz3

(k > 2).

!&lx

ck

2 The values of rk must be computed from the probability integral.

The

table below gives 15-decimal values of qk3 %, rky and N(k) for k = 1, 2, . . . . 36, as computed in Fortran on Stanford's IBM 360/67 computer

in double precision. To generate normal dwiatcs, one selects K and prestorcs the values of rk, qk7 and dk for k = 1, 2, . . . . K. Then set qofieO and

dKfi-1. (The limit K = I2 permits normal deviates up to + 5.0 to . be generated, and the deviates will be truncated less than once in a million trials. cation.)
i

A higher limit will decrease the probability of trun-

As suggested in [2], one should start the algorithm with a preliminary determination of the sign of the normal deviate. We do this

in steps Nl-NJ of the following algorithm. a uniform deviate on the interval w, 1).

At entry to Step N4, u is The rest is the algorithm

oi' Sctction 2, with the sign appended in the last step. Nl. [13ct:in choice of sign and interval.] uniform N2. N5.
N4. rJ5.

Set k * 1. Generate a

deviate u on If

ro, 1).

Set u f--u. set sf-landgotostepN4.

[Test for sign.]

u < 1,

If u >- 1, set se -1, and set u-u - 1. [Test for interval.] If u <- rk, go to step N6. [Increase interval.] to step N4. If u > rk, set k ek + 1 and go back

N6.

[Begin generation of ly\ in the selected interval.] another uniform deviate u on

Generate

[0, 1) and set w<-udk .

J-v*
N8. N9.

Set t eGk(w). Generate another uniform deviate u* on [0, 1). [Test.] If u* > -t, go to step N13. If u* < t , generate another uniform deviate

NlO. [Trial continues.]


U

on [0, 1). If u < u*, set t f--u and go back to step N8.

Nil.

[Test.]

NU. [Reject the trial.] Nl2. [Finish.] Return

If u 2 u*, go back to step N6. y f-- s ( qkol + w) as the sample normal variable.

As in Section 3, we let N(k) be the expected number of selections of uniform deviates in steps N&-N13, as a function of k. from (22): We have

13

l+ N(1) = 1 e-w2/2 s 0 s 0

edle dw 1 dw

"k dk N(k) = qk e k+ / Yk-l Wmerical values of N(k) are given in the table. Using the formula x+h +t*/li e/ x dt
.
m

e '/2 -k J 'k-1

ew212

dw ('k 2 2).

eow2/'

dw

asymptotic

e 2x I 2
X

x+h
V

one can show that


I I c

lim k --> 03 Cf.

N(k) = 1 - e

4.30026 .

(25).

The equality (26) was written to me by U. Dieter.

1
L

I have used the same computer to establish that 03 & > k=l N ( k, (k - rk-l) :

3.03585 >

so that the expected number of uniform deviates chosen in order to generate one normal deviate is 1 + y $ 4.03585 .

14

The correctness of this algorithm for generating normal deviates, <2" ~(~11 as the value 01 .J N, have be<?n confirmed in unpublished experi-

mc'nts by A. I. E'orsythr? and independently by J. II. Ahrens.

6.

Comparison with the Center-Tail method of Dieter and Ahrens.

In [l], Dieter and Ahrens give a related but different modification of' the von Neumann idea for the generation of normal deviates. There are only two intervals, the center and the tail, and the algorithms are quite different for the two. The expected number of uniform deviates

needed is near 6.321, and computation of-a square root is required in approximately 16 per cent of the cases. The algorithm of Section 5 above requires no function call, but its main advantage over the Center-Tail method lies in requiring about two-thirds the number of uniform deviates. 1 This should be reflected in a shorter average time of execution. The Dieter-Ahrens algorithm for the center interval closely resembles
L

my algorithm for each interval, and the proofs are very close to those given above. The big difference is that in [l] all variables ui have (0 2 x < l), and the com-

! L

the cumulative distribution function x2 parisons <are of the form


U ,\ u < u < u
n-2

n+l

n-l

<
l **

<u

<u

<u

1'

In contrast, in this paper all variables ui

have uniform distributions

and the comparisons take the form (for the principal case k = 1):
U2 1 U < u < u

n+l 2

un

n-l

n-2

<

"*

<l.l

<u 2

<

15

Changing the distribution function in [l] costs an extra uniform deviate and a comparison for each ui3 whereas forming U12/2 u 's. i = Gl(w) in

Section 5 is done only once for each chain of fact that u12/2
U2

Moreover, the

is usually small means that most of the time u1 is accepted immediately. This contributes

U12/2

and hence

to keeping

N low in my algorithm.

Finally, the use of Gk(w) makes

it possible to use the von Neumann technique in any interval in which Gk(w) can be evaluated. In a more recent manuscript [2] Dieter and Ahrens have improved their Center-Tail method so that the comparisons are simpler and the expected number of uniform deviates needed is reduced to near 5.236. According to the authors, the improved Center-Tail method is still somewhat slower than my algorithm.

7.

Further generalizations.

Let

f(x)

(-a

<

<

a)

be

the

probability density function of a random variable F. Under what conditions on from F? f could the von Neumann idea be applied to pick a sample It is sufficient that the interval (-03 , w) be the union Ik = csk-1' $1 ck = "*, I. k > either 0, -2, -l, f(x) 0

of a set of' abutting intervals

0, 1, 2, . ..) such that in each closed interval or the following three conditions all hold: continuous, and f is monotonic. f(x)

f is absolutely

Then a preliminary game can be played to select an interval Ik. If b(x) = f'(x)/f(x) 2 0 in Ik, the algorithm of Section 2 can

be adapted to select a value in Ik tional to f(x). and

with a density distribution propor-

(It may be necessary to subdivide Ik so that (5)

(6)

hold.)

16

L.

If

b(x)

<

in

Ik,

change

to

-x

and follow an'analogous

algorithm.

(a) One must evaluate various integrals like

f(t) dt, Ja in order to determine the parameters needed to pick the intervals Ik during execution, and to evaluate the needed These computations have to be done only

rk) $5 and d k'

once in designing the algorithm. (b) One must evaluate Gk(w) for arbitrary during each execution of the algorithm. w in 1% dkl

Note that

G-ii(w) = B(qk-l + d - B(qk 1) Y.k-1 + w = J 'k-1 b(t) d-t = / qk-1 f(t) qk-1 + w f'(t) dt

::irlf*~~

on-l;,

(b)

:i :: :1011(' on-line, ti~c succes;:; of an a&orithm would 222 f(x) rapidly. WC,

~~~III to tl~~pcnd only on tire ;ihility to evaluate .,

thus see that having an equation of -type

f(x) = C exp ( CJY (x)) (and hence a solution of (1)) is of great practical advantage, but it is

not essential in principle to the use of von Neumann's idea.

17

Rei'ercnces. 1 . J. 11. A
I Y

"Computc~r* rnr;thoti:: f'or :;amplinll;

Irwrn tJl(b exponc:ntitLl

and normal di::trit)utions," Comm. Assoc. Computing

Mach. , vol. 15 (15772), pp. 000-000. 2. U. Dieter and J. Ahrens, "A combinatorial method for the

generation of normally distributed random numbers," to appear.

3.

John von Neumann, "Various techniques used in connection

with random digits" (smmary written by George E. Forsythe), pp. 36-38 of Monte Carlo Method, [U. s.] National Bureau of Standards, Applied Mathematics Series, vol. 12 (1951). Reprinted in John von Neumann,

Collected Works, vol. 5, pp. 768-770, Pergamon Press, 1963. February 9, 1972.

18

rr-

r Lf Lc e fi Q cx CFQ

2i 0

cp r-

ma3

u
4

C) U . iJ

C? u . c

0 . b 0 :J 0

19

You might also like