0% found this document useful (0 votes)
26 views

Simulation

Uploaded by

Akriti Gupta
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views

Simulation

Uploaded by

Akriti Gupta
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Copyright c 2016 by Karl Sigman

1 Inverse Transform Method and some alternative algorithms


Assuming our computer can hand us, upon demand, iid copies of rvs that are uniformly dis-
tributed on (0, 1), it is imperative that we be able to use these uniforms to generate rvs of any
desired distribution (exponential, Bernoulli etc.). The first general method that we present is
called the inverse transform method.
As a quick example for motivation: Suppose that we want a random variable X that has
an exponential distribution at rate λ: The cumulative distribution function (CDF) is given by

F (x) = P (X ≤ x) = 1 − e−λx , x ≥ 0.

Then simply define


1
X = − ln (U ),
λ
where ln (y) denotes the natural logarithm of y > 0.
Proof:
1
P (X ≤ x) = P (− ln (U ) ≤ x)
λ
= P (ln (U ) ≥ −λx)
= P (U ≥ e−λx )
= 1 − e−λx .

(Recall that P (U ≥ y) = 1 − y, y ∈ (0, 1).) It turns out that this kind of clever “transfor-
mation” involves the inverse function of a CDF only. We now present this general method.
Let F (x), x ∈ IR, denote any cumulative distribution function (cdf) (continuous or not). In
other words, F (x) = P (X ≤ x), x ∈ IR, for some random variable X. Recall that F : IR −→
[0, 1] is thus a non-negative and non-decreasing (monotone) function that is continuous from
the right and has left hand limits, with values in [0, 1]; moreover F (∞) = limx→∞ F (x) = 1
and F (−∞) = limx→−∞ F (x) = 0. Our objective is to generate (simulate) rvs X distributed
as F ; that is, we want to simulate a rv X such that P (X ≤ x) = F (x), x ∈ IR.
Define the generalized inverse of F , F −1 : [0, 1] −→ IR, via

(1) F −1 (y) = min{x : F (x) ≥ y}, y ∈ [0, 1].

If F is continuous, then F is invertible (since it is thus continuous and strictly increasing)


in which case F −1 (y) = min{x : F (x) = y}, the ordinary inverse function and thus
F (F −1 (y)) = y and F −1 (F (x)) = x. In general it holds that
F −1 (F (x)) ≤ x and F (F −1 (y)) ≥ y. F −1 (y) is a non-decreasing (monotone) function in y.
This simple fact yields a simple method for simulating a rv X distributed as F :

Proposition 1.1 (The Inverse Transform Method) Let F (x), x ∈ IR, denote any cumu-
lative distribution function (cdf ) (continuous or not). Let F −1 (y), y ∈ [0, 1] denote the inverse
function defined in (1). Define X = F −1 (U ), where U has the continuous uniform distribution
over the interval (0, 1). Then X is distributed as F , that is, P (X ≤ x) = F (x), x ∈ IR.

1
Proof : We must show that P (F −1 (U ) ≤ x) = F (x), x ∈ IR. First suppose that F is continuous.
Then we will show that (equality of events) {F −1 (U ) ≤ x} = {U ≤ F (x)}, so that by taking
probabilities (and letting a = F (x) in P (U ≤ a) = a) yields the result: P (F −1 (U ) ≤ x) =
P (U ≤ F (x)) = F (x).
To this end: F (F −1 (y)) = y and so (by monotonicity of F ) if F −1 (U ) ≤ x, then U =
F (F −1 (U )) ≤ F (x), or U ≤ F (x). Similarly F −1 (F (x)) = x and so if U ≤ F (x), then
F −1 (U ) ≤ x. We conclude equality of the two events as was to be shown. In the general
(continuous or not) case, it is easily shown that

{U < F (x)} ⊆ {F −1 (U ) ≤ x} ⊆ {U ≤ F (x)},

which yields the same result after taking probabilities (since P (U = F (x)) = 0 since U is a
continuous rv.)

1.1 Examples
The inverse transform method can be used in practice as long as we are able to get an explicit
formula for F −1 (y) in closed form. We illustrate with some examples. We use the notation
U ∼ unif (0, 1) to denote that U is a rv with the continuous uniform distribution over the
interval (0, 1).

1. Exponential distribution: F (x) = 1 − e−λx , x ≥ 0, where λ > 0 is a constant. Solving the


equation y = 1 − e−λx for x in terms of y ∈ (0, 1) yields x = F −1 (y) = −(1/λ) ln (1 − y).
This yields X = −(1/λ) ln (1 − U ). But (as is easily checked, see below) 1 − U has a
unif (0, 1) distribution too, since U does, and thus we can simplify the algorithm by
replacing 1 − U by U :

Algorithm for generating an exponential rv at rate λ:

i Generate U ∼ unif (0, 1).


ii Set X = − λ1 ln (U ).

P (1 − U ≤ x) = P (U ≥ 1 − x) = 1 − (1 − x) = x, x ∈ (0, 1); indeed 1 − U is unif (0, 1).

2. Discrete rvs: discrete inverse-transform method: Consider a non-negative discrete rv X


with probability mass function (pmf) p(k) = P (X = k), k ≥ 0. In this case, F (x) is not
continuous, but the construction X = F −1 (U ) is explicitly given by (see proof below):
Algorithm for generating a discrete random variable:

(a) Generate U ∼ unif (0, 1).


(b) Set X = 0 if U ≤ p(0), otherwise set
k−1
X k
X
X = k, if p(i) < U ≤ p(i), k ≥ 1.
i=0 i=0

2
This is known as the discrete inverse-transform method. It easily can be extended to cover
discrete random variables that are not necessarily non-negative. The algorithm is easily
verified
Pk−1directly by recalling
Pk that P (a < U ≤ b) = b − a, for 0 ≤ a < b ≤ 1; here we use
a = i=0 p(i) < b = i=0 p(i), and so b − a = p(k).

3. Bernoulli (p) and Binomial (n, p) distributions:


Suppose we want to generate a Bernoulli (p) rv X, in which case P (X = 0) = 1 − p and
P (X = 1) = p for some p ∈ (0, 1). Then the discrete inverse-transform method yields:
Algorithm for generating a Bernoulli (p) rv X:

i Generate U ∼ unif (0, 1).


ii Set X = 0 if U ≤ 1 − p; X = 1 if U > 1 − p.

Since 1 − U is also unif (0, 1), we can re-do the above by replacing U by 1 − U and obtain
Another Algorithm for generating a Bernoulli (p) rv X:

i Generate U ∼ unif (0, 1).


ii Set X = 1 if U ≤ p; X = 0 if U > p.

Suppose we want X to have a binomial (n, p) distribution, that is,


 
n k
p(k) = P (X = k) = p (1 − p)n−k , 0 ≤ k ≤ n.
k

One could, in principle, use the discrete inverse-transform method with these p(k), but
we also can note that X can be represented (in distribution) as the sum of n iid Bernoulli
(p) rvs, Y1 , . . . , Yn ;
Xn
X= Yi ,
i=1

the number of successes out of n iid Bernoulli (p) trials.


Alternative algorithm for generating a binomial (n, p) rv X:

i Generate n iid rvs U1 , U2 , . . . , Un ∼ unif (0, 1).


ii For each 1 ≤ i ≤ n, set Yi = 1 if Ui ≤ p; Yi = 0 if Ui > 1 − p. (This yields n iid
Bernoulli (p) rvs)
iii Set X = ni=1 Yi .
P

The advantage of this algorithm is its simplicity, we do not need to do the various compu-
tations involving the p(k). On the other hand, this algorithm requires n uniforms for each
copy of X versus only one uniform when using the discrete inverse-transform method.
Thus we might not want to use this algorithm when n is quite large.
Poisson approximation to the binomial distribution
In fact, when n is very large, and p is small, it follows (e.g., can be proved; there is a
theorem lurking here that is stated below), that the distribution of X is very approxi-
mately the Poisson distribution with mean np: For α > 0, consider a sequence of Binomial

3
(n, p(n)) rvs Xn in which p(n) = α/n, n ≥ 1. Note how E(Xn ) = np(n) = α, n ≥ 1,
but the distribution of Xn is changing as n increases: more Bernoulli trials are performed
but with a decreasing probability of success; p(n) → 0 as n → ∞, even though they each
have the same expected number of successes, α.
Then Xn converges in distribution to the Poisson distribution with mean α:

αk
lim P (Xn = k) = e−α , k ≥ 0.
n→∞ k!

The proof is based on writing out


 
n α k α
P (Xn = k) = ( ) (1 − )n−k ,
k n n
k
taking the limit as n → ∞ and showing that it converges to e−α αk! , k ≥ 0.
As an application, suppose a circuit board has n = 1000 independent components within,
each defective with probability p = 1/1000. Then letting X denote the total number of
defective components within the circuit, it has a binomial (1000, 1/1000) distribution, but
we can say that X approximately has a Poisson distribution with mean 1000(1/1000) = 1.
Thus, the probability that none of the components are defective, P (X = 0), can be
closely approximated (within 3 decimal places) by e−1 = 0.368. The exact answer is
(1 − p)1000 = (0.999)1000 = 0.368 This motivates our next example.

4. Poisson distribution with mean α:


In this case
αk
p(k) = P (X = k) = e−α , k ≥ 0.
k!
We could thus use the discrete inverse-transform method, but of course it involves com-
k
puting (in advance) pieces like αk! . Here we present an alternative algorithm that makes
use of properties of a Poisson process at rate α. The trick is to recall that if {N (t) : t ≥ 0}
is the counting process of a Poisson process at rate α, then N (t) has a Poisson distribution
with mean αt; in particular (t = 1), N (1) has a Poisson distribution with mean α. Thus
if we can simulate N (1), then we can set X = N (1) and we are done. Let Y = N (1) + 1,
and let tn = X1 + · · · + Xn denote the nth point of the Poisson process; the Xi are iid with
an exponential distribution at rate α. Note that Y = min{n ≥ 1 : tn > 1} = min{n ≥
1 : X1 + · · · + Xn > 1}, a stopping time. Using the inverse transform method to generate
the iid exponential interraival times Xi , we can represent Xi = −(1/α) ln(Ui ). We then
can re-write (recalling that ln(xy) = ln(x) + ln(y))

Y = min{n ≥ 1 : ln(U1 ) + · · · + ln(Un ) < −α}


= min{n ≥ 1 : ln(U1 · · · Un ) < −α}
= min{n ≥ 1 : U1 · · · Un < e−α }

We thus can simulate Y by simply consecutively generating independent uniforms Ui


and taking the product until the product first falls below e−α . The number of uniforms

4
required yields Y . Then we get X = N (1) = Y − 1 as our desired Poisson. Here then is
the resulting algorithm:
Alternative algorithm for generating a Poisson rv X with mean α:

i Set X = 0, P = 1
ii Generate U ∼ unif (0, 1), set P = U P
iii If P < e−α , then stop. Otherwise if P ≥ e−α , then set X = X + 1 and go back to ii.
iii Output X.

Note that, unlike the inverse transform method, which would only require one U , the
above algorithm requires a random number of Ui , Y = X + 1 uniforms to be precise, and
unknown in advance. On the other hand this algorithm does not require computing pieces
k
like αk! . On average, the number of Ui required is E(X + 1) = α + 1. If α is not too big,
then this algorithm can be considered very efficient.

You might also like