5th Unit Random Variate Generation
5th Unit Random Variate Generation
06CS82
5.1 Inverse Transform Technique Inverse transform technique can be used to sample from the exponential, the uniform, the weibull, triangular distributions and from empirical distributions. Additionally it is the underlying principle for sampling from a wide variety of discrete distributions.
Page 1
06CS82
5.1.1 Exponential Distribution Step 1: Compute the cdf of the desired random variable X. For the exponential distribution, the cdf is F(x) =1 e-X, X0. Step 2: Set F(X) = R on the range of X. For the exponential distribution, it becomes 1 e-X =R on the range x>=0. Since X is a random variable (with the exponential distribution in this case), it follows that 1 e-X is also a random variable, here called R. R has a uniform distribution over the interval [0,1]. Step 3: Solve the equation F(X) = R for X in terms of R. For the exponential distribution, the solution proceeds as follows: 1 e-x = R e-x= 1 R -X= ln(1 - R)
x=- ln(1 R)
( 5.1 )
Equation (5.1) is called a random-variate generator for the exponential distribution. In general, Equation (5.1) is written as X=F-1(R). Generating a sequence of values is accomplished through step 4. Step 4: Generate uniform random numbers R1, R2, R3,... and compute the desired random variates by Xi = F-1(Ri) For the exponential case, F-1(R) =-
for i = 1,2,3,.... One simplification that is usually employed in Equation (5.2) is to replace 1 Ri by Ri to yield ( 5.3 )
Page 2
06CS82
Table 5.1 Generation of Exponential Variates X, with mean 1, Given random numbers i Ri Xi 1 2 3 4 0.7965 1.592 5 0.7696 1.468
Question 1:Generate 5 exponential distribution random variates with mean =1 and R1=0.3056 R4=0.7891 SOLUTION:i Ri Xi 1 0.3056 0.3647 2 0.8591 1.9597 3 0.0568 0.0584 4 0.7891 1.5563 5 0.3653 0.4546 R2=0.8591 R5=0.3653 R3=0.0568
Xi=- ln(1 Ri) X1=- ln(1 R1) X1=- ln(1 0.3056) X1=0.3647 X2=- ln(1 R2) X2=- ln(1 0.8591) X2=1.9597
Chandrashekar B, Asst.Professor, MIT-Mysore Page 3
06CS82
Page 4
06CS82
Figure 5.1. (a) Empirical histogram of 200 uniform random numbers; (b) empirical histogram of 200 exponential variates; (c) theoretical uniform density on [0,1]; (d) theoretical exponential density with mean 1.
Page 5
06CS82
Figure 5.2 gives a graphical interpretation of the inverse transform technique. The cdf shown is F(x) = 1- e-x, an exponential distribution with rate =1. To generate a value X1 with cdf F(X), first a random number Ri between 0 and 1 is generated, a horizontal line is drawn from R1 to the graph of the cdf, then a vertical line is dropped to the x axis to obtain X1, the desired result. Notice the inverse relation between R1 and X1, namely R1=1-e-X1 and X1=-ln(1-R1) In general the relation is written as R1=F(X1) X1=F-1(R1) Why does the random variable X1 generated by this procedure have the desired distribution? Pick a value X0 and compute the cumulative probability P(X1 X0) = P(R1 F(X0)) = F(X0) .(5.4) To see the first equality in Equation (5.4), refer to Figure 5.2, where the fixed numbers X0 and F(X0) are drawn on their respective axes. It can be seen that X1 xo when and only when R1 F(X0). Since 0 F(xo) 1, the second equality in Equation (5.4) follows immediately from the fact that R1 is uniformly distributed on [0,1]. Equation (5.4) shows that the cdf of X1 is F; hence, X1 has the desired distribution.
Chandrashekar B, Asst.Professor, MIT-Mysore Page 6
and
06CS82
5.1.2 Uniform Distribution Consider a random variable X that is uniformly distributed on the interval [a,b]. A reasonable guess for generating X is given by X = a + (b - a)R (5.5) [Recall that R is always a random number on [0,1]. The pdf of X is given by
The derivation of Equation (5.5) follows steps 1 through 3 of Section 5.1.1: Step 1: The cdf is given by
=R
Step 3: Solving for X in terms of R yields X = a + (b a)R, which agrees with equation(5.5). 5.1.3 Triangular Distribution Consider a random variable x that has pdf
This distribution is called triangular distribution with end point [0,2] and mode at 1. Step1:Its cdf is given by
Page 7
06CS82
2
in which case X=
1X2 1/2X1
Page 8
06CS82
5.1.4 Weibull Distribution The Weibull distribution is a model for time to failure for machine or electronic components. When the location parameter v is set to zero, its pdf is given by equation:
f(x)=
where >0 and >0 are the scale and shape parameters of the distribution. To generate a weibull variate, following steps 1 through 3 of section 8.1.1: Step 1: The cdf is given by F(X)=1Step 2: F(X)= 1=R. (5.6) ,x0.
Step 3: Solving for x in terms of R yields X=[-ln(1-R)]1/ By comparing equations 5.6 and 5.1, it can be seen that, if X is a Weibull variate, then X is an exponential variate with mean . Conversely, if Y is an exponential variate with mean , then Y1/ is a weibull variate with shape parameter and scale parameter =1/. Discrete Distributions All discrete distributions can be generated using the inverse transform technique, either numerically through a table-lookup procedure, or in some cases algebraically with the final generation scheme in terms of a formula. Other techniques are sometimes used for certain distributions, such as the convolution technique for the binomial distribution. X 0 1 2 P(X) 0.50 0.30 0.20 F(X) 0.50 0.80 0.10
06CS82
Example: Consider the discrete uniform distribution on {1,2,..., k} with pmf and cdf given by p(x) = 1/k, x = 1,2, . ..k, and
Let xi=i and ri = p(l) + . + p(xi)=F(xi) = i/k for i=1,2,..., k. Then by using Inequality , it can be seen that if the generated random number R satisfies R
i-1
(5.7)
Then X is generated by setting X = i . Now, Inequality can be solved for j: i 1 < Rk i Rk i < Rk +1 Let [y] denote the smallest integer > y. For example, [7.82] = 8, [5.13] = 6 [-1,32] = -1. For y > 0, [y] is a function that rounds up. This notation and Inequality (5.15) yield a formula for generating X, namely X = [Rk] For example, consider generating a random variate X, uniformly distributed on {1, 2, ... , 10}. The variate, X, might represent the number of pallets to be loaded onto a truck. Using Table A.1 as a source of random numbers, R, and Equation (5.16) with k = 10 yields R1 = 0.78, X1 = [7.8] = 8
Chandrashekar B, Asst.Professor, MIT-Mysore Page 10
and
06CS82
R2 = 0.03, X2 = [0.3] = 1 R3 = 0.23, X3 = [2.3] = 3 R4 = 0.97, X4 = [9.7] = 10. Example 1 (An Empirical Discrete Distribution) : At the end of the day, the number of shipments on the loading dock of the IHW Company (whose main product is the famous, incredibly huge widget) is 0, 1, or 2, with observed relative frequency of occurrence of 0.50, 0.30, and 0.20, respectively. Internal consultants have been asked to develop a model to improve the efficiency of the loading and hauling operations, and as part of this model they will need to be able to generate values, X, to represent the number of shipments on the loading dock at the end of each day. The consultants decide to model X as a discrete random variable with distribution as given in Table 5.5 and shown in Figure 5.6. The probability mass function (pmf), P(x) is given by p(0)= P(X = 0) = 0.50 p(1)= P(X = 1) = 0.30 p(2)= P(X = 2) = 0.20 and the cdf, F(x) = P(X x ) , is given by
Page 11
06CS82
Figure 5.6. The cdf of number of shipments, X. Table 5.6. Table for Generating the Discrete Variate Input I 1 2 3 n 0.50 0.80 1.00 Output Xi 0 1 2
Recall that the cdf of a discrete random variable always consists of horizontal line segments with jumps of size p(x) at those -points, x, which the random variable can assume. For example, in Figure 8.6 there is a jump of size = 0.5 at x = 0, of size p(l)=0.3 at x=1, and of size p(2) = 0.2 at x=2. For generating discrete random variables, the inverse transform technique becomes a table-lookup procedure, but unlike the case of continuous variables, interpolation is not required. To illustrate the procedure, suppose that R1 = 0.73 is generated. Graphically, as illustrated in Figure 8.6, first locate R1 = 0.73 on the vertical axis, next draw a horizontal line segment until it hits a "jump" in cdf, and then drop a perpendicular to the horizontal axis to get the generated variate. Here R1 = 0.73 is transformed to X1 = 1. This procedure is analogous to the procedure
Chandrashekar B, Asst.Professor, MIT-Mysore Page 12
06CS82
used for empirical continuous distributions, except that the final step of linear interpolation is eliminated. 5.2 Acceptance-Rejection Technique Suppose that an analyst needed to devise a method for generating random variates, X, uniformly distributed between1/4 and 1. One way to proceed would be to follow these steps: Step 1 : Generate a random number R. Step 2a: If R 1/4, accept X = R, then go to step 3. Step 2b: If R < 1/4, reject R, and return to step 1. Step 3: If another uniform random variate on [1/4,1] is needed, repeat the procedure beginning at step 1. If not, stop. Each time step 1 is executed, a new random number R must be generated. Step 2a is an acceptance and step 2b is a "rejection" in this acceptance rejection technique. To summarize the technique, random variates (R) with some distribution (here uniform on [0,1]) are generated until some condition (R > 1/4) is satisfied. When the condition is finally satisfied, the desired random variate, X (here uniform on [1/4,1]), can be computed (X = R). This procedure can be shown to be correct by recognizing that the accepted values of R are conditioned values; that is, R itself does not have the desired distribution, but R conditioned on the event {R 1/4} does have the desired distribution. To show this, take 1/4 a < b 1; then P(a<Rb | R1)= (5.28)
[ ,1].
Page 13
06CS82
between 1/4 and 1 (all other values of R are thrown out), is the desired distribution. Therefore, if 5.2.1 Poisson Distribution: A Poisson random variable, N, with mean >0 has pmf p(n) = P(N = n) = e-.n/n! , n = 0, 1, 2,. but more important, N can be interpreted as the number of arrivals from a Poisson arrival process in one unit of time. The inter- arrival times, A1, A2,... of successive customers are exponentially distributed with rate (i.e., is the mean number of arrivals per unit time); in addition, an exponential variate can be generated by Equation (5.3). Thus there is a relationship between the (discrete) Poisson distribution and the (continuous) exponential distribution, namely: N=n if and only if A1 + A2 + + An 1 < A1 + .. + An + An+1 ....... (5.30) Equation (5.29), N = n, says there were exactly n arrivals during one unit of time; but relation (5.30) says that the nth arrival occurred before time 1 while the (n + l)st arrival occurred after time l. Clearly, these two statements are equivalent. Proceed now by generating exponential inter arrival times until some arrival, say n + 1, occurs after time 1; then set N = n. For efficient generation purposes, relation (5.30) is usually simplified by first using Equation (5.3), Ai= (-1/)ln Ri, to obtain (5.29) R 1, set X = R.
Next multiply through by -, reverses the sign of the inequality, and use the fact that a sum of logarithms is the logarithm of a product, to get
Page 14
06CS82
5.31
which is equivalent to relation (5.30).The procedure for generating a Poisson random variate, N, is given by the following steps: Step1: Set n = 0, P = 1. Step2: Generate a random number Rn+1 and replace P by P *Rn+1. Step3: If P < e-, then accept N = n. Otherwise, reject the current n, increase n by one, and return to step 2. Notice that upon completion of step 2, P is equal to the rightmost expression in relation (5.31). The basic idea of a rejection technique is again exhibited; if P > e- in step 3, then n is rejected and the generation process must proceed through at least one more trial. How many random numbers will be required, on the average, to generate one Poisson variate, N? If N=n, then n+1 random numbers are required so the average number is given by Example 4: Generate three Poisson variates with mean =0.2. First compute steps 1 to 3 above: Step l: Set n = 0, P = 1. Step 2: R1 = 0.4357, P = 1*R1 = 0.4357. Step 3: Since P = 0.4357 < e- = 0.8187, accept N = 0.
Chandrashekar B, Asst.Professor, MIT-Mysore Page 15
E(N+1) = +1
This is quite large if the mean alpha of the Poisson distribution is large. =e-0.2=
0.8187. Next get a sequence of random numbers R from Table A.1 and follow
06CS82
Step1-3: (R1 =0.4146 leads to N = 0.) Step1: Set n = 0, P = 1. Step2: R1=0.8353, P = 1 * R1 =0.8353. Step3: Since P > e- , reject n = 0 and return to step 2 with n=1. Step2: R2 = 0.9952, P = R1*R2 = 0.8313. Step3. Since P > e-, reject n = 1 and return to step 2 with n = 2. Step2. R3 = 0.8004, P = R1*R2*R3 = 0.6654. Step3. Since P < e-, accept N = 2. Five random numbers, to generate three Poisson variates here (N = 0, and N =2), but in the long run to generate, say, 1000 Poisson variates = 0.2 it would require approximately 1000 (a +1) or 1200 random numbers. n 0 0 0 1 2 Rn+1 0.4357 0.4146 0.8353 0.9952 0.8004 P 0.4357 0.4146 0.8353 0.8313 0.6654 accept/reject P< e- (accept) P< e- (accept) P e- (reject) P e- (reject) p< e- (accept) N=2 Result N=0 N=0
5.2.2 Non-stationary Poisson Process (NSPP): A Possion arrival process with an arrival rate that varies with time Idea behind thinning: Generate a stationary Poisson arrival process at the fastest rate, * = max (t) But accept only a portion of arrivals, thinning out just enough to get the desired time-varying rate. Step 1: Let *=max (t) be the maximum of the arrival rate function and set t=0 and i=1. Step 2: Generate E from the exponential distribution with rate * and t=t+E.
Chandrashekar B, Asst.Professor, MIT-Mysore Page 16
06CS82
Step 3: Generate random number R from the U(0,1) distribution. If R (t)/ * then Ti=t and i=i+1. Step 4: Go to step2. t (min) 0 60 120 180 240 300 360 420 480 Procedure: Step1: * = max (t) = 1/5, t = 0 and i = 1. Step2: For random number R = 0.2130, E = -5ln (0.213) = 13.13 t = 13.13 Step3: Generate R = 0.8830 (13.13)/*=(1/15)/(1/5)=1/3 Since R>1/3, do not generate the arrival Step2: For random number R = 0.5530, E = -5ln (0.553) = 2.96 t = 13.13 + 2.96 = 16.09 Step3. Generate R = 0.0240 (16.09)/*=(1/15)/(1/5)=1/3 Since R<1/3, T1 = t = 16.09, and i = i + 1 = 2
Chandrashekar B, Asst.Professor, MIT-Mysore Page 17
Arrival Rate (t) (arrivals/min) 1/15 1/12 1/7 1/5 1/8 1/10 1/15 1/20 1/20
06CS82
5.3 Special Properties Based on features of particular family of probability distributions For example: Direct Transformation for normal and lognormal distributions Convolution Beta distribution (from gamma distribution) 5.3.1 Direct Transformation for normal and lognormal distributions: Many methods have been developed for generating normally distributed random variates. The inverse transform technique cannot easily be applied, however, because the inverse cdf cannot be written in closed form. The standard normal cdf is given by
Consider two standard normal variates, Z1, and Z2, plotted as a point in the plane as shown in fig.8.7 and represented in polar coordinates as Z1=B cos Z2=B sin Z2-axis (Z1, Z2) B Z2 (8.26)
Z1
Z1-axis
06CS82
degrees of freedom, which is equivalent to an exponential distribution with mean 2. Thus, the radius, B, can be generated by equation: B=(-2lnR)1/2 (8.27)
By the symmetry of the normal distribution, it seems reasonable to suppose, and indeed it is the case, that the angle is uniformly distributed between 0 and 2 radians. The radius, B, and the angle, , are mutually independent. Combining equations 8.26 and 8.27 gives a direct method for generating two independent standard normal variates, Z1 and Z2, from two independent random numbers, R1 and R2: Z1=(-2lnR1)1/2cos(2R2) Z2=(-2lnR1)1/2sin(2R2) 5.3.2 Convolution Method: The probability distribution of a sum of two or more independent random variables is called a convolution of the distributions of the original variables. The convolution method thus refers to adding together two or more random variables to obtain a new random variable with the desired distribution. This technique can be applied to obtain Erlang variates and binomial variates. 5.3.3 More special properties:There are many relationships among probability distributions that can be exploited for random-variate generation. The convolution method is one example. Another particularly useful example is the relationship between the beta distribution and gamma distribution.
Page 19
06CS82
Suppose that X1 has a gamma distribution with shape parameter 1 and scale parameter = , while X2 has a gamma distribution with shape = , and that these two random
has a beta distribution with parameters 1 and 2 on the interval [0,1]. If, instead, we want Y to be defined on the interval (a, b), then set
Page 20