Continuous and Discrete Distributions
Continuous and Discrete Distributions
6.2
USEFUL PROBABILITY DISTRIBUTIONS
The purpose of this section is to discuss a variety of distributions that have been found
to be useful in simulation modeling and to provide a unified listing of relevant prop-
erties of these distributions [see also Forbes et al. (2011); Johnson, Kotz, and
Balakrishnan (1994, 1995); and Johnson, Kotz, and Kemp (1992)]. Section 6.2.1
provides a short discussion of common methods by which continuous distributions
are defined, or parameterized. Then Secs. 6.2.2 and 6.2.3 contain compilations of
several continuous and discrete distributions. Finally, Sec. 6.2.4 suggests how the
data themselves can be used directly to define an empirical distribution.
For a given family of continuous distributions, e.g., normal or gamma, there are
usually several alternative ways to define, or parameterize, the probability density
function. However, if the parameters are defined correctly, they can be classified, on
the basis of their physical or geometric interpretation, as being one of three basic
types: location, scale, or shape parameters.
A location parameter g specifies an abscissa (x axis) location point of a distri-
bution’s range of values; usually g is the midpoint (e.g., the mean m for a normal
distribution) or lower endpoint (see Sec. 6.8) of the distribution’s range. (In the lat-
ter case, location parameters are sometimes called shift parameters.) As g changes,
the associated distribution merely shifts left or right without otherwise changing.
Also, if the distribution of a random variable X has a location parameter of 0, then
the distribution of the random variable Y 5 X 1 g has a location parameter of g.
A scale parameter b determines the scale (or unit) of measurement of the val-
ues in the range of the distribution. (The standard deviation s is a scale parameter
for the normal distribution.) A change in b compresses or expands the associated
286 selecting input probability distributions
distribution without altering its basic form. Also, if the distribution of the random
variable X has a scale parameter of 1, then the distribution of the random variable
Y 5 bX has a scale parameter of b.
A shape parameter a determines, distinct from location and scale, the basic
form or shape of a distribution within the general family of distributions of interest.
A change in a generally alters a distribution’s properties (e.g., skewness) more fun-
damentally than a change in location or scale. Some distributions (e.g., exponential
and normal) do not have a shape parameter, while others (e.g., beta) may have two.
Table 6.3 gives information relevant to simulation modeling applications for 13 con-
tinuous distributions. Possible applications are given first to indicate some (certainly
not all) uses of the distribution [see Hahn and Shapiro (1994) and Lawless (2003) for
other applications]. Then the density function and distribution function (if it exists in
simple closed form) are listed. Next is a short description of the parameters, includ-
ing their possible values. The range indicates the interval where the associated ran-
dom variable can take on values. Also listed are the mean (expected value), variance,
and mode, i.e., the value at which the density function is maximized. MLE refers to
the maximum-likelihood estimator(s) of the parameter(s), treated later in Sec. 6.5.
General comments include relationships of the distribution under study to other dis-
tributions. Graphs are given of the density functions for each distribution. The nota-
tion following the name of each distribution is our abbreviation for that distribution,
which includes the parameters. The symbol , is read “is distributed as.”
Note that we have included the less familiar Johnson SB, Johnson SU, log-
logistic, Pearson type V, and Pearson type VI distributions, because we have found
that these distributions often provide a better fit to data sets than standard distribu-
tions such as gamma, lognormal, and Weibull.
TABLE 6.3
Continuous distributions
Uniform U(a, b)
Possible Used as a “first” model for a quantity that is felt to be randomly varying between
applications a and b but about which little else is known. The U(0, 1) distribution is essential
in generating random values from all other distributions (see Chaps. 7 and 8).
1
if a # x # b
Density f(x) 5 • b 2 a
(See Fig. 6.5) 0 otherwise
0 if x , a
x2a
Distribution F(x) 5 µ if a # x # b
b2a
1 if b , x
Parameters a and b real numbers with a , b; a is a location parameter, b 2 a is a scale
parameter
chapter six 287
Uniform U(a, b)
Range [a, b]
a1b
Mean
2
(b 2 a) 2
Variance
12
Mode Does not uniquely exist
MLE â 5 min Xi, b̂ 5 max Xi
1#i#n 1#i#n
Comments 1. The U(0, 1) distribution is a special case of the beta distribution (when
al 5 a2 5 1).
2. If X , U(0, 1) and [x, x 1 Dx] is a subinterval of [0, 1] with Dx $ 0,
x1¢x
P(X [ [x, x 1 ¢x]) 5 #x 1dy 5 (x 1 ¢x) 2 x 5 ¢x
f(x)
1/(b a)
FIGURE 6.5
0 a b x U(a, b) density function.
Exponential expo(B)
Possible Interarrival times of “customers” to a system that occur at a constant rate, time
applications to failure of a piece of equipment.
1
e2xyb if x $ 0
Density f (x) 5 • b
(see Fig. 6.6) 0 otherwise
1 2 e2xyb if x $ 0
Distribution F(x) 5 e
0 otherwise
Parameter Scale parameter b . 0
Range [0, `)
Mean b
Variance b2
Mode 0
(continued)
288 selecting input probability distributions
Exponential expo(B)
MLE b̂ 5 X(n)
Comments 1. The expo(b) distribution is a special case of both the gamma and Weibull
distributions (for shape parameter a 5 1 and scale parameter b in both cases).
2. If X1, X2, . . . , Xm are independent expo(b) random variables, then X1 1
X2 1 . . . 1 Xm , gamma(m, b), also called the m-Erlang(b) distribution.
3. The exponential distribution is the only continuous distribution with the
memoryless property (see Prob. 4.30).
f (x)
1.2
1.0
0.8
0.6
0.4
0.2
0 1 2 3 4 5 6 x
FIGURE 6.6
expo(1) density function.
Gamma gamma(A, B)
Possible Time to complete some task, e.g., customer service or machine repair
applications
b2ax a21e2xyb
if x . 0
Density f (x) 5 • G(a)
(see Fig. 6.7) 0 otherwise
where G(a) is the gamma function, defined by G(z) 5 e0` t z21e2tdt for any real
number z . 0. Some properties of the gamma function: G(z 1 1) 5 zG(z)
for any z . 0, G(k 1 1) 5 k! for any nonnegative integer k, G(k 1 12 ) 5
1p . 1 . 3 . 5 . . . (2k 2 1)y2k for any positive integer k, G(1y2) 5 1p
chapter six 289
Gamma gamma(A, B)
Distribution If a is not an integer, there is no closed form. If a is a positive integer, then
a21 (xyb) j
1 2 e2xyb ^ if x . 0
F(x) 5 • j50 j!
0 otherwise
Parameters Shape parameter a . 0, scale parameter b . 0
Range [0, `)
Mean ab
Variance ab2
Mode b(a 2 1) if a $ 1, 0 if a , 1
MLE The following two equations must be satisfied:
n
^ ln Xi
i51
ln b̂ 1 °(â) 5 , âb̂ 5 X(n)
n
which could be solved numerically. [° (â) 5 G¿(â)yG(â) and is called the
digamma function; G¿ denotes the derivative of G.] Alternatively, approxima-
tions to â and b̂ can be obtained by letting T 5 [ln X(n) 2 Oni51 ln Xi yn] 21,
using Table 6.21 (see App. 6A) to obtain â as a function of T, and letting
b̂ 5 X(n)yâ. [See Choi and Wette (1969) for the derivation of this procedure
and of Table 6.21.]
f(x)
1.2
1.0
12
0.8
0.6
1
2
0.4
3
0.2
0 1 2 3 4 5 6 7 x
FIGURE 6.7
gamma(a, 1) density functions. (continued)
290 selecting input probability distributions
Gamma gamma(A, B)
Comments 1. The expo(b) and gamma(1, b) distributions are the same.
2. For a positive integer m, the gamma(m, b) distribution is called the
m-Erlang(b) distribution.
3. The chi-square distribution with k df is the same as the gamma(ky2, 2)
distribution.
4. If X1, X2, . . . , Xm are independent random variables with Xi , gamma(ai, b),
then X1 1 X2 1 . . . 1 Xm , gamma(a1 1 a2 1 . . . 1 am, b).
5. If X1 and X2 are independent random variables with Xi , gamma(ai, b), then
X1y(X1 1 X2) , beta(a1, a2).
6. X , gamma(a, b) if and only if Y 5 1yX has a Pearson type V distribution
with shape and scale parameters a and 1yb, denoted PT5(a, 1yb).
7.
` if a , 1
1
lim f (x) 5 µ if a 5 1
xS0 b
0 if a . 1
Weibull Weibull(A, B)
Possible Time to complete some task, time to failure of a piece of equipment; used as a
applications rough model in the absence of data (see Sec. 6.11)
a
ab2axa21e2(xyb) if x . 0
Density f (x) 5 e
(see Fig. 6.8) 0 otherwise
a
1 2 e2(xyb) if x . 0
Distribution F(x) 5 e
0 otherwise
Parameters Shape parameter a . 0, scale parameter b . 0
Range [0, `)
b 1
Mean Ga b
a a
b2 2 1 1 2
Variance e 2G a b 2 c G a b d f
a a a a
a 2 1 1ya
ba b if a $ 1
Mode • a
0 if a , 1
MLE The following two equations must be satisfied:
n n n 1yâ
^ X âi ln Xi 1
^ ln Xi °
^ X âi¢
i51 i51 i51
n 2 5 , b̂ 5
â n n
^ X âi
i51
The first can be solved for â numerically by Newton’s method, and the second
equation then gives b̂ directly. The general recursive step for the Newton
iterations is
A 1 1yâk 2 CkyBk
âk11 5 âk 1
1yâ2k 1 ( Bk Hk 2 Ck2 ) yB2k
chapter six 291
f(x)
1.2
3
1.0
0.8
2
0.6
1
0.4
12
0.2
f(x)
2.00
1.75
1.50
1.25
1.00
5
0.75
4
0.50
0.25
0.00
0.00 0.25 0.50 0.75 1.00 1.25 1.50 x
(b)
FIGURE 6.8
Weibull(a, 1) density functions. (continued)
292 selecting input probability distributions
Weibull Weibull(A, B)
where
n
^ ln Xi n n
i51
A5 , Bk 5 ^ X âi k, Ck 5 ^ X âi k ln Xi
n i51 i51
and
n
Hk 5 ^ X âi k (ln Xi ) 2
i51
[See Thoman, Bain, and Antle (1969) for these formulas, as well as for confidence
intervals on the true a and b.] As a starting point for the iterations, the estimate
n n 2 21y2
(6yp2 ) c ^ (ln Xi ) 2 2 a ^ ln Xi b ^ n d
â0 5 • i51 i51 ¶
n21
[due to Menon (1963) and suggested in Thoman, Bain, and Antle (1969)] may
be used. With this choice of â0, it was reported in Thoman, Bain, and Antle
(1969) that an average of only 3.5 Newton iterations were needed to achieve
four-place accuracy.
Comments 1. The expo(b) and Weibull(1, b) distributions are the same.
2. X , Weibull(a, b) if and only if X a , expo(b a) (see Prob. 6.2).
3. The (natural) logarithm of a Weibull random variable has a distribution
known as the extreme-value or Gumbel distribution [see Averill M. Law &
Associates (2013), Lawless (2003), and Prob. 8.1(b)].
4. The Weibull(2, b) distribution is also called a Rayleigh distribution with
parameter b, denoted Rayleigh(b). If Y and Z are independent normal ran-
dom variables with mean 0 and variance b2 (see the normal distribution), then
X 5 (Y 2 1 Z 2)1y2 , Rayleigh(21y2b).
5. As a S `, the Weibull distribution becomes degenerate at b. Thus, Weibull
densities for large a have a sharp peak at the mode.
6. The Weibull distribution has a negative skewness when a . 3.6 [see
Fig. 6.8(b)].
7.
` if a , 1
1
lim f(x) 5 µ if a 5 1
xS0 b
0 if a . 1
m m
s2 5 ^ ^ bi bjCij
i51 j51
Note that we need not assume independence of the Xi’s. If the Xi’s are
independent, then
m
s2 5 ^ b2i Var(Xi )
i51
3. The N(0, 1) distribution is often called the standard or unit normal distri-
bution.
4. If X1, X2, . . . , Xk are independent standard normal random variables, then
X12 1 X22 1 . . . 1 Xk2 has a chi-square distribution with k df, which is also
the gamma(ky2, 2) distribution.
f (x)
0.5
0.4
0.3
0.2
0.1
0
3 2 1 0 1 2 3 x
FIGURE 6.9
N(0, 1) density function. (continued)
294 selecting input probability distributions
5. If X , N(m, s2), then eX has the lognormal distribution with scale parameter
e m and shape parameter s, denoted LN(m, s2).
6. If X , N(0, 1), if Y has a chi-square distribution with k df, and if X and Y are
independent, then Xy1Yyk has a t distribution with k df (sometimes called
Student’s t distribution).
7. If the normal distribution is used to represent a nonnegative quantity (e.g.,
time), then its density should be truncated at x 5 0 (see Sec. 6.8).
8. As s S 0, the normal distribution becomes degenerate at m.
f(x)
1.0
32
12
0.8
0.6
1
0.4
0.2
0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 x
FIGURE 6.10
LN(0, s2) density functions.
chapter six 295
MLE m̂ 5
i51
, ŝ 5 £
^ (ln Xi 2 m̂) 2 § , MLE for scale parameter 5 e m̂
i51
n
n
Comments 1. X , LN(m, s2) if and only if ln X , N(m, s2). Thus, if one has data X1,
X2, . . . , Xn that are thought to be lognormal, the logarithms of the data points
ln X1, ln X2, . . . , ln Xn can be treated as normally distributed data for pur-
poses of hypothesizing a distribution, parameter estimation, and goodness-
of-fit testing.
2. As s S 0, the lognormal distribution becomes degenerate at e m. Thus,
lognormal densities for small s have a sharp peak at the mode.
3. lim f(x) 5 0, regardless of the parameter values.
xS0
0 0.2 0.4 0.6 0.8 1.0 x 0 0.2 0.4 0.6 0.8 1.0 x
(a) (b)
f(x) f(x)
3 1 2, 2 0.8 3
1 0.8, 2 2
2 1 1, 2 2 2 1 0.2, 2 0.8
1 2, 2 1
1 0.8, 2 0.2
1 1 1 0.5, 2 0.5
0 0.2 0.4 0.6 0.8 1.0 x 0 0.2 0.4 0.6 0.8 1.0 x
(c) (d )
FIGURE 6.11
beta(a1, a2) density functions.
chapter six 297
1
1 2 FG a b if x . 0
Distribution F(x) 5 • x
0 otherwise
where FG (x) is the distribution function of a gamma(a, lyb) random variable
Parameters Shape parameter a . 0, scale parameter b . 0
f (x)
3.6
4
3.0
2.4
1.8
2
1.2
1
12
0.6
0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 x
FIGURE 6.12
PT5(a, 1) density functions.
(continued)
298 selecting input probability distributions
Range [0, `)
b
Mean for a . 1
a21
b2
Variance for a . 2
(a 2 1) 2 (a 2 2)
b
Mode
a11
MLE If one has data X1, X2, . . . , Xn, then fit a gamma(aG, bG) distribution to 1yX1,
1yX2, . . . , 1yXn, resulting in the maximum-likelihood estimators âG and b̂G.
Then the maximum-likelihood estimators for the PT5(a, b) are â 5 âG and
b̂ 5 1yb̂G (see comment 1 below).
Comments 1. X , PT5(a, b) if and only if Y 5 1yX , gamma(a, 1yb). Thus, the Pearson
type V distribution is sometimes called the inverted gamma distribution.
2. Note that the mean and variance exist only for certain values of the shape
parameter.
f(x) f(x)
4 1 12 4
1 1
3 3 2 4
2 2 2 2
2 4 2 2 2 1
1 1
2 1
0 1 2 3 4 5 6 x 0 1 2 3 4 5 6 x
(a) (b)
f(x) f(x)
4 1 2 4 1 4
3 3
2 2 2 4
2 4
2 2 2 2
1 1
2 1 2 1
0 1 2 3 4 5 6 x 0 1 2 3 4 5 6 x
(c) (d)
FIGURE 6.13
PT6(a1, a2, 1) density functions.
Log-logistic LL(A, B)
Possible Time to perform some task
applications
a(xyb) a21
if x . 0
Density f (x) 5 • b[1 1 (xyb)a ] 2
(see Fig. 6.14) 0 otherwise
1
if x . 0
Distribution F(x) 5 • 1 1 (xyb) 2a
0 otherwise
Parameters Shape parameter a . 0, scale parameter b . 0
(continued)
300 selecting input probability distributions
Log-logistic LL(A, B)
Range [0, `)
Mean bu cosecant(u) for a . 1, where u 5 pya
Variance b u { 2 cosecant(2u) 2 u[cosecant(u)]2 }
2
for a . 2
a21 1ya
ba b if a . 1
Mode • a11
0 otherwise
MLE Let Yi 5 ln Xi. Solve the following two equations for â and b̂:
n
21 n
^ [ 1 1 e(Yi 2â)yb̂ ] 5
2
(6.1)
i51
and
n Yi 2 â 1 2 e(Yi 2â)yb̂
^a b̂
b
1 1 e(Yi 2â)yb̂
5n (6.2)
i51
Then the MLEs for the log-logistic distribution are â 5 1yâ and b̂ 5 eb̂.
Johnson, Kotz, and Balakrishnan (1995, chap. 23) suggest solving Eqs. (6.1)
and (6.2) by using Newton’s method.
f (x)
1.2
1.0
3
0.8
0.6 2
0.4
1
1
0.2 2
0.0
0 1 2 3 4 5 6 x
FIGURE 6.14
LL(a, 1) density functions.
chapter six 301
Log-logistic LL(A, B)
1 x2a 2
a2 (b 2 a) 2 c a11a2 ln a
2 b2x
bd
e if a , x , b
Density f (x) 5 • (x 2 a)(b 2 x) 12p
(see Fig. 6.15) 0 otherwise
x2a
£ c a1 1 a2 ln a bd if a , x , b
Distribution F(x) 5 • b2x
0 otherwise
where F(x) is the distribution function of a normal random variable with
m 5 0 and s2 5 1
Parameters Location parameter a [ (2`, ` ), scale parameter b 2 a (b . a), shape
parameters a1 [ (2`, ` ) and a2 . 0
Range [a, b]
Mean All moments exist but are extremely complicated [see Johnson, Kotz, and
Balakrishnan (1994, p. 35)]
1
Mode The density is bimodal when a2 , and
12
21 2 2a22
Za1Z , 2 2a2 tanh21 ( 21 2 2a22 )
a2
[tanh21 is the inverse hyperbolic tangent]; otherwise the density is unimodal. The
equation satisfied by any mode x, other than at the endpoints of the range, is
2(x 2 a) x2a
5 1 1 a1a2 1 a22 lna b
b2a b2x
Comments 1. X , JSB(a1, a2, a, b) if and only if
X2a
a1 1 a2 lna b , N(0, 1)
b2X
2. The density function is skewed to the left, symmetric, or skewed to the right
if a1 . 0, a1 5 0, or a1 , 0, respectively.
3. lim f(x) 5 lim f(x) 5 0 for all values of a1 and a2.
xSa xSb
4. The four parameters may be estimated using a number of methods [see, for
example, Swain, Venkatraman, and Wilson (1988) and Slifker and Shapiro
(1980)].
(continued)
302 selecting input probability distributions
f (x)
6 1 3, 2 2 1 3, 2 2
5 1 2, 2 2
1 2, 2 2
4
1 0, 2 2
0
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 x
(a)
f(x)
5
4
1 0.533, 2 0.5
1 0.533, 2 0.8
1 0, 2 0.5
2 1 0, 2 0.8
0
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 x
(b)
FIGURE 6.15
JSB(a1, a2, 0, 1) density functions.
chapter six 303
x2g x2g 2
Distribution F(x) 5 £ e a1 1 a2 ln c 1 a b 1 1df for 2` , x , `
b B b
Parameters Location parameter g [ (2`, ` ), scale parameter b . 0, shape parameters
a1 [ (2`, ` ) and a2 . 0
Range (2`, `)
2 a1
Mean g 2 be1y(2a 2) sinh a b, where sinh is the hyperbolic sine
a2
Mode The equation satisfied by the mode, other than at the endpoints of the range, is
g 1 by, where y satisfies
y 1 a1a2 2y2 1 1 1 a22 2y2 1 1 ln(y 1 2y2 1 1) 5 0
Comments 1. X , JSU(a1, a2, g, b) if and only if
X2g X2g 2
a1 1 a2 ln c 1 a b 1 1 d , N(0, 1)
b B b
2. The density function is skewed to the left, symmetric, or skewed to the right
if a1 . 0, a1 5 0, or a1 , 0, respectively.
3. The four parameters may be estimated by a number of methods [see, for exam-
ple, Swain, Venkatraman, and Wilson (1988) and Slifker and Shapiro (1980)].
f (x)
0.8
0.7
1 0, 2 2
0.6
1 2, 2 2 1 2, 2 2
0.5
0.4
1 3, 2 2
1 3, 2 2
0.3
0.2
0.1
0.0
6 4 2 0 2 4 6 x
(a)
FIGURE 6.16
JSU(a1, a2, 0, 1) density functions.
(continued)
304 selecting input probability distributions
f(x)
0.8
1 1, 2 2
0.7
0.6
1 1, 2 32
0.5
0.4
1 1, 2 1
0.3
0.2
1 1, 2 12
0.1
0.0
4 2 0 2 4 6 8 10 12 x
(b)
FIGURE 6.16
(continued)
Triangular triang(a, b, m)
Possible Used as a rough model in the absence of data (see Sec. 6.11)
applications
2(x 2 a)
if a # x # m
(b 2 a)(m 2 a)
Density f(x) 5 e 2(b 2 x)
if m , x # b
(see Fig. 6.17) (b 2 a)(b 2 m)
0 otherwise
0 if x , a
(x 2 a) 2
if a # x # m
(b 2 a)(m 2 a)
Distribution F(x) 5 f
(b 2 x) 2
12 if m , x # b
(b 2 a)(b 2 m)
1 if b , x
Parameters a, b, and m real numbers with a , m , b. a is a location parameter, b 2 a is a
scale parameter, m is a shape parameter
Range [a, b]
a1b1m
Mean
3
chapter six 305
Triangular triang(a, b, m)
a2 1 b2 1 m2 2 ab 2 am 2 bm
Variance
18
Mode m
MLE Our use of the triangular distribution, as described in Sec. 6.11, is as a rough
model when there are no data. Thus, MLEs are not relevant.
Comment The limiting cases as m S b and m S a are called the right triangular and left
triangular distributions, respectively, and are discussed in Prob. 8.7. For a 5 0
and b 5 1, both the left and right triangular distributions are special cases of
the beta distribution.
f(x)
2/(b a)
FIGURE 6.17
0 a m b x triang(a, b, m) density function.
The descriptions of the six discrete distributions in Table 6.4 follow the same pattern
as for the continuous distributions in Table 6.3.
In some situations we might want to use the observed data themselves to specify
directly (in some sense) a distribution, called an empirical distribution, from which
random values are generated during the simulation, rather than fitting a theoretical
distribution to the data. For example, it could happen that we simply cannot find a
theoretical distribution that fits the data adequately (see Secs. 6.4 through 6.6). This
section explores ways of specifying empirical distributions.
For continuous random variables, the type of empirical distribution that can be
defined depends on whether we have the actual values of the individual original
observations X1, X2, . . . , Xn rather than only the number of Xi’s that fall into each of
several specified intervals. (The latter case is called grouped data or data in the
form of a histogram.) If the original data are available, we can define a continuous,
piecewise-linear distribution function F by first sorting the Xi’s into increasing
306 selecting input probability distributions
TABLE 6.4
Discrete distributions
Bernoulli Bernoulli( p)
Possible applications Random occurrence with two possible outcomes; used to generate other dis-
crete random variates, e.g., binomial, geometric, and negative binomial
12p if x 5 0
Mass (see Fig. 6.18) p(x) 5 • p if x 5 1
0 otherwise
0 if x , 0
Distribution F(x) 5 • 1 2 p if 0 # x , 1
1 if 1 # x
Parameter p [ (0, 1)
Range {0, 1}
Mean p
Variance p(1 2 p)
1
0 if p , 2
Mode • 0 and 1 if p 5 1
2
1
1 if p . 2
MLE p̂ 5 X(n)
Comments 1. A Bernoulli(p) random variable X can be thought of as the outcome of
an experiment that either “fails” or “succeeds.” If the probability of
success is p, and we let X 5 0 if the experiment fails and X 5 1 if it
succeeds, then X , Bernoulli( p). Such an experiment, often called a
Bernoulli trial, provides a convenient way of relating several other dis-
crete distributions to the Bernoulli distribution.
2. If t is a positive integer and X1, X2, . . . , Xt are independent Bernoulli(p)
random variables, then X1 1 X2 1 . . . 1 Xt has the binomial distribu-
tion with parameters t and p. Thus, a binomial random variable can be
thought of as the number of successes in a fixed number of independent
Bernoulli trials.
3. Suppose we begin making independent replications of a Bernoulli trial
with probability p of success on each trial. Then the number of failures
before observing the first success has a geometric distribution with
parameter p. For a positive integer s, the number of failures before
p(x)
p
1p
FIGURE 6.18
Bernoulli(p) mass function (p . 0.5
0 1 x here).
chapter six 307
Bernoulli Bernoulli( p)
observing the sth success has a negative binomial distribution with
parameters s and p.
4. The Bernoulli(p) distribution is a special case of the binomial distribu-
tion (with t 5 1 and the same value for p).
p(x)
1( j i 1)
Binomial bin(t, p)
Possible applications Number of successes in t independent Bernoulli trials with probability p of
success on each trial; number of “defective” items in a batch of size t;
number of items in a batch (e.g., a group of people) of random size;
number of items demanded from an inventory
t
a b p x (1 2 p) t2x if x [ {0, 1, . . . , t}
Mass (see Fig. 6.20) p(x) 5 • x
0 otherwise
t
where a b is the binomial coefficient, defined by
x
t t!
a b5
x x!(t 2 x)!
0 if x , 0
:x;
t
Distribution F(x) 5 µ ^ a b pi (1 2 p) t2i if 0 # x # t
i50 i
1 if t , x
Parameters t a positive integer, p [ (0, 1)
Range {0, 1, . . . , t}
p(x) p(x)
t5 t 10
0.6 0.6
p 0.1 p 0.1
0.5 0.5
0.4 0.4
0.3 0.3
0.2 0.2
0.1 0.1
0 0
0 1 2 3 4 5 x 0 1 2 3 4 5 6 7 8 9 10 x
p(x) p(x)
t5 t 10
0.6 0.6
p 0.5 p 0.5
0.5 0.5
0.4 0.4
0.3 0.3
0.2 0.2
0.1 0.1
0 0
0 1 2 3 4 5 x 0 1 2 3 4 5 6 7 8 9 10 x
FIGURE 6.20
bin(t, p) mass functions.
chapter six 309
Binomial bin(t, p)
Mean tp
Variance tp(1 2 p)
p(t 1 1) 2 1 and p(t 1 1) if p(t 1 1) is an integer
Mode e
: p(t 1 1) ; otherwise
MLE If t is known, then p̂ 5 X(n)yt. If both t and p are unknown, then t̂ and p̂
exist if and only if X(n) . (n 2 1)S 2 (n)yn 5 V(n). Then the following
approach could be taken. Let M 5 max Xi, and for k 5 0, 1, . . . , M, let
1#i#n
fk be the number of Xi’s $ k. Then it can be shown that t̂ and p̂ are the
values for t and p that maximize the function
M p
g(t, p) 5 ^ fk ln(t 2 k 1 1) 1 nt ln(1 2 p) 1 nX(n) ln
k51 12p
subject to the constraints that t [ {M, M 1 1, . . .} and 0 , p , 1. It is
easy to see that for a fixed value of t, say t0, the value of p that maximizes
g(t0, p) is X(n)yt0, so t̂ and p̂ are the values of t and X(n)yt that lead to
the largest value of g[t, X(n)yt] for t [ {M, M 1 1, . . . , M¿}, where
M9 is given by [see DeRiggi (1983)]
X(n)(M 2 1)
M¿ 5 j k
1 2 [V(n)yX(n)]
Note also that g[t, X(n)yt] is a unimodal function of t.
Comments 1. If Y1, Y2, . . . , Yt are independent Bernoulli( p) random variables, then
Y1 1 Y2 1 . . . 1 Yt , bin(t, p).
2. If X1, X2, . . . , Xm are independent random variables and Xi , bin(ti, p),
then X1 1 X2 1 . . . 1 Xm , bin(t1 1 t2 1 . . . 1 tm, p).
3. The bin(t, p) distribution is symmetric if and only if p 5 12.
4. X , bin(t, p) if and only if t 2 X , bin(t, 1 2 p).
5. The bin(1, p) and Bernoulli( p) distributions are the same.
Geometric geom( p)
Possible applications Number of failures before the first success in a sequence of independent
Bernoulli trials with probability p of success on each trial; number of items
inspected before encountering the first defective item; number of items in a
batch of random size; number of items demanded from an inventory
p(1 2 p) x if x [ {0, 1, . . .}
Mass (see Fig. 6.21) p(x) 5 e
0 otherwise
1 2 (1 2 p) :x; 11 if x $ 0
Distribution F(x) 5 e
0 otherwise
Parameter p [ (0, 1)
Range {0, 1, . . .}
12p
Mean
p
(continued)
310 selecting input probability distributions
Geometric geom( p)
12p
Variance
p2
Mode 0
1
MLE p̂ 5
X(n) 1 1
Comments 1. If Y1, Y2, . . . is a sequence of independent Bernoulli( p) random vari-
ables and X 5 min{i: Yi 5 1} 2 1, then X , geom( p).
2. If X1, X2, . . . , Xs are independent geom( p) random variables, then
X1 1 X2 1 . . . 1 Xs has a negative binomial distribution with parame-
ters s and p.
3. The geometric distribution is the discrete analog of the exponential dis-
tribution, in the sense that it is the only discrete distribution with the
memoryless property (see Prob. 4.31).
4. The geom(p) distribution is a special case of the negative binomial dis-
tribution (with s 5 1 and the same value for p).
p(x)
p 0.25
0.6
0.5
0.4
0.3
0.2
0.1
0
0 1 2 3 4 5 6 7 8 9 10 x
p(x)
p 0.50
0.6
0.5
0.4
0.3
0.2
0.1
0 FIGURE 6.21
0 1 2 3 4 5 6 7 8 9 10 x geom(p) mass functions.
chapter six 311
Possible applications Number of failures before the sth success in a sequence of independent
Bernoulli trials with probability p of success on each trial; number of good
items inspected before encountering the sth defective item; number of items
in a batch of random size; number of items demanded from an inventory
s1x21 s
a b p (1 2 p) x if x [ {0, 1, . . .}
Mass (see Fig. 6.22) p(x) 5 • x
0 otherwise
:x;
s1i21 s
Distribution F(x) 5 •
^a i
b p (1 2 p) i if x $ 0
i50
0 otherwise
Parameters s a positive integer, p [ (0, 1)
Range {0, 1, . . .}
s(1 2 p)
Mean
p
s(1 2 p)
Variance
p2
Mode Let y 5 [s(1 2 p) 2 1]yp; then
y and y 1 1 if y is an integer
Mode 5 e
: y; 1 1 otherwise
MLE If s is known, then p̂ 5 sy[X(n) 1 s]. If both s and p are unknown, then ŝ
and p̂ exist if and only if V(n) 5 (n 2 1)S 2 (n)yn . X(n). Let M 5
max Xi, and for k 5 0, 1, . . . , M, let fk be the number of Xi’s $ k. Then
1#i#n
we can show that ŝ and p̂ are the values for s and p that maximize the
function
M
h(s, p) 5 ^ fk ln(s 1 k 2 1) 1 ns ln p 1 nX(n) ln(1 2 p)
k51
(continued)
312 selecting input probability distributions
p(x)
s2
0.30
p 0.5
0.25
0.20
0.15
0.10
0.05
0
0 1 2 3 4 5 6 7 8 9 10 x
p(x)
s5
0.30 p 0.5
0.25
0.20
0.15
0.10
0.05
0 FIGURE 6.22
0 1 2 3 4 5 6 7 8 9 10 x negbin(s, p) mass functions.
Poisson Poisson(L)
Possible applications Number of events that occur in an interval of time when the events are
occurring at a constant rate (see Sec. 6.12); number of items in a batch
of random size; number of items demanded from an inventory
e2llx
if x [ {0, 1, . . .}
Mass (see Fig. 6.23) p(x) 5 • x!
0 otherwise
0 if x , 0
:x;
Distribution F(x) 5 • li
e2l ^ if x $ 0
i50 i!
Parameter l.0
Range {0, 1, . . .}
Mean l
Variance l
chapter six 313
Poisson Poisson(L)
l 2 1 and l if l is an integer
Mode e
:l; otherwise
MLE l̂ 5 X(n).
Comments 1. Let Y1, Y2, . . . be a sequence of nonnegative IID random variables, and let
X 5 max{i: Oij51 Yj # 1}. Then the distribution of the Yi’s is expo(1yl)
if and only if X , Poisson(l). Also, if X¿ 5 max{i: Oij51 Yj # l}, then
the Yi’s are expo(1) if and only if X9 , Poisson(l) (see also Sec. 6.12).
2. If X1, X2, . . . , Xm are independent random variables and Xi , Poisson(li),
then X1 1 X2 1 . . . 1 Xm , Poisson(l1 1 l2 1 . . . 1 lm ).
p(x) p(x)
0.6 0.5 0.6 1
0.5 0.5
0.4 0.4
0.3 0.3
0.2 0.2
0.1 0.1
0 0
0 1 2 3 4 5 x 0 1 2 3 4 5 x
p(x) p(x)
2 6
0.6 0.6
0.5 0.5
0.4 0.4
0.3 0.3
0.2 0.2
0.1 0.1
0 0
0 1 2 3 4 5 6 7 8 9 10 x 0 1 2 3 4 5 6 7 8 9 10 x
FIGURE 6.23
Poisson(l) mass functions.
order. Let X(i) denote the ith smallest of the Xj’s, so that X(1) # X(2) # . . . # X(n).
Then F is given by
0 if x , X(1)
i21 x 2 X(i)
F(x) 5 µ 1 if X(i) # x , X(i11)
n21 (n 2 1)(X(i11) 2 X(i)) for i 5 1, 2, . . . , n 2 1
1 if X(n) # x
Figure 6.24 gives an illustration for n 5 6. Note that F(x) rises most rapidly
over those ranges of x in which the Xi’s are most densely distributed, as desired.
314 selecting input probability distributions
F(x)
1
4
5
3
5
2
5
1
5
FIGURE 6.24
Continuous, piecewise-linear empirical distribution function from
original data.
Also, for each i, F(X(i)) 5 (i 2 1)y(n 2 1), which is approximately (for large n) the
proportion of the Xj’s that are less than X(i); this is also the way we would like a con-
tinuous distribution function to behave. (See Prob. 6.5 for a discussion of another
way to define F.) However, one clear disadvantage of specifying this particular em-
pirical distribution is that random values generated from it during a simulation run
can never be less than X(1) or greater than X(n) (see Sec. 8.3.16). Also, the mean of
F(x) is not equal to the sample mean X(n) of the Xi’s (see Prob. 6.6).
If, however, the data are grouped, then a different approach must be taken since
we do not know the values of the individual Xi’s. Suppose that the n Xi’s are grouped
into k adjacent intervals [a0, a1), [a1, a2), . . . , [ak21, ak), so that the jth interval
contains nj observations, where n1 1 n2 1 . . . 1 nk 5 n. (Often the aj’s will be
equally spaced, but we need not make this assumption.) A reasonable piecewise-
linear empirical distribution function G could be specified by first letting G(a0) 5 0
and G(aj) 5 (n1 1 n2 1 . . . 1 nj)yn for j 5 1, 2, . . . , k. Then, interpolating linearly
between the aj’s, we define
0 if x , a0
x 2 aj21
G(x) 5 µ G(aj21 ) 1 [G(aj ) 2 G(aj21 )] if aj21 # x , aj
aj 2 aj21
for j 5 1, 2, . . . , k
1 if ak # x
Figure 6.25 illustrates this specification of an empirical distribution for k 5 4. In
this case, G(aj) is the proportion of the Xi’s that are less than aj, and G(x) rises most
rapidly over ranges of x where the observations are most dense. The random values
generated from this distribution, however, will still be bounded both below (by a0)
and above (by ak); see Sec. 8.3.16.
In practice, many continuous distributions are skewed to the right and have a
density with a shape similar to that in Fig. 6.26. Thus, if the sample size n is not very
large, we are likely to have few, if any, observations from the right tail of the true
underlying distribution (since these tail probabilities are usually small). Moreover,
the above empirical distributions do not allow random values to be generated
chapter six 315
G(x)
(n1 n2 n3)n
(n1 n2)n
n1n
a0 a1 a2 a3 a4 x
FIGURE 6.25
Continuous, piecewise-linear empirical distribution function from grouped
data.
beyond the largest observation. On the other hand, very large generated values can
have a significant impact on the disposition of a simulation run. For example, a large
service time can cause considerable congestion in a queueing-type system. As a
result, Bratley, Fox, and Schrage (1987, pp. 131–133, 150–151) suggest append-
ing an exponential distribution to the right side of the empirical distribution, which
allows values larger than X(n) to be generated.
For discrete data, it is quite simple to define an empirical distribution, provided
that the original data values X1, X2, . . . , Xn are available. For each possible value x,
an empirical mass function p(x) can be defined to be the proportion of the Xi’s that
are equal to x. For grouped discrete data we could define a mass function such that
the sum of the p(x)’s over all possible values of x in an interval is equal to the
f(x)
x
FIGURE 6.26
Typical density function experienced in practice.
316 selecting input probability distributions
proportion of the Xi’s in that interval. How the individual p(x)’s are allocated for the
possible values of x within an interval is essentially arbitrary.