Lecture Notes 3 Discrete Probability Distributions For Students
Lecture Notes 3 Discrete Probability Distributions For Students
Topics:
I. Random Variables and their Probability Distributions
II. Cumulative Distribution Functions
III. Expected Values of Random Variables
IV. The Binomial Distribution
V. The Poisson Distribution
A. Random Variables
One of the fundamental concepts of probability theory is that of a random variable.
Consider tossing a coin two times. We can think of the following ordered sample space:
The outcome of a random experiment need not be a number, but we are often interested in some
(numerical) measurement of the outcome. we may be interested in the total number of heads that
occur and not care at all about the actual head–tail sequence that results. The number of Heads
obtained is numeric in nature that can be 0, 1, or 2. These quantities of interest, or, more formally,
these real-valued functions defined on the sample space, are known as random variables.
Definition 1
A random variable is a variable that assumes numerical values associated with events of an
experiment.
– If X is a function that assigns a real numbered value to every possible event in a sample space
of interest, X is called a. random variable
– The specified value of the random variable is unknown until the experimental outcome is
observed.
Example 1 In tossing dice, we are often interested in the sum of the two dice and are not really
concerned about the separate values of each die. That is, we may be interested in knowing that the
sum is 7 and may not be concerned over whether the actual outcome was (1, 6), (2, 5), (3, 4), (4, 3), (5,
2), or (6, 1).
Example 2 Observe 100 babies to be born in a clinic. The number of boys, which have been born, is a
random variable. It may take values from 0 to 100.
Example 4 Select one student from an university and measure his/her height and record this height
by x. Then x is a random variable, assuming values from, say from 100 cm to 250 cm in dependence
upon each specific student.
Example 5 The weight of babies at birth also is a random variable. It can assume values in the
interval, for example, from 800 grams to 6000 grams.
Example 6 Statisticians use sampling plans to either accept or reject batches or lots of material.
Suppose one of these sampling plans involves sampling independently 10 items from a lot of 100 items
in which 12 are defective.
Lecture Notes 3 – Discrete Probability Distributions 2
Engr. Caesar Pobre Llapitan
Let X be the random variable defined as the number of items found defective in the sample of 10. In
this case, the random variable takes on the values 0, 1, 2, . . ., 9, 10.
Example 7 Suppose a sampling plan involves sampling items from a process until a defective is
observed. The evaluation of the process will depend on how many consecutive items are observed. In
that regard, let X be a random variable defined by the number of items observed before a defective is
found. With N a nondefective and D a defective, sample spaces are S = {D} given X = 1, S = {ND} given
X = 2, S = {NND} given X = 3, and so on.
Definition 2
A discrete random variable is one that can assume only a countable number of values.
A continuous random variable can assume any value in one or more intervals on a line.
Among the random variables described above the number of boys in Example 1 and the number of
patients in Example 2 are discrete random variables, the height of students and the weight of babies
are continuous random variables.
In most practical problems, continuous random variables represent measured data, such as all
possible heights, weights, temperatures, distance, or life periods, whereas discrete random variables
represent count data, such as the number of defectives in a sample of k items or the number of
highway fatalities per year in a given state.
Example 1 Suppose you randomly select a student attending your university. Classify each of the
following random variables as discrete or continuous:
a) Number of credit hours taken by the student this semester
b) Current grade point average of the student.
Solution
a) The number of credit hours taken by the student this semester is a discrete random variable
because it can assume only a countable number of values (for example 10, 11, 12, and so on). It
is not continuous since the number of credit hours cannot assume values as 11.5678, 15.3456
and 12.9876 hours.
b) The grade point average for the student is a continuous random variable because it could
theoretically assume any value (for example, 5.455, 8.986) corresponding to the points on the
interval from 0 to 10 of a line.
Example 2 Listed is a series of experiments and associated random variables. In each case, identify
the values that the random variable can assume and state whether the random variable is discrete or
continuous.
Experiment Random Variable (X)
a. Take a 20-question examination Number of questions answered correctly
b. Observe cars arriving at a tollbooth for Number of cars arriving at tollbooth
1 hour
c. Audit 50 tax returns Number of returns containing errors
d. Observe an employee’s work Number of non-productive hours in an
eight-hour workday
e. Weigh a shipment of goods Number of pounds
Lecture Notes 3 – Discrete Probability Distributions 3
Engr. Caesar Pobre Llapitan
Example 1:
A voice communication system for a business contains 48 external lines. At a particular time, the
system is observed, and some of the lines are being used. Let the random variable X denote the
number of lines in use. Then, X can assume any of the integer values 0 through 48. When the system
is observed, if 10 lines are in use, x = 10.
Example 2:
In a semiconductor manufacturing process, two wafers from a lot are tested. Each wafer is classified as
pass or fail. Assume that the probability that a wafer passes the test is 0.8 and that wafers are
independent. The sample space for the experiment and associated probabilities are shown in Table 1.
For example, because of the independence, the probability of the outcome that the first wafer tested
passes and the second wafer tested fails, denoted as pf, is
The random variable X is defined to be equal to the number of wafers that pass. The last column of
the table shows the values of X that are assigned to each outcome in the experiment.
Example 3:
Define the random variable X to be the number of contamination particles on a wafer in
semiconductor manufacturing. Although wafers possess a number of characteristics, the random
variable X summarizes the wafer only in terms of the number of particles.
The possible values of X are integers from zero up to some large value that represents the maximum
number of particles that can be found on one of the wafers. If this maximum number is very large, we
might simply assume that the range of X is the set of integers from zero
to infinity.
Note that more than one random variable can be defined on a sample space. In Example 3, we might
define the random variable Y to be the number of chips from a wafer that fail the final test.
Exercises 1
Classify the following random variables as discrete or continuous:
1. X: the number of automobile accidents per year in Virginia.
2. Y: the length of time to play 18 holes of golf.
3. M: the amount of milk produced yearly by a particular cow.
4. N: the number of eggs laid each month by a hen.
5. P: the number of building permits issued each month in a certain city.
6. Q: the weight of grain produced per acre.
Lecture Notes 3 – Discrete Probability Distributions 4
Engr. Caesar Pobre Llapitan
For each of the following exercises, determine the range (possible values) of the random variable.
1. The random variable is the number of nonconforming solder connections on a printed circuit
board with 1000 connections.
2. An electronic scale that displays weights to the nearest pound is used to weigh packages. The
display shows only five digits. Any weight greater than the display can indicate is shown as
99999. The random variable is the displayed weight.
3. A batch of 500 machined parts contains 10 that do not conform to customer requirements.
Parts are selected successively, without replacement, until a nonconforming part is obtained.
The random variable is the number of parts selected.
4. The random variable is the number of surface flaws in a large coil of galvanized steel.
5. An order for an automobile can select the base model or add any number of 15 options. The
random variable is the number of options selected in an order.
6. A group of 10,000 people are tested for a gene called Ifi202 that has been found to increase the
risk for lupus. The random variable is the number of people who carry the gene.
D. The Probability Distribution and Mass Function for A Discrete Random Variable
Random variables are so important in random experiments that sometimes we essentially ignore the
original sample space of the experiment and focus on the probability distribution of the random
variable.
Definition 3
The probability distribution for a discrete random variable X is a table, graph, or formula
that gives the probability of observing each value of X. We shall denote the probability of X by
the symbol p(X = x).
with the interpretation that p(X = x1) = p1, p(X = x2) = p2, . . . , p(X = xn) = pn
Thus, the probability distribution for a discrete random variable X may be given by one of the
following ways:
1. the table
X p
x1 p1
x2 p2
... ...
xn pn
where pk is the probability that the variable X assume the value Xk (k = 1, 2..., n).
2. a formula for calculating p(Xk) (k = 1, 2..., n).
3. a graph presenting the probability of each value Xk.
p(xi ) 1
all xi
2.
Note
– Variable names are capital letters (e.g., X)
– Values of variables are lower case letters (e.g., x1)
Example 1 A balanced coin is tossed twice and the number X of heads is observed. Find the
probability distribution for X.
Solution
Let Hk and Tk denote the observation of a head and a tail, respectively, on the kth toss, for k = 1, 2. The
four simple events and the associated values of x are shown in Table 1.
The event X = 0 is the collection of all simple events that yield a value of X = 0, namely, the simple
event E4. Therefore, the probability that x assumes the value 0 is
P(X = 0) = p(0) = P(E4) = 0.25
Finally,
P(X = 2) = p(2) = P(E1) = 0.25
The probability distribution p(x) is displayed in tabular form in Table 2 and as a probability histogram
in Figure 1.
Table 2 Probability distribution for X, the number of heads in two tosses of a coin
X p(X)
0 0.25
1 0.5
2 0.25
Lecture Notes 3 – Discrete Probability Distributions 6
Engr. Caesar Pobre Llapitan
0.6
0.4
0.2
0
0 1 2
Figure 1 Probability distribution for X, the number of heads in two tosses of a coin
Example 2
There is a chance that a bit transmitted through a digital transmission channel is received in error. Let
X equal the number of bits in error in the next four bits transmitted. The possible values for X are {0,
1, 2, 3, 4}. Based on a model for the errors that is presented in the following section, probabilities for
these values will be determined. Suppose that the probabilities are
P(X = 0) = 0.6561 P(X = 1) = 0.2916 P(X = 2) = 0.0486
P(X = 3) = 0.0036 P(X = 4) = 0.0001
The probability distribution of X is specified by the possible values along with the probability of each.
A graphical description of the probability distribution of X is shown in Fig. 2.
Let’s assign 1 for head and 0 for tail. The sample space is
S = {TTT, TTH, THT, HTT, THH, HTH, HHT, HHH}
Example 4 Let the random variable X denote the number of semiconductor wafers that need to be
analysed in order to detect a large particle of contamination. Assume that the probability that a wafer
contains a large particle is 0.01 and that the wafers are independent. Determine the probability
distribution of X.
Lecture Notes 3 – Discrete Probability Distributions 7
Engr. Caesar Pobre Llapitan
Let p denote a wafer in which a large particle is present, and let a denote a wafer in which it is absent.
The sample space of the experiment is infinite, and it can be represented as all possible sequences that
start with a string of a’s and end with p. That is,
Consider a few special cases. We have P(X = 12) = P(p) = 0.01. Also, using the independence
assumption
A general formula is
Describing the probabilities associated with X in terms of this formula is the simplest method of
describing the distribution of X in this example. Clearly f (x) 0.
Example 5 A shipment of 20 similar laptop computers to a retail outlet contains 3 that are defective.
If a school makes a random purchase of 2 of these computers, find the probability distribution for the
number of defectives.
Definition 4
For a discrete random variable X with possible values x1, x2, …, xn, a probability mass
function is a function such that
1. f(xi) 0
n
f xi 1
2. i 1
3. f(xi) = P(X = xi)
Example 5 The sample space of a random experiment is {a, b, c, d, e, f}, and each outcome is equally
likely. A random variable
is defined as follows:
Outcome a b c d e f
x 0 0 1.5 1.5 2 3
Exercises 2:
1. An overseas shipment of 5 foreign automobiles contains 2 that have slight paint blemishes. If an
agency receives 3 of these automobiles at random, list the elements of the sample space S, using
the letters B and N for blemished and non-blemished, respectively then to each sample point
assign a value x of the random variable X representing the number of automobiles with paint
blemishes purchased by the agency.
2. Let W be a random variable giving the number of heads minus the number of tails in three tosses
of a coin. List the elements of the sample space S for the three tosses of the coin and to each
sample point assign a value w of W.
3. The grades of n = 50 students in a statistics class are summarized as follows:
Lecture Notes 3 – Discrete Probability Distributions 8
Engr. Caesar Pobre Llapitan
Grade (X)
A (x = 1) B (x = 2) C (x = 3) D or below (x= 4
No. of students 10 20 15 5
Let X denote a grade in statistics. Let the values 1,2, 3, and 4 represent an A, B, C, and D or below,
respectively. Determine the probability mass function of X and plot f(xi) .
4. The orders from n = 100 customers for wooden panels of various thickness (X) are summarized as
follows:
5. An optical inspection system is to distinguish among different part types. The probability of a
correct classification of any part is 0.98. Suppose that three parts are inspected and that the
classifications are independent. Let the random variable X denote the number of parts that are
correctly classified. Determine the probability mass function of X.
6. The following data were collected by counting the number of operating rooms in use at Tampa
General Hospital over a 20-day period: On three of the days only one operating room was used, on
five of the days two were used, on eight of the days three were used, and on four days all four of
the hospital’s operating rooms were used.
a. Use the relative frequency approach to construct a probability distribution for the number of
operating rooms in use on any given day.
b. Draw a graph of the probability distribution.
c. Show that your probability distribution satisfies the required conditions for a valid discrete
probability distribution.
7. An assembly consists of two mechanical components. Suppose that the probabilities that the first
and second components meet specifications are 0.95 and 0.98. Assume that the components are
independent. Determine the probability mass function of the number of components in the
assembly that meet specifications.
8. Marketing estimates that a new instrument for the analysis of soil samples will be very successful,
moderately successful, or unsuccessful, with probabilities 0.3, 0.6, and 0.1, respectively. The yearly
revenue associated with a very successful, moderately successful, or unsuccessful product is $10
million, $5 million, and $1 million, respectively. Let the random variable X denote the yearly
revenue of the product. Determine the probability mass function of X.
Example 1
Suppose you were to toss two coins over and over again a very large number of times and record the
number X of heads for each toss. A relative frequency distribution for the resulting collection of 0’s, 1’s
Lecture Notes 3 – Discrete Probability Distributions 9
Engr. Caesar Pobre Llapitan
and 2’s would be very similar to the probability distribution shown in Figure 1. In fact, if it were
possible to repeat the experiment an infinitely large number of times, the two distributions would be
almost identical.
Thus, the probability distribution of Figure 1 provides a model for a conceptual population of values X –
the values of X that would be observed if the experiment were to be repeated an infinitely large number
of times.
Example 2
A survey reveals the following frequencies (1,000s) for the number of color TVs per household.
Number of TVs Number of Households x p(x)
0 1,218 0 1,218/101,501 = 0.012
1 32,379 1 0.319
2 37,961 2 0.374
3 19,387 3 0.191
4 7,714 4 0.076
5 2,842 5 0.028
Total 101,501 1
Exercises 3:
1. The percent frequency distributions of job satisfaction scores for a sample of information systems
(IS) senior executives and middle managers are as follows. The scores range from a low of 1 (very
dissatisfied) to a high of 5 (very satisfied).
five of the days two were used, on eight of the days three were used, and on four days all four of
the hospital’s operating rooms were used.
a. Use the relative frequency approach to construct a probability distribution for the number of
operating rooms in use on any given day.
b. Draw a graph of the probability distribution.
c. Show that your probability distribution satisfies the required conditions for a valid discrete
probability distribution.
3. A technician services mailing machines at companies in the Phoenix area. Depending on the type
of malfunction, the service call can take 1, 2, 3, or 4 hours. The different types of malfunctions
occur at about the same frequency.
a. Develop a probability distribution for the duration of a service call.
b. Draw a graph of the probability distribution.
c. Show that your probability distribution satisfies the conditions required for a discrete
probability function.
4. The following table is a partial probability distribution for the MRA Company’s projected profits
(x = profit in $1000s) for the first year of operation (the negative value denotes a loss).
x f(x)
100 .10
0 .20
50 .30
100 .25
150 .10
200
a. What is the proper value for f(200)? What is your interpretation of this value?
b. What is the probability that MRA will be profitable?
c. What is the probability that MRA will make at least $100,000?
5. A shipment of 7 television sets contains 2 defective sets. A hotel makes a random purchase of 3 of
the sets. If x is the number of defective sets purchased by the hotel, find the probability
distribution of X. Express the results graphically as a probability histogram.
6. From a box containing 4 dimes and 2 nickels, 3 coins are selected at random without replacement.
Find the probability distribution for the total T of the 3 coins. Express the probability distribution
graphically as a probability histogram.
7. From a box containing 4 black balls and 2 green balls, 3 balls are drawn in succession, each ball
being replaced in the box before the next draw is made. Find the probability distribution for the
number of green balls.
Definition:
The cumulative distribution function of a discrete random variable X, denoted as F(x) is
F x P X x f x
xi x
i
Lecture Notes 3 – Discrete Probability Distributions 11
Engr. Caesar Pobre Llapitan
1.
F x P X x xi x
f xi
0 F x 1
2.
If x y, then F x F y
3.
Like a probability mass function, a cumulative distribution function provides probabilities. Notice
that even if the random variable X can only assume integer values, the cumulative distribution
function can be defined at non-integer values.
Example 1 Determine the probability mass function of X from the following cumulative distribution
function:
0 x 2
0.2 2 x 0
f x
0.7 0 x2
1 2 x
From the plot, the only points that receive nonzero probability are -2, 0, and 2. The probability mass
function at each point is the change in the cumulative distribution function at the point. Therefore,
f (-2) = 0.2 - 0 = 0.2 f (0) = 0.7 - 0.2 = 0.5 f (2) = 1.0 - 0.7 = 0.3
Example 2 Suppose that a day’s production of 850 manufactured parts contains 50 parts that do not
conform to customer requirements. Two parts are selected at random, without replacement, from the
batch. Let the random variable X equal the number of nonconforming parts in the sample.
What is the cumulative distribution function of X?
Exercises 4
1. An investment firm offers its customers municipal bonds that mature after varying numbers of
years. Given that the cumulative distribution function of T, the number of years to maturity
for a randomly selected bond, is
0, t<1
1/4, 1 t<3
f x 1/2, 3 t<5
3/4, 5 t < 7
1, t 7
Find
a) P(T = 5) c) P(1.4 < T < 6)
b) P(T > 3) d) P(T 5 T 2)
3. A shipment of 7 television sets contains 2 defective sets. A hotel makes a random purchase of 3
of the sets. If x is the number of defective sets purchased by the hotel, find the cumulative
distribution function of the random variable X representing the number of defectives. Then
using F(x), find
a. P(X = 1)
b. P(0 < x 2)
c. Construct a graph of the cumulative distribution function
4. Errors in an experimental transmission channel are found when the transmission is checked
by a certifier that detects missing pulses. The number of errors found in an eight-bit byte is a
random variable with the following distribution:
0, x<1
0.7, 1 x<4
f x
0.9, 4 x<7
1, 7 x
Two numbers are often used to summarize a probability distribution for a random variable X. The
mean is a measure of the center or middle of the probability distribution, and the variance is a
measure of the dispersion, or variability in the distribution. These two measures do not uniquely
identify a probability distribution. That is, two different distributions can have the same mean and
variance. Still, these measures are simple, useful summaries of the probability distribution of X.
Definition
The mean or expected value of the discrete random variable X, denoted as E(X) or is
E X x x f x
The variance of X, denoted as 2 or V(X) is
2 V X E X
2
x x 2 f x x x 2 f x 2
The standard deviation of X is
2
The mean of a discrete random variable X is a weighted average of the possible values of X, with
weights equal to the probabilities. If f(x) is the probability mass function of a loading on a long, thin
beam, E(X) is the point at which the beam balances. Consequently, E(X) describes the “center’’ of the
distribution of X in a manner similar to the balance point of a loading.
Lecture Notes 3 – Discrete Probability Distributions 13
Engr. Caesar Pobre Llapitan
(a) (b)
Figure 3 A probability distribution can be viewed as a loading with the mean equal to the balance
point. Parts (a) and (b) illustrate equal means, but Part (a) illustrates a larger variance.
Figure 4 illustrates that two probability distributions can differ even though they have identical means
and variances.
(a) (b)
Figure 4 The probability distributions illustrated in Parts (a) and (b) differ even though they have
equal means and equal variances.
Example 1 There is a chance that a bit transmitted through a digital transmission channel is received
in error. Let X equal the number of bits in error in the next four bits transmitted. The possible values
for X are {0, 1, 2, 3, 4}. Based on a model for the errors that is presented in the following section,
probabilities for these values will be determined. Suppose that the probabilities are
P(X = 0) = f(o) = 0.6561 P(X = 1) = f(1) = 0.2916 P(X = 2) = f(2) = 0.0486
P(X = 3) = f(3) = 0.0036 P(X = 4) = f(4) = 0.0001
Now,
E X x x f x
= 0.4
5
V X 2 x f xi
2
i
i 1 = 0.36
Example 2 Two new product designs are to be compared on the basis of revenue potential. Marketing
feels that the revenue from design A can be predicted quite accurately to be $3 million. The revenue
potential of design B is more difficult to assess. Marketing concludes that there is a probability of 0.3
that the revenue from design B will be $7 million, but there is a 0.7 probability that the revenue will be
only $2 million. Which design do you prefer?
Example 3 The number of messages sent per hour over a computer network has the following
distribution:
x = number of 10 11 12 13 14 15
messages
f(x) 0.8 0.15 0.30 0.20 0.20 0.07
Determine the mean and standard deviation of the number of messages sent per hour.
Lecture Notes 3 – Discrete Probability Distributions 14
Engr. Caesar Pobre Llapitan
Example 5 A lot containing 7 components is sampled by a quality inspector; the lot contains 4 good
components and 3 defective components. A sample of 3 is taken by the inspector. Find the expected
value of the number of good components in this sample.
Example 6 A salesperson for a medical device company has two appointments on a given day. At the
first appointment, he believes that he has a 70% chance to make the deal, from which he can earn
$1000 commission if successful. On the other hand, he thinks he only has a 40% chance to make the
deal at the second appointment, from which, if successful, he can make $1,500. What is his expected
commission based on his own probability belief? Assume that the appointment results are
independent of each other.
Example 7 Suppose that the number of cars X that pass through a car wash between 4:00 P.M. and
5:00 P.M. on any sunny Friday has the following probability distribution:
x 4 5 6 7 8 9
P(X = x) 1/12 1/12 ¼ ¼ 1/6 1/6
Let g(X) = 2X − 1 represent the amount of money, in dollars, paid to the attendant by the manager.
Find the attendant’s expected earnings for this particular time period.
Exercises 5
1. The probability distribution of X, the number of imperfections per 10 meters of a synthetic
fabric in continuous rolls of uniform width, is given as
x 0 1 2 3 4
f(x) 0.41 0.37 0.16 0.05 0.1
Find the average number of imperfections per 10 meters of this fabric.
2. A coin is biased such that a head is three times as likely to occur as a tail. Find the expected
number of tails when this coin is tossed twice.
3. In a gambling game, a woman is paid $3 if she draws a jack or a queen and $5 if she draws a
king or an ace from an ordinary deck of 52 playing cards. If she draws any other card, she
loses. How much should she pay to play if the game is fair?
4. An attendant at a car wash is paid according to the number of cars that pass through. Suppose
the probabilities are 1/12, 1/12, 1/4, 1/4, 1/6, and 1/6, respectively, that the attendant receives $7,
$9, $11, $13, $15, or $17 between 4:00 P.M. and 5:00 P.M. on any sunny Friday. Find the
attendant’s expected earnings for this particular period.
5. By investing in a particular stock, a person can make a profit in one year of $4000 with
probability 0.3 or take a loss of $1000 with probability 0.7. What is this person’s expected gain?
6. Suppose that an antique jewelry dealer is interested in purchasing a gold necklace for which
the probabilities are 0.22, 0.36, 0.28, and 0.14, respectively, that she will be able to sell it for a
profit of $250, sell it for a profit of $150, break even, or sell it for a loss of $150. What is her
expected profit?
7. A private pilot wishes to insure his airplane for $200,000. The insurance company estimates
that a total loss will occur with probability 0.002, a 50% loss with probability 0.01, and a 25%
loss with probability 0.1. Ignoring all other partial losses, what premium should the insurance
company charge each year to realize an average profit of $500?
Lecture Notes 3 – Discrete Probability Distributions 15
Engr. Caesar Pobre Llapitan
Such a handful of distributions describe several real-life random phenomena. For instance, in a study
involving testing the effectiveness of a new drug, the number of cured patients among all the patients
who use the drug approximately follows a binomial distribution.
In an industrial example, when a sample of items selected from a batch of production is tested, the
number of defective items in the sample usually can be modelled as a hypergeometric random variable.
In a statistical quality control problem, the experimenter will signal a shift of the process mean when
observational data exceed certain limits. The number of samples required to produce a false alarm
follows a geometric distribution which is a special case of the negative binomial distribution.
On the other hand, the number of white cells from a fixed amount of an individual’s blood sample is
usually random and may be described by a Poisson distribution.
Binomial Distributions
An experiment often consists of repeated trials, each with two possible outcomes that may be labeled
success or failure.
The random variable in each case is a count of the number of trials that meet a specified criterion. The
outcome from each trial either meets the criterion that X counts or it does not; consequently, each
trial can be summarized as resulting in either a success or a failure.
The process is referred to as a Bernoulli process. Each trial is called a Bernoulli trial.
Consider the set of Bernoulli trials where three items are selected at random from a
manufacturing process, inspected, and classified as defective or non-defective. A defective item is
designated a success. The number of successes is a random variable X assuming integral values from 0
through 3. The eight possible outcomes and the corresponding values of X are
Since the items are selected independently and we assume that the process produces 25%
defectives, we have
x 0 1 2 3
f(x) 27/64 27/64 9/64 1/64
Binomial Distribution
The number X of successes in n Bernoulli trials is called a binomial random variable. The
probability distribution of this discrete random variable is called the binomial distribution, and its
values will be denoted by b(x; n, p) since they depend on the number of trials and the probability of a
success on a given trial. Thus, for the probability distribution of X, the number of defectives is
Definition
A Bernoulli trial can result in a success with probability p and a failure with probability q = 1 − p. Then
the probability distribution of the binomial random variable X, the number of successes in n
independent trials, is
n
b x; n, p p x qn x , x 0,1,2,... n
f(x) = x
Example 1 Each sample of water has a 10% chance of containing a particular organic pollutant.
Assume that the samples are independent with regard to the presence of the pollutant.
a. Find the probability that in the next 18 samples, exactly 2 contain the pollutant.
b. Determine the probability that at least four samples contain the pollutant.
c. Determine the probability that 3 X < 7.
Example 2 Test for impurities commonly found in drinking water from private wells showed that 30%
of all wells in a particular country have impurity A. If a random sample of 5 wells is selected from the
large number of wells in the country, what is the probability that:
a) Exactly 3 will have impurity A?
b) At least 3?
c) Fewer than 3?
Exercises 6:
1. For each scenario described below, state whether or not the binomial distribution is a reasonable
model for the random variable and why. State any assumptions you make.
Lecture Notes 3 – Discrete Probability Distributions 17
Engr. Caesar Pobre Llapitan
b. If the rework percentage increases to 4%, what is the probability that X exceeds 1?
c. If the rework percentage increases to 4%, what is the probability that X exceeds 1 in at least
one of the next five hours of samples?
7. Because not all airline passengers show up for their reserved seat, an airline sells 125 tickets for a
flight that holds only 120 passengers. The probability that a passenger does not show up is 0.10,
and the passengers behave independently.
a. What is the probability that every passenger who shows up can take the flight?
b. What is the probability that the flight departs with empty seats?
8. This exercise illustrates that poor quality can affect schedules and costs. A manufacturing process
has 100 customer orders to fill. Each order requires one component part that is purchased from a
supplier. However, typically, 2% of the components are identified as defective, and the
components can be assumed to be independent.
a. If the manufacturer stocks 100 components, what is the probability that the 100 orders
can be filled without reordering components?
b. If the manufacturer stocks 102 components, what is the probability that the 100 orders
can be filled without reordering components?
c. If the manufacturer stocks 105 components, what is the probability that the 100 orders
can be filled without reordering components?
9. A multiple-choice test contains 25 questions, each with four answers. Assume a student just
guesses on each question.
a. What is the probability that the student answers more than 20 questions correctly?
b. What is the probability the student answers less than 5 questions correctly?
10. A particularly long traffic light on your morning commute is green 20% of the time that you
approach it. Assume that each morning represents an independent trial.
a. Over five mornings, what is the probability that the light is green on exactly one day?
b. Over 20 mornings, what is the probability that the light is green on exactly four days?
c. Over 20 mornings, what is the probability that the light is green on more than four days?
The Poisson probability distribution is named for the French mathematician S.D. Poisson (1871-1840,
It is used to describe a number of processes, including the distribution of telephone calls going
through a switchboard system, the demand of patients for service at a health institution, the arrivals
of trucks and cars at a tollbooth, and the number of accidents at an intersection.
3. The number of events that occur in one unit of time is independent of the number that occur
in other units.
4. The mean number of events in each unit will be denoted by the Greek letter λ
The formulas for the probability distribution, the mean and the variance of a Poisson random variable
are shown in the next box.
The probability distribution, mean and variance for a Poisson random variable x:
1. The probability distribution:
x e
p(x)
x! (x = 0, 1, 2...),
where
λ = mean number of events during the given time period,
e = 2.71828... (the base of natural logarithm).
Note that instead of time, the Poisson random variable may be considered in the experiment of
counting the number x of times a particular event occurs during a given unit of area, volume, etc.
Example 1 Suppose that we are investigating the safety of a dangerous intersection. Past police
records indicate a mean of 5 accidents per month at this intersection. Suppose the number of
accidents is distributed according to a Poisson distribution. Calculate the probability in any month of
exactly 0, 1, 2, 3 or 4 accidents.
Solution Since the number of accidents is distributed according to a Poisson distribution and the
mean number of accidents per month is 5, we have the probability of happening
5 x e 5
p(x)
Accidents in any month x ! . By this formula we can calculate
p(0) = 0.00674, p(1) = 0.3370, p(2) = 0.08425, p(3) = 0.14042, p(4) = 0.17552.
The probability distribution of the number of accidents per month is presented in Table 5.3 and
Figure 5.2.
10 0.018133
11 0.008242
12 0.003434
Example 2 For the case of the thin copper wire, suppose that the number of flaws follows a Poisson
distribution with a mean of 2.3 flaws per millimeter. Determine the probability of exactly 2 flaws in 1
millimeter of wire.
Example 3 Contamination is a problem in the manufacture of optical storage disks. The number of
particles of contamination that occur on an optical disk has a Poisson distribution, and the average
number of particles per centimeter squared of media surface is 0.1. The area of a disk under study is
100 squared centimeters. Find the probability that 12 particles occur in the area of a disk under study.
Exercises 7
1. Suppose X has a Poisson distribution with a mean of 4. Determine the following probabilities:
a. P(X = 0) c. P(X 2)
b. P(X = 4) d. P(X 8)
2. Suppose that the number of customers that enter a bank in an hour is a Poisson random
variable, and suppose that Determine the mean and variance of X.
3. The number of telephone calls that arrive at a phone exchange is often modeled as a Poisson
random variable. Assume that on the average there are 10 calls per hour.
a. What is the probability that there are exactly 5 calls in one hour?
b. What is the probability that there are 3 or less calls in one hour?
c. What is the probability that there are exactly 15 calls in two hours?
d. What is the probability that there are exactly 5 calls in 30 minutes?
4. The number of flaws in bolts of cloth in textile manufacturing is assumed to be Poisson
distributed with a mean of 0.1 flaw per square meter.
a. What is the probability that there are two flaws in 1 square meter of cloth?
b. What is the probability that there is one flaw in 10 square meters of cloth?
c. What is the probability that there are no flaws in 20 square meters of cloth?
d. What is the probability that there are at least two flaws in 10 square meters of cloth?
5. The number of cracks in a section of interstate highway that are significant enough to require
repair is assumed to follow a Poisson distribution with a mean of two cracks per mile.
a. What is the probability that there are no cracks that require repair in 5 miles of
highway?
Lecture Notes 3 – Discrete Probability Distributions 21
Engr. Caesar Pobre Llapitan
b. What is the probability that at least one crack requires repair in ½ mile of highway?
c. If the number of cracks is related to the vehicle load on the highway and some sections
of the highway have a heavy load of vehicles whereas other sections carry a light load,
how do you feel about the assumption of a Poisson distribution for the number of
cracks that require repair?
6. The number of surface flaws in plastic panels used in the interior of automobiles has a Poisson
distribution with a mean of 0.05 flaw per square foot of plastic panel. Assume an automobile
interior contains 10 square feet of plastic panel.
a. What is the probability that there are no surface flaws in an auto’s interior?
b. If 10 cars are sold to a rental company, what is the probability that none of the 10 cars
has any surface flaws?
c. If 10 cars are sold to a rental company, what is the probability that at most one car has
any surface flaws?
7. The number of failures for a cytogenics machine from contamination in biological samples is a
Poisson random variable with a mean of 0.01 per 100 samples.
a. If the lab usually processes 500 samples per day, what is the expected number of
failures per day?
b. What is the probability that the machine will not fail during a study that includes 500
participants? (Assume one sample per participant.)
8. The number of failures of a testing instrument from contamination particles on the product is
a Poisson random variable with a mean of 0.02 failure per hour.
a. What is the probability that the instrument does not fail in an 8-hour shift?
b. What is the probability of at least one failure in a 24-hour day?
Research:
1. Geometric and Negative Binomial Distributions
2. Hypergeometric Distribution