HELM Workbook 37 Discrete Probability Distributions
HELM Workbook 37 Discrete Probability Distributions
Discrete Probability
Distributions
Learning outcomes
In this Workbook you will learn what a discrete random variable is. You will find how to
calculate the expectation and variance of a discrete random variable. You will then
examine two of the most important examples of discrete random variables: the binomial
distribution and Poisson distribution.
The Poisson distribution can be deduced from the binomial distribution and is often
used as a way of finding good approximations to the binomial probabilities. The binomial
is a finite discrete random variable whereas the Poisson distribution has an infinite
number of possibilities.
Distributions 37.1
Introduction
It is often possible to model real systems by using the same or similar random experiments and
their associated random variables. Numerical random variables may be classified in two broad but
distinct categories called discrete random variables and continuous random variables. Often, discrete
random variables are associated with counting while continuous random variables are associated with
measuring. In 42. you will meet contingency tables and deal with non-numerical random
variables. Generally speaking, discrete random variables can take values which are separate and can
be listed. Strictly speaking, the real situation is a little more complex but it is sufficient for our
purposes to equate the word discrete with a finite list. In contrast, continuous random variables
can take values anywhere within a specified range. This Section will familiarize you with the idea
of a discrete random variable and the associated probability distributions. The Workbook makes
no attempt to cover the whole of this large and important branch of statistics but concentrates on
the discrete distributions most commonly met in engineering. These are the binomial, Poisson and
hypergeometric distributions.
2 HELM (2008):
Workbook 37: Discrete Probability Distributions
®
Factorials
The factorial of an integer n commonly called ‘factorial n’ and written n! is defined as follows:
n! = n × (n − 1) × (n − 2) × · · · × 3 × 2 × 1 n≥1
Simple examples are:
3! = 3×2×1 = 24 5! = 5×4×3×2×1 = 120 8! = 8×7×6×5×4×3×2×1 = 40320
As you can see, factorial notation enables us to express large numbers in a very compact format. You
will see that this characteristic is very useful when we discuss the topic of permutations. A further
point is that the definition above falls down when n = 0 and we define
0! = 1
Permutations
A permutation of a set of distinct objects places the objects in order. For example the set of three
numbers {1, 2, 3} can be placed in the following orders:
1,2,3 1,3,2 2,1,3 2,3,1 3,2,1 3,1,2
Note that we can choose the first item in 3 ways, the second in 2 ways and the third in 1 way. This
gives us 3×2×1 = 3! = 6 distinct orders. We say that the set {1, 2, 3} has the distinct permutations
1,2,3 1,3,2 2,1,3 2,3,1 3,2,1 3,1,2
HELM (2008): 3
Section 37.1: Discrete Probability Distributions
Example 1
Write out the possible permutations of the letters A, B, C and D.
Solution
The possible permutations are
ABCD ABDC ADBC ADCB ACBD ACDB
BADC BACD BCDA BCAD BDAC BDCA
CABD CADB CDBA CDAB CBAD CBDA
DABC DACB DCAB DCBA DBAC DBCA
There are 4! = 24 permutations of the four letters A, B, C and D.
Example 2
Find the number of permutations of the four letters A, B, C and D taken three
at a time.
Solution
We may choose the first letter in 4 ways, either A, B, C or D. Suppose, for the purposes of
illustration we choose A. We may choose the second letter in 3 ways, either B, C or D. Suppose,
for the purposes of illustration we choose B. We may choose the third letter in 2 ways, either C
or D. Suppose, for the purposes of illustration we choose C. The total number of choices made is
4 × 3 × 2 = 24.
4 HELM (2008):
Workbook 37: Discrete Probability Distributions
®
Example 3
Find the number of permutations of the four letters A, B, C and D taken two at
a time.
Solution
We may choose the first letter in 4 ways and the second letter in 3 ways giving us
4×3×2×1 4!
4×3= = = 12 permutations
1×2 2!
Combinations
A combination of objects takes no account of order whereas a permutation does. The formula
n n!
Pr = gives us the number of ordered sets of r objects chosen from n. Suppose the number
(n − r)!
of sets of r objects (taken from n objects) in which order is not taken into account is C. It follows
that
n! n!
C × r! = and so C is given by the formula C=
(n − r)! r!(n − r)!
We normally denote the right-hand side of this expression by n Cr so that
n n! n n
Cr = A common alternative notation for Cr is .
r!(n − r)! r
Example 4
How many car registrations are there beginning with N P 05 followed by three
letters? Note that, conventionally, I, O and Q may not be chosen.
Solution
We have to choose 3 letters from 23 allowing repetition. Hence the number of registrations beginning
with N P 05 must be 233 = 12167.
HELM (2008): 5
Section 37.1: Discrete Probability Distributions
Task
(a) How many different signals consisting of five symbols can be sent using the
dot and dash of Morse code?
(b) How many can be sent if five symbols or less can be sent?
Your solution
Answer
(a) Clearly, the order of the symbols is important. We can choose each symbol in two ways, either
a dot or a dash. The number of distinct signals is
2 × 2 × 2 × 2× = 25 = 32
(b) If five or less symbols may be used, the total number of signals may be calculated as follows:
Using one symbol: 2 ways
Using two symbols: 2 × 2 = 4 ways
Using three symbols: 2 × 2 × 2 = 8 ways
Using four symbols: 2 × 2 × 2 × 2 = 16 ways
Using five symbols: 2 × 2 × 2 × 2 × 2 = 32 ways
The total number of signals which may be sent is 62.
Task
A box contains 50 resistors of which 20 are deemed to be ‘very high quality’ , 20
‘high quality’ and 10 ‘standard’. In how many ways can a batch of 5 resistors be
chosen if it is to contain 2 ‘very high quality’, 2 ‘high quality’ and 1 ‘standard’
resistor?
Your solution
Answers The order in which the resistors are chosen does not matter so that the number of ways
in which the batch of 5 can be chosen is:
20 20! 20! 10! 20 × 19 20 × 19 10
C2 ×20 C2 ×10 C1 = × × = × × = 361000
18! × 2! 18! × 2! 9! × 1! 1×2 1×2 1
6 HELM (2008):
Workbook 37: Discrete Probability Distributions
®
2. Random variables
A random variable X is a quantity whose value cannot be predicted with certainty. We assume that
for every real number a the probability P(X = a) in a trial is well-defined. In practice, engineers are
often concerned with two broad types of variables and their probability distributions: discrete random
variables and their distributions, and continuous random variables and their distributions. Discrete
distributions arise from experiments involving counting, for example, road deaths, car production
and aircraft sales, while continuous distributions arise from experiments involving measurement, for
example, voltage, corrosion and oil pressure.
We sum the probabilities pi for which xi is less than or equal to x. This gives a step function with
jumps of size pi at each value xi of X. The step function is defined for all values, not just the values
xi of X.
Key Point 1
Probability Distribution of a Discrete Random Variable
Let X be a random variable associated with an experiment. Let the values of X be denoted by
x1 , x2 , . . . , xn and let P(X = xi ) be the probability that xi occurs. We have two necessary conditions
for a valid probability distribution:
• P(X = xi ) ≥ 0 for all xi
n
X
• P(X = xi ) = 1
i=1
HELM (2008): 7
Section 37.1: Discrete Probability Distributions
Example 5
Turbo Generators plc manufacture seven large turbines for a customer. Three of
these turbines do not meet the customer’s specification. Quality control inspectors
choose two turbines at random. Let the discrete random variable X be defined to
be the number of turbines inspected which meet the customer’s specification.
(a) Find the probabilities that X takes the values 0, 1 or 2.
(b) Find and graph the cumulative distribution function.
Solution
(a) The possible values of X are clearly 0, 1 or 2 and may occur as follows:
Sample Space Value of X
Turbine faulty, Turbine faulty 0
Turbine faulty, Turbine good 1
Turbine good, Turbine faulty 1
Turbine good, Turbine good 2
We can easily calculate the probability that X takes the values 0, 1 or 2 as follows:
3 2 1 4 3 3 4 4 4 3 2
P(X = 0) = × = P(X = 1) = × + × = P(X = 2) = × =
7 6 7 7 6 7 6 7 7 6 7
X
The values of F (x) = P(X = xi ) are clearly
xi ≤x
1 5 7
F (0) = F (1) = and F (2) = = 1
7 7 7
(b) The graph of the step function F (x) is shown below.
F (x)
5/7
1/7
x
0 1 2
Figure 1
8 HELM (2008):
Workbook 37: Discrete Probability Distributions
®
fi
Replacing with the probability pi leads to the following definition of the mean or expectation of
N
the discrete random variable X.
Key Point 2
The expectation E(X) of X is the value of X which we expect on average. In a similar way we can
write down the expected value of the function g(X) as E[g(X)], the value of g(X) we expect on
average. We have
HELM (2008): 9
Section 37.1: Discrete Probability Distributions
n
X
E[g(X)] = g(xi )f (xi )
i
n
X
In particular if g(X) = X 2 , we obtain E[X 2 ] = x2i f (xi )
i
2
The variance is usually written as σ . For a frequency distribution it is:
n
2 1 X
σ = fi (xi − µ)2 where µ is the mean value
N i=1
and can be expanded and ‘simplified’ to appear as:
n
1 X 2
σ2 = fi xi − µ2
N i=1
This is often quoted in words:
The variance is equal to the mean of the squares minus the square of the mean.
We now extend the concept of variance to a random variable.
Key Point 3
The Variance of a Discrete Random Variable
Let X be a random variable with values x1 , x2 , . . . , xn . The variance of X, which is written V(X)
is defined by
X n
V(X) = pi (xi − µ)2
i=1
where µ ≡ E(X). We note that V(X) can be written in the alternative form
10 HELM (2008):
Workbook 37: Discrete Probability Distributions
®
Example 6
A traffic engineer is interested in the number of vehicles reaching a particular
crossroads during periods of relatively low traffic flow. The engineer finds that
the number of vehicles X reaching the crossroads per minute is governed by the
probability distribution:
x 0 1 2 3 4
P(X = x) 0.37 0.39 0.19 0.04 0.01
(a) Calculate the expected value, the variance and the standard deviation of the
random variable X.
(b) Graph the probability distribution P(X = x) and the corresponding cumulative
X
probability distribution F (x) = P(X = xi ).
xi ≤x
Solution
(a) The expectation, variance and standard deviation and cumulative probability values are calculated
as follows:
x x2 P(X = x) F (x)
0 0 0.37 0.37
1 1 0.39 0.76
2 4 0.19 0.95
3 9 0.04 0.99
4 16 0.01 1.00
4
X
E(X) = xP(X = x)
x=0
= 0 × 0.37 + 1 × 0.39 + 2 × 0.19 + 3 × 0.04 + 4 × 0.01
= 0.93
HELM (2008): 11
Section 37.1: Discrete Probability Distributions
Solution (contd.)
(b)
F (x) P(X = x)
1.0 1.0
0.8 0.8
0.6 0.6
0.4 0.4
0.2 0.2
x x
0 1 2 3 4 0 1 2 3 4
Figure 2
Task
Find the expectation, variance and standard deviation of the number of Heads in
the three-coin toss experiment.
Your solution
Answer
1 3 3 1 12
E(X) = ×0+ ×1+ ×2+×3=
8 8 8 8 8
X 1 3 3 1
pi x2i = × 02 + × 12 + × 22 + × 32
8 8 8 8
1 3 3 1
= ×0+ ×1+ ×4+ ×9=3
8 8 8 8
3
V(X) = 3 − 2.25 = 0.75 =
√ 4
3
σ =
2
12 HELM (2008):
Workbook 37: Discrete Probability Distributions
®
Exercises
1. A machine is operated by two workers. There are sixteen workers available. How many possible
teams of two workers are there?
2. A factory has 52 machines. Two of these have been given an experimental modification. In the
first week after this modification, problems are reported with thirteen of the machines. What
is the probability that both of the modified machines are among the thirteen with problems
assuming that all machines are equally likely to give problems,?
3. A factory has 52 machines. Four of these have been given an experimental modification. In
the first week after this modification, problems are reported with thirteen of the machines.
What is the probability that exactly two of the modified machines are among the thirteen with
problems assuming that all machines are equally likely to give problems?
5. A hand-held calculator has a clock cycle time of 100 nanoseconds; these are positions numbered
0, 1, . . . , 99. Assume a flag is set during a particular cycle at a random position. Thus, if X is
the position number at which the flag is set.
1
P(X = k) = k = 0, 1, 2, . . . , 99.
100
Evaluate the average position number E(X), and σ, the standard deviation.
(Hint: The sum of the first k integers is k(k + 1)/2 and the sum of their squares is:
k(k + 1)(2k + 1)/6.)
6. Concentric circles of radii 1 cm and 3 cm are drawn on a circular target radius 5 cm. A darts
player receives 10, 5 or 3 points for hitting the target inside the smaller circle, middle annular
region and outer annular region respectively. The player has only a 50-50 chance of hitting the
target at all but if he does hit it he is just as likely to hit any one point on it as any other. If
X = ‘number of points scored on a single throw of a dart’ calculate the expected value of X.
HELM (2008): 13
Section 37.1: Discrete Probability Distributions
Answers
2. There are
52
13
possible different selections of 13 machines and all are equally likely. There is only
2
=1
2
way to pick two machines from those which were modified but there are
50
11
different choices for the 11 other machines with problems so this is the number of possible
selections containing the 2 modified machines.
Alternatively, let S be the event “first modified machine is in the group of 13” and C be the
event “second modified machine is in the group of 13”. Then the required probability is
13 12
P(S) × P(C | S) = × .
52 51
14 HELM (2008):
Workbook 37: Discrete Probability Distributions
®
Answers
52 4
3. There are different selections of 13, different choices of two modified machines
13 2
48
and different choices of 11 non-modified machines.
11
Thus the required probability is
4 48
2 11 (4!/2!2!)(48!/11!37!)
=
52 (52!/13!39!)
13
4!48!13!39!
=
52!2!2!11!37!
4 × 3 × 13 × 12 × 39 × 38
= ≈ 0.2135
52 × 51 × 50 × 49 × 2
Alternatively, let I(i) be the event “modified machine i is in the group of 13” and O(i)
be the negation of this, for i = 1, 2, 3, 4. The number of choices of two modified machines is
4
2
so the required probability is
4
P{I(1)} × P{I(2) | I(1)} × P{O(3) | I(1), I(2)} × P{O(4) | I(1)I(2)O(3)}
2
4 13 12 39 38
= × × ×
2 52 51 50 49
4 × 3 × 13 × 12 × 39 × 38
=
52 × 51 × 50 × 49 × 2
x 0 1 2 3 4 5 6 7 8 9
4. 1 1 1 1 1 1 1 1 1 1
P(X = x) /10 /10 /10 /10 /10 /10 /10 /10 /10 /10
1
E(X) = {0 + 1 + 2 + 3 + . . . + 9} = 4.5
10
HELM (2008): 15
Section 37.1: Discrete Probability Distributions
Answers
5. Same as Q.4 but with 100 positions
1 1 99(99 + 1))
E(X) = {0 + 1 + 2 + 3 + . . . + 99} = = 49.5
100 100 2
σ 2 = mean of squares − square of means
1 2
∴ σ2 = [1 + 22 + . . . + 992 ] − (49.5)2
100
1 [99(100)(199)]
= − 49.52 = 833.25
100 6
√
so the standard deviation is σ = 833.25 = 28.87
6. X can take 4 values 0, 3, 5 or 10
P(X = 0) = 0.5 [only 50/50 chance of hitting target]
The probability that a particular points score is obtained is related to the areas of the annular
regions which are, from the centre: π, (9π − π) = 8π, (25π − 9π) = 16π
16π 1 16
= . =
25π 2 50
8π 1 8
= . =
25π 2 50
π 1 1
= . =
25π 2 50
x 0 3 5 10
25 16 8 1
P(X = x) /50 /50 /50 /50
48 + 40 + 10
∴ E(X) = = 1.96.
50
16 HELM (2008):
Workbook 37: Discrete Probability Distributions
®
The Binomial
Distribution 37.2
Introduction
A situation in which an experiment (or trial) is repeated a fixed number of times can be modelled,
under certain assumptions, by the binomial distribution. Within each trial we focus attention on
a particular outcome. If the outcome occurs we label this as a success. The binomial distribution
allows us to calculate the probability of observing a certain number of successes in a given number
of trials.
You should note that the term ‘success’ (and by implication ‘failure’) are simply labels and as such
might be misleading. For example counting the number of defective items produced by a machine
might be thought of as counting successes if you are looking for defective items! Trials with two
possible outcomes are often used as the building blocks of random experiments and can be useful to
engineers. Two examples are:
HELM (2008): 17
Section 37.2: The Binomial Distribution
1. The binomial model
We have introduced random variables from a general perspective and have seen that there are two
basic types: discrete and continuous. We examine four particular examples of distributions for
random variables which occur often in practice and have been given special names. They are the
binomial distribution, the Poisson distribution, the Hypergeometric distribution and the Normal
distribution. The first three are distributions for discrete random variables and the fourth is for a
continuous random variable. In this Section we focus attention on the binomial distribution.
The binomial distribution can be used in situations in which a given experiment (often referred to,
in this context, as a trial) is repeated a number of times. For the binomial model to be applied the
following four criteria must be satisfied:
• the trial is carried out a fixed number of times n
• the outcomes of each trial can be classified into two ‘types’ conventionally named ‘success’ or
‘failure’
• the occurrence of Heads on any given trial (i.e. throw) may be called a ‘success’ and Tails
called a ‘failure’
1
• the probability of success is p = 2
and remains constant for each trial
1. An electronic product has a total of 30 integrated circuits built into it. The product is capable
of operating successfully only if at least 27 of the circuits operate properly. What is the
probability that the product operates successfully if the probability of any integrated circuit
failing to operate is 0.01?
Before developing the general binomial distribution we consider the following examples which, as you
will soon recognise, have the basic characteristics of a binomial distribution.
18 HELM (2008):
Workbook 37: Discrete Probability Distributions
®
Example 7
In a box of floppy discs it is known that 95% will work. A sample of three of the
discs is selected at random.
Find the probability that (a) none (b) 1, (c) 2, (d) all 3 of the sample will work.
Solution
Let the event {the disc works} be W and the event {the disc fails} be F . The probability that a
disc will work is denoted by P(W ) and the probability that a disc will fail is denoted by P(F ). Then
P(W ) = 0.95 and P(F ) = 1 − P(W ) = 1 − 0.95 = 0.05.
(a) The probability that none of the discs works equals the probability that all 3 discs fail.
This is given by:
P(none work) = P(F F F ) = P(F )×P(F )×P(F ) as the events are independent
3
= 0.05×0.05×0.05 = 0.05 = 0.000125
(b) If only one disc works then you could select the three discs in the following orders
(F F W ) or (F W F ) or (W F F ) hence
(F W W ) or (W F W ) or (W W F ) hence
(d) The probability that all 3 discs work is given by P(W W W ) = 0.953 = 0.857375.
Notice that since the 4 outcomes we have dealt with are all possible outcomes
of selecting 3 discs, the probabilities should add up to 1. It is an easy check to verify
that they do.
One of the most important assumptions above is that of independence.The probability
of selecting a working disc remains unchanged no matter whether the previous selected
disc worked or not.
HELM (2008): 19
Section 37.2: The Binomial Distribution
Example 8
A worn machine is known to produce 10% defective components. If the random
variable X is the number of defective components produced in a run of 3 compo-
nents, find the probabilities that X takes the values 0 to 3.
Solution
Assuming that the production of components is independent and that the probability p = 0.1 of
producing a defective component remains constant, the following table summarizes the production
run. We let G represent a good component and let D represent a defective component.
Note that since we are only dealing with two possible outcomes, we can say that the probability q of
the machine producing a good component is 1 − 0.1 = 0.9. More generally, we know that q+p = 1
if we are dealing with a binomial distribution.
Outcome Value of X Probability of Occurrence
GGG 0 (0.9)(0.9)(0.9) = (0.9)3
GGD 1 (0.9)(0.9)(0.1) = (0.9)2 (0.1)
GDG 1 (0.9)(0.1)(0.9) = (0.9)2 (0.1)
DGG 1 (0.1)(0.9)(0.9) = (0.9)2 (0.1)
DDG 2 (0.1)(0.1)(0.9) = (0.9)(0.1)2
DGD 2 (0.1)(0.9)(0.1) = (0.9)(0.1)2
GDD 2 (0.9)(0.1)(0.1) = (0.9)(0.1)2
DDD 3 (0.1)(0.1)(0.1) = (0.1)3
From this table it is easy to see that
P(X = 0) = (0.9)3
P(X = 1) = 3 × (0.9)2 (0.1)
P(X = 2) = 3 × (0.9)(0.1)2
P(X = 3) = (0.1)3
Clearly, a pattern is developing. In fact you may have already realized that the probabilities we have
found are just the terms of the expansion of the expression (0.9 + 0.1)3 since
(0.9 + 0.1)3 = (0.9)3 + 3 × (0.9)2 (0.1) + 3 × (0.9)(0.1)2 + (0.1)3
We now develop the binomial distribution from a more general perspective. If you find the theory
getting a bit heavy simply refer back to this example to help clarify the situation.
First we shall find it convenient to denote the probability of failure on a trial, which is 1 − p, by q,
that is:
q = 1 − p.
What we shall do is to calculate probabilities of the number of ‘successes’ occurring in n trials,
beginning with n = 1.
n=1 With only one trial we can observe either 1 success (with probability p) or 0 successes
(with probability q).
20 HELM (2008):
Workbook 37: Discrete Probability Distributions
®
n=2 Here there are 3 possibilities: We can observe 2, 1 or 0 successes. Let S denote a success
and F denote a failure. So a failure followed by a success would be denoted by F S whilst two failures
followed by one success would be denoted by F F S and so on.
Then
P(2 successes in 2 trials) = P(SS) = P(S)P(S) = p2
(where we have used the assumption of independence between trials and hence multiplied probabili-
ties). Now, using the usual rules of basic probability, we have:
P(1 success in 2 trials) = P[(SF ) ∪ (F S)] = P(SF ) + P(F S) = pq + qp = 2pq
Task
List the outcomes for the binomial model for the case n = 3, calculate their
probabilities and display the results in a table.
Your solution
Answer
{three successes, two successes, one success, no successes}
Three successes occur only as SSS with probability p3 .
Two successes can occur as SSF with probability (p2 q), as SF S with probability (pqp) or as F SS
with probability (qp2 ).
These are mutually exclusive events so the combined probability is the sum 3p2 q.
Similarly, we can calculate the other probabilities and obtain the following table of results.
Number of successes 3 2 1 0
Probability p3 3p2 q 3pq 2 q3
HELM (2008): 21
Section 37.2: The Binomial Distribution
Note that the probabilities you have obtained:
q 3 , 3q 2 p, 3qp2 , p3
are the terms which arise in the binomial expansion of (q + p)3 = q 3 + 3q 2 p + 3qp2 + p3
Task
Repeat the previous Task for the binomial model for the case with n = 4.
Your solution
Answer
Number of successes 4 3 2 1 0
Probability p4 4p3 q 6p2 q 2 4pq 3 q4
Again we explore the connection between the probabilities and the terms in the binomial expansion
of (q + p)4 . Consider this expansion
(q + p)4 = q 4 + 4q 3 p + 6q 2 p2 + 4qp3 + p4
Then, for example, the term 4p3 q, is the probability of 3 successes in the four trials. These successes
can occur anywhere in the four trials and there must be one failure hence the p3 and q components
which are multiplied together. The remaining part of this term, 4, is the number of ways of selecting
three objects from 4.
4!
Similarly there are 4C2 = = 6 ways of selecting two objects from 4 so that the coefficient 6
2!2!
2 2
combines with p and q to give the probability of two successes (and hence two failures) in four
trials.
The approach described here can be extended for any number n of trials.
22 HELM (2008):
Workbook 37: Discrete Probability Distributions
®
Key Point 4
Notation
If a random variable X follows a binomial distribution in which an experiment is repeated n times
each with probability p of success then we write X ∼ B(n, p).
Example 9
A worn machine is known to produce 10% defective components. If the random
variable X is the number of defective components produced in a run of 4 compo-
nents, find the probabilities that X takes the values 0 to 4.
Solution
From Example 8, we know that the probabilities required are the terms of the expansion of the
expression:
(0.9 + 0.1)4 so X ∼ B(4, 0.1)
Hence the required probabilities are (using the general formula with n = 4 and p = 0.1)
P(X = 0) = (0.9)4 = 0.6561
P(X = 1) = 4(0.9)3 (0.1) = 0.2916
4×3
P(X = 2) = (0.9)2 (0.1)2 = 0.0486
1×2
4×3×2
P(X = 3) = (0.9)(0.1)3 = 0.0036
1×2×3
P(X = 4) = (0.1)4 = 0.0001
Also, since we are using the expansion of (0.9 + 0.1)4 , the probabilities should sum to 1, This is a
useful check on your arithmetic when you are using a binomial distribution.
HELM (2008): 23
Section 37.2: The Binomial Distribution
Example 10
In a box of switches it is known 10% of the switches are faulty. A technician is
wiring 30 circuits, each of which needs one switch. What is the probability that
(a) all 30 work, (b) at most 2 of the circuits do not work?
Solution
The answers involve binomial distributions because there are only two states for each circuit - it
either works or it doesn’t work.
A trial is the operation of testing each circuit.
A success is that it works. We are given P(success) = p = 0.9
Also we have the number of trials n = 30
n
Applying the binomial distribution P(X = r) = Cr pr (1 − p)n−r .
30
(a) Probability that all 30 work is P(X = 30) = C30 (0.9)30 (0.1)0 = 0.04239
(b) The statement that “at most 2 circuits do not work” implies that 28, 29 or 30 work.
That is X ≥ 28
24 HELM (2008):
Workbook 37: Discrete Probability Distributions
®
Example 11
A University Engineering Department has introduced a new software package called
SOLVIT. To save money, the University’s Purchasing Department has negotiated
a bargain price for a 4-user licence that allows only four students to use SOLVIT
at any one time. It is estimated that this should allow 90% of students to use
the package when they need it. The Students’ Union has asked for more licences
to be bought since engineering students report having to queue excessively to use
SOLVIT. As a result the Computer Centre monitors the use of the software. Their
findings show that on average 20 students are logged on at peak times and 4 of
these want to use SOLVIT. Was the Purchasing Department’s estimate correct?
Solution
4
P(student wanted to use SOLVIT ) = = 0.2
20
Let X be the number of students wanting to use SOLVIT at any one time, then
20
P(X = 0) = C0 (0.2)0 (0.8)20 = 0.0115
20
P(X = 1) = C1 (0.2)1 (0.8)19 = 0.0576
20
P(X = 2) = C2 (0.2)2 (0.8)18 = 0.1369
20
P(X = 3) = C3 (0.2)3 (0.8)17 = 0.2054
20
P(X = 4) = C4 (0.2)4 (0.8)16 = 0.2182
Therefore
The probability that more than 4 students will want to use SOLVIT is
P(X > 4) = 1 − P(X ≤ 4) = 0.38138
That is, 38% of the time there will be more than 4 students wanting to use the software. The
Purchasing Department has grossly overestimated the availability of the software on the basis of a
4-user licence.
HELM (2008): 25
Section 37.2: The Binomial Distribution
Task
1
Using the binomial model, and assuming that a success occurs with probability 5
in each trial, find the probability that in 6 trials there are
(a) 0 successes (b) 3 successes (c) 2 failures.
Answer
1 4
In each case p = and q = 1 − p = .
5 5
Here r = 0 and
6
4
6 4096
P(X = 0) = q = = ≈ 0.262
5 15625
Your solution
(b) P(X = 3) =
Answer
3 3
6 6×5×4
3 3 1 4 20 × 64 12 × 80
r = 3 and P(X = 3) = C3 p q = × × = 6
= = 0.0819
1×2×3 5 5 5 15625
Your solution
(c) P(X = 4) =
Answer
4 2
6 6×5 4 2 1 4 15 × 42 240
Here r = 4 and P(X = 4) = C4 p q = × × = = = 0.01536
1×2 5 5 56 15625
26 HELM (2008):
Workbook 37: Discrete Probability Distributions
®
Task
Calculate the mean and variance of a random variable X which follows a binomial
distribution X ∼ B(3, p).
Your solution
HELM (2008): 27
Section 37.2: The Binomial Distribution
Answer
The table of values appropriate to this case is:
x x2 P(X = x) xP(X = x) x2 P(X = x)
0 0 q3 0 0
1 1 3q 2 p 3q 2 p 3q 2 p
2 4 3qp2 6qp2 12qp2
3 9 p3 3p3 9p3
xP(X = x) = 0 + 3q 2 p + 6qp2 + 3p3 = 3p(q + p)2 = 3p
P
Hence E(X) = since q + p = 1
From the results given above, it is reasonable to asert the following result in Key Point 5.
Key Point 5
Expectation and Variance of the Binomial Distribution
If a random variable X which can assume the values 0, 1, 2, 3, . . . , n follows a binomial distribution
X ∼ B(n, p) so that
P(X = r) = n Cr pr q n−r = n Cr pr (1 − p)n−r
then the expectation and variance of the distribution are given by the formulae
E(X) = np and V (X) = np(1 − p) = npq
Task
A die is thrown repeatedly 36 times in all. Find E(X) and V (X) where X is the
number of sixes obtained.
Your solution
28 HELM (2008):
Workbook 37: Discrete Probability Distributions
®
Answer
Consider the occurrence of a six, with X being the number of sixes thrown in 36 trials.
The random variable X follows a binomial distribution. (Why? Refer to page 18 for the criteria
if necessary). A trial is the operation of throwing a die. A success is the occurrence of a 6 on a
particular trial, so p = 16 . We have n = 36, p = 16 so that
1 1 5
E(X) = np = 36 × =6 and V (X) = npq = 36 × × = 5.
6 6 6
√
Hence the standard deviation is σ = 5 ' 2.236.
E(X) = 6 implies that in 36 throws of a fair die we would expect, on average, to see 6 sixes. This
makes perfect sense, of course.
HELM (2008): 29
Section 37.2: The Binomial Distribution
Exercises
1. The probability that a mountain-bike rider travelling along a certain track will have a tyre burst
is 0.05. Find the probability that among 17 riders:
2. (a) A transmission channel transmits zeros and ones in strings of length 8, called ‘words’.
Possible distortion may change a one to a zero or vice versa; assume this distortion occurs
with probability .01 for each digit, independently. An error-correcting code is employed
in the construction of the word such that the receiver can deduce the word correctly if at
most one digit is in error. What is the probability the word is decoded incorrectly?
(b) Assume that a word is a sequence of 10 zeros or ones and, as before, the probability of
incorrect transmission of a digit is .01. If the error-correcting code allows correct decoding
of the word if no more than two digits are incorrect, compute the probability that the
word is decoded correctly.
4. The probability that a machine will produce all bolts in a production run within specification
is 0.998. A sample of 8 machines is taken at random. Calculate the probability that
5. The probability that a machine develops a fault within the first 3 years of use is 0.003. If 40
machines are selected at random, calculate the probability that 38 or more will not develop any
faults within the first 3 years of use.
6. A computer installation has 10 terminals. Independently, the probability that any one terminal
will require attention during a week is 0.1. Find the probabilities that
(a) 0, (b), 1 (c) 2, (d) 3 or more, terminals will require attention during the next week.
7. The quality of electronic chips is checked by examining samples of 5. The frequency distribution
of the number of defective chips per sample obtained when 100 samples have been examined
is:
No. of defectives 0 1 2 3 4 5
No. of samples 47 34 16 3 0 0
Calculate the proportion of defective chips in the 500 tested. Assuming that a binomial distri-
bution holds, use this value to calculate the expected frequencies corresponding to the observed
frequencies in the table.
30 HELM (2008):
Workbook 37: Discrete Probability Distributions
®
Exercises continued
8. In a large school, 80% of the pupils like mathematics. A visitor to the school asks each of 4
pupils, chosen at random, whether they like mathematics.
(a) Calculate the probabilities of obtaining an answer yes from 0, 1, 2, 3, 4 of the pupils
(b) Find the probability that the visitor obtains the answer yes from at least 2 pupils:
9. A machine has two drive belts, one on the left and one on the right. From time to time the
drive belts break. When one breaks the machine is stopped and both belts are replaced. Details of n
consecutive breakages are recorded. Assume that the left and right belts are equally likely to break
first. Let X be the number of times the break is on the left.
(a) How many possible different sequences of “left” and “right” are there?
(b) How many of these sequences contain exactly j “lefts”?
(c) Find an expression, in terms of n and j, for the probability that X = j.
(d) Let n = 6. Find the probability distribution of X.
10. A machine is built to make mass-produced items. Each item made by the machine has a
probability p of being defective. Given the value of p, the items are independent of each other.
Because of the way in which the machines are made, p could take one of several values. In fact
p = X/100 where X has a discrete uniform distribution on the interval [0, 5]. The machine is tested
by counting the number of items made before a defective is produced. Find the conditional probability
distribution of X given that the first defective item is the thirteenth to be made.
11. Seven batches of articles are manufactured. Each batch contains ten articles. Each article has,
independently, a probability of 0.1 of being defective. Find the probability that there is at least one
defective article
12. A service engineer is can be called out for maintenance on the photocopiers in the offices of
four large companies, A, B, C and D. On any given week there is a probability of 0.1 that he will
be called to each of these companies. The event of being called to one company is independent of
whether or not he is called to any of the others.
(a) Find the probability that, on a particular day,
(i) he is called to all four companies,
(ii) he is called to at least three companies,
(iii) he is called to all four given that he is called to at least one,
(iv) he is called to all four given that he is called to Company A.
(b) Find the expected value and variance of the number of these companies which call the
engineer on a given day.
HELM (2008): 31
Section 37.2: The Binomial Distribution
Exercises continued
13. There are five machines in a factory. Of these machines, three are working properly and two
are defective. Machines which are working properly produce articles each of which has independently
a probability of 0.1 of being imperfect. For the defective machines this probability is 0.2. A machine
is chosen at random and five articles produced by the machine are examined. What is the probability
that the machine chosen is defective given that, of the five articles examined, two are imperfect and
three are perfect?
14. A company buys mass-produced articles from a supplier. Each article has a probability p of being
defective, independently of other articles. If the articles are manufactured correctly then p = 0.05.
However, a cheaper method of manufacture can be used and this results in p = 0.1.
(a) Find the probability of observing exactly three defectives in a sample of twenty articles
(i) given that p = 0.05
(ii) given that p = 0.1.
(b) The articles are made in large batches. Unfortunately batches made by both methods
are stored together and are indistinguishable until tested, although all of the articles
in any one batch will be made by the same method. Suppose that a batch delivered
to the company has a probability of 0.7 of being made by the correct method. Find the
conditional probability that such a batch is correctly manufactured given that, in a sample
of twenty articles from the batch, there are exactly three defectives.
(c) The company can either accept or reject a batch. Rejecting a batch leads to a loss for
the company of £150. Accepting a batch which was manufactured by the cheap method
will lead to a loss for the company of £400. Accepting a batch which was correctly
manufactured leads to a profit of £500. Determine a rule for what the company should
do if a sample of twenty articles contains exactly three defectives, in order to maximise
the expected value of the profit (where loss is negative profit). Should such a batch be
accepted or rejected?
(d) Repeat the calculation for four defectives in a sample of twenty and hence, or otherwise,
determine a rule for how the company should decide whether to accept or reject a batch
according to the number of defectives.
32 HELM (2008):
Workbook 37: Discrete Probability Distributions
®
Answers
1. Binomial distribution P(X = r) = nCr pr (1 − p)n−r where p is the probability of single ‘success’
which is ‘tyre burst’.
17
(a) P(X = 1) = C1 (0.05)1 (0.95)16 = 0.3741
(b)
2.
(a) P (distortion) = 0.01 for each digit. This is a binomial situation in which the probability
of ‘success’ is 0.01 = p and there are n = 8 trials.
A word is decoded incorrectly if there are two or more digits in error
3. Let X be a random variable ‘number of answers guessed correctly’ then for each question
(i.e. trial) the probability of a ‘success’ = 51 . It is clear that X follows a binomial distribution
with n = 10 and p = 0.2.
P (randomly choosing correct answer) = 15 n = 10
HELM (2008): 33
Section 37.2: The Binomial Distribution
Answers
9.
n
(c) P(X = j) = 2−n .
j
j 0 1 2 3 4 5 6
P(X = j) 0.015625 0.09375 0.234375 0.3125 0.234375 0.09375 0.015625
10. Let Y be the number of the first defective item.
34 HELM (2008):
Workbook 37: Discrete Probability Distributions
®
Answers
11.
The probability of at least one defective in a batch is 1 − 0.910 = 0.6513.
Let the probability of at least one defective in exactly j batches be pj .
7 4 3
(a) p4 = 1 − 0.910 0.910 = 35 × 0.65134 × 0.34873 = 0.2670.
4
(b)
7 5 2
p5 = 1 − 0.910 0.910 = 21 × 0.65135 × 0.34872 = 0.2993.
5
7 6 1
p6 = 1 − 0.910 0.910 = 7 × 0.65136 × 0.34871 = 0.1863.
6
7 7 0
p7 = 1 − 0.910 0.910 = 0.65137 = 0.0497.
7
HELM (2008): 35
Section 37.2: The Binomial Distribution
Answers
13. Let D denote the event that the chosen machine is defective and D̄ denote the event
“not D”.
Let Y be the number of imperfect articles in the sample of five.
Then
P(D) × P(Y = 2 | D)
P(D | Y = 2) =
P(D) × P(Y = 2 | D) + P(D̄) × P(Y = 2 | D̄)
2 5
5
× × 0.22 × 0.83
2
=
2 5 3 5
5
× × 0.22 × 0.83 + 5 × × 0.12 × 0.93
2 2
2 × 0.22 × 0.83
=
2 × 0.22 × 0.83 + 3 × 0.12 × 0.93
0.04096
= = 0.6519.
0.04096 + 0.02187
14.
20 20 × 19 × 18
(a) (i) p3 = 0.13 × 0.917 = × 0.13 × 0.97 = 0.190.
3 1×2×3
(ii)
20 3
p2 = 0.12 × 0.918 = × 9 × p3 = 0.28518
2 18
20 2
p1 = 0.1 × 0.919 = × 9 × p2 = 0.27017
1 19
20
p0 = 0.920 = 0.12158.
0
(b)
4
0.2 × 0.31 × 0.73
1 0.02058
= = 0.2608.
4 1 3 4 1 3
0.02058 + 0.05832
0.2 × 0.3 × 0.7 + 0.9 × 0.1 × 0.9
1 1
36 HELM (2008):
Workbook 37: Discrete Probability Distributions
®
The Poisson
Distribution 37.3
Introduction
In this Section we introduce a probability model which can be used when the outcome of an experiment
is a random variable taking on positive integer values and where the only information available is a
measurement of its average value. This has widespread applications, for example in analysing traffic
flow, in fault prediction on electric cables and in the prediction of randomly occurring accidents. We
shall look at the Poisson distribution in two distinct ways. Firstly, as a distribution in its own right.
This will enable us to apply statistical methods to a set of problems which cannot be solved using
the binomial distribution. Secondly, as an approximation to the binomial distribution X ∼ B(n, p)
in the case where n is large and p is small. You will find that this approximation can often save the
need to do much tedious arithmetic.
• understand the concepts of probability
Prerequisites
• understand the concepts and notation for the
Before starting this Section you should . . . binomial distribution
'
$
• recognise and use the formula for probabilities
calculated from the Poisson model
HELM (2008): 37
Section 37.3: The Poisson Distribution
1. The Poisson approximation to the binomial distribution
The probability of the outcome X = r of a set of Bernoulli trials can always be calculated by using
the formula
n
P(X = r) = Cr q n−r pr
given above. Clearly, for very large values of n the calculation can be rather tedious, this is particularly
so when very small values of p are also present. In the situation when n is large and p is small and the
product np is constant we can take a different approach to the problem of calculating the probability
that X = r. In the table below the values of P(X = r) have been calculated for various combinations
of n and p under the constraint that np = 1. You should try some of the calculations for yourself
using the formula given above for some of the smaller values of n.
Probability of X successes
n p X=0 X=1 X=2 X=3 X=4 X=5 X=6
38 HELM (2008):
Workbook 37: Discrete Probability Distributions
®
Applying condition (1) allows us to approximate terms such as (n − 1), (n − 2), . . . to n (mathemat-
ically, we are allowing n → ∞ ) and the right-hand side of our expansion becomes
n2 2 nr r
1 + np + p + · · · + p + . . .
2! r!
n
Note that the term p → 0 under these conditions and hence has been omitted.
We now have the series
(np)2 (np)r
1 + np + + ··· + + ...
2! r!
which, using condition (3) may be written as
(λ)2 (λ)r
1+λ+ + ··· + + ...
2! r!
You may recognise this as the expansion of eλ .
If we are to be able to claim that the terms of this expansion represent probabilities, we must be sure
that the sum of the terms is 1. We divide by eλ to satisfy this condition. This gives the result
eλ 1 (λ)2 (λ)r
= 1 = (1 + λ + + · · · + + ...)
eλ eλ 2! r!
λ2 λ3 λr
= e−λ + e−λ λ + e−λ + e−λ + · · · + e−λ + · · · +
2! 3! r!
The terms of this expansion are very good approximations to the corresponding binomial expansion
under the conditions
1. n is large
2. p is small
3. np = λ (λ constant)
The Poisson approximation to the binomial distribution is summarized below.
Key Point 6
Poisson Approximation to the Binomial Distribution
Assuming that n is large, p is small and that np is constant, the terms
n
P(X = r) = Cr (1 − p)n−r pr
of a binomial distribution may be closely approximated by the terms
r
−λ λ
P(X = r) = e
r!
of the Poisson distribution for corresponding values of r.
HELM (2008): 39
Section 37.3: The Poisson Distribution
Example 12
We introduced the binomial distribution by considering the following scenario. A
worn machine is known to produce 10% defective components. If the random vari-
able X is the number of defective components produced in a run of 3 components,
find the probabilities that X takes the values 0 to 3.
Suppose now that a similar machine which is known to produce 1% defective
components is used for a production run of 40 components. We wish to calculate
the probability that two defective items are produced. Essentially we are assuming
that X ∼ B(40, 0.01) and are asking for P(X = 2). We use both the binomial
distribution and its Poisson approximation for comparison.
Solution
Using the binomial distribution we have the solution
40 40 × 39
P(X = 2) = C2 (0.99)40−2 (0.01)2 = × 0.9938 × 0.012 = 0.0532
1×2
Note that the arithmetic involved is unwieldy. Using the Poisson approximation we have the solution
0.42
P(X = 2) = e−0.4 = 0.0536
2!
Note that the arithmetic involved is simpler and the approximation is reasonable.
Practical considerations
In practice, we can use the Poisson distribution to very closely approximate the binomial distribution
provided that the product np is constant with
n ≥ 100 and p ≤ 0.05
Note that this is not a hard-and-fast rule and we simply say that
‘the larger n is the better and the smaller p is the better provided that np is a sensible size.’
The approximation remains good provided that np < 5 for values of n as low as 20.
Task
Mass-produced needles are packed in boxes of 1000. It is believed that 1 needle
in 2000 on average is substandard. What is the probability that a box contains
2 or more defectives? The correct model is the binomial distribution with n =
1 1999
1000, p = (and q = ).
2000 2000
40 HELM (2008):
Workbook 37: Discrete Probability Distributions
®
(a) Using the binomial distribution calculate P(X = 0), P(X = 1) and hence P(X ≥ 2):
Your solution
Answer
1000
1999
P(X = 0) = = 0.60645
2000
999 999
1999 1 1 1999
P(X = 1) = 1000 × = = 0.30338
2000 2000 2 2000
∴ P(X = 0) + P(X = 1) = 0.60645 + 0.30338 = 0.90983 ' 0.9098 (4 d.p.)
Hence P(2 or more defectives) ' 1 − 0.9098 = 0.0902.
(b) Now choose a suitable value for λ in order to use a Poisson model to approximate the probabilities:
Your solution
λ=
Answer
1 1
λ = np = 1000 × = 2
2000
Now recalculate the probability that there are 2 or more defectives using the Poisson distribution
with λ = 12 :
Your solution
P(X = 0) =
P(X = 1) =
Answer
1 1
P(X = 0) = e− 2 , P(X = 1) = 12 e− 2
1
∴ P(X = 0) + P(X = 1) = 32 e− 2 = 0.9098 (4 d.p.)
Hence P(2 or more defectives) ' 1 − 0.9098 = 0.0902.
HELM (2008): 41
Section 37.3: The Poisson Distribution
In the above Task we have obtained the same answer to 4 d.p., as the exact binomial calculation,
essentially because p was so small. We shall not always be so lucky!
Example 13
In the manufacture of glassware, bubbles can occur in the glass which reduces the
status of the glassware to that of a ‘second’. If, on average, one in every 1000
items produced has a bubble, calculate the probability that exactly six items in a
batch of three thousand are seconds.
Solution
Suppose that X = number of items with bubbles, then X ∼ B(3000, 0.001)
Since n = 3000 > 100 and p = 0.001 < 0.005 we can use the Poisson distribution with λ = np =
3000 × 0.001 = 3. The calculation is:
36
P(X = 6) = e−3 ≈ 0.0498 × 1.0125 ≈ 0.05
6!
The result means that we have about a 5% chance of finding exactly six seconds in a batch of three
thousand items of glassware.
Example 14
A manufacturer produces light-bulbs that are packed into boxes of 100. If quality
control studies indicate that 0.5% of the light-bulbs produced are defective, what
percentage of the boxes will contain:
(a) no defective? (b) 2 or more defectives?
Solution
As n is large and p, the P(defective bulb), is small, use the Poisson approximation to the binomial
probability distribution. If X = number of defective bulbs in a box, then
X ∼ P(µ) where µ = n × p = 100 × 0.005 = 0.5
e−0.5 (0.5)0 e−0.5 (1)
(a) P(X = 0) = = = 0.6065 ≈ 61%
0! 1
(b) P(X = 2 or more) = P(X = 2) + P(X = 3) + P(X = 4) + . . . but it is easier to consider:
P(X ≥ 2) = 1 − [P(X = 0) + P(X = 1)]
e−0.5 (0.5)1 e−0.5 (0.5)
P(X = 1) = = = 0.3033
1! 1
i.e. P(X ≥ 2) = 1 − [0.6065 + 0.3033] = 0.0902 ≈ 9%
42 HELM (2008):
Workbook 37: Discrete Probability Distributions
®
1. the probability of more than one event occurring in the subinterval is zero
2. the probability of one event occurring in a subinterval is proportional to the length of the
subinterval
HELM (2008): 43
Section 37.3: The Poisson Distribution
Key Point 7
The Poisson Probabilities
If X is the random variable
‘number of occurrences in a given interval’
for which the average rate of occurrence is λ then, according to the Poisson model, the probability
of r occurrences in that interval is given by
−λ λr
P(X = r) = e r = 0, 1, 2, 3, . . .
r!
Task
λr
Using the Poisson distribution P(X = r) = e−λwrite down the formulae for
r!
P(X = 0), P(X = 1), P(X = 2) and P(X = 6), noting that 0! = 1.
Your solution
P(X = 0) =
P(X = 1) =
P(X = 2) =
P(X = 6) =
Answer
λ0 1 λ
P(X = 0) = e−λ × = e−λ × ≡ e−λ P(X = 1) = e−λ × = λe−λ
0! 1 1!
2 2
λ λ λ6 λ6 −λ
P(X = 2) = e−λ × = e−λ P(X = 6) = e−λ × = e
2! 2 6! 720
44 HELM (2008):
Workbook 37: Discrete Probability Distributions
®
Task
Calculate P(X = 0) to P(X = 5) when λ = 2, accurate to 4 d.p.
Your solution
Answer
r 0 1 2 3 4 5
P(X = r) 0.1353 0.2707 0.2707 0.1804 0.0902 0.0361
Notice how the values for P(X = r) in the above answer increase, stay the same and then decrease
relatively rapidly (due to the significant increase in r! with increasing r). Here two of the probabilities
are equal and this will always be the case when λ is an integer.
In this last Task we only went up to P(X = 5) and calculated each entry separately. However, each
probability need not be calculated directly. We can use the following relations (which can be checked
from the formulae for P(X = r)) to get the next probability from the previous one:
λ λ λ
P(X = 1) = P(X = 0) , P(X = 2) = P(X = 1), P(X = 3) = P(X = 2) , etc.
1 2 3
Key Point 8
Recurrence Relation for Poisson Probabilities
In general, for ease of calculation the recurrence relation below can be used
λ
P(X = r) = P(X = r − 1) for r ≥ 1.
r
HELM (2008): 45
Section 37.3: The Poisson Distribution
Example 15
Calculate the value for P(X = 6) to extend the Table in the previous Task using
the recurrence relation and the value for P(X = 5).
Solution
The recurrence relation gives the formula
2 1
P(X = 6) = P(X = 5) = × 0.0361 = 0.0120
6 3
We now look further at the Poisson distribution by considering an example based on traffic flow.
Example 16
Suppose it has been observed that, on average, 180 cars per hour pass a specified
point on a particular road in the morning rush hour. Due to impending roadworks
it is estimated that congestion will occur closer to the city centre if more than
5 cars pass the point in any one minute. What is the probability of congestion
occurring?
Solution
We note that we cannot use the binomial model since we have no values of n and p. Essentially we
are saying that there is no fixed number (n) of cars passing the specified point and that we have
no way of estimating p. The only information available is the average rate at which cars pass the
specified point.
Let X be the random variable X = number of cars arriving in any minute. We need to calculate
the probability that more than 5 cars arrive in any one minute. Note that in order to do this we
need to convert the information given on the average rate (cars arriving per hour) into a value for
λ (cars arriving per minute). This gives the value λ = 3.
Using λ = 3 to calculate the required probabilities gives:
r 0 1 2 3 4 5 Sum
P(X = r) 0.04979 0.149361 0.22404 0.22404 0.168031 0.10082 0.91608
46 HELM (2008):
Workbook 37: Discrete Probability Distributions
®
Example 17
The mean number of bacteria per millilitre of a liquid is known to be 6. Find the
probability that in 1 ml of the liquid, there will be:
(a) 0, (b) 1, (c) 2, (d) 3, (e) less than 4, (f) 6 bacteria.
Solution
Here we have an average rate of occurrences but no estimate of the probability so it looks as though
we have a Poisson distribution with λ = 6. Using the formula in Key Point 7 we have:
60
(a) P(X = 0) = e−6 = 0.00248.
0!
That is, the probability of having no bacteria in 1 ml of liquid is 0.00248
λ
(b) P(X = 1) = × P(X = 0) = 6 × 0.00248 = 0.0149.
1
That is, the probability of having 1 bacteria in 1 ml of liquid is 0.0149
λ 6
(c) P(X = 2) = × P(X = 1) = × 0.01487 = 0.0446.
2 2
That is, the probability of having 2 bacteria in 1 ml of liquid is 0.0446
λ 6
(d) P(X = 3) = × P(X = 2) = × 0.04462 = 0.0892.
3 3
That is, the probability of having 3 bacteria in 1 ml of liquid is 0.0892
(e) P(X < 4) = P(X = 0) + P(X = 1) + P(X = 2) + P(X = 3) = 0.1512
66
(f) P(X = 6) = e−6 = 0.1606
6!
Note that in working out the first 6 answers, which link together, all the digits were kept in the
calculator to ensure accuracy. Answers were rounded off only when written down.
Never copy down answers correct to, say, 4 decimal places and then use those rounded figures to
calculate the next figure as rounding-off errors will become greater at each stage. If you did so here
you would get answers 0.0025, 0.0150, 0.0450, 0.9000 and P(X < 4) = 0.1525. The difference is
not great but could be significant.
HELM (2008): 47
Section 37.3: The Poisson Distribution
Task
A Council is considering whether to base a recovery vehicle on a stretch of road to
help clear incidents as quickly as possible. The road concerned carries over 5000
vehicles during the peak rush hour period. Records show that, on average, the
number of incidents during the morning rush hour is 5. The Council won’t base a
vehicle on the road if the probability of having more than 5 incidents in any one
morning is less than 30%. Based on this information should the Council provide a
vehicle?
Your solution
(Do the calculation on separate paper and record the main results here.)
Answer
We need to calculate the probability that more than 5 incidents occur i.e. P(X > 5). To find this
we use the fact that P(X > 5) = 1 − P(X ≤ 5). Now, for this problem:
5r
P(X = r) = e−5
r!
Writing answers to 5 d.p. gives:
50
P(X = 0) = e−5 = 0.00674
0!
P(X = 1) = 5 × P(X = 0) = 0.03369
5
P(X = 2) = × P(X = 1) = 0.08422
2
5
P(X = 3) = × P(X = 2) = 0.14037
3
5
P(X = 4) = × P(X = 3) = 0.17547
4
5
P(X = 5) = × P(X = 4) = 0.17547
5
The probability of more than 5 incidents is P(X > 5) = 1 − P(X ≤ 5) = 0.38403, which is 38.4%
(to 3 s.f.) so the Council should provide a vehicle.
48 HELM (2008):
Workbook 37: Discrete Probability Distributions
®
Key Point 9
The Poisson Distribution
If X is the random variable {number of occurrences in a given interval}
for which the average rate of occurrences is λ and X can assume the values 0, 1, 2, 3, . . . and the
probability of r occurrences in that interval is given by
λr
P(X = r) = e−λ
r!
then the expectation and variance of the distribution are given by the formulae
E(X) = λ and V(X) = λ
For a Poisson distribution the Expectation and Variance are equal.
HELM (2008): 49
Section 37.3: The Poisson Distribution
Exercises
1. Large sheets of metal have faults in random positions but on average have 1 fault per 10 m2 .
What is the probability that a sheet 5 m × 8 m will have at most one fault?
2. If 250 litres of water are known to be polluted with 106 bacteria what is the probability that a
sample of 1 cc of the water contains no bacteria?
3. Suppose vehicles arrive at a signalised road intersection at an average rate of 360 per hour and
the cycle of the traffic lights is set at 40 seconds. In what percentage of cycles will the number
of vehicles arriving be (a) exactly 5, (b) less than 5? If, after the lights change to green, there
is time to clear only 5 vehicles before the signal changes to red again, what is the probability
that waiting vehicles are not cleared in one cycle?
(a) Find the probability that there are 4 defective transistors in a batch of 2000.
(b) What is the largest number, N , of transistors that can be put in a box so that the
probability of no defectives is at least 1/2?
5. A manufacturer sells a certain article in batches of 5000. By agreement with a customer the
following method of inspection is adopted: A sample of 100 items is drawn at random from
each batch and inspected. If the sample contains 4 or fewer defective items, then the batch
is accepted by the customer. If more than 4 defectives are found, every item in the batch is
inspected. If inspection costs are 75 p per hundred articles, and the manufacturer normally
produces 2% of defective articles, find the average inspection costs per batch.
6. A book containing 150 pages has 100 misprints. Find the probability that a particular page
contains (a) no misprints, (b) 5 misprints, (c) at least 2 misprints, (d) more than 1 misprint.
7. For a particular machine, the probability that it will break down within a week is 0.009. The
manufacturer has installed 800 machines over a wide area. Calculate the probability that (a)
5, (b) 9, (c) less than 5, (d) more than 4 machines breakdown in a week.
8. At a given university, the probability that a member of staff is absent on any one day is 0.001.
If there are 800 members of staff, calculate the probabilities that the number absent on any
one day is (a) 6, (b) 4, (c) 2, (d) 0, (e) less than 3, (f) more than 1.
9. The number of failures occurring in a machine of a certain type in a year has a Poisson
distribution with mean 0.4. In a factory there are ten of these machines. What is
50 HELM (2008):
Workbook 37: Discrete Probability Distributions
®
Exercises continued
10. A factory uses tools of a particular type. From time to time failures in these tools occur and
they need to be replaced. The number of such failures in a day has a Poisson distribution with
mean 1.25. At the beginning of a particular day there are five replacement tools in stock. A
new delivery of replacements will arrive after four days. If all five spares are used before the
new delivery arrives then further replacements cannot be made until the delivery arrives.
Find
(a) the probability that three replacements are required over the next four days.
(b) the expected number of replacements actually made over the next four days.
Answers
1. Poisson Process. In a sheet size 40 m2 we expect 4 faults
∴ λ=4 P(X = r) = λr e−λ /r!
HELM (2008): 51
Section 37.3: The Poisson Distribution
Answers
5. P(defective) = 0.02. Poisson approximation to binomial λ = np = 100(0.02) = 2
P(4 or fewer defectives in sample of 100)
−2 −2 22 −2 23 −2 24 −2
= e + 2e + e + e + e = 0.947347
2 3! 4!
Cost c 75 75 × 50
Inspection costs
P(X = c) 0.947347 0.0526
E(Cost) = 75(0.947347) + 75 × 50(0.0526) = 268.5 p
10. Let the number required over 4 days be X. Then E(X) = 4 × 1.25 = 5 and X ∼ Poisson(5).
e−5 53
(a) P(X = 3) = = 0.1404.
3!
(b) Let R be the number of replacements made.
and
52 HELM (2008):
Workbook 37: Discrete Probability Distributions
®
The Hypergeometric
Distribution 37.4
Introduction
The hypergeometric distribution enables us to deal with situations arising when we sample from
batches with a known number of defective items. In essence, the number of defective items in a
batch is not a random variable - it is a known, fixed, number.
• understand the concepts of probability
Prerequisites
• understand the notation n Cr used in
Before starting this Section you should . . . probability calculations
HELM (2008): 53
Section 37.4: The Hypergeometric Distribution
1. The Hypergeometric distribution
Suppose we are sampling without replacement from a batch of items containing a variable number of
defectives. We are essentially assuming that we know the probability p that a given item is defective
but not the actual number of defective items contained in the batch. The number of defective items
in the batch is a random variable in this case.
When we sample from the batch, we are left with:
1. a smaller batch;
2. a (possibly) smaller (but still variable) number of defective items. The number of defective
items is still a random variable.
While the probability of finding a given number of defectives in a sample drawn from the second
batch will (in general) be different from the probability of finding a given number of defectives in a
sample drawn from the first batch, sampling from both batches may be described by the binomial
distribution for which:
n
P(X = r) = Cr (1 − p)n−r pr
Sampling in this case varies the values of n and p in general but not the underlying distribution
describing the sampling process.
Example 18
A batch of 100 piston rings is known to contain 10 defective rings. If two piston
rings are drawn from the batch, write down the probabilities that:
Solution
10 1
(a) The probability that the first ring is defective is clearly = .
100 10
(b) Assuming that the first ring selected is defective and we do not replace it, the probability
9 1
that the second ring is defective is equally clearly = .
99 11
The hypergeometric distribution may be thought of as arising from sampling from a batch of items
where the number of defective items contained in the batch is known.
Essentially the number of defectives contained in the batch is not a random variable, it is fixed.
54 HELM (2008):
Workbook 37: Discrete Probability Distributions
®
The calculations involved when using the hypergeometric distribution are usually more complex than
their binomial counterparts.
If we sample without replacement we may proceed in general as follows:
• we may select n items from a population of N items in N Cn ways;
• we may select r defective items from M defective items in M Cr ways;
• we may select n − r non-defective items from N − M non-defective items in N −M Cn−r ways;
• hence we may select n items containing r defectives in M Cr × N −M Cn−r ways.
• hence the probability that we select a sample of size n containing r defective items from a
population of N items known to contain M defective items is
M N −M
Cr × Cn−r
NC
n
Key Point 10
Hypergeometric Distribution
The distribution given by
M N −M
Cr × Cn−r
P(X = r) = NC
n
which describes the probability of obtaining a sample of size n containing r defective items from
a population of size N known to contain M defective items is known as the hypergeometric
distribution.
Example 19
A batch of 10 rocker cover gaskets contains 4 defective gaskets. If we draw samples
of size 3 without replacement, from the batch of 10, find the probability that a
sample contains 2 defective gaskets.
Solution
M N −M
Cr × Cn−r
Using P(X = r) = NC
we know that N = 10, M = 4, n = 3 and r = 2.
n
4
C2 × 6 C1 6×6
Hence P(X = 2) = 10 C
= = 0.3
3 120
HELM (2008): 55
Section 37.4: The Hypergeometric Distribution
It is possible to derive formulae for the mean and variance of the hypergeometric distribution. How-
ever, the calculations are more difficult than their binomial counterparts, so we will simple state the
results.
Key Point 11
Expectation and Variance of the Hypergeometric Distribution
The expectation (mean) and variance of the hypergeometric random variable
M N −M
Cr × Cn−r
P(X = r) = NC
n
are given by
N −M M
E(X) = µ = np and V(X) = np(1 − p) where p =
N −1 N
Example 20
For the previous Example, concerning rocker cover gaskets, find the expectation
and variance of samples containing 2 defective gaskets.
Solution
M N −M
Cr × Cn−r
Using P(X = r) = NC
we know that N = 10, M = 4, n = 3 and r = 2.
n
Hence
4
E(X) = np = 3 × = 1.2
10
and
N −M 4 6 10 − 4
V(X) = np(1 − p) =3× × × = 0.48
N −1 10 10 10 − 1
56 HELM (2008):
Workbook 37: Discrete Probability Distributions
®
Task
In the manufacture of car tyres, a particular production process is know to yield 10
tyres with defective walls in every batch of 100 tyres produced. From a production
batch of 100 tyres, a sample of 4 is selected for testing to destruction. Find:
Your solution
Answer
Sampling is clearly without replacement and we use the hypergeometric distribution with
N = 100, M = 10, n = 4, r = 1 and p = 0.1. Hence:
M N −M
Cr × Cn−r
(a) P(X = r) = NC
gives
n
10 100−10
C1 × C4−1 10 × 117480
P(X = 1) = 100 C
= ≈ 0.3
4 3921225
(b) The expectation is E(X) = np = 4 × 0.1 = 0.4
N −M 90
(c) The variance is V(X) = np(1 − p) = 0.4 × 0.9 × ≈ 0.33
N −1 99
HELM (2008): 57
Section 37.4: The Hypergeometric Distribution
Task
A company (the producer) supplies microprocessors to a manufacturer (the con-
sumer) of electronic equipment. The microprocessors are supplied in batches of
50. The consumer regards a batch as acceptable provided that there are not
more than 5 defective microprocessors in the batch. Rather than test all of the
microprocessors in the batch, 10 are selected at random and tested.
(a) Find the probability that out of a sample of 10, d = 0, 1, 2, 3, 4, 5 are defec-
tive when there are actually 5 defective microprocessors in the batch.
(b) Suppose that the consumer will accept the batch provided that not more
than m defectives are found in the sample of 10.
(i) Find the probability that the batch is accepted when there are 5 defec-
tives in the batch.
(ii) Find the probability that the batch is rejected when there are 3 defec-
tives in the batch.
Your solution
58 HELM (2008):
Workbook 37: Discrete Probability Distributions
®
Answer
(a) Let X = the numbers of defectives in a sample. Then
45
C10−d × 5 Cd
P(X = d) = 50 C
10
Hence
45
C10 × 5 C0 45
C9 × 5 C1
P(X = 0) = 50 C
= 0.311 P(X = 1) = 50 C
= 0.431
10 10
45
C8 × 5 C2 45
C7 × 5 C3
P(X = 2) = 50 C
= 0.210 P(X = 3) = 50 C
= 0.044
10 10
45
C6 × 5 C4 45
C5 × 5 C5
P(X = 4) = 50 C
= 0.004 P(X = 5) = 50 C
= 0.0001
10 10
Exercise
A company buys batches of n components. Before a batch is accepted, m of the components are
selected at random from the batch and tested. The batch is rejected if more than d components in
the sample are found to be below standard.
(a) Find the probability that a batch which actually contains six below-standard components
is rejected when n = 20, m = 5 and d = 1.
(b) Find the probability that a batch which actually contains nine below-standard components
is rejected when n = 30, m = 10 and d = 1.
HELM (2008): 59
Section 37.4: The Hypergeometric Distribution
Answer
(a) Let the number of below-standard components in the sample be X. The probability of
acceptance is
14 6 14 6
5 0 4 1
P(X = 0) + P(X = 1) = +
20 20
5 5
14 13 12 11
5
× 4
× 3
× 2
× 101
+ 14
4
× 13
3
× 12
2
× 12
2
× 11
1
× 6
1
= 20 19 18 17 16
5
× 4
× 3
× 2
× 1
2002 + 6006
=
15504
= 0.5165
21 9 21 9
10 0 9 1
P(X = 0) + P(X = 1) = +
30 30
10 10
Now
21 9 21 20 19 18 17 16 15 14 13 12
= × × × × × × × × ×
10 0 10 9 8 7 6 5 4 3 2 1
= 352716
21 9 21 20 19 18 17 16 15 14 13 9
= × × × × × × × × ×
9 1 9 8 7 6 5 4 3 2 1 1
= 2645370
30 30 29 28 27 26 25 24 23 22 21
= × × × × × × × × ×
10 10 9 8 7 6 5 4 3 2 1
= 30045015
352716 + 2645370
= 0.0998
30045015
Hence the probability of rejection is 1 − 0.0998 = 0.9002
60 HELM (2008):
Workbook 37: Discrete Probability Distributions