0% found this document useful (0 votes)
138 views21 pages

Unit 3

The document discusses binomial distributions, which model the number of successes in a fixed number of independent trials. It defines the binomial distribution, gives examples of calculating probabilities, and discusses the mean and variance of binomial distributions. The objectives are to understand binomial distributions and how to apply them to real-life problems involving repeated trials with fixed probabilities.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
138 views21 pages

Unit 3

The document discusses binomial distributions, which model the number of successes in a fixed number of independent trials. It defines the binomial distribution, gives examples of calculating probabilities, and discusses the mean and variance of binomial distributions. The objectives are to understand binomial distributions and how to apply them to real-life problems involving repeated trials with fixed probabilities.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

Unit 3

Discrete Probability Distributions

Introduction
You are welcome to this unit. In Unit 1, we discussed both discrete and continuous random
variables. In this Unit, we consider some specific discrete distributions and how their
probabilities are calculated and applied in real life situations. The most important of them that
we consider in this chapter are the binomial, Poisson, negative binomial, geometric and the
hypergeometric distributions. Unit 3 is structured under the following sections:

Section 1 Binomial Distribution


Section 2 Poison Distribution
Section 3 Negative Binomial Distribution
Section 4 Geometric Distribution
Section 5 Hypergeometric Distribution
Section 6 Sums of Binomial Random Variables

Unit Objectives
By the end of this unit, you should be able to:
 Identify each of the discrete distributions under study by their properties
 Use each of the discrete distributions to solve problems on probabilities
 Apply the specific discrete probability distributions in real life situations
SECTION 1: Binomial Distribution
Introduction
We welcome you to section 1 of Unit 3. In this section, we consider one of the most important
distributions in probability-binomial distribution. We will enumerate the various properties of the
binomial distributions that will enable you to easily identify a distribution as binomial. We then
proceed to determine how to calculate probabilities for the binomial distributions and their
applications in real life situations. Wish you well in this unit.

Objectives
By the end of this section, you should be able to:
 State the properties of the binomial distribution
 Identify the binomial distribution as a discrete probability distribution
 Apply the binomial probability distributions in real life situations

Dear learner, we start this section by reminding you that many statistical problems deal with
situations referred to as repeated trials. Repeated trials play a very important role in probability
and statistics, especially when the number of trials is fixed, the probability of a success is the
same for each trial, and the trials are all independent. For example, we may want to know the
following probabilities. The probability that 9 out of 10 VCR’s will run at least 1000 hours. The
probability that 45 of 120 drivers stopped at a roadblock will be wearing seatbelts. The
probability that 7 out of 10 persons will recover from malaria or the probability that 35 out of 47
candidates will pass the impending end of semester examination.

In each of these examples, we are interested in the probability of getting x successes in n trials or
in other words x successes and 𝑛 − 𝑥 failures in 𝑛 attempts. There are many examples that
conform either exactly or approximately to the list of experiments above.
A binomial experiment is one that possesses the following properties:

i. The experiment consists of n repeated trials


ii. There are only two possible outcomes for each trial. “Success” and “failure”.
iii. The probability of a success denoted by p remains constant from trial to trial.
iv. The repeated n trials are independent.

If p and 1  p , (1  p  q) are the probabilities of success and failure on any one trial, then the
probability of getting x successes and 𝑛 − 𝑥 failures in some specific order is
p x (1  p) n x .
The number of ways in which we can select the x trials on which there is to be a success is n C x
and it follows that the desired probability for x successes in n trials is given by
n
C x p x (1  p) n x .
Definition A random variable X has a binomial distribution and it is referred to as a binomial
random variable if its probability distribution is given by:

b( x; n, p)  n C x p x (1  p) n x , for x  0, 1, 2,  , n
That is the number of successes in n trials is a random variable having a binomial distribution
with the parameters n and p. This probability distribution is called the binomial distribution
because for x  0, 1, 2,  , n , the values of the probability are the successive terms of the
binomial expansion (1  p)  Pn . This shows also that the sum of the probabilities equals one
as it should.
We consider some examples:

Example 3.1.1

Each of five questions on a multiple-choice examination has four choices, only one of which is
correct. What is the probability that:
1. a student will get exactly three answers correct?
2. a student will get at most three answers correct?
3. a student will get at least four answers correct

Solution

Let X be a random variable. Then the number of trials n  5 , and the probability of a success
(guessing a correct answer) is p  14 . We notice that this problem satisfies all the four properties
of the binomial distribution. Here n  5 , p  1
4 and x  3 .
1. P(exactly 3 correct)  b(3; 5, )  10( )( )  1024
90
 0.0879
1
4
1
64
9
16

3
2. P(at most 3 correct)  P( x  3)   5 C x ( 14 ) x ( 34 ) 5 x
x 0

 5 C0 ( 14 ) 0 ( 34 ) 5 + 5 C1 ( 14 )1 ( 34 ) 4 + 5 C2 ( 14 ) 2 ( 34 ) 3 + 5 C3 ( 14 ) 3 ( 34 ) 2

 1024
1008
 0.9844 .

3. P(at least 4 correct)  P( x  4)  1  P( x  4)

 1  P( x  3)  1  0.9844  0.0156
Example 3.1.2

Find the probability that on the street of Accra, 7 out of 10 tro-tro drivers will wear seatbelts at a
roadblock if we can assume independence and the probability is 0.2 that any one of them will
wear a seatbelt.

Solution

We find that there are only two outcomes (wearing and not wearing a seatbelt). Also the
probability of wearing a seatbelt is 0.2, which is the same throughout the experiment. This
experiment can therefore be seen as binomial.
Substituting x  7 , n  10 and p  0.2 into the formula for the binomial distribution, we get
b(7;10,0.2) 10 C7 (0.2)7 (0.8)3  120(0.2)7 (0.8)3  0.00079 .

If n is large, the calculation of probabilities can become quite tedious. In that case, it may be
desirable to use numerical approximations or refer to special tables. We therefore consider the
numerical characteristics of the binomial distribution.

The mean of a probability distribution is simply the mathematical expectation of a corresponding


random variable. If a random variable takes on the values x1 , x2 ,  xk with probabilities
f ( x1 ), f ( x2 ),  , f ( xk ) , its mathematical expectation is given by
x1 f ( x1 )  x2 f ( x2 )    xk f ( xk ) .
That is
   xf ( x) ,
all. x
where the mean is denoted by the Greek letter  . The mean of probability distribution measures
its centre in the sense of an average or better still in the sense of a centre of gravity.

Theorem For the random variable X that has a binomial distribution with parameter p and n,
the mathematical expectation is given by
E[ X ]    np .

Proof For the random variable X, the expectation is given by


E[ X ]   xf ( x) .
x
We substitute the expression that defines b( x; n, p) .
We get
n
E[ X ]   x n C x p x (1  p) n  x
x 0
n
  x x!( nn! x )! p x (1  p) n  x .
x 0
But
x
x!  1
( x 1)!
, p x  pp x 1 and n! n(n  1)!

Making use of these expressions and factoring out n and p, we obtain


n
E[ X ]  np x ( x (1n)!(1n)! x )! p x 1 (1  p) n  x ,
x 1

where the summation starts with x=1 since the original summand is zero for x  0 . Now let
y  x  1 and m  n  1. Then we obtain
m
E[ X ]  np y!( mm! y )! p y (1  p) m y .
y 0
This last sum can be recognised as that of all terms of the binomial distribution with parameters
m and p. Hence this sum equals one and it follows that E[ X ]  np .

Definition The variance  2 and the standard deviation  of the binomial distribution with
parameters n and p are respectively
 2  np(1  p) and   np(1  p) .
Example 3.1.3

If 75% of all consultations handled by lecturer consultants at a computing centre involve


programs with syntax errors, and X is the number of programs with syntax errors in 10 randomly
chosen consultations, then the mathematical expectation is given by
E[ X ]  np  10(0.75)  7.5 .

The variance is calculated to be  2  np(1  p)  10(0.75)(0.25)  1.875 . The standard deviation


is therefore given by   1.875
SECTION 2: Poisson Distribution

Introduction
In this section, we consider another example of discrete probability distributions called the
Poisson distribution. The Poisson distribution is widely used in the study of business and
industrial processes. This distribution was discovered by the French mathematician Simon Denis
Poisson.

Objectives
By the end of this section, you should be able to:
 State the properties of the binomial distribution
 Identify the Poison distribution as a discrete probability distribution
 Use the Poison distribution to calculate probabilities
 Apply the Poison probability distribution in real life situations

When n is large, the calculation of binomial probabilities with the formula that was derived in
the last section, will usually involve a prohibitive amount of work. For example, if in the
binomial process n  7000, x  25 and p  0.001 then we need to calculate
b( x; n, p)  n C x p x (1  p) n x
 7000 C25 (0.001) 25 (0.999) 6975 .

This becomes very difficult to evaluate. The Poisson distribution provides a way out of this
problem. The Poisson distribution comes into play in situations in which discrete events are
being observed in some continuous interval of time or space. The random variable of interest is
the number of occurrences of the observation period. The given time interval may be of any
length, such as a minute, a day, a week, a month or even a year. The Poisson experiment might
generate observations for the random variable X, representing the number of telephone calls per
hour received by an office, the number of weeks the Universities are closed down due to strike
actions, or the number of postponed games due to rain during a football season.

A Poisson experiment is one that possesses the following properties:

i. The number of successes occurring in one-time interval of specified region are independent
of those occurring in any other disjoint time interval or region of space.
ii. The probability of a single success occurring during a very short time interval is proportional
to the length of the time interval and does not depend on the number of successes occurring
between this time interval.
iii. The probability of more than one success occurring in such a short interval is negligible.

When we observe the occurrence of discrete events in a continuous interval, we are observing
what is called a Poisson process. Each Poisson process is characterised by one parameter, which
we denote by  . This parameter gives the average number of occurrences of the event in a unit
interval. The probability distribution of the Poisson variable X is called the Poisson distribution
and will be denoted by p( x;  ) , since its values depend only on  , the average number of
success occurring in a given time interval or specified region.

Definition The probability distribution of the Poisson random variable X, representing the
number of success occurring in a given time interval or specified region is given by
x e 
p( x;  )  , 𝑥 = 0, 1, 2, …
x!

where  is the average number of successes occurring in the given time interval or specified
region and e  2.71828.
Statistical tables contain Poisson probability sums:
n
p( x;  )   p( x;  )
x 0
Example 3.2.1
The average number of taxi drivers wearing seatbelts at Winneba junction in one hour is four.
What is the probability that six drivers will be found to wear seatbelts in a given hour?

Solution
Using the Poisson distribution with 𝑥 = 6 and  = 4, we find from tables that
6 5
p(6;4)  e  4 46
6!   p( x;4)   p( x;4)
x 0 x 0

 0.8893  0.7851  0.1042

Example 3.2.2
Suppose the average number of oil tankers arriving each day at Tema harbour is 10 and the
facilities at the harbour can handle at most 15 tankers per day. What is the probability that on a
given day tankers will have to be sent away?

Solution
Let x be the number of tankers arriving each day. If tankers will have to be sent away, then x, the
number of tankers arriving each day is more than 15. Then

p( x  15)  1  p( x  15)
15
 1   p( x;10)
x 0

 1  0.9519  0.0481 .

The mean and variance of the Poisson distribution


As is the case of the binomial distribution, there are general formulae that can be used to find the
mean and variance for a Poisson random variable.

Theorem The mean and the variance of the Poisson distribution both have the value  .
Proof To find the mean, we write down the general definition of the expectation of a discrete
random variable.
That is
  E[ X ]   xf ( x) .
all. x

We now substitute the expression that defines p( x;  ) . In our case


x e 
f ( x)  p( x;  )  .
x!
Therefore

x e 
E[ X ]   x
x 0 x!


 x 1e  
 x ,
x 0 x( x  1)!
since x! x( x  1)! and x   x 1 .
We obtain

 x 1e  
E[ X ]   
x 1 ( x  1)!


 x 1e  
  .
x 1 ( x  1)!
In this expression, let y  x  1 , then

 y e 
E[ X ]    
y 0 y!
since

 y e  


y 0 y!
  p ( y;  )  1 .
y 0

We then conclude that the mean or the mathematical expectation of the Poisson distribution is 
. Similarly, it can be shown that  2   . Commented [j1]: An example involving mean and variance of
the poisson distribution

The Poisson approximation to the binomial distribution


It was pointed out in the introduction that when n is large, the calculation of binomial
probabilities becomes difficult. Let us now state and prove a theorem that will help us calculate
binomial distributions using the Poisson distribution.

Theorem Let X be a binomial random variable with probability distribution b( x; n, p) .


When n   , p  0 , and   np , remains constant, b( x; n, p)  p( x;  ) .
This theorem states that if n is large and p is small and   np remains constant, then the
binomial distribution can be approximated by the Poison distribution.

Proof We are given that   np , therefore p  n . Let us make this substitution into the
formula for the binomial distribution and simplify the resulting expression,
b( x; n, p)  n C x p x (1  p) n x .
Expanding this formula, we get
n!
b( x; n, p)  p x (1  p) n  x
x!(n  x)!

n x
  
x
n!
   1  
x!(n  x)!  n   n 

n x
n(n  1)(n  2)...(n  x  1)  x   
 1   .
x! nx  n 

We divide each of x factors of n by nx.

We obtain
n x
x  
b( x; n, p)  1(1  1n )(1  n2 )...(1  x 1
n ) 1   .
x!  n 
If we let n   , we find that

b( x; n, p)  1(1  1n )(1  n2 )...(1  xn1 )  1 .


Also
n x x
     
n

1    1   1   .
 n  n  n
If
x

n   , 1    1.
 n
This implies that
n x
   
n

1    1   , as n   .
 n  n
We also notice that

    
n

 
n

1    1     e , as 𝑛 → ∞
 n  n 

n x
 
We have shown therefore that 1    e  , as 𝑛 → ∞
 n
and
1(1  1n )(1  n2 )...(1  xn1 )  1.

We obtain
x e 
b( x; n, p)   p( x;  ) , for 𝑥 = 0, 1, 2…
x!

We have proved that if n is large and p is small and   np remains constant then
b( x; n, p)  p( x;  ) , 𝑥 = 0, 1, 2,…

An acceptable rule of thumb is to use this approximation if n  20 and p  0.05 . If n  100 , the
approximation is generally excellent, so long as np  10 .

Example 3.2,3
A publishing Company of mathematics books, Ghana Mathematics Group (GMG), takes pains to
ensure that its books are free of typographical errors. The probability of any given page
containing at least one such error is 0.005 and errors are independent from page to page. What is
the probability that one of its 400 page book will contain
1. exactly one page with errors?
2. at most three pages with errors?

Solution
We find from this problem that n  400 and p  0.005 . Therefore np=2. These values satisfy
the conditions in the theorem. Although this is a typical binomial problem, it is convenient to use
the Poisson approximation. We have
21 e 2
b(1;400,0.005)  p(1;2)   0.271 .
1!
Similarly,
3 3
p( x  3)   p( x;2)   2e1!
2

x 0 x 0

 0.135  0.271  0.271  0.180  0.857 .

Example 3.2.4
It is known that 5% of the books bound at a certain bindery have defective bindings. Find the
probability that 2 of 100 books bound by this bindery will have defective bindings using
1. the formula for the binomial distribution
2. the Poisson approximation to the binomial distribution.

Solution
1. Substituting x=2, n=100 and p=0.05 into the formula for the
binomial distribution, we get
b( x; n, p)  n C x p x (1  p) n x
 (100)(0.05) 2 (0.95) 98  0.081

2. Substituting x=2,   np  (100)(0.05)  5 into the formula for


the Poisson distribution, we get
52 e5
p(2;5)   0.084 .
2!
It is interesting to note that the difference between the two values we obtain (the error we would
make by using the Poisson approximation) is only 0.003.
SECTION 3: Negative Binomial Distribution

Section Introduction
Dear Learner, you are once again welcome. In this section, we consider the Negative binomial
distribution. Please do not get it wrong. We are not going to talk about the negative section of the
binomial distribution. We will explain the concept of the negative binomial distribution, give its
properties and show how its calculation is done. We wish you happy reading.

Section Objectives
By the end of this section, you should be able to:
 State the properties of the negative binomial distribution
 Identify the negative binomial distribution as a discrete probability distribution
 Use the negative binomial distribution to calculate probabilities
 Apply the negative binomial probability distribution in real life situations

Let us consider an experiment in which the properties are the same as those listed for a binomial
experiment, with the exception that the trials will be repeated until a fixed number of successes
occur. Therefore, instead of finding the probability of x successes in n trials, where n is fixed,
we now are interested in the probability that the k th success occurs on the xth trial. Experiments of
this nature are called Negative binomial experiments. For instance, we may be interested in the
probability that the tenth driver observed on a roadblock will be the third to wear a seatbelt. The
probability that the fifth person to hear a rumour will be the first one to believe it, or the
probability that a burglar will be caught for the second time on his or her ninth job.

Definition The number of trials X to produce k successes in a negative binomial experiment


is called a negative binomial variable.

The negative binomial random distribution is based on an experiment satisfying the following
conditions:
i. The experiment consists of a sequence of independent trials.
ii. Each trial can result in either a success or a failure.
iii. The probability of success is constant from trial to trial.
iv. The experiment continues until a total of 𝑟 successes have been observed, where 𝑟 is a
specified positive integer.

The random variable of interest is X, the number of failures that precede the 𝑟𝑡ℎ success. X is
called a negative binomial random variable because in contrast to the binomial random variable,
the number of success is fixed and the number of trials is random. The negative binomial
variable X is called the negative binomial distribution and will be denoted by b * ( x; k , p) since
its values depend on the number of successes desired and the probability of a success on a given
trial. To obtain the general formula for b * ( x; k , p) , consider the probability of a success on the
𝑥𝑡ℎ trial preceded by k  1 successes and x  k failures in some specified order. Since the trials
are independent, we multiply all the probabilities corresponding to each desired outcome. Each
success occurs with probability p and each failure with probability q  1  p . Therefore the
probability for the specified order, ending in a success is
p k 1 (1  p) xk p  p k (1  p) xk .

But the total number of successes after the occurrence of k  1 successes and x  k failures in
any order is given by. x k C k 1 . The required probability is therefore given by
x 1
Ck 1 p k (1  p) xk .

Definition If repeated independent trials can result in a success with probability p and a
failure with probability 1  p , then the probability distribution of the random variable X, the
number of the trial on which the 𝑘𝑡ℎ success occurs is given by
b* ( x; k , p)  x1 Ck 1 p k (1  p) xk , x  k , k  1, k  2,  

In statistics, negative binomial distributions are also referred to as binomial waiting-time


distributions or Pascal distributions.

Example 3.3.1
If the probability is 0.40 that a driver is observed to wear a seatbelt, what is the probability that
the fifteenth driver observed will be the third to wear a seatbelt?

Solution
Substituting x=15, k=3 and p=0.40 into the formula for the negative binomial distribution, we get
b* (10;3,0.40) 14 C2 (0.40) 3 (0.60)12
14!
 (0.40)3 (0.60)12  2.177  103
2!12!

Example 3.3.2
Find the probability that a person tossing three coins will get either all heads or all tails for the
second time on the fifth toss.

Solution
Using the negative binomial distribution with x=5, k=2 and p=0.25,
we get
b* (5;2,0.25)  4 C1 (0.25) 2 (0.75) 3
4!
 (0.25)2 (0.75)3  0.1055 .
1!3!

The negative binomial distribution derives its name from the fact that each term in the expansion
of p k (1  p) x k , corresponds to the values b * ( x; k , p) for x  k , k  1, k  2,  
Section 4: Geometric Distribution

Introduction
Dear learner, this section is going to be a short one, this is because all the concepts that we need
to learn have already been treated in the previous sections in this unit. The geometric distribution
is another discrete probability distribution. Again, the geometric distribution is widely used in
the study of business and industrial processes.

Objectives
By the end of this section, you should be able to:
 Identify the geometric distribution as a discrete probability distribution
 Use the geometric distribution to calculate probabilities
 Apply the geometric probability distribution in real life situations

Dear student, so far, you have treated 3 discrete probability distributions. You realised that they
are all linked together in a way. If we consider the special case of the negative binomial
distribution where k  1 , we have a probability distribution for the number of trials required for
a single success.
The negative binomial distribution reduces to the form

b* ( x;1, p)  p(1  p) x1 , x  1, 2, 3,...

Since the successive terms constitute a geometric progression, it is customary to refer to this
special case as the geometric distribution and it is denoted by g(x; p). The geometric distribution
has many important applications. Suppose in a sequence of trials, we are interested in the
number of the trial on which the first success occurs. If the first success is to come on the 𝑥𝑡ℎ
trial, it has to be preceded by 𝑥 − 1 failures, and if the probability of a success is p, then the
probability of 𝑥 − 1 failures on 𝑥 − 1 trials is (1  p) x 1 . This probability distribution is called
the Geometric distribution.

Definition If repeated independent trials can result in a success with probability p and a
failure with probability 1-p, then the probability distribution of the random variable X, the
number of the trial on which the first success occurs is given by
g ( x; p)  p(1  p) x1 , x  1, 2, 3,...

Example 3.4.1
If the probability is 0.20 that a burglar will be caught on any given job, what is the probability of
being caught for the first time on the forth job?

Solution
This is a negative binomial experiment. We substitute 𝑥 = 4, 𝑝 = 0.20 in

g ( x; p)  p(1  p) x1 .
We get
g (4;0.20)  (0.20)(0.80) 3  0.102

Example 3.4.2
In a certain manufacturing process, it is known that on the average, 1 in every 100 items is
defective. What is the probability that 5 items are inspected before a defective item is found?

Solution
We find that g (5;0.01)  (0.01)(0.99) 4  0.0096 .

We see that in the geometric trials, items are drawn one at a time with replacement from a batch
when X is the number of draws to the first defective item discovered. In this case the number of
trials is often called the waiting time.
SECTION 5: Hypergeometric Distribution

Introduction
The hypergeometric distribution is the last discrete probability we will study in this unit. This
distribution depends on 3 parameters and it is considered as the most complex among the
discrete distributions. Let us read through this work and find out how it is linked with the other
distributions already discussed above.

Objectives
By the end of this section, you should be able to:
 State the properties of the hypergeometric distribution
 Identify the hypergeometric distribution as a discrete probability distribution
 Use the hypergeometric distribution to calculate probabilities
 Apply the hypergeometric probability distribution in real life situations

The negative binomial and the hypergeometric distributions are both closely related to the
binomial distribution. Whereas the binomial distribution is the approximate probability model
for sampling without replacement from a finite dichotomous population, the hypergeometric
distribution is the exact probability model for the number of successes in the sample. The
hypergeometric distribution satisfies the following properties:

i The population or set to be sampled consists of N individuals, objects or elements (a


finite population).
ii Each individual can be characterised as a success or a failure, and there are k successes in
the population.
iii A sample of n individuals is drawn in such a way that each subject of size n is equally
likely to be chosen.

The random variable of interest is X- the number of successes in the sample. The probability
distribution of X depends on the parameters, 𝑛, 𝑘 and 𝑁. We therefore wish to obtain
P( X  x)  h( x; n, k , N ) . Suppose that we are interested in the number of defectives in a sample
of n units drawn from a lot containing N units of which k are defectives. If the sample is drawn
in such a way that at each successive drawing whatever units are left in the lot has the same
chance of being selected, then we are solving the problem of sampling without replacement
(unlike the binomial distribution).

The x successes (defectives) can be chosen in k C x ways, the n  x failures (non-defectives) can
be chosen in N k
C n x ways and hence x successes and n  x failures can be chosen in
N k
k
C x  Cn x ways. Also n objects can be chosen from a set of N objects in N C n ways. If we
consider all these possibilities as equally likely, if follows that for sampling without replacement,
the probability of getting “x successes in n trials is given by

( k C x )( N k C n  x )
h( x; n, k , N )  N
, 𝑥 = 0, 1 ,2, … , 𝑛, x  k ,
Cn
and
nx  N k.

This equation defines the hypergeometric distribution whose parameters are the sample size n,
the lot size N and the number of successes k in the lot.

Example 3.5.1
A shipment of 20 tape recorders contains 5 that are defectives. If 10 of them are randomly
chosen for inspection, what is the probability that 2 of the 10 will be defective?

Solution
We substitute 𝑥 = 2, 𝑛 = 10, 𝑘 = 5, and 𝑁 = 20 into the formula for the hypergeometric
distribution. We obtain
( 5 C 2 )(15 C8 )
h(2;10,5,20)  20
 0.348 .
C10

When N is large and n is relatively small compared to N, there is not much difference between
sampling with replacement and sampling without replacement and the formula for the binomial
distribution with parameters n and p  Nk may be used to approximate hypergeometric
probabilities. In general, it can be shown that

h( x; n, k , N )  b( x; n, p) with p  k
n when N   .

A good rule of thumb is to use the binomial distribution as an approximation to the


hypergeometric distribution if n  10
N
.

Example 3.5.2
Repeat the preceding example for a lot of 100 tape recorders of which 25 are defective by using;
1. the formula for the hypergeometric distribution
2. the formula for the binomial distribution as an approximation.

Solution
We find from the given problem that x=2, n=10, k=25, N=100, k
N  0.25
25 75
( C 2 )( C8 )
1. h(2;10,25,100)  100
 0.292
C10
2. b(2;10,0.25)  (10 C2 )(0.25) 2 (0.75) 8  0.282
Section 6: Sums of Binomial Random Variables

Introduction
In this section, we turn to the important question of determining the distribution of a sum of
independent random variables in terms of the distributions of the individual constituents. In
particular, in this section, we consider sums of discrete random variable-the binomial
distribution. We hope you enjoy this section as well.

Objectives
By the end of this section you should be able to find the sum of binomial variables.

Consider the sum of two independent discrete random variables X and Y whose values are
restricted to the non-negative integers. Let f X () denote the probability distribution of X and
fY () denote the probability distribution of Y . The distribution of their sum Z  X  Y is given
by the discrete convolution formula.

Theorem Discrete Convolution Formula. The random variable Z  X  Y has the


probability distribution f Z () given by
z
f Z (z)  f X Y ( z )  P(Z  z)   f X ( x) fY ( z  x)
x 0

For z  0,1,...

Proof For each z , the event [ Z  z ] is the union of the disjoint events [X  x and Y  z  x] for
x=0,1,2,…,z. Consequently,
z
P( Z  z )  f Z ( z )   P( X  x and Y  z  x) dist
0
z
  f X ( x ) fY ( z  x )
x 0
where the last step follows by independence.
Let X 1 and X 2 be independent binomial random variables having the same probability of
success. Their sum is again binomial.
Let X 1 and X 2 be independent binomial random variables where X i has a binomial (ni , p)
distribution for i  1, 2. Then X1  X 2 has a binomial distribution with n1  n2 trials and
probability of success p .

Let X1 , X 2 ,..., X k be independent binomial random variables where X i has a binomial (ni , p)
distribution for i  1, 2,..., k . Then X1  X 2    X k has a binomial (n1  n2    nk , p)
distribution.

Proof By the discrete convolution formula, Z  X1  X 2 has probability distribution


z
P( X 1  X 2  z)  f Z (z)   f X1 ( x) f X 2 ( z  x)
x 0
so
z
n   n 
f Z ( z )    1  p x (1  p)n1  x  2  p z  x (1  P) n2 ( z  x )
x 0  x   z  x

z
 n  n 
 p z (1  p)n1  n2  z   1  2 
x  0  x  z  x 

Now equating the coefficients of s y in the binomial expansion of both sides of

(1  s)1n (1  s)n2  (1  s)n1 n2


we conclude that
 n1  n2   n1  n2 
z

x 0 
  x  z  x    
   z 
The case for several binomial random variables follows by induction

Activity Set 3

1. The probability that a patient recovers from a rare blood disease is 0.4. If 15 people are
known to have contracted this disease, what is the probability that:
(a) At least 10 survive?
(b) From 3 to 8 survive?
(c) Exactly 5 survive?

2. One prominent physician claims that 70% of those with lung canr are chain smokers. If
his assertion is correct;
(a) find the probability that of 10 such patience recently admitted to a hospital, fewer than
half are Chain smokers;
(b) find the probability that of 20 such patience recently admitted to a hospital, fewer than
half are Chain smokers.

3. Suppose that the number of bush fires observed in a particular region in Ghana during a one-
year period has a Poisson distribution with   8 . Find:
(a) P( x  5)
(b) P(6  x  9
(c) P( x  10)
(d) How many bush fires would you expect to be observed during the one-year
period and what is the standard deviation of the number of the observed bush
fires?

4. The number of telephone calls that pass through a switch board has a Poisson distribution
with mean equal to 2 per minute. Find the probability that no telephone calls pass through
the switch board in two consecutive minutes.

5. If the probability is 0.75 that a person will believe a rumour about misappropriation of funds
by a certain politician, find the probability that:
(a) the eighth person to hear the rumour will be the fifth to believe it.
(b) The fifteenth person to hear the rumour will be the tenth to believe it.

6. Among the 16 applicants for a job, ten have University degrees. If three of the applicants
are randomly chosen for interviews, what are the probabilities that:
(a) none have a University degree.
(b) one has a University degree
(c) two have University degree
(d) all three have University degree.
7. The probability that a candidate passes the entrance examination to UEW is 0.7. Find
the probability that a candidate passes the test:
(a) on the third trial
(b) before the fourth trial.

8. The probability that a person, living in a certain city, owns a dog is estimated to be 0.3.
Find the probability that the tenth person randomly interviewed in that city is the fifth one
to own a dog.

9. A restaurant chef prepares a tossed salad containing on average, 5 vegetables. Find the
probability that the salad contains more than 5 vegetables
(a) On a given day;
(b) On 3 of the next 4 days;
(c) For the first time in April on April 5

10. Population studies of biology and the environment often tag and release subjects in order
to estimate size and degrees of certain features in the population. Ten animals of a
certain population thought to be extinct are caught, tagged and released in a certain
region. After a period of time, a random sample of 15 of this type of animal is selected in
the region. What is the probability that 5 of those selected are tagged animals if there are
25 animals of this type in the region?

Solution Set 3
1. (a) 0.034, (b) 0.878 (c) 0.185
2. (a) 0.0474 (b) 0.0171
3. (a) 0.191 (b) 0.526 (c) 0.283 (d)8, 2.83
4. (a) 0.0183
5. (a) 0.13 (b) 0.0.11
6. (a) 0.036 (b) 0.268 (c) 0.482 (d) 0.964
7. (a) 0.063 (b) 0.973
8. (a) 0.0515
9. (a) 0.3840 (b) 0.1395 (c) 0.0553
10. (a) 0.2315

You might also like