0% found this document useful (0 votes)
6 views10 pages

Unit 11

Uploaded by

rithy khouy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views10 pages

Unit 11

Uploaded by

rithy khouy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

Discrete Uniform and

UNIT 11 DISCRETE UNIFORM AND Hypergeometric


Distributions
HYPERGEOMETRIC
DISTRIBUTIONS
Structure
11.1 Introduction
Objectives

11.2 Discrete Uniform Distribution


11.3 Hypergeometric Distribution
11.4 Summary
11.5 Solution/Answers

11.1 INTRODUCTION
In the previous two units, we have discussed binomial distribution and its
limiting form i.e. Poisson distribution. Continuing the study of discrete
distributions, in the present unit, two more discrete distributions – Discrete
uniform and Hypergeometric distributions are discussed.
Discrete uniform distribution is applicable to those experiments where the
different values of random variable are equally likely. If the population is finite
and the sampling is done without replacement i.e. if the events are random but
not independent, then we use Hypergemetric distribution.
In this unit, discrete uniform distribution and hypergeometric distribution are
discussed in Secs. 11.2 and 11.3, respectively. We shall be discussing their
properties and applications also in these sections.
Objectives
After studing this unit, you should be able to:
 define the discrete uniform and hypergeometric distributions;
 compute their means and variances;
 compute probabilities of events associated with these distributions; and
 know the situations where these distributions are applicable.

11.2 DISCRETE UNIFORM DISTRIBUTION


Discrete uniform distribution can be conceived in practice if under the given
experimental conditions, the different values of the random variable are
equally likely. For example, the number on an unbiased die when thrown may
be 1 or 2 or 3 or 4 or 5 or 6. These values of random variable, “the number on
an unbiased die when thrown” are equally likely and for such an experiment,
the discrete uniform distribution is appropriate.

45
Discrete Probability Definition: A random variable X is said to have a discrete uniform
Distributions
(rectangular) distribution if it takes any positive integer value from 1 to n,
and its probability mass function is given by
1
 for x  1, 2, ..., n
P X  x    n
0, otherwise.
where n is called the parameter of the distribution.
For example, the random variable X, “the number on the unbiased die
when thrown”, takes on the positive integer values from 1 to 6 follows
discrete uniform distribution having the probability mass function.

1
 , for x  1, 2, 3, 4, 5, 6.
P X  x    6
 0 , otherwise.

Mean and Variance of the Distribution


n n
1 1 n
Mean = E(X) =  x p  x    x.    x
x 1 x 1  n  n x 1
1
 1  2  3  ...  n 
n
 n  n  1 
1 n  n  1 sum of first n natural numbers  
 . 2 
n 2 
(see Unit 3of Course MST  001) 
n 1
 .
2
2
Variance = E( X 2 )  [E(X)] 2  
[  2  '2  1' ]

where
n 1
E X  [Obtained above]
2
n
 
E X 2   x 2 .p(x)
x 1

n
1
and E  X 2    x 2 .
x 1 n
1
 [12  22  32  ...  n 2 ]
n
sum of squares of first n 
 
1  n  n  1 2n  1   n  n  1 2n  1 
    natural numbers 
n 6  6 
(see Unit 3of Course MST  001) 
 

46

 n  1 2n  1 Discrete Uniform and
Hypergeometric
6 Distributions

 Variance =
 n  1 2n  1   n  1  2
 
6  2 
 n  1  2
   2n  1  3  n  1 
12
n 1  n  1 n  1  n 2  1
  4n  2  3n  3   
12 12 12
Example 1: Find the mean and variance of a number on an unbiased die when
thrown.
Solution: Let X be the number on an unbiased die when thrown,
 X can take the values 1, 2, 3, 4, 5, 6 with
1
P  X  x   ; x  1, 2, 3, 4, 5, 6.
6
Hence, by uniform distribution, we have
n 1 6 1 7
Mean =   , and
2 2 2
2
n 2  1  6   1 35
Variance =   .
12 12 12
Uniform Frequency Distribution
If an experiment, satisfying the requirements of discrete uniform distribution,
is repeated N times, then expected frequency of a value of random variable is
given by
f  x   N.P  X  x  ; x  1, 2, ..., n

1
 N. ; x  1, 2, 3,..., n.
n
Example 2: If an unbiased die is thrown 120 times, find the expected
frequency of appearing 1, 2, 3, 4, 5, 6 on the die.
Solution: Let X be the uniform discrete random variable, “the number on the
unbiased die when thrown”.
1
 P  X  x   ; x  1, 2, ..., 6
6
Hence, the expected frequencies of the value of random variable are given as
computed in the following table:

47
Discrete Probability
Distributions
X P X  x  Expected/Theoretical frequencies
f  x   N.P[X  x]  120.P[X  x]

1 1 1
120   20
6 6
2 1 1
120   20
6 6
3 1 1
120   20
6 6
4 1 1
120   20
6 6
5 1 1
120   20
6 6
6 1 1
120   20
6 6

Now, you can try the following exercise:


E1) Obtain the mean, variance of the discrete uniform distribution for the
random variable, “the number on a ticket drawn randomly from an urn
containing 10 tickets numbered from 1 to 10”. Also obtain the expected
frequencies if the experiment is repeated 150 times.

11.3 HYPERGEOMETRIC DISTRIBUTION


In the last section of this unit, we have studied discrete uniform probability
distribution wherein the probability distribution is obtained for the possible
outcomes in a single trial like drawing a ticket from an urn containing 10
tickets as mentioned in exercise E1). But, if there are more than one but finite
trials with only two possible outcomes in each trial, we apply some other
distribution. One such distribution which is applicable in such a situation is
binomial distribution which you have studied in Unit 9. The binomial
distribution deals with finite and independent trials, each of which has exactly
two possible outcomes (Success or Failure) with constant probability of
success in each trial. For example, if we again consider the example of
drawing ticket randomly from an urn containing 10 tickets bearing numbers
from 1 to 10. Then, the probability that the drawn ticket bears an odd number
5 1
is  . If we replace the ticket back, then the probability of drawing a ticket
10 2
5 1
bearing an odd number is again  . So, if we draw ticket again and again
10 2
with replacement, trials become independent and probability of getting an odd
number is same in each trial. Suppose, it is asked that what is the probability of
getting 2 tickets bearing odd number in 3 draws then we apply binomial
distribution as follows:
48
Let X be the number of times on odd number appears in 3 draws, then by Discrete Uniform and
binomial distribution, Hypergeometric
Distributions
2 3 2
1 1  1  1  3
P  X  2  C2    
3
  3      .
2 2  4  2  8
But, if in the example discussed above, we do not replace the ticket after any
draw the probability of getting an odd number gets changed in each trial and
the trials remain no more independent and hence in this case binomial
distribution is not applicable. Suppose, in this case also, we are interested in
finding the probability of getting ticket bearing odd number twice in 3 draws,
then it is computed as follows:
Let Ai be the event that ith ticket drawn bears odd number and A i be the event
that i th ticket drawn does not bear odd number.
 Probability of getting ticket bearing odd number twice in 3 draws

 P  A1  A 2  A 3   P  A1  A 2  A 3   P  A1  A 2  A3 

[As done in Unit 3 of this Course]

 P  A1  P  A 2  A1  P  A3  A1  A 2   P  A1  P  A 2  A1  P  A 3  A1  A 2 

 P  A1  P  A 2  A1  P  A 3  A1  A 2 

[Multiplication theorem for dependent events (See Unit 3 of this Course)]


5 4 5 5 5 4 5 5 4
= . .  . .  . .
10 9 8 10 9 8 10 9 8
5 5 4
 3
10  9  8
This result can be written in the following form also:
5  4  5  3 2
= [Multiplying and Dividing by 2]
2  10  9  8
5
5 4 1 1 C  5C1
=  5 = 5 C 2  5C1  10  102
2 10  9  8 C3 C3
3 2
In the above result, 5 C 2 is representing the number of ways of selecting 2 out
of 5 tickets bearing odd number, 5 C1 is representing the number of ways of
selecting 1 out of 5 tickets bearing even number i.e. not bearing odd number,
and 10 C3 is representing the number of ways of selecting 3 out of total 10
tickets.
Let us consider another similar example of a bag containing 20 balls out of
which 5 are white and 15 are black. Suppose 10 balls are drawn at random one
by one without replacement, then as discussed in the above example, the
probability that in these 10 draws, there are 2 white and 8 black balls is

49
Discrete Probability 5
Distributions C 2  15C8
20
.
C10
Note: The result remains exactly same whether the items are drawn one by one
without replacement or drawn at once.
Let us now generalize the above argument for N balls, of which M are white
and N  M are black. Of these, n balls are chosen at random without
replacement. Let X be a random variable that denote the number of white balls
drawn. Then, the probability of X  x white balls among the n balls drawn is
given by
M
C x . N  M Cn  x
P X  x   N
Cn

[For x  0, 1, 2,..., n  n  M  or x  0, 1, 2,..., M  n  M  ]

The above probability function of discrete random variable X is called the


Hypergeometric distribution.
Remark 1: We have a hypergeometric distribution under the following
conditions:
i) There are finite number of dependent trials
ii) A single trial results in one of the two possible outcomes-Success or
Failure
iii) Probability of success and hence that of failure is not same in each trial
i.e. sampling is done without replacement
Remark 2: If number (n) of balls drawn is greater than the number (M) of
white balls in the bag, then if n ≤ M, the number  x  of white balls drawn
cannot be greater than n and if n > M, then number of white balls drawn
cannot be greater than M. So, x can take the values upto n  if n  M  and
M(if n > M) i.e. x can take the value upto n or M, whichever is less, i.e.
x  min {n, M}.
The discussion leads to the following definition
Definition: A random variable X is said to follow the hypergeometric
distribution with parameters N, M and n if it assumes only non-negative
integer values and its probability mass function is given by
 M C x . N M C n  x
 for x  0, 1, 2, ..., min{n, M}
P X  x    N
Cn
 0, otherwise

where n, M, N are positive integers such that n ≤ N, M ≤ N.
Mean and Variance
n
Mean = E  X    x.p  X  x 
x0

n M
C x . N M C n  x
=  x.
x 1
N
Cn
50
n
M M 1 C x 1. N M C n  x Discrete Uniform and
=  x.
x 1 x
. N
Cn
Hypergeometric
Distributions

n
M
= N
Cn

x 1
M 1
C x 1. N  M Cn  x 
M M 1
 N
 C0 . N  M Cn 1  M 1C1. N M C n  2  ...  M 1Cn 1. N  M C 0 
Cn

M
 N
Cn
 N 1
Cn 1 
[This result is obtained using properties of binomial coefficients and involves
lot of calculations and hence its derivation may be skipped. It may be noticed
that in this result the left upper suffix and also the right lower suffix is the sum
of the corresponding suffices of the binomial coefficients involved in each
product term. However, the result used in the above expression is enrectangled
below for the interesting learners.]
We know that
mn m n
1  x   1  x  . 1  x  [By the method of indices]

Expanding using binomial theorem as explained in Unit 9 of this course, we


have
mn
C0 . x m  n  m  n C1. x m  n 1  m  n C 2 .x m  n  2  ...  mn
Cm  n

  m
C0 x m  m C1 x m 1  m C 2 x m 2  ...  m Cm 

. n C0 x n  n C1 x n 1  n C2 x n  2  ...  n C n 
Comparing coefficients of X m  n  r , we have
mn
Cr   m
C0 .n Cr  m C1 . n C r 1  ...  mC r . n C 0 
M n Nn N 1
= .
N N  n n 1

M.n n  1 N  1 nM
 .  .
N. N  1 n  1 N

 
E X 2  E  X  X  1  X 

 E  X  X  1   E  X 

 n M
C x . N  M Cn  x   nM 
  x  x  1 . N  
 x 0 Cn   N 
n
 M M  1 M  2 C x  2 . N M C n  x   nM 
   x  x  1 . . . N  
x 0  x x 1 Cn   N 

51
Discrete Probability
M  M  1  n   nM 
Distributions  N 
Cn  x 0
 M2

C x  2 . N M C n  x    
  N 
M  M  1 N  2  nM 
 N
Cn
 C n 2   
 N 

[The result in the first term has been obtained using a property of
binomial coefficients as done above for finding E(X).]
M  M  1 N  n n N2 nM
= . 
N n 2 Nn N

M(M  1)n(n  1) nM
 
N(N  1) N
Thus,
2
2 M  M  1 n  n  1 nM  nM 
V X  E X  2
  E  X   
N  N  1
 
N  N 

NM  N  M  N  n 
 [On simplification]
N 2  N  1

Example 2: A jury of 5 members is drawn at random from a voters’ list of


100 persons, out of which 60 are non-graduates and 40 are graduates. What is
the probability that the jury will consist of 3 graduates?
Solution: The computation of the actual probability is hypergeometric, which
is shown as follows:
60
C . 40C3
P [2 non-graduates and 3 graduates]  1002
C5
60  59  40  39  38  5  4  3  2

2  6 100  99  98  97  96
 0.2323
Example 3: Let us suppose that in a lake there are N fish. A catch of 500 fish
(all at the same time) is made and these fish are returned alive into the lake
after making each with a red spot. After two days, assuming that during this
time these ‘marked’ fish have been distributed themselves ‘at random’ in the
lake and there is no change in the total number of fish, a fresh catch of 400 fish
(again, all at once) is made. What is the probability that of these 400 fish, 100
will be having red spots.
Solution: The computation of the probability is hypergeometric and is shown
as follows: As marked fish in the lake are 500 and other are N  500,
500
C100 . N500C300
 P[100 marked fish and 300 others] = N
.
C400
We cannot numerically evaluate this if N is not given. Though N can be
estimated using method of Maximum likelihood estimation which you will
read in Unit 2 of MST-004 We are not going to estimate it. You may try it as
an exercise after reading Unit 2 of MST-004.
Here, let us take an assumed value of N say 5000.
52
Then, Discrete Uniform and
500 4500 Hypergeometric
C100 . C300
P  X  100  5000
Distributions
C400
You will agree that the exact computation of this probability is complicated.
Such problem is normally there with the use of hypergeometric distribution,
especially, if N and M are large. However, if n is small compared to N i.e. if n
n
is such that  0.05 , say then there is not much difference between sampling
N
with and without replacement and hence in such cases, the probability obtained
by binomial distribution comes out to be approximately equal to that obtained
using hypergeometric distribution.
You may now try the following exercise.
E2) A lot of 25 units contains 10 defective units. An engineer inspects 2
randomly selected units from the lot. He/She accepts the lot if both the
units are found in good condition, otherwise all the remaining units are
inspected. Find the probability that the lot is accepted without further
inspection.

We now conclude this unit by giving a summary of what we have covered in it.

11.4 SUMMARY
The following main points have been covered in this unit:
1) A random variable X is said to have a discrete uniform (rectangular)
distribution if it takes any positive integer value from 1 to n, and its
probability mass function is given by
1
 for x  1, 2, ..., n
P X  x    n
0, otherwise.

where n is called the parameter of the distribution.


n 1 n2 1
2) For discrete uniform distribution, mean = and variance = .
2 12
3) A random variable X is said to follow the hypergeometic distribution
with parameters N, M and n if it assumes only non-negative integer values
and its probability mass function is given by
 M Cx . N MCn x
 for x  0, 1, 2, ..., min{n, M}
P X  x    N
Cn
 0, otherwise

where n, M, N are positive integers such that n ≤ N, M ≤ N.
nM
4) For hypergeometric distribution, mean  and
N
NM  N  M  N  n 
variance  .
N 2  N  1

53
Discrete Probability
Distributions 11.5 SOLUTIONS/ANSWERS
E1) Let X be the number on the ticket drawn randomly from an urn containing
tickets numbered from 1 to 10.
 X is a discrete uniform random variable having the values
1
1, 2, 3, 4, …, 10 with probability of each of these values equal to .
10
Thus, the expected frequencies for the values of X are obtained as in the
following table:

X P X  x  Expected/Theoretical frequency
f  x   N.P  X  x 
 150.P  X  x 
1 1 1
150   15
10 10
2 1 1
150   15
10 10
3 1 1
150   15
10 10
4 1 1
150   15
10 10
5 1 1
150   15
10 10
6 1 1
150   15
10 10
7 1 1
150   15
10 10
8 1 1
150   15
10 10
9 1 1
150   15
10 10
10 1 1
150   15
10 10

E2) Here N = 25, M = 10 and n = 2.


 none of the 2 randomly selected 
The desired probability = P  
 units is found defective 

C 0 . 2510 C2 1 . C2
10 15
15  14
 25
 25  = 0.35.
C2 C2 25  24

54

You might also like