0% found this document useful (0 votes)
18 views6 pages

Student Notes 2.2

This document covers important discrete probability distributions, specifically the discrete uniform, binomial, and hypergeometric distributions. It explains the definitions, properties, and formulas associated with these distributions, along with examples to illustrate their applications. Additionally, it discusses the conditions under which the hypergeometric distribution can be approximated by the binomial distribution.

Uploaded by

202218et518
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views6 pages

Student Notes 2.2

This document covers important discrete probability distributions, specifically the discrete uniform, binomial, and hypergeometric distributions. It explains the definitions, properties, and formulas associated with these distributions, along with examples to illustrate their applications. Additionally, it discusses the conditions under which the hypergeometric distribution can be approximated by the binomial distribution.

Uploaded by

202218et518
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 6

Module 2.

Discrete Distributions
Student Notes 2.2: Binomial and Hypergeometric Distribution

Important Discrete Random Variables


1. Discrete Uniform Probability Distribution

You are about to launch a new product. The product was test marketed, but preference for
body colour was not.
There are six colours: Violet (1), Blue (2), Green (3), Yellow (4), Orange (5) and Red (6). Initially
it must be assumed that each body colour is equally preferred.
Without any historical data we assume each colour is equally likely.
The probability function P(x) = 1/6

P(1)+P(2) + P(3)+P(4)+P(5)+P(6) = 6(1/6) = 1,


hence P(x) is a probability mass function.

If X takes values 1, 2, …, n, all equally likely, then X has (discrete) uniform distribution with pdf
f(x)=1/n, x=1,2,…,n.

2. Binomial Probability Distribution b(x; n, p)

Let E be an experiment having only two outcomes, these two outcomes are occurrence of an
event A or non-occurrence of an event A or call these outcomes as success and failure.
Such an experiment with only 2 outcomes is called a Bernoulli trial.

Suppose this random experiment is repeated n times independently (i.e. outcome of any
repetition is not dependent on any other repetation).
P (success) = P(A)= p , P(Failure) = P(AC)= 1-p and remains same for all the n repetations.
Suppose the success occurs x times amoung n repetations of the experiment E.
Then the random variable X is called binomial variate with parameters n and p.
Its distribution is called binomial distribution which is given by the following formula.

()
n!
f (x)=P (X =x)= n p x q(n−x) = p x q(n−x)
x x ! ( n−x ) !
The Binomial distribution is discrete probability distribution and it is represented by
b(x; n, p), where n is the number of trials and p is the probability of success in each trail.
The random variable X counts the number of successes in n trials.
The cumulative distribution function F(x)=P(X≤x) is denoted by B(x;n,p).
x

()
P( X ≤ x)=B (x ; n , p)= ∑ n p k q(n−k) .
K =0 k
B(x; n,p) is tabulated for different values of n and p.
Four properties of a Binomial distribution
1. The experiment consists of a sequence of n identical trials.
2. Two outcomes, success and failure, are possible on each trial.
3. The probability of a success, denoted by p, does not change
from trial to trial.
4. The trials are independent.

Example: A coin is tossed 10 times. Suppose head occurs x times amoung these 10 tosses. We
are repeating the random experiment 10 times independently. Call the success when head
occurs.
P(Success) = P(Head in a single toss) = ½ remains same for all 10 repetations of the experiment.
X has binomial distribution with parameters n = 10 and p = 1/2 .

( )
P( X=x)= 10 p q
x
x (n−x)
x = 0, 1, 2,…, 10.

( )( )
10 1
P(5 heads) = P(X = 5) = 5 2 10

Example 2: Suppose an unbiased die is tossed 8 times. Suppose the number 1 occurs x times
amoung these 8 tosses of the die.
Call the success when the number 1 occurs on the top of the die.
P(success) = 1/6 for a single toss and remains same for all the 8 tosses.
X is the number of successes in 8 repetations of the experiment.
All the 8 repetation’s are independent. So X has binomial distribution with n = 8 and p = 1/6 .
P(X = x) =( )( ) ( )
8 1 x 5 8-x
2 6 6
x = 0,1,2,….8.

Probability 1 occurs 2 times amoung 8 tosses = P(X = 2)= (82)( 16) (56)
2 6

Example 3: A factory is producing 5% defective items. 10 items are chosen from a large lot of
items produced by this factory. What is the probability that no defective is found amoung these
10 items ? So we are repeating the experiment 10 times. Call a success when a defective item is
chosen. P(success) = p =0 .05 and remains same for all 10 repetations. What is X? Its
distribution? Parameters?Why?
Solution: Here X= number of defective item found among all 10 items, p = probability of
choosing defective item=0.05 and n = 10. All the 10 trails are independent. So X has binomial
distribution with n = 10 and p = 0.05. The probability distribution is
P(X = x) =( )
10 ( 0.05 )x
x
( 0.95 )10-x x = 0,1,2,….10.
Required probability =P (X = 0) = (100)(0.95) = (0.95)
10 10
= 0.5987.

Example 4: A card is drawn 5 times with replacement. Find the probability that atmost 2 shades
are found in 5 draws.
Solution: Let X be the number of times a spade is drawn. Call a success when a spade is drawn.
p = P(spade) = 13/52 = ¼ remains same for all the 5 draws. All the 5 trails are independent.
So X has binomial distribution with parameters n = 5, p = 1/4.
x

( )( ) (34 )
P(X =x) = 5 1
x 4
5-x
, x = 0,1,2,3,4,5.

Required probability = P(x=0) + P(x = 1) + P(x = 2) = (50)( 14) ( 34) + (51)( 14)( 34) + (52)( 14) ( 34) =
0 5 4 2 3

918
1024

Computations using tables: At the back of the book binomial table of cdf is given .
Same problem can be solved using tables. Need to find B(2; 5,0.25), look in the block of 5, row
corresponding to 2 and column corresponding to 0.25. The entry there is B(2;5,0.25)=0.8965.

3. Hypergeometric Distribution

There are N items, a of them are of one type and the remaining N-a out of N items are of some
other type, n items are chosen without replacement. Let X be items of type one found among
these n items. Then distribution function of X is

P(X = x) =
( x )( n−x )
a N−a
, x = 0,1,2,…..,n (if n ≤ a)
(n)
N

X is called hypergeometric variant and its distribution is called hypergeometric distribution.


Properties of hypergeometric distribution:
i) The experiment consists of drawing a random sample of size n without replacement and
without regard to order from a collection of N objects.
ii) Out of the N objects, a have a trait of interest to us; the other (N-a) do not have the trait.
iii) The random variable X is the number of objects in the sample with the trait.

( ax)( N−a
n−x )
= no. of ways x

favorables can be chosen

( Nn )=total number of ways sample


can be chosen
Definition: A random variable X with integer values has a hypergeometric distribution with
parameters N, n, a if its distribution function is

f ( x )=P ( X =x )=
( x )( n−x )
a N−a
; if max [0, n-(N-a)] ≤ x ≤ min (n,a)
(n)
N

Example: Suppose a box contain 6 white 4 black balls. If 3 balls are drawn without
replacement. Find the probability that 1 white and 2 black balls are found among these 3 balls.
Solution: Here N =10, total number of ball in the box= size of population
n =3= size of sample, a = number of white ball, N-a = number of black balls.

Let x = number of white ball chosen. Then required probability =


6 4
1 2 ( )( )
=
6.
4.3
2.1
=
6 ×6
=
3
10
3 ( ) 10.9 .8
3.2 .1
120 10

.
Sampling with replacement vs without replacement:
 Sampling without replacement → hypergeometric distribution
 Sampling with replacement → binomial distribution
Now suppose we are sampling with replacement. Then while picking each item, situation is
identical. Also if we randomly pick each item, then trials are independent also. So X = number of
items chosen of a particular type has binomial distribution with n = no. of trials = sample size,
p = probability that a randomly chosen item at any pick is of desired type = a/N, where N =
population size, a= no of items in population of desired type.

Binomial approximation to hypergeometric:


When population size is large compared to sample size, even if sample is without replacement,
the non-availability of picked items for next choice does not change the scenario on different
trials. We could treat trials identical and independent.
Thus if n ≤ (N / 10) , then hypergeometric distribution can be approximated by binomial
distribution with n = sample size , p = a/N, where a = no of favorable in population of size N.

Example: During a course of an hour, 1000 bottles of beer are filled by a particular machine.
Each hour a sample of 20 bottles is randomly selected and number of ounces of beer per bottle
is checked. Let X denote the number of bottles selected that are underfilled. Suppose during a
particular hour, 100 underfilled bottles are produced. Find the probability that at least 3
underfilled bottles will be among those sampled.
Solution: (using hyper geometric) Here N =1000, n =20, a = 100 = no of under filled bottles. x =
number of under bottles in selected 20. The probability distribution is

( )( )
f ( x )=P ( X =x )=
100 900
x 20−x

( ) 1000
20
Required probability = P[X ≥ 3] = 1 – P[X = 0] – P[X = 1] – [X = 2]

( )( ) ( )( ) ( )( )
=1-
100 900
0 20
-
100 900
1 19
-
100 900
2 18
= 0.3224.
( ) ( ) ( )
1000
20
1000
20
1000
20
Solution using binomial: Population size N=1000, a = 100, sample size n = 20.
Required probability = 1 – F(2)= B (2; 20, 0.1) = 1 - 0.6769 = 0.3231.

Example 2
A 1000-strong IT firm is concerned about a low retention rate for its employees. In recent
years, management has seen a turnover of 10% of the employees annually. HR takes a random
sample of 5 employees and meets each one separately to understand their concerns and also
whether they are planning to leave.
1. # of trials, n, is 5; identical in the sense that each trial is an interview with an employee.
2. Two outcomes: Resign(success) or No plans of resigning(failure).
3. Trials are independent – Since the employees were randomly selected. We need to interview
separately so that ones decision do not influence others so in that sense trials are
independent.
4. The probability will change: Here if we interview on person and he says he want to resign
then p = 100/1000=0.01.If we go to second person then the probability that he want to leave is
99/999, the probability of success changes. But we still want to model this as binomial
distribution. If we look at 99/999 is almost equal to 0.1 so we can assume probability as 0.1.
This all happens because our sample size n = 5 is very less compared to the size of population
=1000. So if a sample size is much less size of population then probability do not change.This
process is sampling without replacement.
n
b(x; n, p) – Rule of Thumb: <5 % .
N
The supplier claims that the defective rate is 1%. We test the consignment of 1000 items by
sampling 10 items and classifying each as Defective or Not Defective.
Note: Since this is sampling without replacement, So ‘p’ changes as we change sample.
If N >> n, so that p does not change by much.

Learning Outcome:
Discrete uniform probability distribution, Bernoulli trial, Binomial distribution and Hyper
geometric distribution. Binomial approximation to hypergeometric.

Questions to consider:
Q1. What is a binomial experiment?
Q2. What are the four properties of a binomial distribution?
Q3. What is the formula for binomial distribution?
Q4. What is a hypergeometric experiment?
Q5. What are the properties of a hypergeometric distribution?
Q6. What is the formula for hypergeometric distribution?
Q7. When and how hypergeometric distribution can be approximated by binomial distribution?
Q8. What is discrete uniform probability distribution?

You might also like