0% found this document useful (0 votes)
15 views44 pages

Lec 01

The document discusses probability distributions for discrete and continuous random variables. It defines key terms like random variable, probability mass function, probability density function, binomial, Poisson and normal distributions. Examples are provided to illustrate concepts like finding probabilities and determining what type of distribution applies in a given scenario.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views44 pages

Lec 01

The document discusses probability distributions for discrete and continuous random variables. It defines key terms like random variable, probability mass function, probability density function, binomial, Poisson and normal distributions. Examples are provided to illustrate concepts like finding probabilities and determining what type of distribution applies in a given scenario.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 44

Probability distribution

Prepared by-
Lakshmi Rani Kundu
Assistant Professor
Department of Public Health and Informatics
Jahangirnagar University
Learning Outcomes:
At the end of the session students will able to-
 Use the probability distribution for a discrete random
variable to find the probability of events of interest.
 Explain how a density function is used to find probabilities
involving continuous random variables.
 explain the similarities and differences between distributions
of the discrete type and the continuous type and when the
use of each is appropriate.
 Random variable: A random variable assigns a unique
numerical value to the outcome of a random experiment.
Example: Consider the random experiment of flipping a coin
twice.
The sample space of possible outcomes is S = { HH, HT, TH, TT }.
Now, let’s define the variable X to be the number of
tails that the random experiment will produce.
If the outcome is HH, we have no tails, so the value for X is 0.
If the outcome is HT, we got one tail, so the value for X is 1.
If the outcome is TH, we again got one tail, so the value for X is 1.
Lastly, if the outcome is TT, we got two tails, so the value for X is 2.
X is a quantitative variable that takes the possible values of
0, 1, or 2.
It is random because we do not know which of the three values
the variable will eventually take.
 Types of random variable:
➢ Discrete random variable: A random variable whose
possible values are a list of distinct values, is called
a discrete random variable. For example, number of
heads, number of accidents in a day, number of patients
admitted in a hospital, etc.
➢ Continuous random variable: A random variable
that can take any value in an interval, is called
a continuous random variable. For example, Weight,
Blood pressure,Time to recovery, cholesterol, etc.
 Probability distribution: The list of all possible values that the
random variable can assume and their corresponding probabilities is
called the probability distribution.
Probability distribution

Discrete distribution Continuous distribution


* Bernoulli distribution * Normal distribution
* Binomial distribution * Exponential distribution
* Poisson distribution * Gamma distribution
* Geometric distribution * Beta distribution
* Hypergeometric distribution * Cauchy distribution
* Negative binomial distribution * Lognormal and pareto distribution
* Multinomial distribution
 Probability mass function : A probability mass
function (pmf) assigns probabilities to all possible
outcomes for a discrete random variable.
 Probability density function: A probability density
function (pdf) assigns probabilities to all possible
outcomes for a continuous random variable.
 What is the probability distribution of X, where the
random variable X is the number of tails appearing
in two tosses of a fair coin?
 Example:
Binomial distribution
 When a random process or experiment, called a trial, can
result in only one of two mutually exclusive outcomes, such
as dead or alive, sick or well, full-term or premature, the
trial is called a Bernoulli trial. The binomial distribution is
derived from a process known as a Bernoulli trial.
 Binomial distribution: A discrete random variable X is said to
have a binomial distribution if its probability function is defined by
 n  x n− x
P( X = x) = f ( x; n, p ) =   p q for x = 0,1, 2,..., n
 x
where,

Note:
 Example: Random Experiments (Binomial or Not?)
➢ A fair coin is flipped 20 times; X represents the number of heads.
X is binomial with n = 20 and p = 0.5.
➢ You roll a fair die 50 times; X is the number of times you get a
six.
X is binomial with n = 50 and p = 1/6.
➢ Roll a fair die repeatedly; X is the number of rolls it takes to get a six.
X is not binomial, because the number of trials is not
fixed.
 Mean and Variance of binomial distribution:
mean = np
Variane = npq
mean >Variance
 Example: Suppose we sample 120 people at random. 10% of
the population has blood type B. On average, how many
would you expect to have blood type B? What is the standard
deviation of the number X who have blood type B?
mean=120*.1=12
Variance=npq=120*.1*.9=10.8
 Example: In a community, the probability that a newly born
child will be boy is 2/5. Among the 4 newly born children in
that community, what is the probability that (i) all the four boys,
(ii) at least two boys, (iii) no boys, (iv) exactly one boy and (v)
at most two boys.
Solution: Let us consider the event that a newly born child is a
boy as success in Bernouli trial with probability of success 2/5.
Let the number of boys be a random variable X can take values
0,1,2,3,and 4.
The probability function of X is
4− x
 4 2   3 
x

f ( x; 4, 2 / 5 ) =       for x = 0,1, 2, 3, 4
 x 5   5 
(i) P[all boys]=P[X=4]=(2/5)^4=16/625=0.0256
(ii) P[at least two boys]=P[X ≥ 2]=1-P[X<2]
=1-P[X=0]+P[X=1]
=1-[(3/5)^4+4(2/5)(3/5)^3]=?
(iii) P[no boys] = P[X= 0] = (3/5)^4 = 0.1296

(iv) P[exactly one boy] = P[X=1] = 4(2/5)(3/5)^3=216/625


=0.3456
(v) P[at most two boys] = P[X ≤ 2]
=P[X=0]+P[X=1]+P[X=2]
=?
Poisson distribution
 Poisson distribution:A discrete random variable X is
said to have a Poisson distribution if its probability
function is given by
e−  x
P ( X = x) = f ( x;  ) = ; x = 0,1, 2,..., 
x!
where, e=2.71828 and  is the parameter of the
distribution which is the mean number of success.
 Mean and variance of Poisson distribution:
Mean =Variance = 
 Consider the number of telemarketing phone calls received by a
household during a given day. In this example, the receiving of a
telemarketing phone call by a household is called an occurrence, the
interval is one day (an interval of time), and the occurrences are
random (that is, there is no specified time for such a phone call to come
in) and discrete. The total number of telemarketing phone calls
received by a household during a given day may be 0, 1, 2, 3, 4, and so
forth. The independence of occurrences in this example means that the
telemarketing phone calls are received individually and none of two (or
more) of these phone calls are related.
 Some examples where Poisson probability distribution
may be successively applied:
➢ The number of cars passing through a certain street in time t,
➢ Number of suicide reported in a particular day,
➢ Number of accidents that occur on a given highway during a 1-
week period,
➢ Number of deaths from a disease such as heart attack or cancer,
➢ Number of customers entering a grocery store during a 1-hour
interval,
➢ Number of robbers caught on a given day in a certain city
➢ Number of television sets sold at a department store during a
given week and so on.
 Example:
 Example: In a certain population an average of 13 new
cases of esophageal cancer are diagnosed each year. If the
annual incidence of esophageal cancer follows a Poisson
distribution, find the probability that in a given year the
number of newly diagnosed cases of esophageal cancer will
be:
(a) Exactly 10 (b) At least three
(c) No more than five (d) Between six and eight
(e) Fewer than seven
 Poisson distribution is a limiting case of the
binomial distribution under the following
conditions:
i) The probability of success or failure in Bernoulli trial is very
small. That is p → 0 or q → 0.
ii) n, the number of trials is very large.
iii) np =  is a finite constant.
 Example: If the probability that a car accident happens in a
very busy road in an hour is .001. if 2000 cars passed in one
hour by that road, what is the probability that
(i) exactly 3, (ii) more than two car accidents happeded on
that hour of the road.
Solution: Let X be the number of accident which follows
Poisson distribution with  =2000*.001=2, as the
probability of accident very small.
(i) P[exactly 3 accident]=P[X=3] =0.18
(ii) P[more than 2 accident]=P[X>2]=1-P[X≤2]=0.325
Home Work
1. Suppose it is known that the probability of recovery for a certain
disease is 0.4. If a random sample of 20 people is drawn from
this population, what is the probability that (i) all will recover,
(ii) no recover, (iii) exactly two will recover, (iv) at least two
will recover, (v) At most two will recover, (vi) Six or more will
recover, (vii) Five or fewer will recover, and (viii) Between six
and nine will recover?
2. If the mean number of serious accidents per year in a large
factory (where the number of employees remains constant) is five,
find the probability that in the current year there will be:
(i) Exactly seven accidents (ii) Four or more accidents
(iii) No accidents (iv) Fewer than five accidents
3. In a certain population the probability that a new cases of
esophageal cancer will be diagnosed is .2. A random sample
of 35 people is drawn from this population. Find the
probability that in a given year the number of newly
diagnosed cases of esophageal cancer will be:
(i) five or fewer (ii) At least six
(iii) Between nine and 13 (iv) No more than 6
(v) fewer than 5 (v) greater than 8
Continuous Probability distribution
Normal distribution
 Normal distribution: A continuous random variable X is
said to have a normal distribution if its probability density function
is given by
1  x− 
2

1 −  
f ( x;  ,  2 ) = e 2  
; −  x  
 2
where the parameters  and  2 satisfy −     and  2  0
 The parameters  and  2 are the mean and variance of the
normal variate X.
 X~N (  , 2 )
 The graph of the normal curve is

−  +
 The curve is symmetric about the mean
 The mean, the median, and the mode are all equal
 The total area under the curve is 1 or 100%
 The black and the red normal curves have means or centers
at μ = mu = 10. The red curve is more spread out and thus
has a larger standard deviation.
 The black and the green normal curves have the same
standard deviation or spread but different mean.
 Standard Normal Distribution: A continuous random
variable Z is said to have a standard normal variate if its
density function is given by
1 − 12 z 2
f ( z) = e ; −  z  
2

 Z~N(0,1)
 The Standard Deviation Rule for Normal Random
Variables (Empirical rule):
If X is a normal random variable, then the probability is
➢ 68% of observations fall within 1 standard deviation of the mean
➢ 95% of observations fall within 2 standard deviations of the mean
➢ 99.7% of observations fall within 3 standard deviation of the mean
Using probability notation, we may write
 Suppose that foot length of a randomly chosen adult male is a
normal random variable with mean μ = 11 and standard
deviation σ = 1.5.
(i) What is the probability that a randomly chosen adult male
will have a foot length between 8 and 14 inches?
Answer: 11±2*1.5 = .95 or 95%
(ii) An adult male is almost guaranteed (.997 probability) to
have a foot length between what two values?
Answer: 6.5 and 15.5 inches
 What is the probability that a male’s foot length to be more than 13
inches?
 How many standard deviations below or above the mean male foot
length is 13 inches?
x−
z=

=(13-11)/1.5
=+1.33
we have just found the z-score for a male foot length of 13 inches
to be z = +1.33. Or, we have standardized the value of 13.
 The standardized value z tells how many standard deviations
below or above the mean the original value is, and is calculated as
follows:
z-score = (value – mean)/standard deviation
 What is the standardized value for a male foot length of 8.5
inches? How does this foot length relate to the mean?
Answer: z = (8.5 – 11) / 1.5 = -1.67.
This foot length is 1.67 standard deviations below the mean.
 A man’s standardized foot length is +2.5. What is his actual
foot length in inches?
x = mu+z*sigma
x = 11 + 2.5(1.5) = 14.75 inches.
 Example:
Example: Male foot lengths have a normal distribution,
with mean 11 inches, and standard deviation 1.5 inches.
What is the probability that a male’s foot length to be
more than 13 inches?
Solution: The standardized value of 11 is
z=(x-mu)/sigma=1.33
P(X>13)=P(Z>1.33)
=1-P(Z<1.33)
=0.0918
 Suppose that the growth in inches during the tenth year of
life of Bangladeshi boy is a normal random variate with mean
2 inches and standard deviation 1 inch. Find the probability
that a randomly selected boy will grow (i) between 1 and 2
inches in his tenth year, (ii) more than 3 inches in his tenth
year, (iii) at least 1 inch in his tenth year and (iv) less than 1
inch in his tenth year.
Solution:
(i) P[1<X<2]=P[1-2/1<X-2/1<2-2/1]=P[-1<Z<0]
= P[Z<0]-P[Z<-1]
=.500-.1587=.3413
(ii) P[X>3] = P[Z>3-2/1] = P[Z>1]= 1-P[Z<1]
=1-.8413=0.1587
(iii) P[X>1]= P[Z>-1] = 1-P[Z<-1] =1- .1587 =0.8413

(iv)P[X<1]= P[Z<-1]=0.1587
Normal Approximation to Binomial
and Poisson
 Normal distribution is obtained from binomial
distribution under the following conditions:
➢ The probability of success or failure are not so small
➢ n, the number of trials is very large (say n>30). np>5
Then, mean,  = np  = npq
standard deviation,
 Normal distribution is obtained from poisson
distribution under the following conditions:
➢ The mean λ of a poisson distribution is large (say λ>1000).
mean = λ, standard deviation,  = 
Home Work
1. Suppose the average length of stay in a chronic disease hospital of a
certain type of patient is 60 days with a standard deviation of 15. If
it is reasonable to assume an approximately normal distribution of
lengths of stay, find the probability that a randomly selected
patient from this group will have a length of stay:
(a) Greater than 50 days (b) Less than 30 days
(c) Between 30 and 60 days (d) Greater than 90 days
2. The weights of a certain population of young adult females are
approximately normally distributed with a mean of 132 pounds
and a standard deviation of 15. Find the probability that a subject
selected at random from this population will weigh:
(a) More than 155 pounds (b) 100 pounds or less
(c) Between 105 and 145 pounds
3. Suppose it is known that the probability of recovery for a
certain disease is 0.4. If 35 people are stricken with the
disease, what is the probability that:
(a) 25 or more will recover?
(b) Fewer than five will recover?
(Use the normal approximation.)

You might also like