Probability Distributionsa
Probability Distributionsa
Chapter 3
Probability Distribution of Discrete
Variable
Definition: The probability distribution of a
discrete variable is a table, graph, formula,
or other device used to specify all possible
values of a discrete randam variable along
with their respective probabilities
Example 3.2.1
A public health nurse has a case load of
50 families.
The goal is to construct the probability
distribution of X, the number of children
per family for the population
Table 3.2.1 gives the probability
distribution on the next slide
Table 3.2.1Probability distribution of number of
children per family in a population of 50 families
x Frequency of occurance of x P(X) = x
0 1 1/50
1 4 4/50
2 6 6/50
3 4 4/50
4 9 9/50
5 10 10/50
6 7 7/50
7 4 4/50
8 2 2/50
9 2 2/50
10 1 1/50
50 50/50
Figure 3.2.1Graphical representation of Probability
distribution of number of children per family in a
population of 50 families
0
0.05
0.1
0.15
0.2
0.25
0 1 2 3 4 5 6 7 8 9 10
x
P
r
o
b
a
b
i
l
i
t
y
P(X) = x
Probability Distribution of Discrete
Variable
So There are two essential properties of
probability distribution of a discrete
variable
1. 0 P(X = x) 1
2. P(X = x) = 1
Example 3.2.1
With this probability distribution, it is
possible to make probability statements
regarding the random variable X
Question: For any one of the family
selected, what is the probability that the
family will have three children?
The answer: P(X=3) = 4/50 = 0.08
Example 3.2.1
What is the probability that a family
chosen at random will have either three or
four children?
The answer: From the addition rule
P(X=3 or X=4) = P(X=3) + P(X=4)
P(X=3 or X=4) = 0.08 + 0.18 = 0.26
Cumulative Distribution
It is obained by succesive
addition of probabilities
Cumulative probability
distribution can help us tu
answer several questions
1. What is the probability that a
family picked at random from
the 50 will have fewer than five
children?
Look at the excel example1
x Frequency of
occurance of x
P(X= x) P(X x)
0 1
0.02 0.02
1 4
0.08 0.1
2 6
0.12 0.22
3 4
0.08 0.3
4 9
0.18 0.48
5 10
0.2 0.68
6 7
0.14 0.82
7 4
0.08 0.9
8 2
0.04 0.94
9 2
0.04 0.98
10 1
0.02 1
sum 50 1
Cumulative Distribution
1. What is the probability that a
family picked at random from
the 50 will have fewer than five
children?
For X = 0 to X = 4
P(X<5) = 24/50 = 0.48
x Frequency of
occurance of x
P(X= x) P(X x)
0 1
0.02 0.02
1 4
0.08 0.1
2 6
0.12 0.22
3 4
0.08 0.3
4 9
0.18 0.48
5 10
0.2 0.68
6 7
0.14 0.82
7 4
0.08 0.9
8 2
0.04 0.94
9 2
0.04 0.98
10 1
0.02 1
sum 50 1
Cumulative Distribution
2. What is the probability that a
ranomly picked family will have
five or more children?
Here, the set of families with
five or more children is the
compliment set of the et of
families with fewer than five
children
The sum always equal to 1
As a result
P(X > 5) = 1 P(X <5)
P(X > 5) =1 0.48 = 0.52
x Frequency of
occurance of x
P(X= x) P(X x)
0 1
0.02 0.02
1 4
0.08 0.1
2 6
0.12 0.22
3 4
0.08 0.3
4 9
0.18 0.48
5 10
0.2 0.68
6 7
0.14 0.82
7 4
0.08 0.9
8 2
0.04 0.94
9 2
0.04 0.98
10 1
0.02 1
sum 50 1
Cumulative Distribution
3. What is the probability that a
ranomly picked family will have
between three and six children,
inclusive?
Here, we need to find out:
P(3 X 6)
Now, this equals to:
P(3 X 6) = P(X 6) P (X< 3)
P(3 X 6) = 0.82 0.22
P(3 X 6) = 0.60
x Frequency of
occurance of
x
P(X= x) P(X
x)
0 1
0.02 0.02
1 4
0.08 0.1
2 6
0.12 0.22
3 4
0.08 0.3
4 9
0.18 0.48
5 10
0.2 0.68
6 7
0.14 0.82
7 4
0.08 0.9
8 2
0.04 0.94
9 2
0.04 0.98
10 1
0.02 1
su
m
50 1
Theoretical Probability Distributions
Binominal distribution
Poisson Distribution
Normal Distrbution
Binominal distribution
One of the most encountered distribution
Originated from Bernoulli trial named after
in the honor of the Swiss mathematician
James Bernoulli
When a single trial of some process or
experiment can result in only one of two
mutually exclusive outcomes (i.e. Male or
female) the trial is said to be binominal
Binominal distribution
Bernoulli trials forms the benoulli process on the
folowing conditions
1. each trial result in one of two possible
mutually exclusive outcomes (one success and
other failure)
2. The probability of success denoted as p and
the probability of failure with q as q=1-p
3. The trials are independent means that the
outcome of any particular trial is not effected by
the outcome of any other trial
Example 3.3.1
The task is to calculate the probability of x
successes in n benoulli trial
Let say, in a certain population 52 percent of all
recorded birts are male
Then we say that the probability of a recorded
male birth is 0.52
If we randomly selct five birth records from this
population, what is the probability that exacly
three of records will be for male births?
Example 3.3.1
Let say, male birth is a success (arbitrary)
Suppose the five birth records selected
resulted in this squence of sexes:
MFMMF
f code this as 10110
Now, with p and q notation, the probability
of above sequence is found by means of
multiplication rule as
P(1.0,1,1,0) = pqppq = q
2
p
3
Example 3.3.1
The three male and
two female squences
could occur as
Number Sequence
1 111OO
2 11O1O
3 11OO1
4 1O11O
5 1OO11
6 1O1O1
7 O111O
8 OO111
9 O1O11
10 O11O1
Example 3.3.1
So, the question:
What is the
probability, in a
random sample of
size 5, drawn from the
specified population,
of observing three
success and two
failure?
Number Sequence
1 111OO
2 11O1O
3 11OO1
4 1O11O
5 1OO11
6 1O1O1
7 O111O
8 OO111
9 O1O11
10 O11O1
Example 3.3.1
Since in the population, p =
0.52 and
q = 1 p = 1 0.52 = 0.48
The the answer to the
question: 10
*
q
2
*
p
3
= 10
*
(0.48)
2
*
(0.52)
3
=10
*
0.2304
*
0.140608=0.32
So, where does the 10 came
from?
Number Sequence
1 111OO
2 11O1O
3 11OO1
4 1O11O
5 1OO11
6 1O1O1
7 O111O
8 OO111
9 O1O11
10 O11O1
Example 3.3.1
The list of number of sequences
will become difficult as the sample
size increases
Thus, if whe have n things and x of
which one type and the remainder
are of another type we use the
equation:
n
C
x
= n!/x!(n-x)!
This equation gives the number of
combinations of n things taken x at
a time
Number Sequence
1 111OO
2 11O1O
3 11OO1
4 1O11O
5 1OO11
6 1O1O1
7 O111O
8 OO111
9 O1O11
10 O11O1
Example 3.3.1
Thus, for our example
n
C
x
= n!/x!(n-x)!
n
C
x
= 5!/2!(5-2)! = 120/12 =10
Number Sequence
1 111OO
2 11O1O
3 11OO1
4 1O11O
5 1OO11
6 1O1O1
7 O111O
8 OO111
9 O1O11
10 O11O1
The Binominal Distribution
Then, the probability of obtaining
exactly x successes in n trials as
f(x) =
n
C
x
q
n-x
p
x
f(x) =
n
C
x
q
x
p
n-x
for x = 0,1,2,...,n
Number Sequence
1 111OO
2 11O1O
3 11OO1
4 1O11O
5 1OO11
6 1O1O1
7 O111O
8 OO111
9 O1O11
10 O11O1
The Binominal Distribution
Number of Success Probablity, f(x)
0
n
C
0
q
n-0
p
0
1
n
C
1
q
n-1
p
1
2
n
C
2
q
n-2
p
2
... ...
x
n
C
x
q
n-x
p
x
... ...
n
n
C
0
q
n-n
p
n
Total 1
Example 3.3.2
30 % of a certain population are immune to some
disease.
If a random sample of size 10 is selected from this
population, what is the probability that it will contain
exactly 4 immune persons?
Example 3.3.2
Here we can set the probability of an immune person to
be 0.3
Then
f(4) =
10
C
4
(0.7)
6
(0.3)
4
= 10! /4!6! (0.117649)(0.0081)
=0.2001
This prblem can be solved with Excel
Example 3.3.2
For the birth record example
Here, the objectve was the probabilty of observing exactly
3male birth records when n=5 and p=0.52
Hawever there is no =0.52 in Table A in Appendix II
But, if we ask the question as the probability of observing 2
female birth for n=2 and p=1-0.52=0.48, we can solve the
problem
Since from the table: P(X2)=0.5373 and P(X1)=0.2135
Then P(X=2) = P(X2) - P(X1)= 0.5373 0.2135 = 0.324
This number is same wth our previous example
Example 3.3.3
Let say for a certain pupation 10 % of the
population is color blind. If a randam sample of 25
people is drawn from this population.
a. Find the probability that five or fewer will be color
blind
From the Table A for n=25 and p=0.10
P(X5) = 0.9666
Example 3.3.3
Let say for a certain pupation 10 % of the
population is color blind. If a randam sample of 25
people is drawn from this population.
b. Find the probability that six or more will be color
blind
This is the compliment of the set specified in part a.
Thus
P(X>6) = 1 - P(X5) = 1 - 0.9666 = 0.0334
Example 3.3.3
Let say for a certain pupation 10 % of the
population is color blind. If a randam sample of 25
people is drawn from this population.
c. Find the probability that between six and nine
inclusve will be color blind
This is the compliment of the set specified in part a.
Thus
P(6X9) = P(X9) - P(X5)
P(6X9) = - 0.9999 0.9666 = 0.0333
Example 3.3.3
Let say for a certain pupation 10 % of the
population is color blind. If a randam sample of 25
people is drawn from this population.
c. Find the probability that two, three, or four will be
color blind
This is the compliment of the set specified in part a.
Thus
P(2X4) = P(X4) - P(X1)
P(6X9) = - 0.9020 0.2712 = 0.6308
When we have p > 0.50
The probability that X is equal to some specified
value given the sample size and probability of
success greather than 0.50 is equal to the
probability that X is equal to n x given the sample
size and the probability of a failure of 1 p.
This statement is given as
P(X=x|n, p>0.50) = P(X=n - x|n, 1 p)
When we have p > 0.50
If we are looking for cumulative probability when
p >0.50
P(Xx|n, p>0.50) = P(X>n - x|n, 1 p)
And to find the probability that X is grather than or
equal to some x when p>0.50 we use the following
equation
P(X>x|n, p>0.50) = P(Xn - x|n, 1 p)
Example 3.3.4
In a certain community, on a given evening,
someone is at home in 85 % of hoseholds. A health
care research team conducting a telephone survey
selects a random sample of 12 households.
a. Find the probability that the team will find
someone at home in exactly 7 households
We can look at this problem as the probability that
the team conducting the survey gets no answer
from exaxtly 5 calls out of 12, if no one is at home in
15 % of households
Then the answer
Example 3.3.4
We can look at this problem as the probability that
the team conducting the survey gets no answer
from exaxtly 5 calls out of 12, if no one is at home in
15 % of households
Then the answer
P(X=5|12, p>0.15) = P(X5) P(X4)
P(X=5|12, p>0.15) = 0.9954 0.9761 = 0.0193
Example 3.3.4
In a certain community, on a given evening,
someone is at home in 85 % of hoseholds. A health
care research team conducting a telephone survey
selects a random sample of 12 households.
b. Find the probability that the team will find
someone at home in 5 or fewer hoseholds
Example 3.3.4
b. Find the probability that the team will find
someone at home in 5 or fewer hoseholds
P(X5|12, p>0.85) = P(X>12 - 5|n=12, p=1 0.85)
P(X5|12, p>0.85) = P(X>7|n=12, p=0.15)
P(X5|12, p>0.85) = 1 - P(X6|n=12, p=0.15)
P(X5|12, p>0.85) = 1 0.9993 = 0.0007
Example 3.3.4
c. Find the probability that the team will find
someone at home in 8 or more hoseholds
P(X>8|n=12, p=0.85) = P(X4 |n=12, p=0.15)
P(X>8|n=12, p=0.85) = 0.9761
The poisson Distribution
Poisson distribution named for the French
mathematician Simoen Denis Poisson
It has been used in biology and medicine
If x is the number of occurances of some random
event in an interval of time or space, the probabiity
that x will occur is given by
( )
,... 2 , 1 , 0
!
=
=
x
x
e
x f
x
=
= = =
e
f X P
Example 3.4.1
b. Fin the probability that no emergency admissions
will occur on a particular day
( )
05 . 0
1
1 05 . 0
! 0
3
0
0 3
=
=
=
e
f
Example 3.4.1
c. Fin the probability that either three or four
emergency admissions will occur on a particular
day
( ) ( )
39 . 0
24
81 05 . 0
6
27 05 . 0
! 4
3
! 3
3
4 3
4 3 3 3
=
=
+ = +
e e
f f
All of this values can also be obtained from Table B in Appendix II for the
known and X
The Poisson distribution is defined by one parameter: lambda.
This parameter equals the mean and variance.
As lambda increases, the Poisson distribution approaches a
normal distribution.
A variable follows a Poisson distribution if the following
conditions are met:
Data are counts of events (non-negative integers with no
upper bound).
All events are independent.
Average rate does not change over the period of interest.
The Poisson distribution is similar to the binomial distribution
because they
both model counts of events.
However, the Poisson distribution models a finite observation
space with any integer number of events greater than or equal to
zero.
The binomial distribution models a fixed number of discrete trials
from 0 to n events.
The formula for the Poisson cumulative probability
function is
The following is the plot of the Poisson cumulative distribution
function with the same values of as the pdf plots above.
The following is the plot of the Poisson cumulative distribution
function with the same values of as the pdf plots above.
Example 3.4.2
In the study of a certain aquatic organism, a large
number of samples were taken from a pound, and
the number of organism in each sample was
counted.
The average nuber of organisms per sample was
found to be two.
Assuming that the number of organisms follows a
poisson distribution:
a. Find the probability that the next sample taken
will contain one or fewer organisms
Example 3.4.2
a. Find the probability that the next sample taken
will contain one or fewer organisms
In the Table B, when = 2, the probability that X1
is 0.406
That is, P(X1| 2) = 0.406
Example 3.4.2
b. Find the probability that the next sample taken
will contain exactly three organisms
P(X = 3 | 2) = P(X 3) P(X 2)
P(X = 3 | 2) =0.857 0.677 = 0.180
Example 3.4.2
c. Find the probability that the next sample taken
will contain more than five organisms
P(X > 5 | 2) = 1 P(X 5)
P(X > 5 | 2) = 1 0.983 = 0.0170
Continuous Probability Distribution
The distributions (Binominal and Poisson) so far we have
seen are distributions of discrete variables
A continus variable is one that can assume any value within
a specified interval of values assumed by the variable.
f(x)
x
Figure 3.5.2 Graphical Representation of a continuous distribution
Continuous Probability Distribution
The total area under the curve is equal to 1
The probability of any specific value of a random variable is
zero
f(x)
x
Figure 3.5.2 Graphical Representation of a continuous distribution
Continuous Probability Distribution
To find the area of a smooth curve between any two points, a and b, the density
function is integrated from a to b.
A density function is a formula used to represent the distribution of a continuous
random variable
f(x)
x
Figure 3.5.2 Graph of a contnuous distribution showing area between a and b
a b
Continuous Probability Distribution
Defination: A nonnegative function f(x) is called a
probability distribution (probability density function)
of the continuous random variable X if the total area
bounded by its curve and the x-axis is equal to 1
and if the subarea under the curve bounded by the
curve, the x-axis, and perpendiculars drawn at any
two points a and b gives the probability that X is
between the points a and b
The Normal Distribution
The most importan distribution in all of statistics
It is first formulated by Abraham De Moivre in 1733
It is also called Gaussian distribution in the honor of
Carl Friedrich Gauss
The normal density function is given as
( )
( )
< <
=
x for
e x f
x
2
1
2
2
2o
o t
The Normal Distribution
Here is the mean
and o is the standard
deviation
( )
( )
< <
=
x for
e x f
x
2
1 2
2
2o
o t
x
f(x)
Charecteristics of Normal Distribution
It is symmetrical around the mean,
The mean, median, and mode are all same
The total area under the curve above the x-axis is one
square unit.
The area defined by 1 o around the mean is approximately
68 % of the total area
The area defined by 2 o around the mean is approximately
95 % of the total area
The area defined by 3 o around the mean is approximately
99.7 % of the total area
The normal distribution completely determined by
parameters mean and standard deviation
The satandard normal distribution is a special case where
mean equals to zero and standard deviation equals to one
The satandard Normal Distribution
For a random variable defined as
The the equation for the standard normal
distribution is given as
( )
o
=
x
z
( ) < < =
z e z f
z
,
2
1
2
2
t
2
1
1
0
2
2
}
z
z
z
dz e
t
The satandard Normal Distribution
x
f(x)
z
f(z)
=0
o=1
Example 3.6.1
Given the standard normal distribution, find the area under
the curve, above the z-axis between
z = and z = 2
From the Table C in Appendix II, the are is given as 0.9772
x
f(x)
0 2
Example 3.6.1
From the Table C in Appendix II, the are is given as
0.9772
This value says that the probability that a z picked at
random from the population of zs will have a value
between - and 2
x
f(x)
0 2
Example 3.6.1
From the Table C in Appendix II, the are is given as
0.9772
We can also say that the relative frequency of
occurance (or proportion) of values of z betven -
and 2.
x
f(x)
0 2
Example 3.6.1
From the Table C in Appendix II, the area is given as
0.9772
Put another way, 97.72 % of the zs have a value
between - and 2.
x
f(x)
0 2
Example 3.6.2
What is the probabilty that a z picked at random from
the population of zs will have value between
-2.55 and +2.55?
-2.55 2.55
Example 3.6.2
P(-2.55<z<2.55) = 0.9946 0.0054 = 0.9892
f it was asked as as inlusive between -2.55 and +2.55
Then
P(-2.55z2.55) = P(-2.55<z<2.55) = 0.9892
Since P(z = z
0
) =0
-2.55 2.55
Example 3.6.3
What proportion of z values are between
-2.74 and +1.53?
-2.74 1.53
Example 3.6.3
P(-2.74z1.53) = 0.9370 0.0031= 0.9892
-2.74 1.53
Example 3.6.4
Given the satandard normal distribution,
find P(z>2.71)
2.71
Example 3.6.4
P(z>2.71) = 1 P(z2.71) = 1 0.9966= 0.0034
2.71
Example 3.6.5
Given the satandard normal distribution,
find P(0.84z2.45)
2.45 0.84
Example 3.6.5
P(0.84z2.45)=P(z2.45) P(z0.84)
=0.9929 0.7995= 0.1934
2.45 0.84
Example 3.6.6
A physical terapist belives
that scores on a certain
manual dexterity test are
approximately normally
distributed with a mean of
10 and a standard
deviation of 2.5. If a
randomly selected
individual takes the test,
what is the probability
that he or she will make a
sore of 15 or better?
15 =10
o
=2.5
2 0
o
=1
Example 3.6.6
P(x>15)=P(z>2) = 0.0228
2 0
o
=1
( ) ( )
2
5 . 2
10 15
5 1 for x =
= =
o
x
z
Example 3.6.7
Suppose it is known that
the weights of a certain
population of individuals
are aproximately normally
distributed with a mean of
140 pounds and a
standard deviation of 25
pounds.
What is the probability
that a person picked at
random from this group
will weight between 100
and 170 pounds
170 =140
o
=25
100
1.2 0
o
=1
-1.6
Example 3.6.7
P(100x170)=P(-1.6z1.2) =
P(-z1.2) P(-z-1.6) =0.8849 0.0548
P(-z1.2) P(-z-1.6) =0.8301
1.2 0
o
=1
-1.6
( ) ( )
( ) ( )
2 . 1
25
140 170
170 for x
6 . 1
25
140 100
100 for x
=
= =
=
= =
o
x
z
x
z
Example 3.6.8
In a population of 10,000 of the people described in
example 3.6.7, how many would you expect to
weight more than 200 pounds?
P(x>200)=P(z>2.4) =1-0.9918 = 0.0082
So, for 10.000 people
10,000x0.0082=82 will weigh more than 200 pounds