C-5 Probability
C-5 Probability
Probability Distributions
01/23/2023 1
Learning objectives
of an event
chance
probability experiment
01/23/2023 3
Con…
• Probability is the language of chance i.e. the chance
an event to occur?
• The frequentist definition is usually used in statistics.
• This states that the probability of the occurrence of a
particular event equals the proportion of times that
the event would (or does) occur in a large number of
similar repeated trials
• It has a value between 0 and 1
01/23/2023 4
Con…
01/23/2023 5
Con…
• When can we talk about probability ?
• When dealing with a process that has an uncertain
outcome
01/23/2023 7
Definitions of some terms commonly encountered in
probability
Experiment = any process with an uncertain outcome.
An experiment is a trial and all possible outcomes are
events
Event = something that may happen or not when the
experiment is performed (either occur or not)
Events are represented by uppercase letters such as A,
B, C, etc
01/23/2023 8
• Sample space: The set of all possible outcomes of an
experiment , for example, (H,T).
• Probability
– Can be defined as the number of times in which that event
occurs in a very large number of trials
• Probability of an Event E
– A number between 0 and 1 representing the proportion of
times that event E is expected to happen when the experiment
is done over and over again under the same conditions
01/23/2023 9
Two Categories of Probability
1. Objective and
2. Subjective Probabilities
Objective probability
Classical probability
01/23/2023 11
Con..
• If we toss a die, what is the probability of coming up (4)?
– m = 1 (which is 4) and N = 6
– The probability of 4 coming up is 1/6
• There are 2 possible outcomes {H, T} in the set of all
possible trials of Tossing of coin
P(H) = 0.5
P(T) = 0.5
SUM = 1.0
01/23/2023 12
Relative Frequency Probability
• The proportion of times the event A occurs in a large
number of trials repeated under essentially identical
conditions
• Probabilities can be expressed as proportions that
range from 0 to 1
– Definition: If a process is repeated a large number
of times (n), and if an event with the characteristic
E occurs m times, the relative frequency of E,
01/23/2023 Probability of E = m/n 13
Con…
P(H) = 0.5562.
P (A ∩ B) = 0
– If E1 occurs, then E2 doesn’t occurs
– If E2 occurs, then E1 doesn’t occurs
01/23/2023 17
Con…
Example:
– A coin toss cannot produce head and tail
simultaneously.
– Weight of an individual can’t be classified
simultaneously as “underweight”, “normal”,
“overweight”
– Blood pressure reading: A=(DBP<90) and
B=(DBP<95), can’t occur at the same time
01/23/2023 18
Independent Events
• Two events A and B are independent if the probability of the
first one happening is the same no matter how the second
one turns out. or
Example:
– The outcomes on the first and second coin tosses are
independent
01/23/2023 19
Dependent Events
• Occurrence of one affects the probability of the other
– P(A ∩ B) ≠ P(A) x P(B)
• Example: Consider the DBP measurements from a
mother and her first-born child.
Let:
– A = {mother’s DBP 95}
– B = {first-born child’s DBP 80}
• Suppose P{A ∩ B} = 0.05 P{A} = 0.1 P{B} = 0.2, then P {A
∩ B} = 0.05 > P{A} x P{B} = 0.02
• then Events A and B would be dependent
01/23/2023 20
Con…
Dependent events
E1 = Rain forecasted on the news
01/23/2023 21
Intersection and union
• The intersection of two events A and B, A ∩ B, is the event
that A and B happen simultaneously
– P ( A and B ) = P (A ∩ B )
01/23/2023 22
Con….
0 ≤P(E)≤1
A value 0 means the event can not occur
A value 1 means the event definitely will occur
A value of 0.5 means that the probability that the event will
occur is the same as the probability that it will not occur
01/23/2023 24
Con…
01/23/2023 27
Basic Probability Rules
1. Addition rule
More generally:
P(A or B) = P(A) + P(B) - P(A and B)
P (event A or event B occurs or they both occur)
01/23/2023 28
01/23/2023 29
01/23/2023 30
Example: The probabilities below represent years of
schooling completed by mothers of newborn infants
Mother’s education Probability
≤ 8 years 0.056
9 to 11 years 0.159
12 years 0.321
12 to 15 years 0.218
≥ 16 years 0.230
Not reported 0.016
01/23/2023 31
Con…
• What is the probability that a mother has completed < 12
years of schooling?
01/23/2023 34
Con…
P(ECG abnormal and RA abnormal)=7/19=0.37
P(ECG abnormal or RA abnormal)
= P(ECG abnormal) + p(RA abnormal) - P(Both ECG and RA
abnormal)
=17/19 + 2/19-7/19=19/19 – 7/19 = 1-0.37 = 0.63
Note : The problem is that the 7 patients whose ECGs and RAs
are both abnormal are counted twice.(if the intersection was
not subtracted)
01/23/2023 35
Con…
2. Multiplication rule
– If A and B are independent events,
– Then P(A ∩ B) = P(A) × P(B)
• More generally (both independent & dependent),
– P(A ∩ B) = P(A) P(B|A) = P(B) P(A|B)
– P (A and B) denotes the probability that A and B
both occur at the same time
36
01/23/2023
Example
=½x½=¼
01/23/2023 37
Conditional Probability
• Refers to the probability of an event, given that
another event is known to have occurred. T/F
– “What happened first is assumed”
• The conditional probability that event B has occurred
given that event A has already occurred is denoted P(B|
A) and is defined
•01/23/2023
Provided that P(A) > 0 38
Example:
Bright light 18 3 21
Reduced light 21 18 39
Total 39 21 60
01/23/2023 39
Con…
• The probability of developing retinopathy is:
P (Retinopathy) = No. of infants with retinopathy
Total No. of infants
= (18+21)/(21+39)
= 0.65
01/23/2023 40
Con…
= 18/21 = 0.86
01/23/2023 41
Con…
= 21/39 = 0.54
• The conditional probabilities suggest that premature
infants exposed to bright light have a higher risk of
retinopathy than premature infants exposed to reduced
light
01/23/2023 42
Con…
01/23/2023 43
Example
Optic-nerve Degeneration
Sex Present Not Present
Female 4 1
Male 4 1
01/23/2023 44
Solution
P(Optic-nerve degeneration/Female)=
No.of females with optic-nerve degeneration
No. of females
4/5=0.80
P(Optic-nerve degeneration)=
Bright light 18 3 21
Reduced light 21 18 39
Total 39 21 60
01/23/2023 46
Con…
• Are the events “presence of retinopathy” and “light
exposure” independent/dependent for this sample?
P(Retinopathy/Bright light) = P(Retinopathy) if
independent
= 18/21 39/60, 0.86 ≠ 0.65, dependent
P(Retinopathy/Reduced light) = P(Retinopathy) if
independent
= 21/39 39/60, 0.54 ≠ 0.65, dependent
01/23/2023 47
Example
• Culture and Gonodectin (GD) test results for 240
Urethral discharge specimens
Culture result
GD test result Gonorrhea No Gonorrhea Total
01/23/2023 48
Con…
• What is the probability that a man has gonorrhea?
– P (gonorrhea) = No. of persons with gonorrhea
Total No. of sample persons
= 183/240
= 0.76
01/23/2023 49
Con…
= 175/183
= 0.96 * 100% = 96%
– N.B: True positives = Positive test result and the disease
01/23/2023 50
Con…
01/23/2023 51
Con…
01/23/2023 55
Con...
01/23/2023 56
Con…
Probability distribution
• It is a device used to describe the behavior that a
random variable may have by applying the theory of
probability
∑ P(X = x) = 1
– Poisson distribution
01/23/2023 62
Con…
• The following data shows the number of diagnostic
services a patient receives
No_ of service P (X=x)
0 0.671
1 0.229
2 0.053
3 0.031
4 0.010
5 0.006
01/23/2023 64
Con…
• What is the probability that a patient receives exactly
3 diagnostic services?
P(X=3) = 0.031
• What is the probability that a patient receives at most
one diagnostic service or less than or equals to one?
P (X≤1) = P(X = 0) + P(X = 1)
= 0.671 + 0.229
= 0.900
01/23/2023 65
Con…
= 0.016
01/23/2023 66
The Expected Value of a Discrete Random variable
01/23/2023 67
• The average value assumed by a random variable is
called its expected value, or the population mean
• It is represented by E(X) or µ
• To obtain the expected value of a discrete random
variable X, we multiply each possible outcome by its
associated probability and sum all values with a
probability greater than 0
01/23/2023 68
• For the diagnostic service data:
01/23/2023 69
01/23/2023 70
The Variance of a Discrete Random Variable
01/23/2023 71
For the diagnostic service data: (µ=0.5)
σ2 = ∑(xi-µ)2 p(X=xi)
= (0− 0.5)2(0.671) +(1 − 0.5)2(0.229)
+(2 − 0.5)2(0.053) +(3 − 0.5)2(0.031)
+(4 − 0.5)2(0.010) +(5 − 0.5)2(0.006)
= 0.782
Standard deviation = σ = √0.782 = 0.884
01/23/2023 72
Binomial Distribution (CHOICE)
It is one of the most widely encountered discrete
probability distributions
Consider dichotomous (binary) random variable
Is based on Bernoulli trial, James Bernouli (1654 –1705).
– When a single trial of an experiment can result in only
one of two mutually exclusive outcomes (success or
failure; dead or alive; sick or well, male/female, etc)
01/23/2023 73
Example:
• If you take blood sample for HIV test, then
– Let Y= 1 positive and Y = 0 negative
01/23/2023 76
Characteristics of a Binomial Distribution
01/23/2023 79
Con…
–p : probability of success
01/23/2023 80
Example:
01/23/2023 81
• If the probability that any individual in the population
is a smoker to be P=0.40, then the probability that x=4
smokers out of n=10 subjects selected is:
P(X=4) =10C4(0.4)4(1-0.4)10-4
= 10C4(0.4)4(0.6)6 = 210(.0256)(.04666)
= 0.25
• The probability of obtaining exactly 4 smokers in the
sample is about 0.25.
01/23/2023 82
Con…
01/23/2023 86
Example:
• 70% of a certain population has been immunized for
polio. If a sample of size 50 is taken, what is the
“expected total number”, in the sample who have been
immunized?
– µ = np = 50(.70) = 35
• This tells us that “on the average” we expect to see 35
immunized subjects in a sample of 50 from this
population.
01/23/2023 87
Con…
• If repeated samples of size 10 are selected from the
population of infants born, the mean number of children
per sample who survive to age 70 would be (given p=0.72)
µ = np = (10)(0.72) = 7.2
• The variance would be npq = (10)(0.72)(0.28) = 2.02 and
the SD would be:
√2.02 = 1.42
01/23/2023 88
Exercise: Suppose that in a certain malarious area past
experience indicates that the probability of a person with a high
fever will be positive for malaria is 0.7. Consider 3 randomly
selected patients (with high fever) in that same area.
1) What is the probability that no patient will be positive for
malaria?
2) What is the probability that exactly one patient will be positive for
malaria?
3) What is the probability that exactly two of the patients will be
positive for malaria?
4) What is the probability that all patients will be positive for
malaria?
5) Find the mean and the SD of the probability distribution given
above.
01/23/2023 89
Con…
Answer:
1) 0.027
2) 0.189
3) 0.441
4) 0.343
5) μ = 2.1 and σ = 0.794
01/23/2023 90
Continuous Probability Distributions CHOICE
01/23/2023 91
Con…
• Instead of assigning probabilities to specific outcomes of
the random variable X, probabilities are assigned to
ranges of values
– A continuous probability distribution describes how likely
it is that a continuous random variable takes values
within certain ranges
• The probability associated with any one particular value is
equal to 0
– Therefore, P(X=x) = 0
01/23/2023 97
Con…
• The normal distribution is a theoretical, continuous
probability distribution whose equation is:
01/23/2023 98
Properties of the Normal Distribution
It is a probability distribution of a continuous variable. It
extends from minus infinity( -∞) to plus infinity (+∞).
It is symmetrical about its mean
The mean, the median and mode are almost equal. It is
unimodal
The total area under the curve about the x-axis is 1
square unit
01/23/2023
The curve never touches the x-axis. i.e. It is asymptotic 99
Con…
01/23/2023 100
1. The mean μ tells you about location -
– Increase μ - Location shifts right
– Location is unchanged
01/23/2023 101
01/23/2023 102
Con…
01/23/2023 103
Perpendiculars of:
±1 SD contain about 68%;
±2 SD contain about 95%;
±3 SD contain about 99.7% of the area under the curve.
01/23/2023 104
Con…
01/23/2023 105
Con…
01/23/2023 106
Standard Normal Distribution
01/23/2023 107
Z – Transformation CALCULATION(4PT)
01/23/2023
Z represents the Z-score for given x value
108
Con…
01/23/2023 109
Con…
• This process is known as standardization and gives
the position on a normal curve with µ=0 and δ=1,
i.e., the SND, Z
• A Z-score is the number of standard deviations that
a given x value is above or below the mean (µ)
• A z-score measures the distance of any observation
from its distribution’s mean in units of standard
deviation
01/23/2023 110
The total area under the curve is 1.0, and the curve is
symmetric, so half is above the mean, half is below
01/23/2023 111
Finding normal curve areas
• To find the area under any normal curve, we first find the
area
01/23/2023 112
Con…
• Read the value of the area (P) from the body of the
table where the row and column intersect
• Values of P are in the form of a decimal point and
four places
01/23/2023 113
01/23/2023 114
01/23/2023 115
01/23/2023 116
Exercise
01/23/2023 117
01/23/2023 118
01/23/2023 119
Example-1
01/23/2023 121
Example-2
01/23/2023 122
Solutions
• X~N(120,10 )
• P(X>130)= P(Z>130-120/10)
= P(Z>1)
= 0.1587
⇒ 15.9% of normal healthy individuals have a systolic
blood pressure greater than 130mm Hg.
01/23/2023 123
P(100<X<140) = P(100-120/10<Z<140-120/10)
= P(‐2<Z<2)
= 0.9544
• ⇒ 95.4% of normal healthy individuals have a systolic
blood pressure between 100 and 140 mm Hg.
01/23/2023 124
Example-3
• Assume that among diabetics the fasting blood level of
glucose is approximately normally distribute with a mean of
105 mg per 100 ml and SD of 9 mg per 100 ml.
01/23/2023 125
Con…
a. P(90<X<125) = P(90-105/9<Z<125-105/9)
= P(‐1.67<Z<2.22)
= 0.9393 (93.9%)
b. P(Z<87.4-105/9)
=P(Z<-1.96)
=0.025 (2.5%)
01/23/2023 126
Con…
c. The lower 10% of diabetes
• p=0.1……………….. Find this probability from
SND table and the corresponding Z-value
Z=-1.28 (since it is the lower 10%)
Z=X-105/9
-1.28 = X-105/9
X = (9 x -1.28)+105
X = 93.75
01/23/2023 127
Con…
d. The points encompass 95% of diabetics
95% means -1.96 to 1.96
The lower (-1.96) and the upper (1.96)
-1.96=X-105/9 and 1.96=X-105/9
X=(-1.96x9)+105 and X=(1.96x9)+105
X=87.36 and X=122.64
01/23/2023 128
Thank
you
01/23/2023 129