0% found this document useful (0 votes)
125 views129 pages

C-5 Probability

This document defines key concepts in elementary probability and probability distributions. It discusses probability as it relates to random events and outcomes, different definitions of probability including classical and relative frequency, mutually exclusive and independent events, and properties of probability including the sum of probabilities equaling 1. It also introduces probability distributions and differentiates between the binomial and normal distributions.

Uploaded by

Biruk Mengstie
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
125 views129 pages

C-5 Probability

This document defines key concepts in elementary probability and probability distributions. It discusses probability as it relates to random events and outcomes, different definitions of probability including classical and relative frequency, mutually exclusive and independent events, and properties of probability including the sum of probabilities equaling 1. It also introduces probability distributions and differentiates between the binomial and normal distributions.

Uploaded by

Biruk Mengstie
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 129

Elementary Probability and

Probability Distributions

01/23/2023 1
Learning objectives

At the end of this chapter, the student will be able to:


• Recognize the concepts and characteristics of probability and
probability distributions
• Compute probabilities of events and conditional probabilities
• Differentiate between the binomial and normal distributions

• Explain the concepts and uses of the standard normal


distribution
01/23/2023 2
Probability

• Chance of observing a particular outcome or Likelihood

of an event

• Assumes a “stochastic” or “random” process: i.e.. the

outcome is not predetermined - there is an element of

chance

• An outcome is a specific result of a single trial of a

probability experiment
01/23/2023 3
Con…
• Probability is the language of chance i.e. the chance
an event to occur?
• The frequentist definition is usually used in statistics.
• This states that the probability of the occurrence of a
particular event equals the proportion of times that
the event would (or does) occur in a large number of
similar repeated trials
• It has a value between 0 and 1
01/23/2023 4
Con…

• Probability theory developed from the study of


games of chance like dice and cards
• A process like flipping a coin, rolling a die or drawing
a card from a deck are probability experiments
• Central to the understanding of inferential statistics

01/23/2023 5
Con…
• When can we talk about probability ?
• When dealing with a process that has an uncertain
outcome

– Birth of male or female child?


– Tossing a coin?
– A patient taking a certain drug (cure/no)?
– The fate of the patient?
– Etc
01/23/2023 6
Con…

• Probability theory is a foundation for statistical


inference, &
• Allows us to draw conclusions about a population of
patients based on information obtained from a
sample of patients drawn from that population.

01/23/2023 7
Definitions of some terms commonly encountered in
probability
 Experiment = any process with an uncertain outcome.
 An experiment is a trial and all possible outcomes are
events
 Event = something that may happen or not when the
experiment is performed (either occur or not)
 Events are represented by uppercase letters such as A,
B, C, etc
01/23/2023 8
• Sample space: The set of all possible outcomes of an
experiment , for example, (H,T).

• Probability
– Can be defined as the number of times in which that event
occurs in a very large number of trials

• Probability of an Event E
– A number between 0 and 1 representing the proportion of
times that event E is expected to happen when the experiment
is done over and over again under the same conditions
01/23/2023 9
Two Categories of Probability

1. Objective and

2. Subjective Probabilities

Objective probability

Classical probability

Relative frequency probability


01/23/2023 10
Classical Probability

• Is based on gambling ideas


• Definition: If an event can occur in N mutually
exclusive and equally likely ways, and if m of these
posses a characteristic, E, then the probability of the
occurrence of (E) = m/N.
– P(E)= the probability of E = m/N

01/23/2023 11
Con..
• If we toss a die, what is the probability of coming up (4)?
– m = 1 (which is 4) and N = 6
– The probability of 4 coming up is 1/6
• There are 2 possible outcomes {H, T} in the set of all
possible trials of Tossing of coin
P(H) = 0.5
P(T) = 0.5
SUM = 1.0
01/23/2023 12
Relative Frequency Probability
• The proportion of times the event A occurs in a large
number of trials repeated under essentially identical
conditions
• Probabilities can be expressed as proportions that
range from 0 to 1
– Definition: If a process is repeated a large number
of times (n), and if an event with the characteristic
E occurs m times, the relative frequency of E,
01/23/2023 Probability of E = m/n 13
Con…

• If you toss a coin 100 times and head comes up 40 times,

P(H) = 40/100 = 0.4.

• If we toss a coin 10,000 times and the head comes up


5562,

P(H) = 0.5562.

• Therefore, the longer the series and the longer sample


size, the closer the estimate to the true value (0.5)
01/23/2023 14
Example:
• The tuberculin skin test is a routine screening test to
detect TB. The results can be categorized as either
positive, negative, or uncertain.
– If the probability of a positive test is 0.1, it means that
if a large number of such tests were performed, about
10% would be positive.
• The actual percentage of positive tests will be
increasingly close to 0.1 when more tests are performed.
01/23/2023 15
Subjective Probability
 Personalistic (An opinion or judgment by a decision maker
about the likelihood of an event)
 Personal assessment of which is more effective to provide
cure- traditional/modern
 Personal assessment of which sports team will win a match
 Also uses classical and relative frequency methods to assess
the likelihood of an event, but does not rely on repeatability of
any process.
01/23/2023 16
Mutually Exclusive Events

• Two events A and B are mutually exclusive if they


cannot both happen at the same time

P (A ∩ B) = 0
– If E1 occurs, then E2 doesn’t occurs
– If E2 occurs, then E1 doesn’t occurs

01/23/2023 17
Con…

Example:
– A coin toss cannot produce head and tail
simultaneously.
– Weight of an individual can’t be classified
simultaneously as “underweight”, “normal”,
“overweight”
– Blood pressure reading: A=(DBP<90) and
B=(DBP<95), can’t occur at the same time
01/23/2023 18
Independent Events
• Two events A and B are independent if the probability of the
first one happening is the same no matter how the second
one turns out. or

• The outcome of one event has no effect on the occurrence


or non-occurrence of the other

Example:
– The outcomes on the first and second coin tosses are
independent
01/23/2023 19
Dependent Events
• Occurrence of one affects the probability of the other
– P(A ∩ B) ≠ P(A) x P(B)
• Example: Consider the DBP measurements from a
mother and her first-born child.
Let:
– A = {mother’s DBP 95}
– B = {first-born child’s DBP 80}
• Suppose P{A ∩ B} = 0.05 P{A} = 0.1 P{B} = 0.2, then P {A
∩ B} = 0.05 > P{A} x P{B} = 0.02
• then Events A and B would be dependent
01/23/2023 20
Con…

 Dependent events
E1 = Rain forecasted on the news

E2 = Take umbrella to work

 Probability of the second event is affected by the


occurrence of the first event

01/23/2023 21
Intersection and union
• The intersection of two events A and B, A ∩ B, is the event
that A and B happen simultaneously
– P ( A and B ) = P (A ∩ B )

• Let A represent the event that a randomly selected newborn


is LBW, and B the event that he or she is from a multiple birth

• The intersection of A and B is the event that the infant is both


LBW and from a multiple Birth

01/23/2023 22
Con….

• The union of A or B, A U B, is the event that either A


happens or B happens or they both happen
simultaneously
– P ( A or B ) = P ( A U B )
• Here, the union of A and B is the event that the
newborn is either LBW or from a multiple birth, or
both
01/23/2023 23
Properties of Probability

1. The numerical value of a probability always lies between 0


and 1, inclusive.

0 ≤P(E)≤1
 A value 0 means the event can not occur
 A value 1 means the event definitely will occur
 A value of 0.5 means that the probability that the event will
occur is the same as the probability that it will not occur
01/23/2023 24
Con…

2. The sum of the probabilities of all mutually exclusive


outcomes is equal to 1.
P(E1) + P(E2 ) + .... + P(En ) = 1.
3. For two mutually exclusive events A and B,

P(A or B ) = P(AUB)= P(A) + P(B).


• If not mutually exclusive:

P(A or B) = P(A) + P(B) - P(A and B)


01/23/2023 25
Con..

4. The complement of an event A, denoted by Ā or Ac, is


the event that A does not occur
• Consists of all the outcomes in which event A does not
occur
P(Ā) = P(not A) = 1 – P(A)
• occurs only when A does not occur.
• These are complementary events.
01/23/2023 26
Co…

• In the example of LBW, the complement of A is the


event that a newborn is not LBW
• In other words, A is the event that the child weighs
2500 grams or more at birth

01/23/2023 27
Basic Probability Rules

1. Addition rule

If events A and B are mutually exclusive:


 P(A or B) = P(A) + P(B)
 P(A and B) = 0

More generally:
 P(A or B) = P(A) + P(B) - P(A and B)
 P (event A or event B occurs or they both occur)
01/23/2023 28
01/23/2023 29
01/23/2023 30
Example: The probabilities below represent years of
schooling completed by mothers of newborn infants
Mother’s education Probability
≤ 8 years 0.056
9 to 11 years 0.159
12 years 0.321
12 to 15 years 0.218
≥ 16 years 0.230
Not reported 0.016

01/23/2023 31
Con…
• What is the probability that a mother has completed < 12
years of schooling?

– P(≤ 8 years) = 0.056 and

– P(9-11 years) = 0.159


• Since these two events are mutually exclusive,
– P(≤ 8 or 9-11) = P(≤ 8 U 9-11)
= P(≤ 8) + P(9-11)
= 0.056+0.159
= 0.215
01/23/2023 32
Con…

• What is the probability that a mother has completed


12 or more years of schooling?
– P(≥12) = P(12 or 13-15 or ≥16)
= P(12 U 13-15 U ≥16)
= P(12)+P(13-15)+P(≥16)
= 0.321+0.218+0.230
= 0.769
01/23/2023 33
 The following data are the results of electrocardiograms
(ECGs) and Radionuclide Angiocaridograms (RAs) for 19
post-traumatic myocardial confusions.

 7 patients developed both ECG and RA


abnormality

 17 patients developed ECG abnormal

 2 patients developed RA abnormal

01/23/2023 34
Con…
 P(ECG abnormal and RA abnormal)=7/19=0.37
 P(ECG abnormal or RA abnormal)
 = P(ECG abnormal) + p(RA abnormal) - P(Both ECG and RA
abnormal)
 =17/19 + 2/19-7/19=19/19 – 7/19 = 1-0.37 = 0.63
 Note : The problem is that the 7 patients whose ECGs and RAs
are both abnormal are counted twice.(if the intersection was
not subtracted)
01/23/2023 35
Con…

2. Multiplication rule
– If A and B are independent events,
– Then P(A ∩ B) = P(A) × P(B)
• More generally (both independent & dependent),
– P(A ∩ B) = P(A) P(B|A) = P(B) P(A|B)
– P (A and B) denotes the probability that A and B
both occur at the same time
36
01/23/2023
Example

• In tossing two coins, what is the probability that a


head will occur both on the first coin and the
second coin?
– Since there is independence between events
– Then P(A ∩ B) = P(A) × P(B)

=½x½=¼
01/23/2023 37
Conditional Probability
• Refers to the probability of an event, given that
another event is known to have occurred. T/F
– “What happened first is assumed”
• The conditional probability that event B has occurred
given that event A has already occurred is denoted P(B|
A) and is defined

•01/23/2023
Provided that P(A) > 0 38
Example:

• A study investigating the effect of prolonged exposure


to bright light on retina damage in premature infants:
Retinopathy
Light exposure Yes No Total

Bright light 18 3 21
Reduced light 21 18 39
Total 39 21 60

01/23/2023 39
Con…
• The probability of developing retinopathy is:
P (Retinopathy) = No. of infants with retinopathy
Total No. of infants
= (18+21)/(21+39)
= 0.65

01/23/2023 40
Con…

• The conditional probability of retinopathy, given


exposure to bright light, is:
• P(Retinopathy/exposure to bright light) =
No. of infants with retinopathy exposed to bright light
No. of infants exposed to bright light

= 18/21 = 0.86

01/23/2023 41
Con…

• P(Retinopathy/exposure to reduced light) =


# of infants with retinopathy exposed to reduced light
No. of infants exposed to reduced light

= 21/39 = 0.54
• The conditional probabilities suggest that premature
infants exposed to bright light have a higher risk of
retinopathy than premature infants exposed to reduced
light
01/23/2023 42
Con…

• For independent events A and B


– P(A/B) = P(A) or
• For independent events A and B
– P(A and B) = P(A/B) P(B)
• (General Multiplication Rule)
• Given that P(B) > 0

01/23/2023 43
Example

 In a study of optic-nerve degeneration in Alzheimer’s disease,


postmortem examinations were conducted on 10 Alzheimer’s patients.
The following table shows the distribution of these patients according to
sex and evidence of optic-nerve degeneration
 Are the events “patients has optic-nerve degeneration” and “patient is
female” independent for this sample of 10 patients?

Optic-nerve Degeneration
Sex Present Not Present
Female 4 1
Male 4 1

01/23/2023 44
Solution
 P(Optic-nerve degeneration/Female)=
No.of females with optic-nerve degeneration
No. of females

4/5=0.80

P(Optic-nerve degeneration)=

No.of Patients with optic-nerve degeneration


Total No. of patients
= 8/10=0.80
 The events are independent for this sample.
01/23/2023 45
• Look at the previous example a study investigating
the effect of prolonged exposure to bright light on
retina damage in premature infants:
Retinopathy
Light exposure Yes No Total

Bright light 18 3 21
Reduced light 21 18 39
Total 39 21 60
01/23/2023 46
Con…
• Are the events “presence of retinopathy” and “light
exposure” independent/dependent for this sample?
 P(Retinopathy/Bright light) = P(Retinopathy) if
independent
= 18/21 39/60, 0.86 ≠ 0.65, dependent
 P(Retinopathy/Reduced light) = P(Retinopathy) if
independent
= 21/39 39/60, 0.54 ≠ 0.65, dependent
01/23/2023 47
Example
• Culture and Gonodectin (GD) test results for 240
Urethral discharge specimens

Culture result
GD test result Gonorrhea No Gonorrhea Total

Positive 175 9 184


Negative 8 48 56
Total 183 57 240

01/23/2023 48
Con…
• What is the probability that a man has gonorrhea?
– P (gonorrhea) = No. of persons with gonorrhea
Total No. of sample persons
= 183/240
= 0.76

01/23/2023 49
Con…

• What is the probability that a man has a positive GD test


and have the disease Gonorrhea?
– P(test negative | they don’t have the disease)
– P (gonorrhea) = No. of persons with true positives for test
Total No. of persons with Disease Gonorrhea)

= 175/183
= 0.96 * 100% = 96%
– N.B: True positives = Positive test result and the disease
01/23/2023 50
Con…

• What is the probability that a man has a negative GD test


and does not have gonorrhea
– P (test positive | they have the disease)
P (-ve test ∣ No gonorrhea) = No. of persons who are true negatives for test
Total No. of persons without Gonorrhea
= 48/57
= 0.84 = 84%
N.B: True negatives = Negative test result and don’t have disease

01/23/2023 51
Con…

• What is the probability that a man has the disease


(Gonorrhea) given the test reads positive
• P (gonorrhea ∣ the test reads positive) =

No. of persons with true positives for test


Total No. of persons with positive test result
= 175/184
= 0.95; 95%
01/23/2023 52
Con…
• What is the probability that a man has not the disease
(Gonorrhea) given the test reads negative
• P (No gonorrhea ∣ the test reads negative) =

No. of persons with true negatives for test


Total No. of persons with Negative test result
= 48/56
= 0.86; 86%
01/23/2023 53
Probability Distributions
• Random Variable: Any quantity or characteristic
that is able to assume a number of different values
such that any particular outcome is determined by
chance
• The variable X is a random or stochastic variable since
the value that it takes is subject to chance
• Random variables can be categorical, discrete or
continuous
01/23/2023 54
Con...

• A discrete random variable is able to assume only a


finite or countable number of outcomes
• A continuous random variable can take on any value
in a specified interval
• For categorical variables, we obtain the frequency
distribution of each variable

01/23/2023 55
Con...

• With numeric variables, the aim is to determine


whether or not normality may be assumed
– If not we may consider transforming the variable
or categorize it for analysis (e.g. age group)

01/23/2023 56
Con…
Probability distribution
• It is a device used to describe the behavior that a
random variable may have by applying the theory of
probability

• The distribution of all possible values (outcomes) of a


random variable along with their respective
probabilities
01/23/2023 57
Con…
• It is a list of the probabilities associated with the
values of the random variable obtained in an
experiment
• Therefore, the probability distribution of a random
variable is a table, graph, or mathematical formula that
gives the probabilities with which the random variable
takes different values or ranges of values.
01/23/2023 58
Con…

• It is a function that assigns probability for each


element of random variable
• A probability distribution defines the relationship
between the outcomes and their likelihood of
occurrence
• Probabilities and probability distributions are nothing
more than extensions of the ideas of relative
frequency and histograms, respectively
01/23/2023 59
Discrete Probability Distributions

• A discrete random variable is a variable that can


assume only a countable number of values
• Many possible outcomes:
– No. of patients tested for HIV

– No. of patients attending a health facility per day

• Only two possible outcomes:


– Yes or no responses, positive or negative test result

– Gender: male or female


01/23/2023 60
Con…
• For a discrete random variable, the probability distribution
specifies each of the possible outcomes of the random variable
along with the probability that each will occur

• The relationship between the values and their associated


probabilities = Probability Mass Functions

• Examples can be:


– Frequency distribution
– Relative frequency distribution
– Cumulative frequency distribution
01/23/2023 61
Con….
Properties of probability distribution of discrete random
variable
0 ≤ P(X = x) ≤ 1

∑ P(X = x) = 1

Examples of probability distributions for discrete random


variables:
– Binomial distribution

– Poisson distribution
01/23/2023 62
Con…
• The following data shows the number of diagnostic
services a patient receives
No_ of service P (X=x)
0 0.671
1 0.229
2 0.053
3 0.031
4 0.010
5 0.006

• The sum of all the individual probabilities is 1


01/23/2023 63
Con…
• The Cumulative Probability Distribution of X, f(x):

– It shows the probability that the variable X is less


than or equal to a certain value, P(X ≤ x)

01/23/2023 64
Con…
• What is the probability that a patient receives exactly
3 diagnostic services?

P(X=3) = 0.031
• What is the probability that a patient receives at most
one diagnostic service or less than or equals to one?
P (X≤1) = P(X = 0) + P(X = 1)

= 0.671 + 0.229

= 0.900
01/23/2023 65
Con…

• What is the probability that a patient receives at least


four diagnostic services?
P (X≥4) = P(X = 4) + P(X = 5)
= 0.010 + 0.006

= 0.016

01/23/2023 66
The Expected Value of a Discrete Random variable

• If a random variable is able to take on a large number


of values, then a probability mass function might not
be the most useful way to summarize its behavior
• Instead, measures of location and dispersion can be
calculated (as long as the data are not categorical)

01/23/2023 67
• The average value assumed by a random variable is
called its expected value, or the population mean
• It is represented by E(X) or µ
• To obtain the expected value of a discrete random
variable X, we multiply each possible outcome by its
associated probability and sum all values with a
probability greater than 0

01/23/2023 68
• For the diagnostic service data:

Mean (X) = 0(0.671) +1(0.229) +2(0.053)


+3(0.031) +4(0.010) +5(0.006)
= 0.498 ≈ 0.5
• We would expect an average of 0.5 services for each
visit

01/23/2023 69
01/23/2023 70
The Variance of a Discrete Random Variable

• The variance of a random variable X is called the


population variance and is represented by Var(X)or 2
• It quantifies the dispersion of the possible outcomes
of X around the expected value μ

01/23/2023 71
For the diagnostic service data: (µ=0.5)
σ2 = ∑(xi-µ)2 p(X=xi)
= (0− 0.5)2(0.671) +(1 − 0.5)2(0.229)
+(2 − 0.5)2(0.053) +(3 − 0.5)2(0.031)
+(4 − 0.5)2(0.010) +(5 − 0.5)2(0.006)
= 0.782
Standard deviation = σ = √0.782 = 0.884

01/23/2023 72
Binomial Distribution (CHOICE)
 It is one of the most widely encountered discrete
probability distributions
 Consider dichotomous (binary) random variable
 Is based on Bernoulli trial, James Bernouli (1654 –1705).
– When a single trial of an experiment can result in only
one of two mutually exclusive outcomes (success or
failure; dead or alive; sick or well, male/female, etc)

01/23/2023 73
Example:
• If you take blood sample for HIV test, then
– Let Y= 1 positive and Y = 0 negative

• If you are also interested in determining whether a


newborn infant will survive until his/her 70th birthday
• Let Y represent the survival status of the child at age 70
years
– Y = 1 if the child survives and Y = 0 if he/she does not
01/23/2023 74
Con…
 The outcomes are mutually exclusive and exhaustive

 Suppose that 72% of infants born survive to age 70


years
• P(Y = 1) = p = 0.72
• P(Y = 0) = 1 − p = 0.28
• The probability distribution of Y is:
Y=y P(Y=y)
0 0.28
1 0.72
01/23/2023 75
• A binomial probability distribution occurs when the
following requirements are met:
1. The procedure has a fixed number of trials (n)
2. The trials must be independent
3. Each trial must have all outcomes that fall into two
categories (dichotomous outcome)
4. The probabilities must remain constant for each trial
[P(success) = p]

01/23/2023 76
Characteristics of a Binomial Distribution

 The experiment consist of n identical trials


 Only two possible outcomes on each trial
 The probability of A (success), denoted by p, remains
the same from trial to trial
 The probability of B (failure), denoted by q, (q = 1- p)
 The trials are independent
 n and  are the parameters of the binomial
distribution
 The mean is n and the variance is n(1- )
01/23/2023 77
• If an experiment is repeated n times and the outcome is
independent from one trial to another, the probability that
outcome X occurs exactly x times is:

–n : denotes the number of fixed trials

–x : denotes the number of successes in the n trials

–p : denotes the probability of success

–q : denotes the probability of failure (1- p)


01/23/2023 78
• Where is the number of combinations of n
distinct objects taken x of them at a time:

– X! = x (x-1) (x-2) …….. ( 1)


– Note: 0! =1 (by definition)

01/23/2023 79
Con…

 The probability x successes in n trials, with probability of


success p in each trial

–n: number of trials (sample size)

–x : number of successes in the n trials

–p : probability of success

–q : probability of failure (1- p)

01/23/2023 80
Example:

• Suppose we know that 40% of a certain population


are cigarette smokers. If we take a random sample of
10 people from this population, what is the
probability that we will have exactly 4 smokers in our
sample?

01/23/2023 81
• If the probability that any individual in the population
is a smoker to be P=0.40, then the probability that x=4
smokers out of n=10 subjects selected is:

P(X=4) =10C4(0.4)4(1-0.4)10-4

= 10C4(0.4)4(0.6)6 = 210(.0256)(.04666)

= 0.25
• The probability of obtaining exactly 4 smokers in the
sample is about 0.25.
01/23/2023 82
Con…

• We can compute the probability of observing zero


smokers out of 10 subjects selected at random, exactly 1
smoker, and so on, and display the results in a table, as
given (next slide)

• E.g. The third column, P(X ≤ x), gives the cumulative


probability of selecting 3 or fewer smokers into the
sample of 10 subjects is
– P(X ≤ 3) =0.3823, or about 38%
01/23/2023 83
01/23/2023 84
Exercise

• Each child born to a particular set of parents has a


probability of 0.25 of having blood type O. If these parents
have 5 children;
What is the probability that
– Exactly two of them have blood type O
– At most 2 have blood type O
– At least 4 have blood type O
– 2 do not have blood type O
01/23/2023 85
The mean and variance of binomial distribution

 once n and p are specified, we can compute the


proportion of success
p = x/n
• And the mean and variance of the distribution are
given by :
E(X) = μ = np, σ2 = npq, σ =

01/23/2023 86
Example:
• 70% of a certain population has been immunized for
polio. If a sample of size 50 is taken, what is the
“expected total number”, in the sample who have been
immunized?
– µ = np = 50(.70) = 35
• This tells us that “on the average” we expect to see 35
immunized subjects in a sample of 50 from this
population.
01/23/2023 87
Con…
• If repeated samples of size 10 are selected from the
population of infants born, the mean number of children
per sample who survive to age 70 would be (given p=0.72)
µ = np = (10)(0.72) = 7.2
• The variance would be npq = (10)(0.72)(0.28) = 2.02 and
the SD would be:
√2.02 = 1.42

01/23/2023 88
 Exercise: Suppose that in a certain malarious area past
experience indicates that the probability of a person with a high
fever will be positive for malaria is 0.7. Consider 3 randomly
selected patients (with high fever) in that same area.
1) What is the probability that no patient will be positive for
malaria?
2) What is the probability that exactly one patient will be positive for
malaria?
3) What is the probability that exactly two of the patients will be
positive for malaria?
4) What is the probability that all patients will be positive for
malaria?
5) Find the mean and the SD of the probability distribution given
above.

01/23/2023 89
Con…

Answer:
1) 0.027
2) 0.189
3) 0.441
4) 0.343
5) μ = 2.1 and σ = 0.794

01/23/2023 90
Continuous Probability Distributions CHOICE

• A continuous random variable X can take on any value


in a specified interval or range
• With a large number of class intervals, the frequency
polygon begins to resemble a smooth curve
• The probability distribution of X is represented by a
smooth curve called a probability density function

01/23/2023 91
Con…
• Instead of assigning probabilities to specific outcomes of
the random variable X, probabilities are assigned to
ranges of values
– A continuous probability distribution describes how likely
it is that a continuous random variable takes values
within certain ranges
• The probability associated with any one particular value is
equal to 0
– Therefore, P(X=x) = 0

– Also, P(X ≥ x) = P(X > x)


01/23/2023 92
Con…
• Probability distributions for a continuous random
variable differ from discrete distributions in several
ways:
– An event can take on any value within the range of the
random variable and not just integers
– The probability of any specific value is zero
– Probabilities are expressed in terms of an area under a
curve (probability is measured area under the curve)
01/23/2023 93
Con…

• Thus, the probability that X will assume some value


in the interval enclosed by two ranges say x1 and x2

• As a continuous variable can take an infinite number


of values, it helps to visualize the probability
distribution as a curve and probabilities as ‘area
under the curve’
• It is also called normal distribution
01/23/2023 94
The Normal Distribution
• The ND is the most important probability distribution in
statistics
• Frequently called the “Gaussian distribution” or bell-shape
curve

• A normal distribution is a continuous, symmetric, bell-shaped


distribution of a variable
• Variables such as blood pressure, weight, height, serum
cholesterol level, and IQ score are approximately normally
distributed
01/23/2023 95
01/23/2023 96
Con…

• The ND is vital to statistical work, most estimation


procedures and hypothesis tests underlie ND
• The concept of “probability of X=x” in the discrete
probability distribution is replaced by the
“probability density function f(x)

01/23/2023 97
Con…
• The normal distribution is a theoretical, continuous
probability distribution whose equation is:

01/23/2023 98
Properties of the Normal Distribution
 It is a probability distribution of a continuous variable. It
extends from minus infinity( -∞) to plus infinity (+∞).
 It is symmetrical about its mean
 The mean, the median and mode are almost equal. It is
unimodal
 The total area under the curve about the x-axis is 1
square unit
01/23/2023
The curve never touches the x-axis. i.e. It is asymptotic 99
Con…

• The distribution is completely determined by the


parameters μ and σ.
• Changing μ alone shifts the entire normal curve
to the left or right
• Changing σ alone changes the degree to which
the distribution is spread out

01/23/2023 100
1. The mean μ tells you about location -
– Increase μ - Location shifts right

– Decrease μ – Location shifts left


– Shape is unchanged

2. The standard deviation tells you about narrowness or flatness of the


bell
– Increase standard deviation - Bell flattens

• Extreme values are more likely


– Decrease standard deviation - Bell narrows

• Extreme values are less likely

– Location is unchanged
01/23/2023 101
01/23/2023 102
Con…

• If the variable X has a normal distribution, the


probability that X has a value within:
– One standard deviation of the mean is 0.68
– Two standard deviation of the mean is 0.95

– Three standard deviation of the mean is 0.997

01/23/2023 103
Perpendiculars of:
±1 SD contain about 68%;
±2 SD contain about 95%;
±3 SD contain about 99.7% of the area under the curve.

01/23/2023 104
Con…

• 68% of the values of X fall within one standard


deviation of the mean,
• 95% of the values are found within two standard
deviations of the mean and
• 99.7% of the values are found within three standard
deviations of the mean

01/23/2023 105
Con…

• We have different normal distributions depending on


the values of mean and standard deviation
• We cannot tabulate every possible distribution
• Tabulated normal probability calculations are
available only for the ND with μ = 0 and standard
deviation=1

01/23/2023 106
Standard Normal Distribution

• It is a normal distribution that has a mean equal to 0


and a SD equal to 1, and is denoted by N(0, 1).
• The main idea is to standardize all the data that is
given by using Z-scores.
• These Z-scores can then be used to find the area (and
thus the probability) under the normal curve.

01/23/2023 107
Z – Transformation CALCULATION(4PT)

• If a random variable X~N(µ,σ) then we can transform it to

a SND with the help of Z transformation

• Translate from x to the standard normal (the “z”

distribution) by subtracting the mean of x and dividing by

its standard deviation:

01/23/2023
Z represents the Z-score for given x value
108
Con…

• If x is normally distributed with mean of 100 and


standard deviation of 50, the z value for x = 250 is:

• This says that x = 250 is three standard deviation


(3 increments of 50 units) above the mean of 100

01/23/2023 109
Con…
• This process is known as standardization and gives
the position on a normal curve with µ=0 and δ=1,
i.e., the SND, Z
• A Z-score is the number of standard deviations that
a given x value is above or below the mean (µ)
• A z-score measures the distance of any observation
from its distribution’s mean in units of standard
deviation
01/23/2023 110
The total area under the curve is 1.0, and the curve is
symmetric, so half is above the mean, half is below

01/23/2023 111
Finding normal curve areas

• The table gives different areas of probability

• To find the area under any normal curve, we first find the

z-score of the normal random variable

• By using the formula

• Then we use a table to find the area

• Draw a standard normal curve and shade the desired

area
01/23/2023 112
Con…

• Find the z value in tenths in the column at left margin


and locate its row
• Find the hundredths place in the appropriate column

• Read the value of the area (P) from the body of the
table where the row and column intersect
• Values of P are in the form of a decimal point and
four places
01/23/2023 113
01/23/2023 114
01/23/2023 115
01/23/2023 116
Exercise

• If Z = 0.10, the area to the left of Z is ______.


• If Z = 1.14, the area to the right of Z is ______.
• If Z = -1.14, the area to the left of Z is ______.

01/23/2023 117
01/23/2023 118
01/23/2023 119
Example-1

• Suppose that total carbohydrate intake in 12-14 year old


males is normally distributed with mean 124 g/1000 cal
and SD 20 g/1000 cal
A) What percent of boys in this age range have
carbohydrate intake above 140g/1000 cal?
B) What percent of boys in this age range have
carbohydrate intake below 90g/1000 cal?
01/23/2023 120
Con…

• Let X be carbohydrate intake in 12-14-year-old males


and X ∼ N (124, 400)
a) P(X > 140) = P(Z > (140-124)/20) = P(Z > 0.8)
= 0.2119
b) P(X < 90) = P(Z < (90-124)/20) = P(Z < -1.7)
= 0.0446

01/23/2023 121
Example-2

• A data collected on systolic blood pressure (SBP) in


normal healthy individuals is normally distributed with
μ = 120 mmHg and σ = 10 mmHg
1) What proportion of normal healthy individuals
have a SBP above 130 mmHg?
2) What proportion of normal healthy individuals
have a SBP between 100 and 140 mmHg?

01/23/2023 122
Solutions
• X~N(120,10 )
• P(X>130)= P(Z>130-120/10)

= P(Z>1)
= 0.1587
⇒ 15.9% of normal healthy individuals have a systolic
blood pressure greater than 130mm Hg.

01/23/2023 123
P(100<X<140) = P(100-120/10<Z<140-120/10)
= P(‐2<Z<2)
= 0.9544
• ⇒ 95.4% of normal healthy individuals have a systolic
blood pressure between 100 and 140 mm Hg.

01/23/2023 124
Example-3
• Assume that among diabetics the fasting blood level of
glucose is approximately normally distribute with a mean of
105 mg per 100 ml and SD of 9 mg per 100 ml.

a) What proportions of diabetics have levels between 90 and 125 mg


per 100 ml?
b) What proportions of diabetics have levels below 87.4 mg per 100 ml?
c) What level cuts of the lower 10% of diabetics?
d) What are the two levels which encompass 95% of diabetics?

01/23/2023 125
Con…

a. P(90<X<125) = P(90-105/9<Z<125-105/9)

= P(‐1.67<Z<2.22)

= 0.9393 (93.9%)
b. P(Z<87.4-105/9)
=P(Z<-1.96)

=0.025 (2.5%)

01/23/2023 126
Con…
c. The lower 10% of diabetes
• p=0.1……………….. Find this probability from
SND table and the corresponding Z-value
Z=-1.28 (since it is the lower 10%)
Z=X-105/9
-1.28 = X-105/9
X = (9 x -1.28)+105
X = 93.75

01/23/2023 127
Con…
d. The points encompass 95% of diabetics
95% means -1.96 to 1.96
The lower (-1.96) and the upper (1.96)
-1.96=X-105/9 and 1.96=X-105/9
X=(-1.96x9)+105 and X=(1.96x9)+105
X=87.36 and X=122.64

01/23/2023 128
Thank
you
01/23/2023 129

You might also like