0% found this document useful (0 votes)
35 views55 pages

VI - Probability Distributions

This document discusses probability distributions, which describe the behavior of random variables. It covers discrete distributions like the binomial distribution and continuous distributions like the normal distribution. The key points are: 1) Probability distributions can be discrete (take countable values) or continuous (take infinite intermediate values). The normal distribution is the most important continuous distribution. 2) Distributions are defined by their parameters, like the mean (μ) and standard deviation (σ) for the normal distribution. These determine the shape of the distribution. 3) For a normal distribution, about 68%, 95%, and 99% of values fall within 1, 2, and 3 standard deviations of the mean, respectively. The standard normal distribution has μ
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
35 views55 pages

VI - Probability Distributions

This document discusses probability distributions, which describe the behavior of random variables. It covers discrete distributions like the binomial distribution and continuous distributions like the normal distribution. The key points are: 1) Probability distributions can be discrete (take countable values) or continuous (take infinite intermediate values). The normal distribution is the most important continuous distribution. 2) Distributions are defined by their parameters, like the mean (μ) and standard deviation (σ) for the normal distribution. These determine the shape of the distribution. 3) For a normal distribution, about 68%, 95%, and 99% of values fall within 1, 2, and 3 standard deviations of the mean, respectively. The standard normal distribution has μ
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 55

PROBABILITY

DISTRIBUTIONS
Probability Distributions

• A probability distribution is a device used to


describe the behaviour that a random variable may
have by applying the theory of probability.
• It is a listing of all the outcomes of an experiment and
its associated probabilities.
• Random Variable = Any quantity or characteristic
that assumes a number of different values such that
any particular outcome is determined by chance.
Cont’d…
• Therefore, the probability distribution of a
random variable is a table, graph, or
mathematical formula that gives the
probabilities with which the random variable
takes different values or ranges of values.

• The distribution can be discrete or


Continuous.
Discrete Probability
Distributions
 A discrete random variable is a variable that can assume only a countable
number of values
• Many possible outcomes:
• – No. of children per woman
• – No. of students in a class
• – No. of patients attending a health facility per day
• Only two possible outcomes:
• – Gender: male or female
• – Tossing a coin: head or tail
• – Yes or no responses
Example: frequency distribution of the number of babies a mother has had.
No of count Percent
children

1 6076 43.1

2 4267 30.2

3 1977 14.0
4 955 6.8

5 479 3.4
6 216 1.5
7 96 0.7
8 44 0.31
• However, with too many categories it becomes
difficult to use frequency distribution to describe
the probability distribution.

• Instead we use theoretical probability


distributions.
Binomial distribution
• One of widely encountered discrete probability
distribution
• A random variable that can only take two possible
outcomes (dichotomous/binary) is called Bernoulli
random variable.
• Is based on Bernoulli trial, James Bernoulli (1654-
1705).
• when a single trial of an experiment can result in only
one of two mutually exclusive outcomes
(success/failure, female/male, dead/alive etc)
Example
Toss a fair coin three times and
define x = number of heads.
x
x p(x)
HHH 1/8 3
P(x = 0) = 1/8 0 1/8
HHT 1/8 2
P(x = 1) = 3/8 1 3/8
HTH 1/8 2 P(x = 2) = 3/8 2 3/8
THH 2 P(x = 3) = 1/8 3 1/8
1/8
1
HTT
1/8
1 Probability
THT 1/8 1
Histogram for x

TTH 1/8 0
TTT 1/8
• Let X be a random variable indicating HIV status (Y); If an individual
is HIV infected = 1 and HIV uninfected = 0.
• Suppose that HIV prevalence among pregnant mothers in country X is
22%.
• P(Y=1) = p = 0.22, and
• P(Y=0) = (1-p) = 0.78
• Suppose we select 2 individuals and let Y be the number that are HIV+
• Then Y can take values 0, 1 or 2
• The outcome of this trial can be presented as:
Assumptions for Binomial
distribution
 The trial has a fixed number of trials n
 The probability of success p is constant for
each trial
 Each trial results in mutually exclusive
outcomes
 The outcomes of the n trials are independent
Formula
• 
Example:
 Suppose we know that 40% of a certain
population are cigarette smokers. If we take a
random sample of 10 people from this
population, what is the probability that we will
have exactly 4 smokers in our sample?
Example:
• 
Class Exercise 1
• Suppose that a fair coin was tossed 6 times; the
probability of heads coming up is 0.5. What is
the probability of Heads coming up exactly 3
times in this trial?
Class Exercise 2
• Each child born to a particular set of parents has a
probability of 0.25 of having blood type O. If these
parents have 5 children.
• What is the probability that:
• a. Exactly 2 of them have blood type O?
• b. At most 2 have blood type O?
• c. At least 4 have blood type O?
• d. 2 do not have blood type O?
Answer
• 
•What if the population you are
interested in is very large and
the associated probability of the
event is very small?
The Poisson distribution
• The Poisson distribution is used to model
discrete events that occur infrequently in time
and space i.e. rare events that occur in
constant rate,
• Example: death rates, road traffic accident
rates.
•Sometimes called distribution of rare
events.
Poisson…
• If X is a random variable representing a Poisson
distribution, then the probability of x occurrences is
given by
P(X=x) = e-λ λx
x!
• λ = the mean number of occurrences in periods of some interval
• e = 2.71
• The mean of Poisson distribution is given by np.
The Poisson distribution…
• Example: Suppose x is a random variable representing the
number of individuals involved in a road traffic accident each
year.
• In US the probability that an individual is involved in RTA is
0.00024
• Suppose we are interested in the number of people in a
population of 10,000 who will be involved in RTA each year.
• I.e. λ = 2.4
Poisson…
• The probability that no one in this pop. will be involved in an accident:

 P(X=0) = e-2.4 2.40 = 0.091


0!
• The probability that exactly one person will be involved in an accident

P(X=1) = e-2.4 2.41 = 0.218


1!
The probability that exactly two persons will be involved:

P(X=2) = e-2.4 2.42 = 0.262


2!
Continuous Probability
Distributions
Continuous Probability
Distributions
• Under different circumstances, the outcome of a random variable
may not be limited to categories or counts.
• E.g. Suppose, X represents the continuous variable ‘Height’;
rarely is an individual exactly equal to 170cm tall
• X can assume an infinite number of intermediate values 170.1,
170.2, 170.3 etc.
• Because a continuous random variable X can take on an
uncountably infinite number of values, the probability associated
with any particular one value is almost equal to zero.
Characteristics of a distribution
• Features commonly used to describe a distribution are location,
dispersion, modality and skew ness.
• Location tells us something about the average value of the
variable.
• Dispersion tells us something about how spread out, the values
of the variable are.
• Modality refers to the number of peaks in the distribution.
• Skew ness refers to whether or not the distribution is symmetric
• A distribution is said to be symmetric if it is symmetrically
distribute about its mode.
• The area under the smooth curve must be equal to 1.
Cont’d…
•Instead of assigning probabilities to
specific outcomes of the random
variable X, probabilities are
assigned to ranges of values
The Normal Distribution
• The ND is the most important probability
distribution in statistics
• Frequently called the “Gaussian distribution”
or bell-shape curve.
• Variables such as blood pressure, weight,
height, serum cholesterol level, and IQ score —
are approximately normally distributed.
Cont’d…
• The ND is vital to statistical work, most
estimation procedures and hypothesis tests are
based on ND.
• A random variable is said to have a normal
distribution if it has a probability distribution that
is symmetric and bell-shaped.
•μ and s are the parameters of
the normal distribution — these
two completely define its shape.
Normal Distribution
μ-3σ μ-2σ μ-σ μ μ+σ μ+2σ μ+3σ
Fig.3. Percentage of area under a normal distribution with mean μ and standard deviation σ
For any normal distribution;
 about 68% (most) of the observations is contained within one SD of the mean.
 about 95% (majority) of the probability is contained within two SDs
 and 99% (almost all) within three SDs of the mean.
Normal Distribution…
•Most biological phenomena such as BP,
serum cholesterol level, weight, birth
weight are approximately normally
distributed.
•Thus we can use the ND to estimate
probabilities associated with these
variables.
Cont’d…
• For example in a population in which SBP is normally
distributed with mean μ and variance s²;
• You might be interested in finding the probability that
a randomly selected individual has SBP of
>160mmHg.
• Which might be essential For planning anti-
hypertensive services
Cont’d…
• We have different normal distributions depending
on the values of μ and s².
• We cannot tabulate every possible distribution
• Tabulated normal probability calculations are
available only for the ND with μ = 0 and s² =1.
Standard Normal Distribution
• It is a normal distribution that has a μ = 0 and a
s = 1, and is denoted by N(0, 1).
• The main idea is to standardize all the data that
is given by using Z-scores.
• These Z-scores can then be used to find the area
(and thus the probability) under the normal
curve.
Cont’d…
•The standard normal distribution has
mean 0 and standard deviation of 1.
• Approximately 68% of the area under the
standard normal curve lies between 1 SDs,
about 95% between 2 SDs, and about 99%
between 3 SDs.
Z-transformation
• 
Example
• 
Cont’d…
• Thisprocess is known as standardization
and gives the position on a normal curve with
=0 and =1, i.e., the SND, Z.

• A Z-score is the number of standard


deviations that a given x value is above or
below the mean.
Finding normal curve areas

• 1. The table gives areas between -∞ and the value of Z.


• 2. Find the z value in tenths in the column at left margin
and locate its row. Find the hundredths place in the
appropriate column.
• 3. Read the value of the area (P) from the body of the table
where the row and column intersect. Values of P are in the
form of a decimal point and four places.
NB: The total area under the curve is 1.0,
and the curve is symmetric, so half (i.e.
0.5) is above the mean, half (0.5) is below
the mean.
1. What is the probability that z<-1.96?

•(1) Sketch a normal curve


•(2) Draw a perpendicular line for z = -
1.96
•(3) Find the area in the table
•(4) The answer is the area to the left of
the line P(z < -1.96) = 0.0250
Exercise
2. What is the probability that -1.96 < z
<1.96?
3. What is the probability that z > 1.96?
Exercise…

4. Compute P(-1 ≤ Z ≤ 1.5)


5. Find the area under the SND from 0 to 1.45
6. Compute P(-1.66 < Z < 2.85)
Exercise…

4. Compute P(-1 ≤ Z ≤ 1.5) = 0.7745


5. Find the area under the SND from 0 to 1.45
= 0.4265
6. Compute P(-1.66 < Z < 2.85) = 0.9493
Applications of the Normal
Distribution
• The ND is used as a model to study many different
variables.
• The ND can be used to answer probability questions
about continuous random variables.
• Following the model of the ND, a given value of x
must be converted to a z score before it can be looked
up in the z table.
Z-transformation
• 
Example
• Question –
 Systolic blood pressure of a group of old
people was normally distributed with a mean
175 and variance of 169; what is the
probability that a randomly selected
individual will have a SBP of less than 140
mmHg?
Solution
• 
Exercise
• The DBP of males 35–44 years of age are
normally distributed with
μ = 80 mm Hg and s² = 144 mm Hg2
Hence s = 12 mm Hg
• Let individuals with BP above 95 mm Hg are
considered to be hypertensive.
Cont’d…
1. What is the probability that a randomly selected
male has a BP above 95 mm Hg?
2. What is the probability that a randomly selected
male has a DBP above 110 mm Hg?
3. What is the probability that a randomly selected
male has a DBP below 60 mm Hg?
Cont’d…
1. What is the probability that a randomly selected
male has a BP above 95 mm Hg? = 0.106
2. What is the probability that a randomly selected
male has a DBP above 110 mm Hg? = 0.006
3. What is the probability that a randomly selected
male has a DBP below 60 mm Hg? = 0.047
Key Concepts
I. Experiments and the Sample Space
1. Experiments, events, mutually exclusive events,
simple events
2. The sample space
II. Probabilities
1. Relative frequency definition of probability
2. Properties of probabilities
a. Each probability lies between 0 and 1.
b. Sum of all simple-event probabilities equals 1.
3. P(A), the sum of the probabilities for all simple events in A
Key Concepts
• 

P( A  B)
P( A | B) 
P( B)

P( A  B)  P( A)  P( B)  P( A  B)
Key Concepts
V. Discrete Random Variables and
Probability Distributions
1. Random variables, discrete and
continuous
2. Properties of probability distributions
0  p ( x)  1 and  p( x)  1
Thank you!

You might also like