Biostats Lecture 7 Bernoulli, Binomial and Poisson Distributions
Biostats Lecture 7 Bernoulli, Binomial and Poisson Distributions
1
Biostats Lecture 7 (Diez Chapter 4)
Not covering:
Geometric Distribution (4.2.2)
Negative Binomial Distribution (4.4)
2
Bernoulli & Binomial Distributions
3
Bernoulli Trial (Experiment, or Random
Variable)
Definition: A Bernoulli trial (experiment, or random
variable) results in one of two possible outcomes
(success/failure), where the success probability is p, and
the failure probability is q=(1-p)
Examples:
• Coin flip (head, tail)
• a person gets flu in 2018 (yes, no)
• a terminally sick patient survives next 5 years (yes, no)
• answer to a true/false question
4
Bernoulli Distribution
For examples:
1. X is the number of heads in a coin flip.
2. X is the number of persons having a disease in a
random draw of an individual from the population.
5
Bernoulli Random Variable Examples
Example 1: Let X be the breast cancer status of a 50+
year old woman with 0.1 prevalence of breast cancer in
this age group. Then X has a Bernoulli distribution with
p = 0.1.
Variance of X:
Standard deviation of X :
7
Example: Multiple Choice Quiz
8
Setup and Answer
Let
, if question i is answered correctly
, if question i is answered incorrectly
where i=1,2,3
Probability of guessing correctly on one question with four answer
choices is ¼. This is a Bernoulli random variable with p = ¼.
There is only one way to answer all three correctly.
Assuming all 3 questions are independent,
P(all 3 questions answered correctly)=
9
Multiple Choice Quiz Continues: From
Bernoulli to Binomial
Let’s look at this a slightly different way.
10
Multiple Choice Quiz Continues
What is the chance you answer 2 questions correctly?
=3*
11
Multiple Choice Quiz Continues
=3*
12
Multiple Choice Quiz Continues
What is the chance you answer 0 questions correctly?
13
Formula for calculating such
probabilities
General formula:
P(answering x questions correctly) =
Number of ways to answer x of 3 questions correctly =
where denotes the number of ways to answer x
questions correctly out of 3. This is an example of
the Binomial Distribution.
14
Formula for calculating binomial
probabilities
More general formula:
P(x successes out of n trials) =
where denotes the number of ways to have x successes
out of n trials. P is the success probability in one trial.
15
How can we get the binomial
probabilities and know they are correct?
1. Set , then
.
are binomial probabilities respectively for success out
of 1 trial.
2.
16
How can we get the binomial
probabilities and know they are correct?
3.
4. In general,
P(x successes out of n trials) =
17
Choose Function (continued)
Computing The # Of Ways
The choose function is useful for calculating the number of ways
to choose k successes in n trials.
• n! is “n factorial”,
• By definition and
• is “n choose k”
Examples:
K=1, n=4: = = 4
K=2, n=9: = = 36
18
Practice (Choose Function)
Which of the following is false?
19
Practice (Choose Function)
Which of the following is false?
21
Conditions for Binomial Distribution
● n Bernoulli trials
● n is fixed (determined in advanced and can’t
change)
● all n Bernoulli trials are independent
● success probability p is the same in all Bernoulli
trials
23
Practice – Binomial Distribution
24
Practice – Using the Binomial Distribution
A 2012 Gallup survey suggests that 26.2% of
Americans are obese. Among a random sample of 10
Americans, what is the probability that exactly 8 are
obese?
25
Practice – Using the Binomial Distribution
A 2012 Gallup survey suggests that 26.2% of
Americans are obese. Among a random sample of 10
Americans, what is the probability that exactly 8 are
obese?
26
Mean, Variance, and Standard Deviation of
Binomial Distribution
Mean:
Variance:
Standard deviation:
Note: Mean and standard deviation of a binomial might not always be whole
numbers. These values represent what we would expect to see on average.
27
Probability Distribution Function (pdf)
reviewed
28
Cumulative Distribution Function (cdf)
Definition: If X is a random variable, then P(X
x), where x is in the sample space of X, is the
cumulative distribution function (cdf).
From the cdf you can obtain the “less than or
equal to” or “at most” cumulative probability.
Note: P(X > x) = 1 - P(X x)
Why?
This works for both discrete and continuous
random variables
Why? 29
Practice – Free Throw Probability
A consistent free throw (FT) basketball player has a 75%
FT percentage, i.e., 75% of the time the player scores on
free throws. Suppose in a game she was awarded 3 FTs.
Each successful free throw is 1 point.
30
Practice – Free Throw Probability
(continued)
Let X be random variable with number of points scored from 3 FTs.
1. What is the pdf of number of points of 3 FTS?
X ~ Binomial (n, p) = Binomial (3, .75)
2. What is the probability that she scores 2 points?
P(X=2)=
= = 3 x 0.5625 x 0.25
= .421875
3. What is the probability that she scores at most 2 points?
P(X 2)=P(X=0)+P(X=1)+P(X=2)
= + +
= (1 x 1 x .015625) + (3 x .75 x .0625) + 3 x 0.5625 x 0.25
= .015625 + .140625 + .421875
= .578125
31
Practice (continued)
4. What is the probability that she scores at least 1 point?
P(X 1)
=P(X=1)+P(X=2)+P(X=3)
= (1- P(X=0))
=(1 - .0156)
=.9843
32
Distribution of Number of Successes
in n Trials as n Increases
33
An Analysis of Facebook Users
A recent study found that ``Facebook users get more than they
give". For example:
1. 40% of Facebook users in our sample made a friend request, but
63% received at least one request.
2. Users in our sample pressed the like button next to friends'
content an average of 14 times, but had their content ``liked" an
average of 20 times.
3. Users sent 9 personal messages, but received 12.
4. 12% of users tagged a friend in a photo, but 35% were
themselves tagged in a photo.
36
Practice – Facebook Users (continued)
What is the probability that the average Facebook user
with 245 friends has 70 or more friends who would be
considered power users?
37
How Large is Large Enough for the Normal to
be a Good Approximation for the Binomial?
In this example:
np = 245 x 0.25 = 61.25
n(1-p) = 61.25 x 0.75 = 45.9375
So, the normal approximation can be used.
38
Example – HPV Risk and # of Sexual
Partners
Consider a random sample of n = 5 participants who all reported having
greater than or equal to 3 sex partners within the last 12 months. Using
the high-risk population prevalence for HPV, p = 0.6, answer these
questions.
39
Example – HPV Risk and # of Sexual
Partners (continued)
a. What are the expected (mean) number of high-risk HPV cases in this
sample and the associated standard deviation?
µ = np = 5 x 0.6 = 3
σ = sqrt(np(1-p)) = sqrt (5*.6*.4) = sqrt (1.2) = 1.095
40
Practice – When Can the Normal
Approximation be Used?
BelowBelow
areare fourpairs
four pairs of
ofBinomial
Binomialdistribution parameters.
distribution parameters.
Which distribution can be approximated by the normal
Which distribution can be approximated by the normal
distribution?
distribution?
1. n = 100, p = 0.95
2. n = 25, p = 0.45
3. n = 150, p = 0.05
1. n =4. 100, p =p 0.95
n = 500, = 0.015
2. n = 25, p = 0.45
3. n = 150, p = 0.05
4. n = 500, p = 0.015
41
Practice – When Can the Normal
Approximation be Used?
BelowBelow
areare fourpairs
four pairs of
ofBinomial
Binomialdistribution parameters.
distribution parameters.
Which distribution can be approximated by the normal
Which distribution can be approximated by the normal
distribution?
distribution?
1. n = 100, p = 0.95
2. n = 25, p = 0.45
3. n = 150, p = 0.05
1. n =4. 100, p =p 0.95
n = 500, (np
= 95, n(1-p) = 5)
= 0.015
2. n = 25, p = 0.45 → 25 x 0.45 = 11.25, 25 x 0.55 =
13.75
3. n = 150, p = 0.05 (np = 7.5, n(1-p) = 142.5)
4. n = 500, p = 0.015 (np = 7.5, n(1-p) = 492.5)
42
Example: Expected Value and Standard
Deviation of a Binomial Random Variable
44
Practice – Attitudes About Home Schooling
An August 2012 Gallup poll suggests that 13% of Americans
think home schooling provides an excellent education for
children. Would a random sample of 1,000 Americans where
100 share this opinion be considered unusual?
http://www.gallup.com/poll/156974/private-schools-top-marks-educating-children.aspx 45
Practice – Attitudes About Home Schooling
(continued)
An August 2012 Gallup poll suggests that 13% of Americans
think home schooling provides an excellent education for
children. Would a random sample of 1,000 Americans where
100 share this opinion be considered unusual?
(a) Yes because 100 is an unusual observation (b) No
np +
Range of usual observations:
= 130 + (2 x 10.6)
= (108.8, 151.2)
46
Poisson distribution
47
Poisson Distribution
• Increasingly being used in public health and clinical research
• The random variable takes the form: the number of events in
a time interval
• There are multiple time intervals
• Examples:
number of heart attacks in a month
number of marriages in a year
number of people getting struck by lightning in a year
49
Poisson Distribution Shape
● The value of λ determines the shape of the Poisson
distribution
● λ is the expected number of events per time interval
50