0% found this document useful (0 votes)

31 views31 pages

W4 Lecture4

- The document discusses theoretical probability distributions and introduces several key distributions: Bernoulli, binomial, and normal distributions. - It provides examples of how to calculate probabilities of events using these distributions and their parameters. Formulas for the binomial and normal distributions are presented. - The key aspects and properties of the normal distribution are described, including how its mean and standard deviation parameters affect the shape of the distribution curve. Methods for finding probabilities and quantiles are illustrated.

Uploaded by

Thi Nam Phạm

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

31 views31 pages

W4 Lecture4

Uploaded by

Thi Nam Phạm

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 31

Biostatistics

Lecture 4
Theoretical Probability Distributions
2022-2 Fall Semester

Instructor: Min Jin Ha

Department of Health Informatics and Biostatistics
Graduate School of Public Health
Yonsei University
Reading
• Pagano and Gauvreau, Chapter 7.1, 7.2 and 7.4
• Credits:
• CMU Open Learning Initiative: Probability & Statistics v5.0
Random Variable
• Any characteristic that can be measured or categorized is called a variable
• If a variable can assume a number of different values such that any
particular outcome is determined by chance, it is random variable
• Ex. Serum cholesterol level of a 25-to 34-year old male in the US
• Random variables typically represented by upper case letters such as X, Y
and Z.
• A discrete random variable can assume only a finite or countable number
of outcomes X x1...x100
• Ex. Marital status, gender, the number of pregnancies
• A continuous random variable can take on any value within a specified
interval
• Ex. Weight, Height, Serum cholesterol level
Probability Distribution
• Every random variable has a corresponding probability distribution to
describe the behavior of the random variable
• We need probability distribution to make statements about how likely
an event may be
• Empirical Distributions: Revisit Lecture 2
Discrete Case
• Research Question: Perception of their own body among college
students in Korea
• Survey Question: Do you feel you are overweight, underweight, or
about right?
• From a random sample of the target population (i.e., college students
in Korea), we obtain categorical responses.
ategory Frequency Relative Frequency
About right 855 (855/1200) ∗ 100 = 71.3%
Overweight 235 (235/1200) ∗ 100 = 19.6%
Underweight 110 (110/1200) ∗ 100 = 9.2%
Total n=1200 100%
Continuous Case empirical distribution

• Pima Indian Woman

• A population of women who were at least 21 years old, of Pima Indian
heritage and living near Phoenix, Arizona
• Interested in BMI

Rcode:
hist(Pima.tr$bmi,col="coral",main="BMI of Pima IndianWomen",xlab="BMI",border="burlywood",breaks=30,freq=F)
lines(density(Pima.tr$bmi,adjust=0.7),col="burlywood",lwd=3)
Theoretical Distributions
• Probabilities that are calculated from a finite amount of data are
called empirical probabilities
• The probability distributions can be determined based on theoretical
considerations, which is called theoretical probability distributions
Bernoulli Distribution
• Consider a dichotomous (two-level) random variable Y.
• By definition, Y must assume one of two possible values:
• Failure or success
• Dead or alive
• Male or Female
• Current smoker or not
• Heads or tails (coin flip)
• A random variables that take this type is known as a Bernoulli random
variable, and we describe the probability of response using the
parameter 𝜋
Bernoulli Random Variable
• Often coded so that 𝑌 = 1 is called an event or success, and 𝑌 = 0 is
called a failure
• 𝜋 is defined as the probability of success, 𝜋 = 𝑃(𝑌 = 1)
 Coin flip: let 𝑌 = 1 if heads and 𝑌 = 0 if tails, then 𝜋 = 0.5= 𝑃(𝑌 = 0)
 Gender at birth in US: let 𝑌 = 1 if male and 𝑌 = 0 if female, then 𝜋=0.512
and 𝑃 𝑌 = 1 = 1-𝜋 = 0.488
Bernoulli Distribution
• Y takes value 1 with probability 𝜋 and 0 with probability 1 − 𝜋
• 𝑃 𝑌 = 𝑦 = 𝜋 𝑦 1 − 𝜋 1−𝑦  Calculate 𝑃 𝑌 = 0 and 𝑃(𝑌 = 1) pi

• We want to extend this to a more complex setting: in a randomly

selected group of 3 students, how surprising would it be to get 2
smokers?
Case Study: Smoking
• Suppose: it is reported that roughly 20% of Korean adults are smokers
• 𝜋 = 𝑃 𝑌 = 1 = 0.2
• Now suppose we randomly select two adults in Korean adults and let
a new random variable 𝑋 represent the number of smokers: the
possible values of 𝑋 are 0,1, or 2. Assume these persons are
independent (we can use the multiplicative rule)
Case Study: Smoking
• The probability distribution of number of smokers out of two people
is given by
𝑃 𝑋 = 0 = 0.64, 𝑃 𝑋 = 1 = 0.32, 𝑃 𝑋 = 2 = 0.04
• Interpretation: if we randomly sample two people from the Korean
population, the probability that both smokers is 4% chance. The
probability both are nonsmokers s 64% chance. The probability that
only one smokers is 32% chance.
Case Study: Smoking
• If we randomly sample 3 people, what is the chance all 3 are smokers?
Case Study: Smoking
• The probability distribution of number of smokers out of three people
is given by
𝑃 𝑋 = 0 = 0.512, 𝑃 𝑋 = 1 = 0.384, 𝑃 𝑋 = 2 = 0.096, 𝑃 𝑋 = 3 = 0.008
• If we randomly sample 4 people, what is the chance all 4 are smokers?
• This is getting ridiculous, now we need a formula!
• We can use the binomial distribution to help determine this probability
Binomial Distribution
• The binomial distribution is used to give us the probability of 𝑋 ‘successes’
from a sequence of 𝑛 independent Bernoulli trials.
• In our example, each person would be an independent Bernoulli trial
(either a smoker or not)
• This distribution involves three assumptions
• There is fixed number of Bernoulli trials, 𝑛, each of which results in one of two
mutually exclusive outcomes
• The outcomes of the 𝑛 trials are independent
• The probability of success 𝜋 is the same for each trial
• The distribution is
𝑛 𝑥
𝑃 𝑋=𝑥 = 𝜋 (1 − 𝜋)𝑛−𝑥
𝑥
has mean 𝑛𝜋 and variance 𝑛𝜋(1 − 𝜋)
Math
• 𝑛! = 𝑛 𝑛 − 1 𝑛 − 2 … (3)(2)(1) is 𝑛 𝑓𝑎𝑐𝑡𝑜𝑟𝑖𝑎𝑙 allows us to
calculate the number of ways in which the 𝑛 individuals can be
ordered (𝑛 choices for 1st, 𝑛-1 choices for the 2nd…)
• By definition 0! =1
𝑛 𝑛!
• 𝑥
= is the combination of 𝑛 objects chosen 𝑥 at a time. The
𝑥! 𝑛−𝑥 !
number of ways in which 𝑥 objects can be selected from a total of 𝑛
objects regardless of order
Binomial Distribution
Binomial(10,0.2)

• 𝑋 ~𝐵𝑖𝑛𝑜𝑚𝑖𝑎𝑙 𝑛, 𝜋 ⇔

0.30
𝑃 𝑋=𝑥 =

0.25
𝑛 𝑥 (1 − 𝜋)𝑛−𝑥
𝑥
𝜋

0.20
• 𝜋 𝑥 (1 − 𝜋)𝑛−𝑥 accounts for
the probability of two

Density

0.15
smokers in order

0.10
𝑛
• 𝑥
accounts for all the

0.05
possible ways in which we
have two smokers regardless

0.00
of order 0 2 4 6 8 10

No. of successes
Exercise
R Workshop
• Random number generation
• Calculate a density given a value
• Calculate right tail and left tail areas
• Calculate quantiles
Continuous Distribution
• We discussed discrete random variables
• As we move to discussion of continuous random variables, we will consider
the distribution of a continuous random variable 𝑋
• Suppose 𝑋 represents height
• An individual exactly 163cm tall is rare
• Theoretically, 𝑋 can assume an infinite number of intermediate values,
such as 163.0001cm or 163.01
• In reality we measure only discrete values due to the limitations of our
measuring instruments
• In result, the distribution of a continuous random variable is represented
by a smooth curve, called density function
Continuous Distributions
• A continuous distribution describes the probabilities of possible
values of a continuous random variable (infinite and uncountable)
• Density functions/curves, like histograms, can have any shape. The
area under the density curve is always 1.
• How do you find the area of interest in the curves?
• Integration!
•
Empirically observed frequency Analytical probability density
(count the number of values observed) (area under the curve)
Normal Distributions

• For the normal distribution,

Parameters:
-μ = Mean (x)
-σ = StDev (x)

• Also called Gaussian Distribution

• Symmetric Bell curve
The normal is
symmetric and
centered on μ

σ=1

σ affects the
σ=2
width of the curve
σ=3

μ affects the
position of the
center of the curve
μ
• The SD measures the distance
from the mean to the point of
inflection
• About 68% of the data are falling
in 1-SD of the mean.
• About 5% of the data are further
than 2-SD from the mean in each
tail
Finding tail probabilities (P given x)
P ( X< x1) = … P (X < x2) = …

x1 x2
In R, pnorm (x =x1, mean = …, sd = …)
Input: quantile and parameters
Finding quantiles (x given P)
P (X <x …) = P1

Area = P1

In R, qnorm (p =P1, mean = …, sd = …)

Input: left tail probability and parameters
Use the property of symmetry

P (x < μ - k) = Pk P (x > μ + k) = Pk P (x < μ + k) = 1 - Pk

1-Pk

Pk Pk
Pk

μ-k μ μ μ+k μ μ+k

Probabilities between two values
Z-scores
• The z-score is a standard normal variable, following normal distribution
with mean zero and unit standard deviation
• The z-score is used to transform normally distributed variables with mean μ
and SD σ into a variable that follows standard normal distribution

• Z ~ N(0,1)
• When we standardize by finding z-scores, we change the the normal
distribution by moving the location (mean moves to zero) and changing the
scale (SD moves to 1)
• Check Workshop!
Quantile-Quantile plot (Q-Q plot)
• Q-Q plot is designed to compare two probability distributions by
plotting their quantiles against each other.
• Many statistical methods are developed under normality assumption
• Q-Q plot for normality check is called normal Q-Q plot
• We obtain data and a statistical method with normality assumption will be
used
• We need to check if the method is ok to be applied to our data.
• Try Q-Qplot which is a scatterplot for quantiles from data vs. the normal
distribution (theoretical )
Example: Annual Precipitation in US Cities
The average amount of
rainfall in inches for
each of the 70 states

Sandesh CS
No ratings yet
Sandesh CS
32,767 pages
Arm Neon Intrinsics Ref
No ratings yet
Arm Neon Intrinsics Ref
348 pages
Travel Companion Finder System
No ratings yet
Travel Companion Finder System
13 pages
Stat 253 Part 4 Special Probability Distributions
No ratings yet
Stat 253 Part 4 Special Probability Distributions
95 pages
Kebutuhan Alat Praktek Teknik Alat Berat
No ratings yet
Kebutuhan Alat Praktek Teknik Alat Berat
2 pages
Discrete Random Variables and Their Probability Distribution
No ratings yet
Discrete Random Variables and Their Probability Distribution
36 pages
QT-Random Variable and Probability Distribution-1
No ratings yet
QT-Random Variable and Probability Distribution-1
4 pages
MS Dynamics 365 F&O Introduction
No ratings yet
MS Dynamics 365 F&O Introduction
14 pages
Statistical Modelling Assignment II
No ratings yet
Statistical Modelling Assignment II
3 pages
Probability Distributions
No ratings yet
Probability Distributions
63 pages
Lecture 4.1 - Inferential Statistics (Discrete Distributions)
No ratings yet
Lecture 4.1 - Inferential Statistics (Discrete Distributions)
24 pages
Chapter 05
No ratings yet
Chapter 05
30 pages
2019 ERP Software Project Report
No ratings yet
2019 ERP Software Project Report
23 pages
UNIT - 4 Complete
No ratings yet
UNIT - 4 Complete
77 pages
OBJECTID Shape Shape - Length Shape - Area Admin1name - en
No ratings yet
OBJECTID Shape Shape - Length Shape - Area Admin1name - en
157 pages
TEDTalksFREEWorksheettoUseWithANYTEDTalkPublicSpeakingGrades612 PDF
No ratings yet
TEDTalksFREEWorksheettoUseWithANYTEDTalkPublicSpeakingGrades612 PDF
2 pages
EQRD - Class 6
No ratings yet
EQRD - Class 6
5 pages
Canadian Visa Requirements 1. Accomplished IMM5257 Form
50% (2)
Canadian Visa Requirements 1. Accomplished IMM5257 Form
5 pages
PRELIM Tour Guiding
No ratings yet
PRELIM Tour Guiding
2 pages
Notes On Unit 3
No ratings yet
Notes On Unit 3
42 pages
Types of Statistical Distributions
No ratings yet
Types of Statistical Distributions
34 pages
Panelboards - Electrical Design Guide
No ratings yet
Panelboards - Electrical Design Guide
2 pages
Annex A
No ratings yet
Annex A
1 page
Ttrignometric Ratios Real World Sohcahtoa Worksheet Ans Etz
100% (1)
Ttrignometric Ratios Real World Sohcahtoa Worksheet Ans Etz
6 pages
ProbabilityDistributions BRSM SP2022 Lecture3
No ratings yet
ProbabilityDistributions BRSM SP2022 Lecture3
45 pages
Probability Densities and Normality
No ratings yet
Probability Densities and Normality
17 pages
Categorical Chapter One
No ratings yet
Categorical Chapter One
11 pages
SQL Plus: A Command Line DOS-like Interface Which Can Provide Users An Environment To Execute
No ratings yet
SQL Plus: A Command Line DOS-like Interface Which Can Provide Users An Environment To Execute
5 pages
List of Documents To Be Attached With The Application Form For Registration As Professional Engineer (Pe) (Through Epe)
100% (1)
List of Documents To Be Attached With The Application Form For Registration As Professional Engineer (Pe) (Through Epe)
6 pages
Random Variables and Discrete Distributions
No ratings yet
Random Variables and Discrete Distributions
40 pages
Konya Province Gelatin Production Pre Feasibility Report With Appendix
No ratings yet
Konya Province Gelatin Production Pre Feasibility Report With Appendix
77 pages
Binomial Distributions
No ratings yet
Binomial Distributions
15 pages
Earthen Farmstead Master Plan
No ratings yet
Earthen Farmstead Master Plan
19 pages
Istanbul Aydin University: Chapter 4: Probability Distributions
No ratings yet
Istanbul Aydin University: Chapter 4: Probability Distributions
12 pages
Biostatistics Sem V
No ratings yet
Biostatistics Sem V
20 pages
Statatics and Probability Chapter 3 and 4
No ratings yet
Statatics and Probability Chapter 3 and 4
10 pages
AEM Lecture 5
No ratings yet
AEM Lecture 5
52 pages
Sandeep Maheshwari
No ratings yet
Sandeep Maheshwari
56 pages
PATTERN Practical Research 1 2 1
No ratings yet
PATTERN Practical Research 1 2 1
17 pages
Binomial Distribution
No ratings yet
Binomial Distribution
36 pages
ALY6000 Module 6.0
No ratings yet
ALY6000 Module 6.0
54 pages
Module 5 Common Discrete Probability Distribution - Latest
No ratings yet
Module 5 Common Discrete Probability Distribution - Latest
45 pages
Ch.3 - Ch.4 - Ch.5 Part II
No ratings yet
Ch.3 - Ch.4 - Ch.5 Part II
86 pages
Lecture 4-Probability Distributions-FOR UPLOAD
No ratings yet
Lecture 4-Probability Distributions-FOR UPLOAD
70 pages
Lecture4 Probability
No ratings yet
Lecture4 Probability
28 pages
Discrete Distribution
No ratings yet
Discrete Distribution
19 pages
Unit 4 SM
No ratings yet
Unit 4 SM
26 pages
Continuous Random Variable
No ratings yet
Continuous Random Variable
44 pages
W6 Lecture6
No ratings yet
W6 Lecture6
20 pages
REPSOL GXR 5 - 1 v1
No ratings yet
REPSOL GXR 5 - 1 v1
1 page
1853 - Random Variable & Distribution
No ratings yet
1853 - Random Variable & Distribution
43 pages
1st Unit Notes
No ratings yet
1st Unit Notes
34 pages
W5 Lecture5
No ratings yet
W5 Lecture5
15 pages
MTK3005 Chapter 2
No ratings yet
MTK3005 Chapter 2
5 pages
Robin Adair Petite 00 Pint
No ratings yet
Robin Adair Petite 00 Pint
6 pages
Lec 01
No ratings yet
Lec 01
44 pages
Probablity Distribution
No ratings yet
Probablity Distribution
61 pages
Usulan Alat Lab TKLP 2022 Asiin
No ratings yet
Usulan Alat Lab TKLP 2022 Asiin
11 pages
Probability Distributions
No ratings yet
Probability Distributions
44 pages
4.random Var - Probability Distribution PDF
No ratings yet
4.random Var - Probability Distribution PDF
61 pages
5 Probability Distributions
No ratings yet
5 Probability Distributions
88 pages
Probability Distribution
No ratings yet
Probability Distribution
11 pages
Statistics - Special Probability Distributions
No ratings yet
Statistics - Special Probability Distributions
46 pages
Vaad Vivad - Round 2: Questions
No ratings yet
Vaad Vivad - Round 2: Questions
2 pages
Class 4 SP
No ratings yet
Class 4 SP
23 pages
Chapter 4
No ratings yet
Chapter 4
31 pages
MetNum1 2023 1 Week 12
No ratings yet
MetNum1 2023 1 Week 12
61 pages
Lecture 7 Random Variable Confidence Interval
No ratings yet
Lecture 7 Random Variable Confidence Interval
52 pages
VI - Probability Distributions
No ratings yet
VI - Probability Distributions
55 pages
Winter Pack Model Answer
No ratings yet
Winter Pack Model Answer
16 pages
Lecture Slides - Inferential Statistics
No ratings yet
Lecture Slides - Inferential Statistics
42 pages
11 Board Question Paper Maths II November 2020 - 6598093377c7e
No ratings yet
11 Board Question Paper Maths II November 2020 - 6598093377c7e
4 pages
05 Discrete PD
No ratings yet
05 Discrete PD
44 pages
Comparative Chart - Pa1 Types Cronologic AL Curriculum Functional Curriculum Mixed Curriculum Items
No ratings yet
Comparative Chart - Pa1 Types Cronologic AL Curriculum Functional Curriculum Mixed Curriculum Items
1 page
Jsa - Certified Associate Javascript Programmer: Exam Objectives
No ratings yet
Jsa - Certified Associate Javascript Programmer: Exam Objectives
5 pages
Unit 2 Ma 202
No ratings yet
Unit 2 Ma 202
13 pages
FEFP - WP - Example 3
No ratings yet
FEFP - WP - Example 3
2 pages
Distributions
No ratings yet
Distributions
54 pages
Kinetic Energy Recovery System
No ratings yet
Kinetic Energy Recovery System
9 pages
Laptop Policy - HR (Final)
No ratings yet
Laptop Policy - HR (Final)
3 pages
Probability Distributions.
No ratings yet
Probability Distributions.
46 pages
(Lecture 4) Discrete Probability Distributions
No ratings yet
(Lecture 4) Discrete Probability Distributions
57 pages
Statistics Final Review
No ratings yet
Statistics Final Review
28 pages
Chapter 3 - Special Probability Distributions
No ratings yet
Chapter 3 - Special Probability Distributions
45 pages
Random Variables: Petter Mostad 2005.09.19
No ratings yet
Random Variables: Petter Mostad 2005.09.19
24 pages
Day 02-Random Variable and Probability - Part (I)
No ratings yet
Day 02-Random Variable and Probability - Part (I)
34 pages
FOW9 - SB - Note Chapter 6&7
No ratings yet
FOW9 - SB - Note Chapter 6&7
13 pages
Chapter - 4 Probability Distribution
No ratings yet
Chapter - 4 Probability Distribution
8 pages
Instructions For Chapter 5-By Dr. Guru-Gharana The Binomial Distribution Random Variable
No ratings yet
Instructions For Chapter 5-By Dr. Guru-Gharana The Binomial Distribution Random Variable
10 pages
Key of Week1 - Lecture Notes
No ratings yet
Key of Week1 - Lecture Notes
10 pages
CHAPTER 6 Discrete Probability Distributions
No ratings yet
CHAPTER 6 Discrete Probability Distributions
19 pages
Binomial Normal Distribution
No ratings yet
Binomial Normal Distribution
47 pages
Statistics II Essentials
From Everand
Statistics II Essentials
Emil Milewski
2.5/5 (1)
Learn Statistics Fast: A Simplified Detailed Version for Students
From Everand
Learn Statistics Fast: A Simplified Detailed Version for Students
Hesbon R.M
No ratings yet

W4 Lecture4

Uploaded by

W4 Lecture4

Uploaded by

Biostatistics

Instructor: Min Jin Ha

• Pima Indian Woman

• We want to extend this to a more complex setting: in a randomly

• For the normal distribution,

• Also called Gaussian Distribution

In R, qnorm (p =P1, mean = …, sd = …)

P (x < μ - k) = Pk P (x > μ + k) = Pk P (x < μ + k) = 1 - Pk

μ-k μ μ μ+k μ μ+k

You might also like