0% found this document useful (0 votes)

36 views36 pages

Unit 05 - Sampling Distributions With Solutions - 1 Per Page

This document provides an overview of sampling distributions and the central limit theorem. It discusses how the binomial distribution can be used to model experiments with dichotomous outcomes and fixed sample size. For large samples, the binomial distribution approximates the normal distribution. The mean and variance of the binomial are defined. The sampling distribution of the sample proportion p̂ is also normally distributed for large n, with mean equal to the population proportion p and variance inversely related to n. Examples are provided to illustrate these concepts.

Uploaded by

Kase1

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

36 views36 pages

Unit 05 - Sampling Distributions With Solutions - 1 Per Page

Uploaded by

Kase1

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 36

Unit 5: Sampling Distributions

Chapter 5 in IPS
Unit 5 Outline: Sampling Distributions

• The Binomial Probability Distribution

• Normal Approximation to the Binomial
X
• pˆ  and its sampling distribution
n
• Law of Large Numbers and the Central Limit Theorem
• Sampling Distribution of X

2 2
The Binomial Distribution

• When sampling n subjects randomly (independently) from a population

with a dichotomous (two level: like yes/no) characteristic,
• X will be total number with characteristic labeled `success’
• p̂ = X/n will be the sample proportion of subjects labeled `success’
• Main example is yes/no answers to polls

• Think Coin Flips

3 3
The Binomial Probability Model
• Prototype for many simple experiments and surveys
• Characterized by 4 properties
1) Fixed number (n) of observations, or `trials’.
2) The n trial are all independent of each other
3) Each trial has two possible outcomes: `success’ or `failure’.
4) The probability (denoted by p) of success at each trial is constant.
• Let X = the total number of successes in the n independent trials. X has
a binomial distribution with parameters n and p.
• The possible values of the binomial random variable X are 0, 1, 2, …,n
• The probability that X = k, where k = 0,1,2,..,n is given by
n k
P ( X  k )    p (1  p ) n  k
k 
n n!
  
 k  k!(n  k )! n! n(n  1)(n  2)    (1)
4 4
5 5
Combinatorics
n n!
• So what does    mean?
 k  k!(n  k )!
• It counts, out of a total of n individuals, the number of
ways to select k individuals to form one group and (n – k)
individuals to form the other group (only two options)

• Or in the context of counting # of heads in n total coin flips,

k flips are heads, and (n – k) flips are tails.

• Simple example: n = 4, k = 2

6 6
The Binomial Distribution: Example

• If a couple are both carriers of a certain disease their child has

probability 0.25 of being born with the disease. Suppose that a
couple has 4 children:
 What is the probability that none of their children have the
disease?
 What is the probability that at least two children have the
disease?
• Let’s use the formula to calculate the answers, and the
table to check both calculations

7 7
Mean and Variance of a Binomial R.V.
Let X ~ B(n, p). Then the mean, variance and standard deviation of
X are:
 X  np
  np (1  p )
2 Derivations pg 320 in IPS;
X you will not be asked to
reproduce the derivation
 X  np (1  p )

Example: What is the mean and standard deviation of the number of

children who will have the disease (from previous example)?
How are the mean and standard deviation interpreted here?

Based on these calculations and the shape of the binomial probability

histogram, we can then assert...
8 8
Normal approximation to the distribution of a
binomial variable
• Draw a Simple Random Sample (SRS) of size n from a
population having proportion p of successes. Let X be the
number of successes.

• The Binomial distribution is awkward to use when n is large, but

for large n the binomial distribution looks very much like the
normal distribution

• When n is large:

X ~ N   np,   np (1  p ) 
• The approximation is accurate as long as both
np ≥ 10 and n(1 – p) ≥ 10.

9 9
X1 ~ Bin(n = 4, p = 0.5) X2 ~ Bin(n = 50, p = 0.5)
.5

.1
.4
.3

Density
Density

.05
.2
.1

0
0

0 1 2 3 4 10 20 30 40
bin4_half bin50_half

.15
.3
.2

.1
Density

Density

.05
.1
0

0 2 4 6 8 10 0 10 20 30
bin10_quarter bin50_quarter

X3 ~ Bin(n = 10, p = 0.25) X4 ~ Bin(n = 50, p = 0.25)

10 10
Example (similar to Example 5.21, IPS p325)‫‏‬

A worker inspects a simple random

sample of 100 switches from a large
shipment. 10% of the switches in
shipments are usually bad switches.
What is the probability that the
number of bad switches is at least
17?

11 11
Approximating the count of bad
switches with a Normal Density

12 12
Solution

 X  np 17  np 
P ( X  17)  P  
 np (1  p )  
 np (1 p ) 
 17  100 ( 0 . 10 ) 
 P Z  
 
 100 ( 0 . 10 )( 0 . 90 ) 
 7
 P Z  
 3
 Z  2.33  0.0099

Using the exact binomial distribution:

P(X ≥ 17) = [P(X = 17) + P(X = 18) + … + P(X = 100)]
13 = 0.0206
13
Sampling Distribution of a
Sample Proportion ( pˆ ).
Setup…
 For a two-level characteristic (success/failure) in a
population which has a true proportion of success, p
 We will take a one sample of size n from the population,
compute sample proportion
X
pˆ 
n
 Sampling distribution of the random variable p̂ is the
theoretical distribution of values if we could look at sample
proportions in all possible random samples of size n.
• Then…

14 14
Sampling Distribution of a Sample Proportion

• Since p̂ is just a linear transformation of X (which is a

binomial random variable) we can easily calculate that:
  pˆ  p

p (1  p )
  pˆ 
n
• Also, if n is large and p is not `too close’ to 0 or 1, then the
sampling distribution of p̂ is…
 [Approximately] Normal Distributed

Again, under the conditions: np ≥ 10 and n(1 – p) ≥ 10

15 15
Interpretation, and a return to some examples

• The main implications of the result

 The sampling distribution of p̂ is centered over the
true population proportion, p.
 Note the formula of the standard deviation of p̂ .
What happens as n increases?
 The standard deviation also depends on the unknown
parameter p.
• Summarized in the graphics from IPS, Section 3.4, 5.1

16 16
From IPS

17 17
More generally (graphic from IPS, Section 5.2)…

18 18
Gallup Poll: Supreme Court Healthcare Ruling
• “Americans are sharply divided over Thursday's Supreme Court
decision on the 2010 healthcare law, with 46% both agree and
disagree with the Healthcare ruling.”

• “Results are based on telephone interviews with 1,012 national

adults, aged 18 and older, conducted June 26-28, 2012, as part
of Gallup Poll Daily tracking. For results based on the total
sample of national adults, one can say with 95% confidence
that the maximum margin of sampling error is ±4 percentage
points.

• “In addition to sampling error, question wording and practical

difficulties in conducting surveys can introduce error or bias
into the findings of public opinion polls.”
19 19
Gallup polls
• What supports the claim: Results are based on telephone
interviews with 1,012 national adults, aged 18 and older,
conducted June 26-28, 2012, as part of Gallup Poll Daily
tracking. For results based on the total sample of national
adults, one can say with 95% confidence that the maximum
margin of sampling error is ±4 percentage points.

• With n = 1,012 the standard deviation (called the standard

error) of the sample proportion will be:

 pˆ 
p (1  p )
1012

p (1  p )
31.81
 0.0314  p (1  p ) 
20 20
Gallup Poll, continued…

p (1  p ) is at its largest when p = ½. Why? Look

at the graph of f ( x)  x(1  x) over
the range of x [0,1].

So the largest value of  p̂  pˆ 

p (1  p )
1012
 0.0314  p (1  p ) 
will be
 
0.0314 ( 1 2 )( 1 2 )  0.0157

Where do we find 95% of the observations of a normal distribution?

We will return to this idea when we study confidence intervals.

21 21
Compact (and technical) summary…
Suppose that X~ B(n,p) then

X is approximately distributed as


N  X  np,  X  np (1  p ) 
p̂ is approximately distributed as
 n 
N   pˆ  p,  pˆ  
 p (1  p ) 
Reminder: the approximation is a good one provided that
np  10 and n(1 – p)  10.
22 22
An old practice problem (with some new parts)
Ultrasound is often used to determine the sex of an unborn baby. However, because the
procedure relies on visual detection of anatomic differences between male and female
babies, the error rates differ according to whether the baby is a boy or a girl.
 Pr(Ultrasound predicts male | baby is male) = 75%
 Pr(Ultrasound predicts female | baby is female) = 90%

(a) Consider an individual woman who comes to the clinic for an ultrasound to predict her
baby’s sex. What is the probability that the ultrasound gives the wrong result?
(Answer = 0.175)

(b) Suppose a clinic performs 10 ultrasounds a day. What is the probability

of two or more incorrect sex determinations?

(a) What is the probability that a baby is male, given that the ultrasound predicts a male?

(a) Suppose a clinic performs 100 ultrasounds in a month. What is the

probability that 25 or more ultrasounds make incorrect determinations?

23 23
Tables of binomial probability distributions

24 24
Solutions
Let B = {baby is a boy}, BC = {baby is a girl}
A = {ultrasound predicts boy}, AC = {ultrasound predicts girl}
(a) P(Wrong result )  P( A and B C )  P( AC and B)
 P ( A | B C ) P ( B C )  P ( AC | B ) P ( B )
 0.10(0.50)  0.25(0.50)  0.175

(b) P(at least 2 out of 10 are wrong predictions)

 1  P (0 or 1 wrong predictions)  1  P (10 or 9 correct predictions)
 
 1  (0.825)10  10(0.175)1 (0.825) 9  1  0.146  0.310  0.544
(c) P(boy | ultrasound predicts boy )  P( B | A)
 P ( A | B ) P ( B ) /[ P ( A | B ) P ( B )  P ( A | B C ) P ( B C )]
 [0.75(0.50)] /[0.75(0.50)  0.10(0.50)]  0.375 / 0.425  0.882

(d) Let X ~ Bin(n  100, p  0.175)

then approximately X ~ N (   17.5,   100(0.175)(0.825)  3.80)
So, P (at least 25 wrong calls out of 100)
 X  17.5 25  17.5 
 P ( X  25)  P    P ( Z  1.97)  0.0244
 3.80 3.80  25 25
Law of Large Numbers:
Sampling Results for x.
• When thinking about the average of a population, the notation often
used is
 population mean: 
 sample mean: x

• The Law of Large Numbers

 If one draws independent samples from a population with (finite)
mean , then as the number of observations increases, the
sample mean eventually becomes arbitrarily close to (and stays
close to) the population mean.
 We can summarize the specific results much like we did for
sample proportions…

26 26
The Sampling Distribution
for the Sample Mean
Notation:
E(x)    E( X )  is the population mean of individuals
(also written E(X))

 SD ( X )  is the population standard deviation

SD ( x )   (also written SD(X)).
n n

Interpretation: The average of the

It is also known that the sampling distribution of the sample
distribution of the sample mean mean is the population mean, and the
will be approximately normally standard deviation of the sample mean
distributed. More on this in a is the population standard deviation
few slides. divided by sq. root of sample size
27 27
Sampling Distribution of the sample mean

E(x)    E( X )

 SD ( X )
SD ( x )  
n n
28 28
Compact (and technical) summary…
Arbitrary Random E( x )    E( X )
Variables
 SD( X )
SD( x )  
n n

Distribution of the Sample mean

Binomial Random Variables X is approximately distributed as


N  X  np,  X  np(1  p) 

p̂ is approximately distributed as
 n 
N   pˆ  p,  pˆ  
 p(1  p) 

March 2,
A typical problem…
It is known that math SAT scores in the entire US population (in
2007) are approximately normal with an average of 515 with a
standard deviation of 100.

.004
(a) What is the probability a

.003
randomly selected SAT-taker
scores above 550 on math?
Density

.002
.001

(b) What is the probability that the

average SAT score for 20
people in a random sample is
0

above 550? 0 200 400

SAT_math
600 800 1000

30 30
Solution

(a) What is the probability a randomly selected SAT-taker scores above 550
on math?

Let X = SAT math score of a random test-taker. Then:

 X   X 550  515 
P ( X  550)  P    P ( Z  0.35)  0.363
 X 100 

(b) What is the probability that the average SAT score for 20 people in
random sample is above 550?

Let X = mean SAT math score of a random sample of 20 test-takers. Then:

X  X 550  515
P ( X  550)  P (  )  P ( Z  1.57)  0.058
X 100 / 20
31 31
Two central ideas: IPS, p 302-303

32 32
Central Limit Theorem
• What this means is, no matter what the underlying
distribution of the individual observations of X is, if you
take a large enough sample, the sampling distribution of
the random variable X will be:

Normally
Distributed!!!
33 33
Example Problem

It is known that [Personal Per Capita] Income in

Massachusetts has a mean of $60,000 and standard
deviation of $20,000.

(a) If we select one individual out of the entire Massachusetts population,

what is the probability of selecting someone whose income is greater
than $70,000?

(a) If we select a random sample of 25 individuals from the Mass

population, what is the probability that the average income in your
sample will be greater than $70,000?

34 34
Example Problem
(a) If we select one individual out of the entire Massachusetts population,
what is the probability of selecting someone whose income is greater
than $70,000?

Great question…we could try to use the Normal Distribution here, but the
calculation would not be very accurate. Why not? Its likely a very right-
skewed distribution (and the CLT does not take hold here since n = 1).

(b) If we select a random sample of 25 individuals from the Mass

population, what is the probability that the average income in your
sample will be greater than $70,000?

This we can compute. Let X be the random variable for the mean income
of a random sample of 25 people. Then:

X  X 70,000  60,000
P ( X  70,000)  P (  )  P ( Z  2.5)  0.0062
X 20,000 / 25

35
Recap of important ideas…
• Random variables are an abstract way to describe the numerical outcome
of an experiment, survey, or study
 they come in 2 forms: discrete and continuous
• Probability distributions (example: binomial model) are used to describe
the distribution of random variables
• Probability distribution models also apply to summary statistics, such as
the sample proportion and sample mean
 Probability model of a summary statistic often called its sampling
distribution
• The sampling distributions of most summary statistics
 Have an expected value (mean) that is identical to the population
parameter of interest (population mean, population proportion)
 Have a standard deviation that decreases as the sample size
increases (by Law of Large Numbers)
• Normal distribution can be used to approximate the sampling
distribution of a sample mean (by the CLT) no matter how the
individuals are distributed in the population!!!
36 36

Chapter Four: Theory of Production and Cost
No ratings yet
Chapter Four: Theory of Production and Cost
33 pages
CVP Analysis 2
50% (2)
CVP Analysis 2
7 pages
Dissertation On Intellectual Property Rights
100% (2)
Dissertation On Intellectual Property Rights
7 pages
Intel AI Global Impact Festival 2025 - Flyer
No ratings yet
Intel AI Global Impact Festival 2025 - Flyer
1 page
IMRD Factors Affecting Skills
No ratings yet
IMRD Factors Affecting Skills
3 pages
Tata Nano Car
No ratings yet
Tata Nano Car
34 pages
Payment Plan: Doctors Floor Price List
No ratings yet
Payment Plan: Doctors Floor Price List
1 page
How The World Sees You
100% (1)
How The World Sees You
10 pages
Chapter-Three Understand Consumer Behavior 3.1 Consumer Buying Behavior
No ratings yet
Chapter-Three Understand Consumer Behavior 3.1 Consumer Buying Behavior
11 pages
Risk
No ratings yet
Risk
27 pages
BALL VALVE 0.75 INCHI
No ratings yet
BALL VALVE 0.75 INCHI
7 pages
Department of Educat
No ratings yet
Department of Educat
3 pages
E Chapter
No ratings yet
E Chapter
6 pages
J Applied Clin Med Phys - 2024 - Dunn - Assessing The Sensitivity and Suitability of A Range of Detectors For SIMT PSQA
No ratings yet
J Applied Clin Med Phys - 2024 - Dunn - Assessing The Sensitivity and Suitability of A Range of Detectors For SIMT PSQA
21 pages
Module 8 Tle
No ratings yet
Module 8 Tle
13 pages
3712012
No ratings yet
3712012
2 pages
IndividualAssignment (Mek625) (2022487736)
No ratings yet
IndividualAssignment (Mek625) (2022487736)
2 pages
Lecture 02 Write Basic Go Web Server
No ratings yet
Lecture 02 Write Basic Go Web Server
17 pages
E+H-PROMAG W 400 - Tender Text - TTW400EN
No ratings yet
E+H-PROMAG W 400 - Tender Text - TTW400EN
2 pages
Lab 4.5.1 Observing TCP and UDP Using Netstat (Instructor Version)
No ratings yet
Lab 4.5.1 Observing TCP and UDP Using Netstat (Instructor Version)
7 pages
Economic Development: Monique L Bait Fran Christ P. Magat Far Eastern University - Manila
No ratings yet
Economic Development: Monique L Bait Fran Christ P. Magat Far Eastern University - Manila
3 pages
Jovision JVS-517-TDL
No ratings yet
Jovision JVS-517-TDL
2 pages
Norman Cordero Marquez, Petitioner, vs. Commission On Elections, Respondent.
No ratings yet
Norman Cordero Marquez, Petitioner, vs. Commission On Elections, Respondent.
9 pages
Fransisca Mira Widyasari
No ratings yet
Fransisca Mira Widyasari
2 pages
Kolkata Faculty List DG Upload Jan 2023
No ratings yet
Kolkata Faculty List DG Upload Jan 2023
3 pages
Resume 2022 July Agrim Mathur
No ratings yet
Resume 2022 July Agrim Mathur
2 pages
Otondro Prohori, Guarding Who, Against What
No ratings yet
Otondro Prohori, Guarding Who, Against What
10 pages
Feasibility Test of A Project:: Market Feasibility (For Demand and Market Price Estimates)
No ratings yet
Feasibility Test of A Project:: Market Feasibility (For Demand and Market Price Estimates)
5 pages
From: Sent: To: Subject
No ratings yet
From: Sent: To: Subject
2 pages
Abangan v. Abangan
No ratings yet
Abangan v. Abangan
2 pages
Principles: Life and Work
From Everand
Principles: Life and Work
Ray Dalio
4/5 (648)
The Glass Castle: A Memoir
From Everand
The Glass Castle: A Memoir
Jeannette Walls
4.5/5 (1856)
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
From Everand
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
Brené Brown
4/5 (1175)
The Emperor of All Maladies: A Biography of Cancer
From Everand
The Emperor of All Maladies: A Biography of Cancer
Siddhartha Mukherjee
4.5/5 (298)
The Perks of Being a Wallflower
From Everand
The Perks of Being a Wallflower
Stephen Chbosky
4.5/5 (4103)
Angela's Ashes: A Memoir
From Everand
Angela's Ashes: A Memoir
Frank McCourt
4.5/5 (943)
Steve Jobs
From Everand
Steve Jobs
Walter Isaacson
4.5/5 (1139)
Shoe Dog: A Memoir by the Creator of Nike
From Everand
Shoe Dog: A Memoir by the Creator of Nike
Phil Knight
4.5/5 (629)
The Outsider: A Novel
From Everand
The Outsider: A Novel
Stephen King
4/5 (2886)
The Light Between Oceans: A Novel
From Everand
The Light Between Oceans: A Novel
M.L. Stedman
4.5/5 (815)
Rise of ISIS: A Threat We Can't Ignore
From Everand
Rise of ISIS: A Threat We Can't Ignore
Jay Sekulow
3.5/5 (144)
Sing, Unburied, Sing: A Novel
From Everand
Sing, Unburied, Sing: A Novel
Jesmyn Ward
4/5 (1267)
Team of Rivals: The Political Genius of Abraham Lincoln
From Everand
Team of Rivals: The Political Genius of Abraham Lincoln
Doris Kearns Goodwin
4.5/5 (244)
Her Body and Other Parties: Stories
From Everand
Her Body and Other Parties: Stories
Carmen Maria Machado
4/5 (903)
Fear: Trump in the White House
From Everand
Fear: Trump in the White House
Bob Woodward
3.5/5 (836)
The Unwinding: An Inner History of the New America
From Everand
The Unwinding: An Inner History of the New America
George Packer
4/5 (45)
Little Women
From Everand
Little Women
Louisa May Alcott
4.5/5 (2369)
The World Is Flat 3.0: A Brief History of the Twenty-first Century
From Everand
The World Is Flat 3.0: A Brief History of the Twenty-first Century
Thomas L. Friedman
3.5/5 (2289)
The Yellow House: A Memoir (2019 National Book Award Winner)
From Everand
The Yellow House: A Memoir (2019 National Book Award Winner)
Sarah M. Broom
4/5 (100)
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
From Everand
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
Dave Eggers
3.5/5 (233)
John Adams
From Everand
John Adams
David McCullough
4.5/5 (2546)
Manhattan Beach: A Novel
From Everand
Manhattan Beach: A Novel
Jennifer Egan
3.5/5 (919)

Unit 05 - Sampling Distributions With Solutions - 1 Per Page

Uploaded by

Unit 05 - Sampling Distributions With Solutions - 1 Per Page

Uploaded by

Unit 5: Sampling Distributions

• The Binomial Probability Distribution

• When sampling n subjects randomly (independently) from a population

• Think Coin Flips

• Or in the context of counting # of heads in n total coin flips,

• If a couple are both carriers of a certain disease their child has

Example: What is the mean and standard deviation of the number of

Based on these calculations and the shape of the binomial probability

• The Binomial distribution is awkward to use when n is large, but

X3 ~ Bin(n = 10, p = 0.25) X4 ~ Bin(n = 50, p = 0.25)

A worker inspects a simple random

Using the exact binomial distribution:

• Since p̂ is just a linear transformation of X (which is a

Again, under the conditions: np ≥ 10 and n(1 – p) ≥ 10

• The main implications of the result

• “Results are based on telephone interviews with 1,012 national

• “In addition to sampling error, question wording and practical

• With n = 1,012 the standard deviation (called the standard

p (1  p ) is at its largest when p = ½. Why? Look

So the largest value of  p̂  pˆ 

Where do we find 95% of the observations of a normal distribution?

(b) Suppose a clinic performs 10 ultrasounds a day. What is the probability

(a) Suppose a clinic performs 100 ultrasounds in a month. What is the

(b) P(at least 2 out of 10 are wrong predictions)

(d) Let X ~ Bin(n  100, p  0.175)

• The Law of Large Numbers

 SD ( X )  is the population standard deviation

Interpretation: The average of the

Distribution of the Sample mean

Binomial Random Variables X is approximately distributed as

(b) What is the probability that the

above 550? 0 200 400

Let X = SAT math score of a random test-taker. Then:

Let X = mean SAT math score of a random sample of 20 test-takers. Then:

It is known that [Personal Per Capita] Income in

(a) If we select one individual out of the entire Massachusetts population,

(a) If we select a random sample of 25 individuals from the Mass

(b) If we select a random sample of 25 individuals from the Mass

You might also like