0% found this document useful (0 votes)
33 views55 pages

12 - Sampling Distribution of Statistics

This document discusses key concepts related to sampling distributions including parameters, statistics, point estimation, and the law of large numbers. It provides examples to illustrate point estimation and explains that as sample size increases, sample statistics get closer to population parameters according to the law of large numbers.

Uploaded by

arpanksidhu044
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
33 views55 pages

12 - Sampling Distribution of Statistics

This document discusses key concepts related to sampling distributions including parameters, statistics, point estimation, and the law of large numbers. It provides examples to illustrate point estimation and explains that as sample size increases, sample statistics get closer to population parameters according to the law of large numbers.

Uploaded by

arpanksidhu044
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 55

12: Sampling Distribution of Statistics

Parameters & Statistics


• A parameter is a number that describes the population.
 In practice, the value of a parameter is not known because we can rarely examine the entire
population.

• A statistic is a number that can be computed from the sample data without making
use of any unknown parameters.
 In practice, we often use a statistic to estimate an unknown parameter.

12: Sampling Distribution of Statistics 2


Parameters & Statistics
• Memory Trick:
 statistics come from samples and
 parameters come from populations
• Notation:
 Population Level:
• µ (the Greek letter mu) for the mean of the population and
• σ for the standard deviation of the population.
 Sample Level:
• (“x-bar”) for the mean of the sample and
• s for the standard deviation of the sample.

12: Sampling Distribution of Statistics 3


Statistical Estimation
• The process of statistical inference involves using information from a sample to draw conclusions
about a wider population.

• Different random samples yield different statistics (recall sampling error). We need to be able to
describe the sampling distribution of possible statistic values in order to perform statistical
inference.

• We can think of a statistic as a random variable because it takes numerical values that describe the
outcomes of the random sampling process.

• Therefore, we can examine the probability distribution of a statistics using what we’ve learned so far

12: Sampling Distribution of Statistics 4


Statistical Inference
Statistical Inference: the process
of using data obtained from a
sample to make estimates and test
hypotheses about the
characteristics of a population Make an inference
about the population
based on the sample
Calculate statistics

Population
Unknown parameters
Hypothesis
Sample
Collect data from a
representative
Sample

12: Sampling Distribution of Statistics 5


Point Estimation
• Point estimation is a form of statistical inference.
 We use the data from the sample to compute a value of a sample statistic
 that sample statistic serves as an estimate of a population parameter.

• Point estimators for population parameters


 is the point estimator of the population mean,
 is the point estimator of the population standard deviation,
 is the point estimator for the population proportion,

12: Sampling Distribution of Statistics 6


Point Estimation
Example: St. Andrew’s College

• St. Andrew’s College received 900 applications from prospective students. The application form
contains a variety of information including:
 the individual’s (SAT) score
 and whether or not the individual desires on-campus housing

• At a meeting in a few hours, the Director of Admissions would like to announce the following for the
population of 900 applicants:
 the average SAT score , and
 the proportion of applicants that want to live on campus

12: Sampling Distribution of Statistics 7


Point Estimation
Example: St. Andrew’s College
• However, the necessary data on the applicants have not yet been entered in the
college’s computerized database.
• The Director decides to estimate the values of the population parameters of interest
based on sample statistics.
• A sample of 30 applicants is selected using computer-generated random numbers.
The following data summaries were obtained:

𝑛=30 ∑ 𝑥𝑖=50,520 ∑ 𝑦𝑖=20


12: Sampling Distribution of Statistics 8
Point Estimation
Example: St. Andrew’s College
• The following point estimates were calculated from the sample:
• as a point estimator for
Note: Different random numbers

𝑥=
∑ 𝑥𝑖 50,520
= =1684
would have identified a different
sample which would have resulted in
𝑛 30 different point estimates (sampling
error)
• as a point estimator for

√ ∑ ( 𝑥𝑖 − 𝑥 )

2
210,512
𝑠= = =¿ 85.2¿
𝑛− 1 29
• as a point estimator for

𝑝=
∑ 𝑦 𝑖 20
= =0.67
𝑛 30
12: Sampling Distribution of Statistics 9
Point Estimation
Example: St. Andrew’s College
• Once all the data for the 900 applicants were entered in the college’s database, the
values of the population parameters of interest were calculated.
• The population mean SAT score,

𝜇=
∑ 𝑥𝑖 1,527,300
= =1697
𝑁 900
• The population standard deviation for SAT score,



2
( 𝑥𝑖 − 𝜇 )
𝑠= = 87 . 4
𝑁
• Population proportion wanting on-campus housing,

𝑝=
∑ 𝑦 𝑖 648
= =0.72
𝑁 900
12: Sampling Distribution of Statistics 10
Summary of Point Estimates
Example: St. Andrew’s College
Population Parameter Point Point
Parameter Value Estimator Estimate
= Population mean 1697 = Sample mean 1684
SAT score SAT score

= Population std. 87.4 s = Sample stan- 85.2


deviation for dard deviation
SAT score for SAT score

p = Population pro- .72 = Sample pro- .67


portion wanting portion wanting
campus housing campus housing

12: Sampling Distribution of Statistics 11


The Law of Large Numbers
• Q: If is rarely exactly right and varies from sample to sample, why is it a reasonable
estimate of the population mean 𝜇?

• A: If we keep on taking larger and larger samples, the statistic is guaranteed to get
closer and closer to the parameter "μ“
 As the sample size increases, gets closer to 𝜇

• The same can be said for (the population proportion) and (the sample proportion)

12: Sampling Distribution of Statistics 12


Sampling Distributions
• The law of large numbers assures us that if we measure enough subjects, the
statistic will eventually get very close to the unknown parameter 𝜇.

• If we took every one of the possible samples of a certain size, calculated the sample
mean for each, and graphed all of those values, we’d have a sampling distribution.

12: Sampling Distribution of Statistics 13


Sampling Distributions
• The population distribution of a variable is the distribution of values of the variable
among all individuals in the population.
• The sampling distribution of a statistic is the distribution of values taken by the
statistic in all possible samples of the same size from the same population.
• NOTE:
 The population distribution describes the individuals that make up the population.
 A sampling distribution describes how a statistic varies in many samples from the population.

12: Sampling Distribution of Statistics 14


Population Distribution vs. Sampling Distribution

Population:
100 blue balls
100 red balls

Define:
proportion of red balls 𝑝

𝒑
𝑝

12: Sampling Distribution of Statistics 15


The sampling distribution of
• Process of Statistical Inference

Population A simple random sample


with mean of elements is selected
=? from the population.

The value of is used to The sample data


make inferences about provide a value for
the value of . the sample mean .

12: Sampling Distribution of Statistics 16


The sampling distribution of
• The sampling distribution of is the probability distribution of all possible
values of the sample mean
• Expected Value of

E() = 

where = the population mean


 When the expected value of the point estimator equals the population
parameter, we say the point estimator is unbiased.

12: Sampling Distribution of Statistics 17


The sampling distribution of
• We will use the following notation to define the standard deviation of
the sampling distribution of

= the standard deviation of


s = the standard deviation of the population
n = the sample size
N = the population size

12: Sampling Distribution of Statistics 18


The sampling distribution of
• The standard deviation of

𝜎
𝜎 𝑥=
√𝑛

 where is the standard error of the mean

12: Sampling Distribution of Statistics 19


The Central Limit Theorem
• Most population distributions are not Normal.
• What is the shape of the sampling distribution of sample means when
the population distribution isn’t Normal?

• As the sample size increases, the distribution of sample means changes


its shape: it looks less like that of the population and more like a Normal
distribution!

12: Sampling Distribution of Statistics 20


The sampling distribution of  ̅
• When the population has a normal distribution, the sampling distribution of is
normally distributed for any sample size.
• In most applications, the sampling distribution of can be approximated by a normal
distribution whenever the sample is size 30 or more.
• The sampling distribution of can be used to provide probability information about
how close the sample mean is to the population mean .

12: Sampling Distribution of Statistics 21


The Central Limit Theorem
• Draw an SRS of size 𝑛 from any population with mean 𝜇 and finite standard
deviation 𝜎. The central limit theorem says that when n is large, the sampling
distribution of the sample mean is approximately Normal:

𝜎
𝑥is approximately 𝑁(μ, )
√𝑛
• The central limit theorem allows us to use Normal probability calculations to
answer questions about sample means from many observations even when the
population distribution is not Normal.

12: Sampling Distribution of Statistics 22


The sampling distribution of

𝜎
𝑥is approximately 𝑁(μ, )
√𝑛

𝑋−𝜇
𝑍=
𝜎 / √𝑛

12: Sampling Distribution of Statistics 23


The Central Limit Theorem

12: Sampling Distribution of Statistics 24


The sampling distribution of
Example: St. Andrew’s College
Sampling 𝜎 𝑥=
𝜎 87.4
= =15.96
Distribution √ 𝑛 √30
of
for SAT
Scores

𝑥
𝐸 ( 𝑥 )=1697
12: Sampling Distribution of Statistics 25
The sampling distribution of
Example: St. Andrew’s College
• What is the probability that a simple random sample of 30 applicants
will provide an estimate of the population mean SAT score that is within
+/-10 of the actual population mean ?
• In other words, what is the probability that will be between 1687 and
1707?

12: Sampling Distribution of Statistics 26


The sampling distribution of
Example: St. Andrew’s College
• We know that

• Step 1: Calculate the z-value at the upper endpoint of


the interval.

• Step 2: Find the area under the curve to the left of


the upper endpoint.

12: Sampling Distribution of Statistics 27


Sampling Distribution of
Example: St. Andrew’s College
• Cumulative probabilities for Standard Normal Dist.

z .00 .01 .02 .03 .04 .05 .06 .07 .08 .09
. . . . . . . . . . .
.5 .6915 .6950 .6985 .7019 .7054 .7088 .7123 .7157 .7190 .7224
.6 .7257 .7291 .7324 .7357 .7389 .7422 .7454 .7486 .7517 .7549
.7 .7580 .7611 .7642 .7673 .7704 .7734 .7764 .7794 .7823 .7852
.8 .7881 .7910 .7939 .7967 .7995 .8023 .8051 .8078 .8106 .8133
.9 .8159 .8186 .8212 .8238 .8264 .8289 .8315 .8340 .8365 .8389
. . . . . . . . . . .

12: Sampling Distribution of Statistics 28


Sampling Distribution of 𝑥 ̅
Example: St. Andrew’s College

Sampling
Distribution 15.96
of
for SAT
Scores

Area = .7357

𝑥
1697 1707
12: Sampling Distribution of Statistics 29
The sampling distribution of
Example: St. Andrew’s College
• Step 3: Calculate the z-value at the lower endpoint of the interval.

• Step 4: Find the area under the curve to the left of the upper endpoint.

12: Sampling Distribution of Statistics 30


The sampling distribution of
Example: St. Andrew’s College
Sampling
Distribution
15.96
of
for SAT
Scores
Area = .2643

𝑥
1687 1697

12: Sampling Distribution of Statistics 31


The sampling distribution of
Example: St. Andrew’s College
• Step 5: Calculate the area under the curve between the lower and upper
endpoints of the interval.

• The probability that the sample mean SAT score will be between 1687
and 1707 is:

12: Sampling Distribution of Statistics 32


The sampling distribution of 𝑥 ̅
Example: St. Andrew’s College
Sampling
Distribution
15.96
of
for SAT
Scores
Area = .4714

𝑥
1687 1697 1707

12: Sampling Distribution of Statistics 33


The sampling distribution of
• Making Inferences about a Population Proportion

Population A simple random sample


with proportion of elements is selected
=? from the population.

The value of is used to The sample data


make inferences about provide a value for
the value of . the sample
proportion .
12: Sampling Distribution of Statistics 34
The sampling distribution of
• The sampling distribution of is the probability distribution of all possible
values of the sample proportion
• Expected Value of

E() = p

where = the population proportion

12: Sampling Distribution of Statistics 35


The sampling distribution of
• The standard deviation of

𝜎 𝑝=
𝑝( 1− 𝑝)
𝑛 √
 where is the standard error of the proportion

12: Sampling Distribution of Statistics 36


The sampling distribution of

𝑝 is approximately 𝑁 (𝑝 ,
𝑝 ( 1 −𝑝 )
𝑛
)

𝑝−𝑝
𝑍=
√ 𝑝 (1− 𝑝)/ 𝑛

12: Sampling Distribution of Statistics 37


The sampling distribution of 𝑝 ̅
• The sampling distribution of can be approximated by a normal
distribution whenever the sample size is large enough to satisfy the two
conditions:
np > 5 and n(1 – p) > 5

• when these conditions are satisfied, the probability distribution of x in


the sample proportion, , can be approximated by normal distribution
(and because n is a constant).

12: Sampling Distribution of Statistics 38


Sampling Distributions
Summary of Central Limit Theorem
Scenario 1: Population distribution of is normal
• When the population distribution is normally distributed, the sampling distribution of is also
normally distributed

Scenario 2: Population distribution of unknown or not normal & sample size at least 30
• According to CLT, the sampling distribution of can be approximated by a normal distribution
whenever the sample is size 30 or more

Scenario 3: For proportions  &


• According to the CLT, the sampling distribution of can be approximated by a normal distribution
whenever the sample size is large enough to satisfy the two conditions &

12: Sampling Distribution of Statistics 39


The sampling distribution of
Example: St. Andrew’s College
• Recall that 72% of the prospective students applying to St. Andrew’s
College desire on-campus housing.

• What is the probability that a simple random sample of 30 applicants


will provide an estimate of the population proportion of applicant
desiring on-campus housing that is within plus or minus .05 of the actual
population proportion?

12: Sampling Distribution of Statistics 40


The sampling distribution of
Example: St. Andrew’s College
• For our example, with n = 30 and p = .72, the normal distribution is an
acceptable approximation because:

12: Sampling Distribution of Statistics 41


The sampling distribution of 𝑝 ̅
Example: St. Andrew’s College
Sampling
Distribution
of
𝜎 𝑝=

.72(1 −. 72)
30
=0 . 082

E() = .72

12: Sampling Distribution of Statistics 42


The sampling distribution of
Example: St. Andrew’s College
• Step 1: Calculate the z-value at the upper endpoint of the interval.

• Step 2: Find the area under the curve to the left of the upper endpoint.

12: Sampling Distribution of Statistics 43


Sampling Distribution of
Example: St. Andrew’s College
• Cumulative probabilities for Standard Normal Dist.

z .00 .01 .02 .03 .04 .05 .06 .07 .08 .09
. . . . . . . . . . .
.5 .6915 .6950 .6985 .7019 .7054 .7088 .7123 .7157 .7190 .7224
.6 .7257 .7291 .7324 .7357 .7389 .7422 .7454 .7486 .7517 .7549
.7 .7580 .7611 .7642 .7673 .7704 .7734 .7764 .7794 .7823 .7852
.8 .7881 .7910 .7939 .7967 .7995 .8023 .8051 .8078 .8106 .8133
.9 .8159 .8186 .8212 .8238 .8264 .8289 .8315 .8340 .8365 .8389
. . . . . . . . . . .

12: Sampling Distribution of Statistics 44


Sampling Distribution of
Example: St. Andrew’s College

Sampling
Distribution
of
𝜎 𝑝=

.72 (1− .72)
30
=0.082

Area = .7291

.72 .77

12: Sampling Distribution of Statistics 45


The sampling distribution of
Example: St. Andrew’s College
• Step 3: Calculate the z-value at the lower endpoint of the interval.

• Step 4: Find the area under the curve to the left of the upper endpoint.

12: Sampling Distribution of Statistics 46


Sampling Distribution of
Example: St. Andrew’s College

Sampling
Distribution
of
𝜎 𝑝=

.72 (1− .72)
30
=0.082

Area = .2709

.67 .72

12: Sampling Distribution of Statistics 47


The sampling distribution of
Example: St. Andrew’s College
• Step 5: Calculate the area under the curve between the lower and upper endpoints
of the interval.

• The probability that the sample proportion of applicants wanting on-campus


housing will be within +/-.05 of the actual population proportion :

12: Sampling Distribution of Statistics 48


The sampling distribution of
Example: St. Andrew’s College

Area = .4582

.67 .72 .77

12: Sampling Distribution of Statistics 49


Practice Problem 1
Critical Reading 502
Mathematics 515
Writing 494

The College Board reported the following mean scores for the three parts of the Scholastic Aptitude Test (SAT). Assume
that the population standard deviation on each part of the test is . An SRS sample of 90 test takers was drawn.

a) Is it appropriate to assume if the sample size is 90?


Yes, because , and according to the CLT that means the sampling distribution is normally distributed.

b) What is the probability a sample of 90 test takers will provide a sample mean test score within 10 points of the
population mean of 502 on the Critical Reading part of the test? Within 10 points means

12: Sampling Distribution of Statistics 50


Practice Problem 1 Critical Reading
Mathematics
502
515
Writing 494

The College Board reported the following mean scores for the three parts of the Scholastic Aptitude Test (SAT). Assume
that the population standard deviation on each part of the test is .

c) What is the probability a sample of 90 test takers will provide a sample mean test score within 10 points of the
population mean of 515 on the Mathematics part of the test? Compare this probability to the value computed in part (a).
Within 10 points means

The probabilities are the same for both the Math and Reading portion of the SAT. This is because the standard error is the same in both cases. The fact
that the means differ does not affect the probability calculations.

12: Sampling Distribution of Statistics 51


Practice Problem 1 Critical Reading
Mathematics
502
515
Writing 494

d) What is the probability a sample of 100 test takers will provide a sample mean test score within 10 of the population
mean of 494 on the writing part of the test? Comment on the differences between this probability and the values
computed in parts (a) and (b).

Note that the standard error is smaller because the sample size is larger.
Within 10 points means

The probability is larger here than in part a) and b) because the larger sample size has made the standard error smaller.

12: Sampling Distribution of Statistics 52


Practice Problem 2
People end up tossing 12% of what they buy at the grocery store. Assume this is the true population
proportion and that you plan to take a sample survey of 540 grocery shoppers to further investigate their
behavior.
a) Show the sampling distribution of , the proportion of groceries thrown out by your sample respondents
is normally distributed.
and

Therefore, the sampling distribution can be approximated by

12: Sampling Distribution of Statistics 53


Practice Problem 2
People end up tossing 12% of what they buy at the grocery store. Assume this is the true population
proportion and that you plan to take a sample survey of 540 grocery shoppers to further investigate
their behavior.
b) What is the probability that your survey will provide a sample proportion within ±.03 of the
population proportion?

Within within ±.03 means

12: Sampling Distribution of Statistics 54


Chapter 7 Problems

• Practice Problem Set #9

12: Sampling Distribution of Statistics 55

You might also like