Lecture 5
Lecture 5
Lecture 5
Sampling and
Sampling Distributions
Reading: Newbold et al, Statistics for
Business and Economics - Chapter 6
Confidence Intervals
Reading: Chapter 7.1 - 7.3 1
Lecture Goals
After completing this Lecture, you should be able to:
Describe a simple random sample and why sampling is
important
Explain the difference between descriptive and
inferential statistics
Define the concept of a sampling distribution
Determine the mean and standard deviation for the
sampling distribution of the sample mean, X
Describe the Central Limit Theorem and its importance
Determine the mean and standard deviation for the
sampling distribution of the sample proportion, p̂
2
Lecture Goals (2)
After completing this lecture, you should
be able to:
3
Introduction
Descriptive statistics
Collecting, presenting, and describing data
Inferential statistics
Drawing conclusions and/or making decisions
concerning a population based only on
sample data
4
Inferential Statistics
Sample Population
5
Inferential Statistics
Drawing conclusions and/or making decisions
concerning a population based on sample results.
Estimation
e.g., Estimate the population mean
weight using the sample mean
weight
Hypothesis Testing
e.g., Use sample evidence to test
the claim (hypothesis) that the
population mean weight is 65kg
6
Sampling from a Population
7
Population vs. Sample
Population Sample
8
Why Sample?
9
Simple Random Sample
10
Sampling Distributions
X1 X2 X3 X4
Sampling
distribution of
sample
means tends f X
towards
normal
distribution,
as n ∞ 12
0
Recap: Sample Mean
E[X] μ
13
Standard Error = Standard Deviation of
Sample Mean
σ
σX
n
Note that the standard error of the mean decreases as
the sample size increases, and σ falls. 14
If the Population is not Normal
(continued)
E[X] μ μ x
Variation Sampling Distribution of x
σ becomes normal as n ∞
σx Larger
n Smaller
sample size sample
size
x
E[X] μ 15
Sampling Distribution of X
Properties
(continued)
σ
σX
n
As n increases, and Larger
σ , decreases, so sample size,
σ x decreases smaller
population
standard
Smaller sample deviation
size, large
population
standard deviation
E[X] μ x
16
Central Limit Theorem
17
Central Limit Theorem (CLT)
As the sample size gets larger and tends towards
x
infinity…the sampling distribution of always tends
towards a normal distribution… regardless of how the
population is distributed.. The population can be very far
x
from normal, , will still be normally distributed as n ∞
n↑
x
18
Standard Normal Distribution for
the Sample Means
Z-value for the sampling distribution of X :
X μ X μ
Z
σX σ
n
where: X = sample mean
μ = population mean
σ x = standard error of the mean
Z is a standardized normal random variable with mean of 0
and a variance of 1
19
Central Limit Theorem
(continued)
Let X1, X2, . . . , Xn be a set of n independent random
variables having identical distributions with mean µ,
variance σ2, and X as the mean of these random
variables.
As n becomes large, the central limit theorem states
that the distribution of
X μx
Z
σX
approaches the standard normal distribution
20
How Large is Large Enough?
21
If the Population is Normal
σ
E[X] μ and
σX
n
22
Sampling Distribution Properties
Normal Population
E[X] μ Distribution
σ μ x
σX Normal Sampling
n Distribution
24
Example
(continued)
Solution:
Even if the population is not normally distributed
(we are not told - so assume it is not), the
central limit theorem can be used (n > 25)
… so the sampling distribution of x is
approximately normal
… where E(x ) = 8
σ 3
…and standard deviation σ x n 36 0.5
25
Example
(continued)
Solution (continued):
7.8-8 μ X
-μ 8.2-8
P(7.8 X 8.2) P 3
σ 3
36 n 36
P(-0.4 Z 0.4) 0.3108
α/2
α/2
Then μ z /2 σ X
- zα/2 0 zα/2 Z
is the interval that includes X with probability 1 – α.
27
Acceptance Intervals - Example
Sampling
Distributions
Sampling Sampling
Distributions Distributions
of Sample of Sample
Means Proportions
30
Sampling Distributions of
Sample Proportions
P = the proportion of the population having
some characteristic
Sample proportion (p̂ ) provides an estimate
of P:
0 ≤ p̂ ≤ 1
p̂ has a binomial distribution, but can be approximated
by a normal distribution when nP(1 – P) > 5
31
^
Sampling Distribution of p
Normal approximation:
Sampling Distribution
P(Pˆ )
.3
.2
.1
0
0 .2 .4 .6 8 1 P̂
Properties:
P(1 P)
E(pˆ ) P and
σ pˆ
n
(where P = population proportion)
32
Z-Value for Proportions
pˆ P pˆ P
Z
σ pˆ P(1 P)
n
33
Example
34
Example
(continued)
35
Example
(continued)
.4251
Standardize
36
Confidence Intervals
Key Point of the Lecture:
37
Properties of Point Estimators
An estimator of a population parameter is
a random variable that depends on sample
information. For example, the sample
mean is an estimator of the population
mean.
The estimator value provides an estimate to
this unknown parameter of interest, e.g. the
sample mean value is an estimate of the
population mean value.
The estimator is random variable because it
will take different values from different
samples, but a specific value of the estimator
is called an estimate for the population 38
parameter.
Point and Interval Estimates
A point estimate is a single number,
a confidence interval provides additional
information about the preciseness, variability
or equivalently, confidence in our the estimate.
Lower Upper
Confidence Confidence
Point Estimate Limit
Limit
Width of
confidence interval
39
Revision: Point Estimates
Mean μ x
Proportion P pˆ
40
Unbiasedness
A point estimator θˆ is said to be an
unbiased estimator of the parameter if its
expected value is equal to that parameter:
E(θ̂) θ
Examples:
The sample mean x is an unbiased estimator of μ
41
Unbiasedness
(continued)
θˆ1 θ̂ 2
θ θˆ
42
Bias
Let θ̂ be an estimator of
Bias(θ̂) E(θ̂) θ
43
Most Efficient Estimator
Sample
48
Confidence Level, (1-)
(continued)
Suppose confidence level = 95%
Also written (1 - ) = 0.95
A relative frequency interpretation:
From repeated samples, 95% of all the
confidence intervals that can be constructed of
size n will contain the unknown true parameter
A specific interval either will contain or will
not contain the true parameter
49
General Formula
θ̂ ME
Point Estimate ± Margin of Error
50
Confidence Intervals
Confidence
Intervals
σ2 Known σ2 Unknown
σ
UCL x z α/2 Upper confidence limit
n
σ
ME z α/2
n
σ
ME z α/2
n
55
Finding z/2
Consider a 95% confidence interval:
1 .95
α α
.025 .025
2 2
57
Intervals and Level of Confidence
Sampling Distribution of the Mean
/2 1 /2
Intervals
x
μx μ
extend from x1
σ 100(1-)%
LCL x z x2
n of intervals
to constructed
σ contain μ;
UCL x z
n 100()% do
Confidence Intervals not. 58
Example
A sample of 11 circuits from a large normal
population has a mean resistance of 2.20
ohms. We know from past testing that the
population standard deviation is 0.35 ohms.
59
Example
(continued)
Solution: x z/2
σ
n
2.20 .2068
1.9932 μ 2.4068
60
Interpretation
We are 95% confident that the true mean
resistance is between 1.9932 and 2.4068
ohms
Although the true mean may or may not be
in this interval, 95% of intervals formed in
this manner will contain the true mean
61
(σ2
7.3
Unknow
n)
Confidence
Intervals
σ2 Known σ2 Unknown
d.f. = n - 1
64
Student’s t Distribution
Note: t Z as n increases
Standard
Normal
(t with df = ∞)
t (df = 13)
t-distributions are bell-
shaped and symmetric, but
have ‘fatter’ tails than the t (df = 5)
normal
0 t
65
Student’s t Table
Confidence t t t Z
Level (10 d.f.) (20 d.f.) (30 d.f.)
Note: t Z as n increases
67
(σ2
Unknow
n)
where tn-1,α/2 is the critical value of the t distribution with n-1 d.f.
and an area of α/2 in each tail:
P(tn1 t n1,α/2 ) α/2
69
Margin of Error
The confidence interval,
s
x t n-1,α/2
n
70
Example
A random sample of n = 25 has x = 50 and
s = 8. Form a 95% confidence interval for μ
72