0% found this document useful (0 votes)
46 views

Confidence Intervals Continued: Statistics 512 Notes 4

1) The document discusses confidence intervals and their use in statistics. It covers topics like asymptotic approximations, types of convergence, the weak law of large numbers, and the central limit theorem. 2) It shows how the central limit theorem can be used to derive a confidence interval for the mean of a population based on a random sample from that population. For large sample sizes, the distribution of the sample mean can be approximated as a normal distribution. 3) It provides an example of constructing a 95% confidence interval for the proportion of consumers who would purchase a new product, based on survey data from a sample of 200 consumers.

Uploaded by

Sandeep Singh
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
46 views

Confidence Intervals Continued: Statistics 512 Notes 4

1) The document discusses confidence intervals and their use in statistics. It covers topics like asymptotic approximations, types of convergence, the weak law of large numbers, and the central limit theorem. 2) It shows how the central limit theorem can be used to derive a confidence interval for the mean of a population based on a random sample from that population. For large sample sizes, the distribution of the sample mean can be approximated as a normal distribution. 3) It provides an example of constructing a 95% confidence interval for the proportion of consumers who would purchase a new product, based on survey data from a sample of 200 consumers.

Uploaded by

Sandeep Singh
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 8

Statistics 512 Notes 4

Confidence Intervals Continued


Role of Asymptotic (Large Sample) Approximations in
Statistics: It is often difficult to find the finite sample
sampling distribution of an estimator or statistic.
Review of Limiting Distributions from Probability
Types of Convergence:
Let
1
, ,
n
X X K
be a sequence of random variables and let
X be another random variable. Let
n
F
denote the CDF of
n
X
and let F denote the CDF of X .
1.
n
X
converges to X in probability, denoted
P
n
X X
if
for every

>0,
(| | ) 0
n
P X X >
as
n
.
2.
n
X
converges to X in distribution, denoted
D
n
X X
if
for every

>0,
( ) ( )
n
F t F t
as
n
at all t for which F is continuous.
Weak Law of Large Numbers
Let
1
, ,
n
X X K
be a sequence of iid random variables having
mean

and variance
2
< . Let
1
n
i
i
n
X
X
n

. Then
P
n
X
.
Interpretation: The distribution of
n
X
becomes more and
more concentrated around

as n gets large.
Proof: Using Chebyshevs inequality, for every

>0
2
2 2
( )
(| | )
n
n
Var X
P X
n



>
which tends to 0 as
n
.
Central Limit Theorem
Let
1
, ,
n
X X K
be a sequence of iid random variables having
mean

and variance
2
< . Then
( )
( )
D
n n
n
n
X n X
Z Z
Var X



where
~ (0,1) Z N
. In other words,
2
/ 2
1
lim ( ) ( )
2
z
x
n
n
P Z z z e dx

.
Interpretation: Probability statements about
n
X
can be
approximated using a normal distribution. Its the
probability statements that we are approximating, not the
random variable itself.
Some useful further convergence properties:
Slutskys Theorem (Theorem 4.3.5):
, , ,
D P P
n n n
X X A a B b
then
D
n n n
A B X a bX + +
.
Continuous Mapping Theorem (Theorem 4.3.4):
Suppose
n
X
converges to X in distribution and g is a
continuous function on the support of X. Then
( )
n
g X
converges to g(X) in distribution.
Application of these convergence properties:
Let
1
, ,
n
X X K
be a sequence of iid random variables having
mean

, variance
2
<
and
4
( )
i
E X <
. Let
1
n
i
i
n
X
X
n

and
2 2
1
1
( )
n
n i n
i
S X X
n

. Then
D
n
n
n
X
T Z
S
n


where
~ (0,1) Z N
.
Proof (only for those interested):
We can write
n
n
n
X
T
S
n

. Using the Central Limit


Theorem which says
D
n
X
Z
n


and Slutksys Theorem, to
prove that
D
n
T Z
, it is sufficient to prove that
1
P
n
S

.
We can write
2 2
2 2 2
1 1 1
2
n n n
i n i i
i i i
n n n
X X X X
S X X
n n n

+

. By the
weak law of large numbers,
2 2 2 2
( ) [ ( )]
p
n i i
S E X E X

or equivalently
2
2
1
P
n
S

.
Back to Confidence Intervals
CI for mean of iid sample
1
,
n
X X K
from unknown
distribution with finite variance and
4
( )
i
E X <
:
By the application of the central limit theorem above,
D
n
n
n
X
T Z
S
n

.
Thus, for large n,
1 1
2 2
1 1
2 2
1 1
2 2
1


n
n
n n
n n
n n
n n
X
P z z
S
n
S S
P z X z X
n n
S S
P X z X z
n n





_




,
_


,
_
+

,
Thus,
1
2
n
n
S
X z
n

t
is an approximate
(1 )
confidence
interval.
How large does n need to be for this to be a good
approximation? Traditionally textbooks say n>30. Well
look at some simulation results later in the course.
Application: A food-processing company is considering
marketing a new spice mix for Creole and Cajun cooking.
They took a simple random sample of 200 consumers and
found that 37 would purchase such a product. Find an
approximate 95% confidence interval for p, the true
proportion of buyers.
Let
1 if ith consumer would buy product
0 if ith consumer would not buy product
i
X

'

.
If the population is large (say 50 times larger than the
sample size), a simple random sample can be regarded as a
sample with replacement. Then a reasonable model is that
1 200
, , X X K
are iid Bernoulli(p). We have
200 200
2 2
2 2 2
1 1
37
0.185
200
( )
0.185 (0.185) 0.151
200
n
i n i
i i
n n
X
X X X
S X
n



Thus, an approximate 95% confidence interval for p is
1
2
0.151
0.185 1.96 (0.131, 0.239)
200
n
n
S
X z
n

t t
.
Note that for an iid Bernoulli (
p
) sample, we can write
2
n
S
in a simple way. In general,
2 2 2
2
1 1
2 2
2 2
2
1 1
( ) ( 2 )
2
=
n n
i n i i n n
i i
n
n n
i i
i n n i
n
X X X X X X
S
n n
X X
nX nX
X
n n n n


+

+


For an iid Bernoulli sample, let

n n
p X
.

n
p
is a natural
point estimator of
p
for the Bernoulli. Note that for a
Bernoulli sample,
2
i i
X X
. Thus, for a Bernoulli sample
2 2

n n n
S p p
and an approximate 95% confidence interval
for
p
is
2
1.96

n n
n
p p
p
n

t
Choosing Between Confidence Intervals
Let
1
, ,
n
X X K
be iid
2
( , ) N where
2
is known.
Suppose we want a 95% confidence interval for

. Then
for any
a
and b that satisfy
( ) 0.95 P a Z b
,
, X b X a
n n
_


,
is a 95% confidence interval because:
0.95


X
P a b
n
P a X b X
n n
P X b X a
n n





,
_


,
_


,
For example, we could choose (1) a=-1.96, b=1.96
[P(Z<a)=.025; P(Z>b)=.025); the choice we have used
before]; (2) a=-2.05, b=1.88 [P(Z<a)=0.02, P(Z>b)=0.03];
(3) a=-1.75, b=2.33 [P(Z<a)=0.04, P(Z>b)=0.01].
Which is the best 95% confidence interval?
Reasonable criterion: Expected length of the confidence
interval. Among all 95% confidence interval, choose the
one that is expected to have the smallest length since it will
be the most informative.
Length of the confidence interval =
( ) ( ) ( ) X a X b b a
n n n


,
thus we want to choose the confidence interval with the
smallest value of b-a.
The value of b-a for the three confidence intervals above is
(1) a=-1.96, b=1.96, (b-a)=3.92; (2) a=-2.05, b=1.88, (b-
a)=3.93; (3) a=-1.75, b=2.33, (b-a)=4.08.
The best 95% confidence interval is (1) with a=-1.96,
b=1.96. In fact, it can be shown that for this problem the
best choice of a and b are a=-1.96, b=1.96.

You might also like