8 Stat Rec
8 Stat Rec
claims Ci
S sum fall
is
8002
E Ci 260 Vki
i th customer
Ci claim for
the distribution
CLT to approximate
use
We can
apox
of
eeEz
E.si 196 / 218
2 28
P P 5 89
5
90
2800000 1
I 7 28 1 112 2.5
1 2.5
2.5
1 0.9938
0.0062
Example
Darth Vader wants to measure the distance between the Death
Star and Tatooine. However, due to atmospheric disturbances,
measurements will not yield the exact distance d. As a result,
Vader has decided to make a series of 36 measurements and then
use their average value as an estimate of the actual distance.
Assume that the values of the successive measurements are
independent random variables with a mean of d light years and a
standard deviation of 2 light years.
I Approximate the probability that the estimated value of the
distance will be within 0.5 light-years from d.
197 / 218
X X measurements E Xi d
36
V Xi 4
x̅ É estimation of distance
rgeenug.hr
gzm36is
meaning that CIT can be applied
P ix also 5 Ñ
f
P P 121 1.5
15 15
P 1.5 251.5 1.51 81 1.5 2811.5 1
811.57
1 2 0.9332 1
0.8664
Example
Darth Vader wants to measure the distance between the Death
Star and Tatooine. However, due to atmospheric disturbances,
measurements will not yield the exact distance d. As a result,
Vader has decided to make a series of 36 measurements and then
use their average value as an estimate of the actual distance.
Assume that the values of the successive measurements are
independent random variables with a mean of d light years and a
standard deviation of 2 light years.
I Approximate the probability that the estimated value of the
distance will be within 0.5 light-years from d.
I How many measurements Vader needs in order to be at least
95% certain that his estimate is accurate to within 0.5 light
years?
198 / 218
We consider now n measurements with n
generic
x̅ NON
0.95 P IX d 40.5
P P 12K
0.95
15 14 1
Pl Fs Z E 06
20141 1
0.95428141 1
1.955281 0.975581
0.025
1.96 0.025
70.02s
199 / 218
How large is “large n”?
How large n should be to have a good approximation depends on
the shape of the population distribution. According to the textbook
201 / 218
Normal approximation to the Binomial distribution
One of the first important applications of the CLT was related to
Binomial random variables.
We know that a Binomial random variable X with parameters
(n, p) can be expressed as a sum of n independent random variables
X = E 1 + E2 + · · · + E n
with E [Ei ] = p and V [Ei ] = p(1 p).
sample proportion
1 p
E XJ np V XJ np
1 P P
P
E ftp.np PVCEJ f.M
204 / 218
Expectation of sample proportion is
0.6
p for any possible n
205 / 218
n 10
P 0 55 P X 3 5 5
P x 6 P X 7 P X 10
10 R
6.61 0.4
K G
or d binom 6 size 10
prob 0.6
8
For n too
P 0.55 P X 355
100 k
R
Ei E
55
0.6 0.4
P 2 1
It I
Continuity correction for Normal approx to Binomial
When using Normal approximation to Binomial, note that:
since the normal is a continuous random variable,
P(X = i) would always be approximated as 0
even if it’s strictly positive (because Bernoulli is discrete).
206 / 218
Continuity correction for Normal approx to Binomial
When using Normal approximation to Binomial, note that:
since the normal is a continuous random variable,
P(X = i) would always be approximated as 0
even if it’s strictly positive (because Bernoulli is discrete).
207 / 218
Example
Suppose for a Binomial (n = 100, p = 0.40) you need to
approximate P(35 X 40):
P(35 X 40) = P(34.5 X 40.5)
!
34.5 40 X np 40.5 40
= P p p p
24 np(1 p) 24
' P ( 1.12 Z 0.10)
= = (0.10) ( 1.12)
= (0.10 (1 (1.12))
= 0.5398 (1 0.8686) = 0.4084
36 P X 60
Pl X 35 P
39.52 560.5
34.5 2 335.5 P 35.52 536.5
208 / 218
Summary of sample mean properties
No matter what the population distribution is, denote
µ = the population expectation
2
= the population variance
X1 + · · · + Xn
then the sample mean X̄ = will have
n
I E [X̄ ] = µ
I V [X̄ ] = n
2
209 / 218
Expectation of the sample variance S 2
1 Pn
Remember: S2 = n 1 i=1 (Xi X̄ )2
210 / 218
Sampling from a normal population
When the population is normally distributed,
I We have seen that the sample mean X̄ is normal for all n:
X̄ µ
p is standard normal for all n
/ n
211 / 218
Sampling from a normal population
When the population is normally distributed,
I We have seen that the sample mean X̄ is normal for all n:
X̄ µ
p is standard normal for all n
/ n
I Now we discuss a result that permits to obtain probabilities
Pn 2
2 i=1 (Xi X̄ )
for the sample variance S = n 1 :
(n 1)S 2 2
2
has n 1 distribution
xi
t.fi
f
NOTE EI 22 YESTERDAY
212 / 218
Sampling from a normal population
When the population is normally distributed,
I We have seen that the sample mean X̄ is normal for all n:
X̄ µ
p is standard normal for all n
/ n
I Now we discuss a result that permits to obtain probabilities
Pn 2
2 i=1 (Xi X̄ )
for the sample variance S = n 1 :
(n 1)S 2 2
2
has n 1 distribution
I Rather counterintuitive, but important: X̄ and S 2 are
independent
213 / 218
Problem
1. The following data sets come from normal populations whose
standard deviation is specified. In each case, determine the
value of a statistic whose distribution is chi-squared, and tell
how many degrees of freedom this distribution has.
(a) 104, 110, 100, 98, 106; = 4
(b) 1.2, 1.6, 2.0, 1.5, 1.3, 1.8; = 0.5
(c) 12.4, 14.0, 16.0; = 2.4
2. Explain why a chi-squared random variable having n degrees
of freedom will approximately have the distribution of a
normal random variable when n is large.
Hint: Use the central limit theorem.
214 / 218
do
In
EM
the observed value that statistic is
of
2 103.6
1
104 6
03.61
106
1031
Z Z Zi
i I I
Y Yet Yu
Yi
haeg.gg
ssmE afpox
we can
apply C
nt no
X
Hence when n is large enough a X
is
approximately normal with some
parameters µ and
SPOILER N J 2n
M
The t distribution
If we standardize the sample mean using sample variance (instead
of population variance)
X̄ µ
p is no longer normal.
S/ n
This is said to be a t distribution with n 1 degrees of freedom
(Tn 1 ).
The density function of a t looks similar to a standard normal
density, although it is somewhat more spread out, resulting in its
having “larger tails”.
216 / 218
Plot of standard normal and t densities
217 / 218
Quantiles of the t distribution
If Td is a t random variable with d degrees of freedom, its
100(1 ↵) percentile is
td,↵ such that P(Td > td,↵ ) = ↵
(same concept as z↵ for the standard normal)
218 / 218