Statistics 2 Final Review
Statistics 2 Final Review
1. (40pts) In the 2019-2020 NBA season, flat-earther Kyrie Irving made 199 field goals
out of 416 attempted, while NBA all-star and all-around great guy Kemba Walker
made 350 field goals out of 832 attempted. Let p1 be Kyrie Irving’s true field goal
percentage (i.e. percentage of successful field goals) in 2019-2020 and p2 be Kemba
Walker’s true field goal percentage in 2019-2020. Assume that all attempted field goals
are IID random variables, so each time Kyrie shoots is an IID Bernoulli(p1 ) random
variable, and each time Kemba shoots is an IID Bernoulli(p2 ) random variable, where
a value of 1 corresponds to a made shot and 0 corresponds to a missed shot. Show
all your work for full credit.
(i) (10pts) Provide (i) an estimate of p1 and (ii) a large-sample 99% confidence
interval for p1 .
The estimate is 100% (199/416) = 47.8%. z0.005 = 2.57. The form of the CI is
p
p̂1 ± zα/2 p̂1 (1 − p̂1 )/n1 .
(ii) (10pts) Provide (i) an estimate of p1 −p2 , and (ii) a large-sample 99% confidence
interval for p1 −p2 . Provide a brief (one sentence) interpretation of this confidence
interval.
We are 99% confident that the difference between shooting percentages lies be-
tween -1.9% and 13.5%; upon resampling a confidence interval constructed in this
manner would contain the true difference p1 − p2 approximately 99% of the time.
1
(iii) (10pts) Do you expect the confidence interval from part (i) to contain the true
proportion p1 exactly 99% of the time and the confidence interval from part (ii)
to contain the true difference p1 − p2 exactly 99% of the time? Why or why not?
In what sense are these confidence intervals justified?
No, we do not expect them to contain the true proportion exactly 99% of the
time since the confidence intervals are approximate rather than exact. They are
based on an approximation using the central limit theorem, and are justified
“asymptotically” as n → ∞. Since the sample size is relatively large here, the
approximation should be quite good, but not perfect.
(iv) (10pts) Find the p-value for the large-sample test of the null hypothesis H0 :
p1 = p2 versus Ha : p1 6= p2 , i.e. a test of the null hypothesis that Kyrie Irving
and Kemba Walker’s percentages were equal in 2019-2020 versus the two-sided
alternative Ha that their percentages were different. Do you reject the null hy-
pothesis at the α = 0.05 level? Provide a brief (one sentence) interpretation of
the p-value.
Using the z-table, we find z0.0265 ≈ 1.936, so the two-sided p-value is approximately
2(0.0265) = 0.053. Since p > 0.05, we fail to reject H0 at the α = 0.05 level. If
H0 were true, there would be approximately 5.3% chance of observing a result as
or more extreme than the one we observed.
2
2. (40pts) The maximal acceptable radon level in a house is 4.0 units. An IID random
sample of n1 = 64 radon measurements is taken from a basement during the daytime
and the sample mean is found to be X = 3.8 units, with a sample standard de-
viation of σ̂X = 0.8 units. An independent IID random sample of n2 = 49 radon
measurements is taken from the basement during the nighttime and the sample mean
is found to be Y = 4.1 units with a sample standard deviation of σ̂Y = 1.1 units.
Show all your work for full credit.
(i) (10pts) Give a 90% large-sample confidence interval for the average radon level
µX during the daytime.
The 90% CI is
√ √
X ± zα/2 σ̂X / n1 = 3.8 ± 1.645(0.8)/ 64 = [3.636, 3.965].
(ii) (10pts) Find the p-value for the large-sample test of the null hypothesis H0 :
µX > 4.0 versus the one-sided alternative Ha : µX < 4.0, and provide a brief (one
sentence) interpretation of the result.
(iii) (10pts) Give a 90% large-sample confidence interval for the difference in the
average radon levels during the day and at night, µX − µY .
The 90% CI is
q p
2
(X−Y )±zα/2 σ̂X /n1 + σ̂Y2 /n2 = (3.8−4.1)±1.645 0.82 /64 + 1.12 /49 = [−0.61, 0.0064].
(iv) (10pts) Find the p-value for the large-sample test of the null hypothesis H0 :
µX = µY versus the two-sided alternative Ha : µX 6= µY , and provide a brief (one
sentence) interpretation of the result.
3
Since δ0 = 0, the test statistic is
X − Y − δ0 3.8 − 4.1
p
2 2
= p = 1.61.
σ̂X /n1 + σ̂Y /n2 0.8 /64 + 1.12 /49
2
We find z0.0537 = 1.61, so the p-value for the two-sided test is 2(0.0537) ≈ 0.107.
4
3. (45pts) Suppose that X1 , . . . , Xn are IID observations and that the density function of
the observations, which is known up to the parameter θ > 0, is given by
1 √
fθ (x) = √ e−|x|/ θ
2 θ
for all x ∈ R. Note that E[X] = 0, and Var(X) = E[X 2 ] = 2θ (which you do not have
to prove). Show all your work for full credit.
(i) (5pts) Find the likelihood function Ln (θ) and log-likelihood function `n (θ).
We have
n n
1 −|x|/√θ √
Y Y Pn
Ln (θ) = fθ (Xi ) = √ e = 2−n θ−n/2 e− i=1 |Xi |/ θ
i=1 i=1
2 θ
and n
n X √
`n (θ) = log Ln (θ) = −n log(2) − log θ − |Xi |/ θ.
2 i=1
1
Pn 2
(ii) (5pts) Show that the maximum likelihood estimator (MLE) of θ is n i=1 |Xi | .
∂2 3 ni=1 |Xi |
P
n
`n (θ) = 2 − .
∂θ2 2θ 4θ5/2
5
(iii) (5pts) Use√ the continuous mapping theorem (Theorem 9.2) and the fact that
E[|X|] = θ (which you do not have to prove) to determine whether the MLE is
consistent.
√
By the law of large numbers, n1 ni=1 |Xi | converges in probability to E[|X|] = θ.
P
Therefore, since g(u) = u2 is continuous,
n
!2
1X P
θ̂n = |Xi | −→ (E[|X|])2 = θ.
n i=1
√
(iv) (5pts) Use the fact that E[|X|] = θ and Var(|X|) = θ (which you do not have
to prove) to find the bias of the MLE. (Hint: E[Y 2 ] = Var(Y ) + (E[Y ])2 .)
We have
!2 ! " #!2
n n n
1 X 1X 1X
E |Xi | = Var |Xi | + E |Xi |
n i=1 n i=1 n i=1
1
= Var(|X|) + (E [|X|])2
n
1 n+1
= θ+θ = θ
n n
so the bias of the MLE is θ/n.
Pn
(vi) (5pts) Show using the factorization theorem that Tn = i=1 |Xi | is a sufficient
statistic for θ.
6
(vii) (5pts) Is either the MLE or the corrected MLE from part (v) the MVUE?
The MLE is not the MVUE because it is biased. The corrected MLE is the
MVUE by the Rao-Blackwell Theorem because it is unbiased and a function of
the minimal sufficient statistic Tn .
√
(viii) (5pts) What is the MLE of θ?
√ p q
1
Pn 2
By the invariance property of the MLE, the MLE of θ is θ̂n = n i=1 |Xi | =
1
Pn
n i=1 |Xi |.
(ix) (5pts) Use the fact that √2θ ni=1 |Xi | ∼ χ2 (2n), that is, follows a chi-squared
P
We have
n
!
2 X
1−α=P χ21−α/2 (2n) ≤ √ |Xi | ≤ χ2α/2 (2n)
θ i=1
!
χ21−α/2 (2n) 1 χ2α/2 (2n)
=P ≤ √ ≤ Pn
2 ni=1 |Xi |
P
θ 2 i=1 |Xi |
!
2 ni=1 |Xi | √ 2 ni=1 |Xi |
P P
=P ≥ θ≥
χ21−α/2 (2n) χ2α/2 (2n)
" #2 " P #2
Pn n
=P 2 i=1 |Xi | ≥ θ ≥ 2 i=1 |Xi | .
χ21−α/2 (2n) χ2α/2 (2n)
So our (1 − α)-level CI is
!2 !2
Pn Pn
2 i=1 |X i | 2 i=1 |Xi |
, .
χ2α/2 (2n) χ21−α/2 (2n)