PSet 7
PSet 7
PSet 7
1. Sample mean & sample variance Suppose that X1 , . . . , Xn are i.i.d. draws from a distribution
with mean µ and variance σ 2 . The sample variance S 2 is a function of the centered values, Xi − X̄.
(a) Prove that X̄ is uncorrelated with Xi − X̄ (for any single i), i.e. Cov(X̄, Xi − X̄) = 0.
(b) We have learned before that if variables A and B are independent, then A is also independent from
any function of B, i.e. A ⊥⊥ g(B). However, the same does not hold for correlation/covariance:
it’s possible to have Corr(A, B) = 0 but Corr(A, g(B)) ̸= 0. Construct a simple example to show
that this can occur. (Your example does not need to be related to the context of the Xi ’s / X̄ /
etc.)
i.i.d.
(c) Let X1 , X2 ∼ Bernoulli(p) where 0 < p < 1. Calculate Cov(X̄, S 2 ). Is it zero or nonzero (your
answer might depend on p)? Are X̄ and S 2 independent or not independent (your answer might
depend on p)?
2. Multiple testing. Using the CLT, we showed in class that if X̄ is the sample average from n data
points drawn i.i.d. from a distribution with mean µ and variance σ 2 , then it’s very unlikely that X̄ will
be far from the true mean µ.
(a) To make the statement above concrete, suppose the sample size is n = 100, and we have µ = 5
and σ 2 = 35. Calculate (approximately) the chance that the sample mean will have an error of
more than 1.5 (i.e., |X̄ − µ| > 1.5)?
(b) Now suppose that we run our study 20 times (each time, gathering a sample size n = 100, from
the same population). Assume our 20 runs are independent. We calculate the sample mean for
each study, and report X̄ = the largest sample mean from any of our 20 studies. What is the
probability that X̄ has error more than 1.5?
3. Sample mean/variance & confidence intervals. Suppose that X1 , . . . , X20 are i.i.d. samples from
a N (µ, σ 2 ) distribution, with µ and σ 2 unknown. Let X̄ = 14.1 and S 2 = 9.9 be the sample mean and
sample variance of our data. Calculate a 95% confidence interval for µ. Your final answer should be
numerical, i.e., a numerical value for the lower endpoint and for the upper endpoint, with no notation
such as tn,α , etc.
(To obtain probabilities and cutoff values for the t distribution, you can use Table 4 in the back of
your book or just search online for “t distribution table”—be aware that the notation t.95 in this table
is not the same as the critical t value for 95% confidence (you should look at the picture in the book
to see what is meant by t.95 ). Or if you have R you can use the commands pt & qt instead of looking
at the table.)
4. MLE example. Suppose that we are working with a family of densities
f (x | θ) = (θ + 1)xθ
supported on x ∈ [0, 1]. The parameter θ must satisfy θ > −1 for this to be a valid density.
(a) Suppose we draw i.i.d. samples X1 , . . . , Xn from the density f (x | θ). Calculate the MLE, θ̂.
(b) Calculate the Fisher information I(θ).
1
(c) Calculate the approximate normal distribution of the MLE θ̂, if the true parameter is θ0 and the
sample size n is large. (The mean and variance of the normal may depend on n and/or θ0 , but
should not depend on any other quantities.)
5. Method of moments. Consider the same parametric family as in the problem above,
f (x | θ) = (θ + 1)xθ
supported on x ∈ [0, 1], for some parameter θ > −1. Above we estimated θ with its MLE θ̂. In this
problem we will use the method of moments (MoM) for estimating θ, instead of the MLE.
(a) Assuming a known value of the parameter θ, calculate µ = E(X) where X is a single draw from
the density f (x | θ).
(b) Now solve for the MoM estimator θ̂ as a function of the data X1 , . . . , Xn .
iid
6. Estimating variance for normal data. Suppose that we observe data X1 , . . . , Xn ∼ N(0, ν)—we
know the mean is 0, but we don’t know the variance ν. (We are writing variance = ν, instead of σ 2 , to
make it clear that we are working with variance rather than with standard deviation as our parameter.)