0% found this document useful (0 votes)
32 views

STAT 135 Lab 2 Confidence Intervals, MLE and The Delta Method

This document provides an overview of confidence intervals, maximum likelihood estimation (MLE), and the method of moments (MOM). It defines what a confidence interval is and how it is calculated. It then explains MLE, including how to find the value of a parameter θ that maximizes the likelihood function. Finally, it describes MOM, which estimates parameters by equating theoretical moments of a distribution to sample moments.

Uploaded by

azer
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views

STAT 135 Lab 2 Confidence Intervals, MLE and The Delta Method

This document provides an overview of confidence intervals, maximum likelihood estimation (MLE), and the method of moments (MOM). It defines what a confidence interval is and how it is calculated. It then explains MLE, including how to find the value of a parameter θ that maximizes the likelihood function. Finally, it describes MOM, which estimates parameters by equating theoretical moments of a distribution to sample moments.

Uploaded by

azer
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 28

STAT 135 Lab 2

Confidence Intervals, MLE and the Delta Method

February 2, 2015
Confidence Intervals
Confidence intervals

What is a confidence interval?


I A confidence interval is calculated in such a way that the
interval contains the true value of θ with some specified
probability (coverage probability).
What kind of parameters can θ correspond to?
I θ = µ from N(µ, σ 2 )
I θ = p from Binomial(n, p)
θ typically corresponds to a parameter from a distribution, F , from
which we are sampling
Xi ∼ F (θ)
Confidence intervals
I We usually write the coverage probability in the form of 1 − α
I If the coverage probability is 95%, then α = 0.05.
I Let qα be the number such that
P(Z < qα ) = 1 − α
where Z ∼ N(0, 1)
I By symmetry of the normal distribution, we have also that
qα = −q(1−α)
Confidence intervals
qα is the number such that

P(Z < qα ) = 1 − α

where Z ∼ N(0, 1).


Note also that by the symmetry of the normal distribution

1 − α = P(Z < qα ) = P(−qα/2 < Z < qα/2 )

For a 95% CI, we have:

q0.05/2 = 1.96 because P(−1.96 < Z < 1.96) = 0.95


Confidence intervals

Suppose that our estimate, θ̂n , of θ, asymptotically satisfies

θ̂n − θ
∼ N(0, 1)
σθ̂n

So in all of the equations in the previous slides, we can replace Z


with θ̂σn −θ and rearrange so that θ is the subject.
θ̂n
Confidence intervals

Recall that
1 − α = P(−qα/2 < Z < qα/2 )
θ̂n −θ
Given that σθ̂n∼ N(0, 1), we have also the result that
!
θ̂n − θ
1 − α = P −qα/2 < < qα/2
σθ̂n
rearranging to make θ the subject, we have
 
1 − α = P θ̂n − qα/2 σθ̂n < θ < θ̂n + qα/2 σθ̂n
Confidence intervals

We have that
 
1 − α = P θ̂n − qα/2 σθ̂n < θ < θ̂n + qα/2 σθ̂n

Recall that if we’re looking for a 95% confidence interval (CI), then
we are looking for an interval (a, b) such P(a < θ < b) = 0.95.

Thus, the 95% CI for θ can be found from


 
0.95 = P θ̂n − q0.025 σθ̂n < θ < θ̂n + q0.025 σθ̂n

For a general (1 − α)% CI, the interval

[θ̂n − q(1−α/2) σθ̂n , θ̂n + q(1−α/2) σθ̂n ]

contains θ with probability 1 − α.


Exercise 1
Confidence intervals - exercise
CI exercises:
1. In R, generate 1000 random samples, x1 , x2 , ..., x1000 , from a
(continuous) Uniform(5, 15) distribution
2. From the 1000 numbers you have just generated, draw 100
simple random samples (without replacement!), X1 , ..., X100 .
Repeat this 1000 times, so that we have 1000 samples of size
100.
3. For each sample of size 100, compute the sample mean, and
produce a histogram (preferably using ggplot()) of the 1000
sample means calculated above. What distribution does the
sample mean (approximately) follow, and why?
4. For each sample, calculate the 95% confidence interval for the
population mean.
5. Of the 1000 confidence intervals, what proportion of them
cover the true mean µ = 15+5
2 = 10?
Maximum likelihood estimation
Maximum likelihood estimation

I Confidence interval for θ: calculate a range of values in


which the true value of the parameter θ lies with some
specified probability.
I Maximum likelihood estimator for θ: calculate a single
value which estimates the true value of θ by maximizing the
likelihood function with respect to θ
I i.e. find the value of θ that maximizes the likelihood of
observing the data given.
Maximum likelihood estimation

What is the likelihood function?


I The likelihood function, lik(θ), is a function of θ which
corresponds to the probability of observing our sample for
various value of θ.

How to find the value of θ that maximizes the likelihood function?


Maximum likelihood estimation

Assume that we have observed i.i.d. random variables X1 , ...., Xn


and that their distribution has density/frequency function fθ .
Suppose that the observed value of Xi is xi for each i = 1, 2, ..., n
How do we write down the likelihood function? The (non-rigorous)
idea:

lik(θ) = P(X1 = x1 , ..., Xn = xn )


= P(X1 = x1 )...P(Xn = xn )
Yn
= fθ (Xi )
i=1

(Note that this proof is not rigorous for continuous variables since
they take on specific values with probability 0)
Maximum likelihood estimation

There are 4 main steps in calculating the MLE, θ̂MLE , of θ.


1. Write down the likelihood function, lik(θ) = ni=1 fθ (Xi ).
Q

2. Calculate the log-likelihood function `(θ) = log(lik(θ))


(Note: this is because it is often much easier to find the
maximum of the log-likelihood function than the likelihood
function)
3. Differentiate the log-likelihood function with respect to θ.
4. Set the derivative to 0, and solve for θ.
Maximum likelihood estimation - example

Example: Suppose Xi ∼ Bernoulli(p).

fp (x) = p x (1 − p)1−x

Step 1: Write down the likelihood function:


n
Y
lik(p) = fp (Xi )
i=1
Yn
= p Xi (1 − p)1−Xi
i=1
Pn Pn
Xi i=1 (1−Xi )
=p i=1 (1 − p)
Maximum likelihood estimation - example

Example: Suppose Xi ∼ Bernoulli(p).

fp (x) = p x (1 − p)1−x
Pn Pn
Step 1: lik(p) = p i=1 Xi (1 − p) i=1 (1−Xi )
Step 2: Calculate the log-likelihood function:

n
X n
X
`(p) = log(lik(p)) = Xi log(p) + (1 − Xi ) log(1 − p)
i=1 i=1
Maximum likelihood estimation - example

Example: Suppose Xi ∼ Bernoulli(p).

fp (x) = p x (1 − p)1−x

Step 2: `(p) = ni=1 Xi log(p) + ni=1 (1 − Xi ) log(1 − p)


P P
Step 3: Differentiate the log-likelihood function with respect
to p:

Pn Pn
d`(p) i=1 Xi i=1 (1− Xi )
= −
dp p 1−p
Maximum likelihood estimation - example

Example: Suppose Xi ∼ Bernoulli(p).

fp (x) = p x (1 − p)1−x
Pn Pn
d`(p) Xi i=1 (1−Xi )
Step 3: dp = i=1
p − 1−p
Step 4: Set the derivative to 0, and solve for p:
Pn
d`(p) Xi
= 0 =⇒ p̂MLE = i=1 =X
dp n
So the MLE for p where Xi ∼ Bernoulli(p) is just equal to the
sample mean.
Method of Moments (MOM)
Method of Moments
I Confidence interval for θ: calculate a range of values in
which the true value of the parameter θ lies with some
specified probability.
I Maximum likelihood estimator for θ: calculate a single
value which estimates the true value of θ by maximizing the
likelihood function with respect to θ.
I Method of moments estimator for θ: By equating the
theoretical moments to the empirical (sample) moments,
derive equations that relate the theoretical moments to θ.
The equations are then solved for θ.
Suppose X follows some distribution. The kth moment of the
distribution is defined to be

µk = E [X k ] = gk (θ)

which will be some function of θ.


Method of Moments

MOM works by equating the theoretical moments (which will be a


function of θ) to the empirical moments.

Moment Theoretical Moment Empirical Moment


Pn
Xi
first moment E [X ] i=1
n
Pn
Xi2
second moment E [X 2 ] i=1
n
Pn
Xi3
third moment E [X 3 ] i=1
n
Method of Moments
MOM is perhaps best described by example.
Suppose that X ∼ Bernoulli(p). Then the first moment is given by

E [X ] = 0 × P(X = 0) + 1 × P(X = 1) = p

Moreover, we can estimate the E[X ] by taking a sample X1 , ..., Xn


and calculating the sample mean :
n
1X
X = Xi
n
i=1

We approximate the first theoretical moment, E [X ], by the first


empirical moment, X , i.e.

p̂MOM = X
which is the same as the MLE estimator! (note that this is not
always the case...)
Exercise 2
Exercise – Question 43, Chapter 8 (page 320) from John
Rice
The file gamma-arrivals contains a set of gamma-ray data
consisting of the times between arrivals (interarrival times) of
3,935 photons (units are seconds)
1. Make a histogram of the interarrival times. Does it appear
that a gamma distribution would be a plausible model?
2. Fit the parameters by the method of moments and by
maximum likelihood. How do the estimates compare?
3. Plot the two fitted gamma densities on top of the histogram.
Do the fits look reasonable?
Hint 1: the gamma distribution can be written as
β α α−1 −βx
fα,β (x) = x e
Γ(α)
Hint 2: the MLE for α has no closed-form solution - use:
α̂MLE = 1
The δ-method
The δ-method

Recall that the CLT says



n(X n − µ) → N(0, σ 2 )

What if we have some general function g (·)?



n(g (X n ) − g (µ)) →?
The δ-method

The δ-method tells us that



n(g (X n ) − g (µ)) → N(0, σ 2 (g 0 (µ))2 )
For a proof for the general case, see
http://en.wikipedia.org/wiki/Delta_method

This method can be used to find the variance of a function of our


random variables!

You might also like