Lecture 09
Lecture 09
Statistical Inference
Lecture 09: Bayesian estimation
Yuansi Chen
Spring 2023
Duke University
https://www2.stat.duke.edu/courses/Spring23/sta732.01/
1
Recap from Lecture 08
2
Where we are
3
Goal of Lecture 09
4
Bayes risk, Bayes estimator
Recall the components of a decision problem
• Data 𝑋
• Model family P = {𝑃𝜃 ∶ 𝜃 ∈ Ω}, a collection of probability
distributions on the sample space
• Loss function 𝐿, 𝐿(𝜃, 𝑑) measures the loss incurred by the
decision 𝑑 when compared with the parameter obtained from 𝜃
• Risk function 𝑅, 𝑅(𝜃, 𝛿) = 𝔼𝜃 [𝐿(𝜃, 𝛿)]
5
The frequentist motivation of the Bayesian setup
Motivation
It is in general hard to find uniformly minimum risk estimator.
Oftentimes, we have risks that cross. This difficulty will not arise if
the performance is measured via a single number.
6
The frequentist motivation of the Bayesian setup
Motivation
It is in general hard to find uniformly minimum risk estimator.
Oftentimes, we have risks that cross. This difficulty will not arise if
the performance is measured via a single number.
Remark
For now, assume Λ(Ω) = 1 (Λ is a prob measure). Later we might
deal with improper prior.
6
Bayes risk
7
Bayes estimator
8
Construct Bayes estimator
𝔼[𝐿(Θ, 𝑑) ∣ 𝑋 = 𝑥]
with respect to 𝑑
9
proof of Thm 7.1
10
Posterior
Def. Posterior
The conditional distribution of Θ given 𝑋, written as ℒ(Θ ∣ 𝑋) is
called the posterior distribution
Remark
• Λ is usually interpreted as prior belief about Θ before seeing
the data
• ℒ(Θ ∣ 𝑋) is the belief after seeing the data
11
Posterior calcultation with density
𝜆(𝜃)𝑝𝜃 (𝑥)
𝜆(𝜃 ∣ 𝑥) =
𝑞(𝑥)
12
Posterior mean is Bayes estimator for squared error loss
2
Suppose 𝐿(𝜃, 𝑑) = (𝑔(𝜃) − 𝑑) then the Bayes estimator is the
posterior mean
proof:
13
Examples
Binomial model with Beta prior
14
Weighted squared error loss
2
Suppose 𝐿(𝜃, 𝑑) = 𝑤(𝜃) (𝑔(𝜃) − 𝑑) . Find a Bayes estimator.
15
Normal mean estimation
𝑋 ∣ Θ = 𝜃 ∼ 𝒩(𝜃, 𝜎2 ),
Θ ∼ 𝒩(𝜇, 𝜏 2 ).
Find the Bayes estimator of mean under squared error loss
What if we have 𝑛 i.i.d. data points?
16
Binary classification
17
Bayes estimators are usually biased
Unbiased estimator under squared error loss is not Bayes
𝔼 [(𝛿(𝑋) − 𝑔(Θ))2 ] = 0
18
proof:
19
Bayes estimators are usually
admissible
Uniqueness of Bayes estimator under strictly convex loss
then the Bayes estimator 𝛿Λ is unique (a.e. with respect to 𝑃𝜃 for all
𝜃).
20
proof: Use the following lemma
Lem. Lehmann Casella exercise 1.7.26
Let 𝜙 be a strictly convex function over an interval 𝐼. If there exists a
value 𝑎0 ∈ 𝐼 minimizing 𝜙, then 𝑎0 is unique.
21
A unique Bayes estimator is admissible
22
proof:
23
Summary
24
What is next?
25
Thank you
26
27