0% found this document useful (0 votes)

33 views24 pages

Lecture4 More Bayes

The document discusses Bayesian and frequentist statistical methods, providing examples of conjugate priors and posterior distributions for normal-normal models. It also covers posterior predictive distributions, which allow predicting new observations given existing data by averaging likelihoods over the posterior distribution.

Uploaded by

Ala Bala

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

33 views24 pages

Lecture4 More Bayes

Uploaded by

Ala Bala

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 24

More on Bayesian Methods

Rebecca C. Steorts
Bayesian Methods and Modern Statistics: STA 360/601

Lecture 4

1
Today’s menu

I Review of notation
I When are Bayesian and frequentist methods the same?
I Example: Normal-Normal
I Posterior predictive inference
I Example
I Credible Intervals
I Example

2
Notation

3
Another conjugate example
Suppose
iid
X1 . . . Xn | λ ∼ Poisson(λ)
λ ∼ Gamma(α, β).
Find p(λ | X).
n h
Y i β α α−1 −β×λ
p(λ | X) = λxi e−λ /xi ! × λ e (1)
γ(α)
i=1
∝ λnx̄ e−nλ × λα−1 e−β×λ (2)
nx̄+α−1 −λ(n+β)
∝λ e (3)

λ | X ∼ Gamma(nx̄ + α, n + β)

4
Normal-Normal

iid
X1 , . . . , Xn |θ ∼ N(θ, σ 2 )
θ ∼ N(µ, τ 2 ),

where σ 2 is known. Calculate the distribution of θ|x1 , . . . , xn .

Using a ton of math and algebra, you can show that
nx̄ µ !
σ 2 + τ2 1
θ|x1 , . . . , xn ∼ N n ,
2 + τ12 n2 + τ12
σ 2 σ
nx̄τ + µσ 2 σ2τ 2

=N , .
nτ 2 + σ 2 nτ 2 + σ 2

5
Two Useful Things to Know

Definition
The reciprocal of the variance is referred to as the precision. Then
1
Precision = .
Variance

Suppose the loss we assume is squared error. Let δ(x) be an

estimator of true parameter θ. Then

M SE(δ(x)) = Bias2 + V ariance (4)

h i
2 2
= θ − Eθ [δ(x)] + Eθ δ(x) − Eθ [δ(x)] (5)

6
Theorem
Let δn be a sequence of estimators of g(θ) with mean squared
error E(δn − g(θ))2 . Let bn (θ) be the bias.
(i) If E[δn − g(θ)]2 → 0 then δn is consistent for g(θ).
(ii) Equivalent to the above, δn is consistent if bn (θ) → 0 and
V ar(δn ) → 0 for all θ.
(iii) In particular (and most useful), δn is consistent if it is
unbiased for each n and if V ar(δn ) → 0 for all θ.
We omit the proof since it requires Chebychev’s Inequality along
with a bit of probability theory. See Problem 1.8.1 in TPE for the
exercise of proving this.

7
Normal-Normal Revisited

We write the posterior mean and posterior variance out.

nx̄ µ
σ 2 + τ2
E(θ|x) = n 1 .
+
σ2 τ2
nx̄ µ
= σ2 + τ2
.
n 1 n 1
+ +
σ2 τ2 σ2 τ2

1
V (θ|x) = n 1 .
+
σ2 τ2

Can someone given an explanation of what’s happening here? How

does this contrast frequentist inference?

8
A consistent estimator
Let δ̂(x) = E[θ | X]. Show that the posterior mean is consistent.

9
A consistent estimator
Let δ̂(x) = E[θ | X]. Show that the posterior mean is consistent.
First consider the posterior mean as n → ∞.

1 nx̄ 1 µ x̄
2
+ 2 2
E(θ|x) = n σ n τ → σ = x̄ as n → ∞.
1 n 1 1 1
+
n σ2 n τ 2 σ2

9
A consistent estimator
Let δ̂(x) = E[θ | X]. Show that the posterior mean is consistent.
First consider the posterior mean as n → ∞.

1 nx̄ 1 µ x̄
2
+ 2 2
E(θ|x) = n σ n τ → σ = x̄ as n → ∞.
1 n 1 1 1
+
n σ2 n τ 2 σ2
Now consider

E[x̄] = E[E[x̄ | θ]] = θ. (unbiased)

9
A consistent estimator
Let δ̂(x) = E[θ | X]. Show that the posterior mean is consistent.
First consider the posterior mean as n → ∞.

1 nx̄ 1 µ x̄
2
+ 2 2
E(θ|x) = n σ n τ → σ = x̄ as n → ∞.
1 n 1 1 1
+
n σ2 n τ 2 σ2
Now consider

E[x̄] = E[E[x̄ | θ]] = θ. (unbiased)

In the case of the posterior variance, divide the denominator and

numerator by n. Then
1
n σ2
V (θ|x) = ≈ → 0 as n → ∞.
1 n 1 1 n
+
n σ2 n τ 2
9
A consistent estimator
Let δ̂(x) = E[θ | X]. Show that the posterior mean is consistent.
First consider the posterior mean as n → ∞.

1 nx̄ 1 µ x̄
2
+ 2 2
E(θ|x) = n σ n τ → σ = x̄ as n → ∞.
1 n 1 1 1
+
n σ2 n τ 2 σ2
Now consider

E[x̄] = E[E[x̄ | θ]] = θ. (unbiased)

In the case of the posterior variance, divide the denominator and

1 nx̄ 1 µ x̄
2
+ 2 2
E(θ|x) = n σ n τ → σ = x̄ as n → ∞.
1 n 1 1 1
+
n σ2 n τ 2 σ2
Now consider

E[x̄] = E[E[x̄ | θ]] = θ. (unbiased)

In the case of the posterior variance, divide the denominator and

numerator by n. Then
1
n σ2
V (θ|x) = ≈ → 0 as n → ∞.
1 n 1 1 n
+
n σ2 n τ 2
Thus, the posterior mean is consistent by our Theorem, part (iii). 9
Posterior Predictive Distributions

I We have just seen how estimation can be done in Bayesian

analysis.
I Another goal might be prediction.
I That is given some data y and a new observation ỹ, we may
wish to find the conditional distribution of ỹ given y.
I This distribution is referred to as the posterior predictive
distribution.
I That is, our goal is to find p(ỹ|y).

10
Posterior Predictive Distributions
Consider
p(ỹ, y)
p(ỹ|y) = (6)
p(y)
R
p(ỹ, y, θ) dθ
= θ (7)
p(y)
R
p(ỹ|y, θ)p(y, θ) dθ
= θ (8)
p(y)
Z
= p(ỹ|y, θ)p(θ|y) dθ. (9)
θ
In most contexts, if θ is given, then ỹ|θ is independent of y, i.e.,
the value of θ determines the distribution of ỹ, without needing to
also know y. When this is the case, we say that ỹ and y are
conditionally independent given θ. Then the above becomes
Z
p(ỹ|y) = p(ỹ|θ)p(θ|y) dθ.
θ
11
Theorem
If θ is discrete and ỹ and y are conditionally independent given θ,
then the posterior predictive distribution is
X
p(ỹ|y) = p(ỹ|θ)p(θ|y).
θ

If θ is continuous and ỹ and y are conditionally independent given

θ, then the posterior predictive distribution is
Z
p(ỹ|y) = p(ỹ|θ)p(θ|y) dθ.
θ

12
Negative Binomial Distribution
I We reintroduce the Negative Binomial distribution.
I The binomial distribution counts the numbers of successes in
a fixed number of iid Bernoulli trials.
I Recall, a Bernoulli trial has a fixed success probability p.
I Suppose instead that we count the number of Bernoulli trials
required to get a fixed number of successes. This formulation
leads to the Negative Binomial distribution.
I In a sequence of independent Bernoulli(p) trials, let X denote
the trial at which the rth success occurs, where r is a fixed
integer.
Then

x−1
f (x) = pr (1 − p)x−r , x = r, r + 1, . . .
r−1

and we say X ∼ Negative Binom(r, p).

13
Negative Binomial Distribution

I There is another useful formulation of the Negative Binomial

distribution.
I In many cases, it is defined as Y = number of failures before
the rth success. This formulation is statistically equivalent to
the one given above in term of X = trial at which the rth
success occurs, since Y = X − r. Then

r+y−1
f (y) = pr (1 − p)y , y = 0, 1, 2, . . .
y

and we say Y ∼ Negative Binom(r, p).

I When we refer to the Negative Binomial distribution in this
class, we will refer to the second one defined unless we
indicate otherwise.

14
X|λ ∼ Poisson(λ)
λ ∼ Gamma(a, b)

Assume that X̃|λ ∼ Poisson(λ) is independent of X. Assume we

have a new observation x̃. Find the posterior predictive
distribution, p(x̃|x). Assume that a is an integer. First, we must
find p(λ|x).

15
Recall

p(λ|x) ∝ p(x|λ)(p(λ)
∝ e−λ λx λa−1 e−λ/b
= λx+a−1 e−λ(1+1/b) .

1
Thus, λ|x ∼ Gamma(x + a, 1+1/b ), i.e.,
b
λ|x ∼ Gamma(x + a, b+1 ). Finish the problem for homework.

16
I Suppose that X is the number of pregnant women arriving at
a particular hospital to deliver their babies during a given
month.
I The discrete count nature of the data plus its natural
interpretation as an arrival rate suggest modeling it with a
Poisson likelihood.
I To use a Bayesian analysis, we require a prior distribution for
θ having support on the positive real line. A convenient choice
is given by the Gamma distribution, since it’s conjugate for
the Poisson likelihood.
The model is given by

X|λ ∼ Poisson(λ)
λ ∼ Gamma(a, b).

17
I We are also told 42 moms are observed arriving at the
particular hospital during December 2007. Using prior study
information given, we are told a = 5 and b = 6.
I (We found a, b by working backwards from a prior mean of 30
and prior variance of 180).
We would like to find several things in this example:
1. Plot the likelihood, prior, and posterior distributions as
functions of λ in R.
2. Plot the posterior predictive distribution where the number of
pregnant women arriving falls between [0,100], integer valued.
3. Find the posterior predictive probability that the number of
pregnant women arrive is between 40 and 45 (inclusive). Do
this for homework.
4. You are expected to have this done by early this week or next
week since you have an exam on Thursday, Feb 11 (in class).
(This material will not be turned in but could appear on the
exam).
18

Machine Learning Lab File
No ratings yet
Machine Learning Lab File
48 pages
Positive Ev Betting
No ratings yet
Positive Ev Betting
2 pages
Automatic Differentiation Variational Inference
No ratings yet
Automatic Differentiation Variational Inference
45 pages
Intra Class Correlation Icc
No ratings yet
Intra Class Correlation Icc
23 pages
O Level Maths Book 4 24-25
No ratings yet
O Level Maths Book 4 24-25
216 pages
Bayesian Inference For The Gaussian
No ratings yet
Bayesian Inference For The Gaussian
11 pages
25 Intro To Bayesian Inference
No ratings yet
25 Intro To Bayesian Inference
31 pages
Single Parameter Models
No ratings yet
Single Parameter Models
37 pages
Biostat Chat GPT Note
No ratings yet
Biostat Chat GPT Note
11 pages
BT Wk3 LectureNotes
No ratings yet
BT Wk3 LectureNotes
16 pages
19-Bayesian 2
No ratings yet
19-Bayesian 2
39 pages
Bayesian - Lec - 3
No ratings yet
Bayesian - Lec - 3
24 pages
i i 2 i 1 2 θ i 2 2 3 2
No ratings yet
i i 2 i 1 2 θ i 2 2 3 2
159 pages
Chapter 2 Concept of Random Variables
No ratings yet
Chapter 2 Concept of Random Variables
144 pages
Forum 15fforum Halliwell Complex
No ratings yet
Forum 15fforum Halliwell Complex
66 pages
Slides 1
No ratings yet
Slides 1
73 pages
20 Bayesian2
No ratings yet
20 Bayesian2
50 pages
Chapter V. Appendix
No ratings yet
Chapter V. Appendix
5 pages
CH 5
No ratings yet
CH 5
45 pages
BT Wk3 LectureNotes
No ratings yet
BT Wk3 LectureNotes
19 pages
AS Lecture 12 (Naive Bayes Classifier)
No ratings yet
AS Lecture 12 (Naive Bayes Classifier)
42 pages
Lecture 4
No ratings yet
Lecture 4
7 pages
Chapter 1 B
No ratings yet
Chapter 1 B
35 pages
Notes PQT
No ratings yet
Notes PQT
17 pages
Lecture 3
No ratings yet
Lecture 3
4 pages
Week 10
No ratings yet
Week 10
2 pages
BML Lecture Notes
No ratings yet
BML Lecture Notes
126 pages
Bayes Lecture Notes
No ratings yet
Bayes Lecture Notes
79 pages
Slides 535 Day 5 SPR 2014
No ratings yet
Slides 535 Day 5 SPR 2014
13 pages
Intro Bayes Time Series 1
No ratings yet
Intro Bayes Time Series 1
72 pages
LN 13
No ratings yet
LN 13
5 pages
W10 Notes
No ratings yet
W10 Notes
2 pages
Applied Bayesian Econometrics For Central Bankers Updated 2017 PDF
No ratings yet
Applied Bayesian Econometrics For Central Bankers Updated 2017 PDF
222 pages
Notes4 BayesianLearning
No ratings yet
Notes4 BayesianLearning
8 pages
Lecture 10
No ratings yet
Lecture 10
33 pages
Exam BUS326
No ratings yet
Exam BUS326
57 pages
Assignment 5 Stat Inf b3 2022 2023 PDF
No ratings yet
Assignment 5 Stat Inf b3 2022 2023 PDF
16 pages
Lecture 20 - Bayesian Analysis
No ratings yet
Lecture 20 - Bayesian Analysis
4 pages
Bayesian Statistics: Thomas Bayes
No ratings yet
Bayesian Statistics: Thomas Bayes
22 pages
확통1 LectureNote09 on Bayesian Statistical Inference
No ratings yet
확통1 LectureNote09 on Bayesian Statistical Inference
78 pages
Revised Fourth Sem Matlab Manual 21-22-1
No ratings yet
Revised Fourth Sem Matlab Manual 21-22-1
40 pages
Week 11
No ratings yet
Week 11
11 pages
Statistical Intervals For A Single Sample: Learning Objectives
No ratings yet
Statistical Intervals For A Single Sample: Learning Objectives
18 pages
CH 2
No ratings yet
CH 2
42 pages
Problem Set 1 Sol
No ratings yet
Problem Set 1 Sol
7 pages
3RD Quarter Stat 102044
No ratings yet
3RD Quarter Stat 102044
54 pages
Trans Area Property Ori
No ratings yet
Trans Area Property Ori
1 page
Bayes Gauss
100% (1)
Bayes Gauss
29 pages
Ch3 - 2009 Conjugate Families of Distributions
No ratings yet
Ch3 - 2009 Conjugate Families of Distributions
67 pages
Bayesian Workshop1 Solution
No ratings yet
Bayesian Workshop1 Solution
3 pages
AprilMay - 2018
No ratings yet
AprilMay - 2018
2 pages
This Study Resource Was
No ratings yet
This Study Resource Was
3 pages
Lec12 13 BayesianInferenceForTheGaussian
No ratings yet
Lec12 13 BayesianInferenceForTheGaussian
57 pages
MA40189 Notes
No ratings yet
MA40189 Notes
70 pages
Probability Distribution
No ratings yet
Probability Distribution
13 pages
Bayesian Statistics: MA501, Statistics For Insurance
No ratings yet
Bayesian Statistics: MA501, Statistics For Insurance
28 pages
T10 Sol..ol
No ratings yet
T10 Sol..ol
8 pages
Bayesian Inference: by Hoai Nam Nguyen September 9, 2017
No ratings yet
Bayesian Inference: by Hoai Nam Nguyen September 9, 2017
7 pages
Lecture Notes For Probability and Statistics
No ratings yet
Lecture Notes For Probability and Statistics
7 pages
Bayenian Networks Jensen
No ratings yet
Bayenian Networks Jensen
6 pages
ST903 Week9sol
No ratings yet
ST903 Week9sol
2 pages
CHAPTER 4 POISION Processes
No ratings yet
CHAPTER 4 POISION Processes
23 pages
Introduction To Bayesian Methods: Jessi Cisewski Department of Statistics Yale University
No ratings yet
Introduction To Bayesian Methods: Jessi Cisewski Department of Statistics Yale University
53 pages
Jain College: Date: Dec-2017 Subject: Statistics I Puc Mock Timings Allowed: 3Hrs. Total Marks: 100
No ratings yet
Jain College: Date: Dec-2017 Subject: Statistics I Puc Mock Timings Allowed: 3Hrs. Total Marks: 100
4 pages
BaYesian Models Machine Learning 2016
No ratings yet
BaYesian Models Machine Learning 2016
126 pages
Lecture 5
No ratings yet
Lecture 5
6 pages
(Ass) Teory of Probability
No ratings yet
(Ass) Teory of Probability
1 page
The University of Nottingham: Do NOT Turn Examination Paper Over Until Instructed To Do So
No ratings yet
The University of Nottingham: Do NOT Turn Examination Paper Over Until Instructed To Do So
6 pages
FDT and MCT
No ratings yet
FDT and MCT
19 pages
Aiml Unit 3
No ratings yet
Aiml Unit 3
9 pages
Machine Learning and Pattern Recognition Gaussian Processes
No ratings yet
Machine Learning and Pattern Recognition Gaussian Processes
6 pages
BONGGA Statistics-and-Probability 3Q SLM1
No ratings yet
BONGGA Statistics-and-Probability 3Q SLM1
4 pages
Stat 535 C - Statistical Computing & Monte Carlo Methods: Arnaud Doucet
No ratings yet
Stat 535 C - Statistical Computing & Monte Carlo Methods: Arnaud Doucet
23 pages
Bayesian Modelling Tuts-4-9
No ratings yet
Bayesian Modelling Tuts-4-9
6 pages
ACTL2102 Final Notes
No ratings yet
ACTL2102 Final Notes
29 pages
Chap 2
No ratings yet
Chap 2
28 pages
Stat 535 C - Statistical Computing & Monte Carlo Methods: Arnaud Doucet
No ratings yet
Stat 535 C - Statistical Computing & Monte Carlo Methods: Arnaud Doucet
23 pages
MCMC Bayes PDF
No ratings yet
MCMC Bayes PDF
27 pages
Bayesian Statistics 01
100% (1)
Bayesian Statistics 01
22 pages
MIT18 05S14 ps6 PDF
No ratings yet
MIT18 05S14 ps6 PDF
5 pages
Bayesian Basics: Ryan P. Adams
No ratings yet
Bayesian Basics: Ryan P. Adams
7 pages
Stat 111
No ratings yet
Stat 111
7 pages
Standard Normal (Z) Table: Area Between 0 and Z
No ratings yet
Standard Normal (Z) Table: Area Between 0 and Z
5 pages
W9PS
No ratings yet
W9PS
9 pages
A Beginner's Notes On Bayesian Econometrics (Art)
No ratings yet
A Beginner's Notes On Bayesian Econometrics (Art)
21 pages
Lecture 5
No ratings yet
Lecture 5
6 pages
Mathematics 1St First Order Linear Differential Equations 2Nd Second Order Linear Differential Equations Laplace Fourier Bessel Mathematics
From Everand
Mathematics 1St First Order Linear Differential Equations 2Nd Second Order Linear Differential Equations Laplace Fourier Bessel Mathematics
Andrew Igla
No ratings yet
Worked Examples in Mathematics for Scientists and Engineers
From Everand
Worked Examples in Mathematics for Scientists and Engineers
G. Stephenson
No ratings yet
Differential Forms
From Everand
Differential Forms
Henri Cartan
5/5 (2)
A-level Maths Revision: Cheeky Revision Shortcuts
From Everand
A-level Maths Revision: Cheeky Revision Shortcuts
Scool Revision
3.5/5 (8)

Lecture4 More Bayes

Uploaded by

Lecture4 More Bayes

Uploaded by

More on Bayesian Methods

where σ 2 is known. Calculate the distribution of θ|x1 , . . . , xn .

Suppose the loss we assume is squared error. Let δ(x) be an

M SE(δ(x)) = Bias2 + V ariance (4)

We write the posterior mean and posterior variance out.

Can someone given an explanation of what’s happening here? How

E[x̄] = E[E[x̄ | θ]] = θ. (unbiased)

E[x̄] = E[E[x̄ | θ]] = θ. (unbiased)

In the case of the posterior variance, divide the denominator and

E[x̄] = E[E[x̄ | θ]] = θ. (unbiased)

In the case of the posterior variance, divide the denominator and

E[x̄] = E[E[x̄ | θ]] = θ. (unbiased)

In the case of the posterior variance, divide the denominator and

I We have just seen how estimation can be done in Bayesian

If θ is continuous and ỹ and y are conditionally independent given

and we say X ∼ Negative Binom(r, p).

I There is another useful formulation of the Negative Binomial

and we say Y ∼ Negative Binom(r, p).

Assume that X̃|λ ∼ Poisson(λ) is independent of X. Assume we

You might also like