0% found this document useful (0 votes)

10 views29 pages

2 MS2 (Sampling)

Uploaded by

hadjiamine93

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views29 pages

2 MS2 (Sampling)

Uploaded by

hadjiamine93

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 29

Mathematical Statistics 2 (MS2)

Lecture 2: Sampling from the posterior

Amine Hadji <[email protected]>

Leiden University February 16, 2022
2 Sampling from the posterior
2.1 Introduction

In this chapter, we will discuss:

• Monte Carlo integration

• Sampling Techniques

• Initiation to MCMC

1 / 17
2 Sampling from the posterior
2.1 Introduction

Introductory problem
Let Y = (Y1 , ..., Yn )|θ ∼ N (θ, 1) a conditional iid sample, and θ a
random variable with a Gamma prior P(θ) = Γ(θ; α, β):

2 / 17
2 Sampling from the posterior
2.1 Introduction

Introductory problem
Let Y = (Y1 , ..., Yn )|θ ∼ N (θ, 1) a conditional iid sample, and θ a
random variable with a Gamma prior P(θ) = Γ(θ; α, β):

• What is the posterior distribution of θ?

• What is the posterior mean? What is the posterior variance?

• Can we construct a 95%-credible interval for θ?

2 / 17
2 Sampling from the posterior
2.2 Monte Carlo integration

Monte Carlo

Definition 2.1 (Monte Carlo methods)

Monte Carlo methods are a class of computational algorithms that rely
on the Strong Law of Large Numbers to obtain numerical results in:
optimization, numerical integration, and generating draws from a
probability distribution.

3 / 17
2 Sampling from the posterior
2.2 Monte Carlo integration

Monte Carlo

Definition 2.1 (Monte Carlo methods)

Example:
R1 1
Pn
The integral 0 xdx can be approximated by n i=1 Xi
iid
where (Xi )ni=1 ∼ U[0,1]

3 / 17
2 Sampling from the posterior
2.2 Monte Carlo integration

Monte Carlo method

Rb
Using a Monte Carlo method to approximate a
f (x)dx (a can be −∞
and b can be +∞):

1. Verify that the integral is well-defined

2. Find a probability distribution P such that P(x) > 0 for all x ∈ (a, b)

1
Pn f (Xi )1(a,b) (Xi )
3. Approximate the integral by n i=1 P(Xi ) for large values of n
By√the Central Limit Theorem, the method has a rate of convergence of
1/ n

4 / 17
2 Sampling from the posterior
2.2 Monte Carlo integration

Monte Carlo method

Rb
Using a Monte Carlo method to approximate a
f (x)dx (a can be −∞
and b can be +∞):

1. Verify that the integral is well-defined

2. Find a probability distribution P such that P(x) > 0 for all x ∈ (a, b)

1
Pn f (Xi )1(a,b) (Xi )
3. Approximate the integral by n i=1 P(Xi ) for large values of n
By√the Central Limit Theorem, the method has a rate of convergence of
1/ n

Warning: If the integral is not well-defined, the method will behave

badly!

4 / 17
2 Sampling from the posterior
2.2 Monte Carlo integration

Approximating the posterior mean

We know that the posterior mean is:

5 / 17
2 Sampling from the posterior
2.2 Monte Carlo integration

Approximating the posterior mean

We know that the posterior mean is:

Therefore, we only need to draw (θi )ni=1 from the prior

5 / 17
2 Sampling from the posterior
2.2 Monte Carlo integration

Approximating a posterior expectation

The previous methods can work to approximate any expectation
E[f (θ)|y ] with f : Θ → R measurable

6 / 17
2 Sampling from the posterior
2.2 Monte Carlo integration

Approximating a posterior expectation

The previous methods can work to approximate any expectation
E[f (θ)|y ] with f : Θ → R measurable
In particular, we can approximate
• the posterior mean

• the posterior variance

• a posterior probability

6 / 17
2 Sampling from the posterior
2.3 Sampling methods

Inverse CDF

Lemma 2.2 (Probability integral transform)

Let X a continuous random variable with cdf FX ; then the random
variable Y := FX (X ) follows a uniform distribution on (0, 1)

7 / 17
2 Sampling from the posterior
2.3 Sampling methods

Inverse CDF

Lemma 2.2 (Probability integral transform)

Let X a continuous random variable with cdf FX ; then the random
variable Y := FX (X ) follows a uniform distribution on (0, 1)

Proposition 2.3
Let F be a cdf, and let F −1 be its inverse function

F −1 (u) = inf{x | F (x) ≥ u} (0 < u < 1).

If U is a uniform random variable on (0, 1) then F −1 (U) has F as its cdf.

7 / 17
2 Sampling from the posterior
2.3 Sampling methods

Inverse CDF Method

1. Draw the iid sample (θi )N

i=1 from the prior

2. Compute the empirical cdf F̂θ

1
PN
N i=1 1(−∞,x) (θi )L(θi |y )
F̂θ (x) = 1
PN
N i=1 (θi )L(θi |y )

3. Generate U a uniform random variable on (0, 1)

4. Compute θ̃ := F̂ −1 (U)

8 / 17
2 Sampling from the posterior
2.3 Sampling methods

Inverse CDF Method

1. Draw the iid sample (θi )N

i=1 from the prior

2. Compute the empirical cdf F̂θ

1
PN
N i=1 1(−∞,x) (θi )L(θi |y )
F̂θ (x) = 1
PN
N i=1 (θi )L(θi |y )

3. Generate U a uniform random variable on (0, 1)

4. Compute θ̃ := F̂ −1 (U)
Using the Law of Large Numbers, we see that
N→∞
θ̃ → P(θ|y )

(i.e. when N is large, θ̃ follows approximately the posterior)

8 / 17
2 Sampling from the posterior
2.3 Sampling methods

Sequential Importance Resampling (SIR)

Proposition 2.4
Let (Yi )N
i=1 iid sample from the distribution P and let Q a distribution
dominated by P (P Q). If (Ik )nk=1 ∼ M(1, PNw1 w , ..., PwN N w ), with
i=1 i i=1 i
wi = Q(Yi )/P(Yi ) then
N→∞
(YIk )nk=1 → Q,

and the random variables (YIk )nk=1 are asymptotically iid

9 / 17
2 Sampling from the posterior
2.3 Sampling methods

SIR Algorithm

1. Draw the iid sample (θi )N

i=1 from the prior

2. Draw the iid sample (Ik )nk=1 from M(1, w1 , ..., wN ) with N n

L(θi |y )
wi = Pn
i=1 L(θi |y )

3. Compute (θIk )nk=1

10 / 17
2 Sampling from the posterior
2.3 Sampling methods

SIR Algorithm

1. Draw the iid sample (θi )N

i=1 from the prior

2. Draw the iid sample (Ik )nk=1 from M(1, w1 , ..., wN ) with N n

L(θi |y )
wi = Pn
i=1 L(θi |y )

3. Compute (θIk )nk=1

The SIR algorithm and the ICDF method are completely equivalent

10 / 17
2 Sampling from the posterior
2.3 Sampling methods

Accept-Reject Method
Let f , g two continuous pdf functions on X such that there exists M > 0
satisfying
f (x)
∀x ∈ X ≤ M.
g (x)

11 / 17
2 Sampling from the posterior
2.3 Sampling methods

Accept-Reject Method
Let f , g two continuous pdf functions on X such that there exists M > 0
satisfying
f (x)
∀x ∈ X ≤ M.
g (x)
Imagine we want to obtain a sample from the distribution with pdf f
using samples from the distribution with pdf g

1. Generate Y from the distribution with pdf g

2. Generate U a uniform random variable on (0, 1)

f (Y )
I If U < Mg (Y ) , then accept Y as a sample from the distribution
with pdf g

I If U ≥ f (Y )
Mg (Y ) , then reject Y and start from the beginning

11 / 17
2 Sampling from the posterior
2.3 Sampling methods

Accept-Reject Method - Theory

Lemma 2.5
Let f , g two continuous pdf functions on X such that there exists M > 0
satisfying
f (x)
∀x ∈ X ≤ M.
g (x)
Let Y a random variable with pdf g and U a uniform random variable on
(0, 1); then
f (Y ) 1
P U≤ =
Mg (Y ) M

12 / 17
2 Sampling from the posterior
2.3 Sampling methods

Accept-Reject Method - Theory

Proposition 2.6
Let f , g two continuous pdf functions on X such that there exists M > 0
satisfying
f (x)
∀x ∈ X ≤ M.
g (x)
Let Y a random variable with pdf g and U a uniform random variable on
(0, 1); then
Z y
f (Y )
P Y ≤ y |U ≤ = F (y ) := f (t)dt.
Mg (Y ) −∞

13 / 17
2 Sampling from the posterior
2.3 Sampling methods

Accept-Reject Sample

1. Compute M := maxθ∈Θ L(θ|y )

2. Generate θ̃ from the prior

3. Generate U a uniform random variable on (0, 1)

I If U < L(θ̃|y )
M , then accept θ̃ as a sample from the posterior

I If U ≥ L(θ̃|y )
M , then reject θ̃ and start from the beginning

14 / 17
2 Sampling from the posterior
2.4 MCMC

Reminder - Markov chain

Definition 2.7 (Markov chain)

A discrete-time Markov chain is a sequence of random variables
X0 , X1 , X2 , ... (i.e. a stochastic process) with the Markov property:

P(Xn+1 ∈ B | X1 = x1 , X2 = x2 , ..., Xn = xn ) = P(Xn+1 ∈ B | Xn = xn ),

if both conditional probabilities are well defined (i.e. if

P(X1 = x1 , ..., Xn = xn ) > 0).

15 / 17
2 Sampling from the posterior
2.4 MCMC

Reminder - Markov chain

Definition 2.7 (Markov chain)

A discrete-time Markov chain is a sequence of random variables
X0 , X1 , X2 , ... (i.e. a stochastic process) with the Markov property:

P(Xn+1 ∈ B | X1 = x1 , X2 = x2 , ..., Xn = xn ) = P(Xn+1 ∈ B | Xn = xn ),

if both conditional probabilities are well defined (i.e. if

P(X1 = x1 , ..., Xn = xn ) > 0).

Definition 2.8 (Time-homogeneity)

A Markov chain is said to be time-homogeneous if

P(Xn+1 ∈ B | Xn = x) = P(X1 ∈ B | X0 = x)

for all n ∈ N.

15 / 17
2 Sampling from the posterior
2.4 MCMC

Reminder - Markov chain

Definition 2.9 (Transition kernel)

A time-homogeneous Markov chain is entirely defined by its transition
kernel Q
Q(x, B) = P(Xn+1 ∈ B | Xn = x)
for all n ∈ N.

16 / 17
2 Sampling from the posterior
2.4 MCMC

Reminder - Markov chain

Definition 2.9 (Transition kernel)

A time-homogeneous Markov chain is entirely defined by its transition
kernel Q
Q(x, B) = P(Xn+1 ∈ B | Xn = x)
for all n ∈ N.

Definition 2.10 (Stationary distribution)

A probability distribution Π is called stationary for the transition kernel Q
if Xn ∼ Π implies that Xn+1 ∼ Π for all n ∈ N
Z
Q(x, B)dΠ(x) = Π(x).

16 / 17
2 Sampling from the posterior
2.4 MCMC

Monte-Carlo Markov chains

Definition 2.11 (MCMC)

Markov chain Monte Carlo methods comprise a class of algorithms for
sampling from a probability distribution by constructing a Markov chain
that has the desired distribution as its equilibrium distribution.

17 / 17

Montecarlo Simulation
No ratings yet
Montecarlo Simulation
584 pages
On Sequential Monte Carlo Sampling Methods For Bayesian Filtering
No ratings yet
On Sequential Monte Carlo Sampling Methods For Bayesian Filtering
35 pages
MCMC
No ratings yet
MCMC
76 pages
Probability Random Variables and Random Processes Part 1
100% (10)
Probability Random Variables and Random Processes Part 1
30 pages
Rarefied Gas Dynamics - DSMC Course
No ratings yet
Rarefied Gas Dynamics - DSMC Course
50 pages
Lecture Notes On Regression: Markov Chain Monte Carlo (MCMC)
No ratings yet
Lecture Notes On Regression: Markov Chain Monte Carlo (MCMC)
13 pages
On The Markov Chain Monte Carlo (MCMC) Method: Rajeeva L Karandikar
No ratings yet
On The Markov Chain Monte Carlo (MCMC) Method: Rajeeva L Karandikar
24 pages
Unit V Graphical Models
No ratings yet
Unit V Graphical Models
23 pages
Lec30 GibbsSampling
No ratings yet
Lec30 GibbsSampling
55 pages
Dynamic Scaled Sampling For Deterministic Constraints
No ratings yet
Dynamic Scaled Sampling For Deterministic Constraints
9 pages
Unit 5
No ratings yet
Unit 5
74 pages
Statistics 202C Study Guide: Part I: Sampling Basic Unstructured Distributions and Monte Carlo Basics
No ratings yet
Statistics 202C Study Guide: Part I: Sampling Basic Unstructured Distributions and Monte Carlo Basics
14 pages
Sampling Methods: Søren Højsgaard
No ratings yet
Sampling Methods: Søren Højsgaard
22 pages
MCMC - Markov Chain Monte Carlo: One of The Top Ten Algorithms of The 20th Century
100% (1)
MCMC - Markov Chain Monte Carlo: One of The Top Ten Algorithms of The 20th Century
31 pages
Bayes Intro PT 2
No ratings yet
Bayes Intro PT 2
13 pages
Bayesian Modeling Using The MCMC Procedure
No ratings yet
Bayesian Modeling Using The MCMC Procedure
22 pages
Machine Learning and Pattern Recognition Sampling Based Approximations
No ratings yet
Machine Learning and Pattern Recognition Sampling Based Approximations
3 pages
MCMC Brief
100% (1)
MCMC Brief
69 pages
Markov Chain Monte Carlo (MCMC) Methods: Example 11 (Matlab)
No ratings yet
Markov Chain Monte Carlo (MCMC) Methods: Example 11 (Matlab)
21 pages
Lecture 19
No ratings yet
Lecture 19
12 pages
Bayesian Analysis
No ratings yet
Bayesian Analysis
20 pages
Solution Measure Integration
100% (1)
Solution Measure Integration
168 pages
MCMC: Metropolis Hastings Algorithm
No ratings yet
MCMC: Metropolis Hastings Algorithm
5 pages
Normal Linear Regression Model Via Gibbs Sampling Gibbs Sampler Diagnostics
No ratings yet
Normal Linear Regression Model Via Gibbs Sampling Gibbs Sampler Diagnostics
11 pages
3-MS2 (MCMC)
No ratings yet
3-MS2 (MCMC)
32 pages
Intro To Markov Chain Monte Carlo: Rebecca C. Steorts Bayesian Methods and Modern Statistics: STA 360/601
No ratings yet
Intro To Markov Chain Monte Carlo: Rebecca C. Steorts Bayesian Methods and Modern Statistics: STA 360/601
35 pages
5d MCMC
No ratings yet
5d MCMC
9 pages
Walker 2016 SliceSampling
No ratings yet
Walker 2016 SliceSampling
18 pages
11 BJPS144
No ratings yet
11 BJPS144
14 pages
Computation
No ratings yet
Computation
11 pages
MCMC
No ratings yet
MCMC
7 pages
Big Data JPM
No ratings yet
Big Data JPM
31 pages
Stat 111
No ratings yet
Stat 111
7 pages
Importance Sampling
No ratings yet
Importance Sampling
13 pages
Problem Set
No ratings yet
Problem Set
13 pages
Elements of Probability Theory - Lecture Notes
No ratings yet
Elements of Probability Theory - Lecture Notes
58 pages
Applied Probability (Girardin, Valérie - Limnios, Nikolaos)
100% (1)
Applied Probability (Girardin, Valérie - Limnios, Nikolaos)
270 pages
Markov Chain Monte Carlo
No ratings yet
Markov Chain Monte Carlo
29 pages
Parameter Estimation
No ratings yet
Parameter Estimation
50 pages
MCMC
No ratings yet
MCMC
70 pages
Lec 25
No ratings yet
Lec 25
3 pages
Questions For Unit 5 RM
No ratings yet
Questions For Unit 5 RM
4 pages
Lectures 6
No ratings yet
Lectures 6
17 pages
ProblemSheet1 23
No ratings yet
ProblemSheet1 23
5 pages
Problems and Proofs in Real Analysis Theory of Measure and Integration 1st Edition James J. Yeh Download
No ratings yet
Problems and Proofs in Real Analysis Theory of Measure and Integration 1st Edition James J. Yeh Download
57 pages
Probability MCQ (Multiple Choice Questions) - Sanfoundry
No ratings yet
Probability MCQ (Multiple Choice Questions) - Sanfoundry
16 pages
Convergence and Large Sample Approximations - Part 1 Definition: Convergence in Distribution
No ratings yet
Convergence and Large Sample Approximations - Part 1 Definition: Convergence in Distribution
10 pages
7 Inference L8 Unlocked
No ratings yet
7 Inference L8 Unlocked
29 pages
Probability Homework Help
No ratings yet
Probability Homework Help
22 pages
Lecture 18 1
No ratings yet
Lecture 18 1
17 pages
CPSC 540: Machine Learning: Monte Carlo Methods
No ratings yet
CPSC 540: Machine Learning: Monte Carlo Methods
32 pages
CPSC 440: Advanced Machine Learning: Monte Carlo Methods
No ratings yet
CPSC 440: Advanced Machine Learning: Monte Carlo Methods
30 pages
Ky - Unit 3
No ratings yet
Ky - Unit 3
20 pages
Bayesian - Lec - 4
No ratings yet
Bayesian - Lec - 4
25 pages
MCMC Bayes PDF
No ratings yet
MCMC Bayes PDF
27 pages
Cours MC
No ratings yet
Cours MC
456 pages
Walker 2007
No ratings yet
Walker 2007
11 pages
MCMC Sampling - Class 2025
No ratings yet
MCMC Sampling - Class 2025
101 pages
Intro Bayes Time Series 1
No ratings yet
Intro Bayes Time Series 1
72 pages
Quiz 5 Chap 5 Answer
No ratings yet
Quiz 5 Chap 5 Answer
4 pages
Bayesian Inference Slides 2021
No ratings yet
Bayesian Inference Slides 2021
37 pages
Real Analysis
100% (1)
Real Analysis
284 pages
General State Space Markov Chains and MCMC Algorithms - Gareth O. Roberts, Jeffrey S. Rosenthal
No ratings yet
General State Space Markov Chains and MCMC Algorithms - Gareth O. Roberts, Jeffrey S. Rosenthal
64 pages
Monte Carlo
No ratings yet
Monte Carlo
59 pages
MCMC Notes
No ratings yet
MCMC Notes
77 pages
Poison Distribution Table
No ratings yet
Poison Distribution Table
1 page
Sampling
No ratings yet
Sampling
100 pages
Annotated L18
No ratings yet
Annotated L18
73 pages
Markov Chain Monte Carlo Methods: Christian P. Robert
No ratings yet
Markov Chain Monte Carlo Methods: Christian P. Robert
456 pages
EA Smartingle
No ratings yet
EA Smartingle
23 pages
Unit 3 Probability Distributions: Structure
No ratings yet
Unit 3 Probability Distributions: Structure
38 pages
Probability Distributions
No ratings yet
Probability Distributions
44 pages
Chapter 04
No ratings yet
Chapter 04
29 pages
APPENDIX A - Standard Normal Cumulative Probability Table
No ratings yet
APPENDIX A - Standard Normal Cumulative Probability Table
2 pages
Chapter 9 Bayesian Methods - Machine Learning For Factor Investing
No ratings yet
Chapter 9 Bayesian Methods - Machine Learning For Factor Investing
11 pages
Probability Distributions
No ratings yet
Probability Distributions
17 pages
M7L7 Conditional Probability
No ratings yet
M7L7 Conditional Probability
13 pages
Industrial Management Question Paper
No ratings yet
Industrial Management Question Paper
8 pages
Ab1202 - Statistics and Analysis Tutorial: 1 Topics: Probability Measures
No ratings yet
Ab1202 - Statistics and Analysis Tutorial: 1 Topics: Probability Measures
2 pages
Unit 4 Problem Set
No ratings yet
Unit 4 Problem Set
2 pages
Probability
No ratings yet
Probability
23 pages
Probability Theory
No ratings yet
Probability Theory
11 pages
P and S Question Bank (2024-2025)
No ratings yet
P and S Question Bank (2024-2025)
8 pages
Prob NStat 3
No ratings yet
Prob NStat 3
10 pages
Sta2001 2023W
No ratings yet
Sta2001 2023W
2 pages
Tabel Stat Baru PDF
No ratings yet
Tabel Stat Baru PDF
19 pages
MA8402 Probability and Queueing MCQ
No ratings yet
MA8402 Probability and Queueing MCQ
18 pages
YW Mortality 1.1 FutureLifetime
No ratings yet
YW Mortality 1.1 FutureLifetime
22 pages
Discrete Probability Distributions
No ratings yet
Discrete Probability Distributions
2 pages

2 MS2 (Sampling)

Uploaded by

2 MS2 (Sampling)

Uploaded by

Mathematical Statistics 2 (MS2)

Lecture 2: Sampling from the posterior

Amine Hadji <[email protected]>

In this chapter, we will discuss:

• What is the posterior distribution of θ?

• What is the posterior mean? What is the posterior variance?

• Can we construct a 95%-credible interval for θ?

Definition 2.1 (Monte Carlo methods)

Definition 2.1 (Monte Carlo methods)

Monte Carlo method

1. Verify that the integral is well-defined

Monte Carlo method

1. Verify that the integral is well-defined

Warning: If the integral is not well-defined, the method will behave

Approximating the posterior mean

Approximating the posterior mean

Therefore, we only need to draw (θi )ni=1 from the prior

Approximating a posterior expectation

Approximating a posterior expectation

• the posterior variance

Lemma 2.2 (Probability integral transform)

Lemma 2.2 (Probability integral transform)

F −1 (u) = inf{x | F (x) ≥ u} (0 < u < 1).

If U is a uniform random variable on (0, 1) then F −1 (U) has F as its cdf.

Inverse CDF Method

1. Draw the iid sample (θi )N

2. Compute the empirical cdf F̂θ

3. Generate U a uniform random variable on (0, 1)

Inverse CDF Method

1. Draw the iid sample (θi )N

2. Compute the empirical cdf F̂θ

3. Generate U a uniform random variable on (0, 1)

(i.e. when N is large, θ̃ follows approximately the posterior)

Sequential Importance Resampling (SIR)

and the random variables (YIk )nk=1 are asymptotically iid

1. Draw the iid sample (θi )N

3. Compute (θIk )nk=1

1. Draw the iid sample (θi )N

3. Compute (θIk )nk=1

1. Generate Y from the distribution with pdf g

2. Generate U a uniform random variable on (0, 1)

Accept-Reject Method - Theory

Accept-Reject Method - Theory

1. Compute M := maxθ∈Θ L(θ|y )

2. Generate θ̃ from the prior

3. Generate U a uniform random variable on (0, 1)

Reminder - Markov chain

Definition 2.7 (Markov chain)

P(Xn+1 ∈ B | X1 = x1 , X2 = x2 , ..., Xn = xn ) = P(Xn+1 ∈ B | Xn = xn ),

if both conditional probabilities are well defined (i.e. if

Reminder - Markov chain

Definition 2.7 (Markov chain)

P(Xn+1 ∈ B | X1 = x1 , X2 = x2 , ..., Xn = xn ) = P(Xn+1 ∈ B | Xn = xn ),

if both conditional probabilities are well defined (i.e. if

Definition 2.8 (Time-homogeneity)

Reminder - Markov chain

Definition 2.9 (Transition kernel)

Reminder - Markov chain

Definition 2.9 (Transition kernel)

Definition 2.10 (Stationary distribution)

Monte-Carlo Markov chains

Definition 2.11 (MCMC)

You might also like