0% found this document useful (0 votes)

19 views72 pages

Intro Bayes Time Series 1

The document is an introduction to a workshop on Bayesian Time Series Econometrics led by Joshua Chan, covering topics such as linear regression, autoregressive models, and Bayesian model comparison. It includes computation techniques like Monte Carlo simulation and Gibbs sampling, along with MATLAB tutorials. The workshop aims to prepare participants for DSGE lectures and consists of lectures and computer lab exercises over two days.

Uploaded by

jessezheng742247

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views72 pages

Intro Bayes Time Series 1

Uploaded by

jessezheng742247

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 72

Introduction to

Bayesian Time Series Econometrics

Joshua Chan

Australian National University

5 July 2013
Instructor

Name: Joshua Chan (Josh or Dr. Chan)

Website:

http://people.anu.edu.au/joshua.chan/

Email:
[email protected]
Overview of the Workshop

Purpose
◦ prepare you for DSGE lectures
◦ go over basic Bayesian time series models and computations
◦ include MATLAB tutorials

Topics covered
◦ linear regression, autoregressive moving average models
◦ vector autoregressive (VAR) models
◦ state space models
◦ Bayesian model comparison
Computation techniques
◦ Monte Carlo simulation
◦ Markov chain Monte Carlo: Gibbs sampling,
Metropolis-Hastings algorithm

Logistics
◦ 8 hours of lecture (9am to 1pm, two days)
◦ 6 hours of computer lab/exercises (2pm to 5pm, two days)
Plan for Today

Four 50-minute Sessions

◦ Intro to Bayesian Econometrics
◦ 1-parameter normal model (Monte Carlo integration)
◦ 2-parameter normal model (Gibbs sampler)
◦ linear regression with iid errors
◦ linear regression with moving average errors (MH algorithm)
Recommended Readings
Textbooks
◦ Koop (2003) Bayesian Econometrics, Wiley
◦ Koop, Poirier and Tobias (2007) Bayesian Econometric
Methods, Cambridge University Press
◦ Geweke (2005) Contemporary Bayesian Econometrics and
Statistics, Wiley

Survey papers
◦ Del Negro and Schorfheide (2011) “Bayesian
Macroeconometrics,” in Geweke, Koop and van Dijk (eds):
The Oxford Handbook of Bayesian Econometrics, Oxford
University Press
◦ Koop and Korobilis (2010) “Bayesian Multivariate Time Series
Methods for Empirical Macroeconomics,” Foundations and
Trends in Econometrics, 3(4): 267-358
Introduction

Frequentist inference
◦ frequentist interpretation of probabilities—an event’s
probability as the limit of its relative frequency in a large
number of trials
◦ θ is thought to be an unknown, but fixed quantity
◦ knowledge of θ is obtained from an observed sample
y1 , . . . , yn (only)
◦ usually summarized by the likelihood function

L(θ; y) = f (y | θ)
Bayesian inference
◦ probability is a (subjective) measure of the “degree of belief”
of the individual assessing the event
◦ θ is a random quantity and has a distribution f (θ) called the
prior distribution
◦ knowledge of θ comes from an observed sample y1 , . . . , yn and
the prior distribution
◦ the goal of the analysis is to obtain the posterior distribution
f (θ | y) using Bayes’ Theorem

Suppose we take n measurements y1 , . . . , yn of an unknown

quantity µ where the magnitude of measurement error is known

From a small-scale study µ is estimated to be around µ0 (but with

substantial uncertainty)

One model is

(yi | µ) ∼ N(µ, σ 2 ), i = 1, . . . , n,
µ∼ N(µ0 , σ02 ),

where σ 2 , µ0 , and σ02 are known

Recall that for x ∼ N(a, b 2 ), the density of x is

1 − 1
(x−a)2 − 12 1 2
x −2x a2
f (x) = √ e 2b 2 ∝e b2 b
2πb 2

What do we learn from the new measurements?

All the information is summarized by the posterior distribution

f (µ)f (y | µ)
f (µ | y) = ,
m(y)

where y = (y1 , . . . , yn )′
Computation: Overview

Immediate goal: derive f (µ | y)

Ignore any constants in f (µ | y) not involving µ

Then try to figure out what density f (µ | y) is

Computation: Details

Now do the computations...

1 n
Y
− (µ−µ0 )2 1 2
f (µ | y) ∝ f (µ)f (y | µ) ∝ e 2σ 2
0 e− 2σ2 (yi −µ)
i =1
P
1 µ2 − 2µµ0 −2µ yi + nµ2
∝ exp − +
2 σ02 σ2

1 1 n 2 µ0 nȳ
∝ exp − + µ − 2µ + 2
2 σ02 σ 2 σ02 σ

The exponent is quadratic in µ, hence, (µ | y) ∼ N(b

µ , Dµ )
Computation: Details

Now compare

1 1 n 2 µ0 nȳ
f (µ | y) ∝ exp − + µ − 2µ +
2 σ02 σ 2 σ02 σ2

with
− 21 1
µ2 −2µ Dµ
b
f (µ | y) ∝ e Dµ µ

We have
−1
1 n µ0 nȳ
Dµ = 2 + 2 , µ
b = Dµ + 2
σ0 σ σ02 σ
Posterior Mean Interpretation

In particular, the posterior mean is

−1
1 n µ0 nȳ
E(µ | y) = µ
b= 2 + 2 + 2
σ0 σ σ02 σ
1 1
σ02 σ2 /n
= 1 n
µ0 + 1 n
ȳ
σ02
+ σ2 σ02
+ σ2

a weighted average of the prior mean and the MLE

Numerical Examples

Suppose µ0 = 10, n = 3, ȳ = 20 and σ 2 = 1

Case 1: σ02 = 0.1; Case 2: σ02 = 10

1.5 0.8
prior
0.7 posterior
0.6
1
0.5

0.4

0.3
0.5
0.2

0.1

0 0
8 10 12 14 0 5 10 15 20 25
Quantities of Interest

p
Since (µ | y) ∼ N(b
µ, Dµ ), can easily compute Var(µ | y),
P(µ > 0 | y), 95% credibe set for µ, etc.

But suppose we wish to obtain E(g (µ) | y) < ∞ for some function
g

Can estimate that using Monte Carlo integration

Monte Carlo Integration

First note that

Z
E(g (µ) | y) = g (µ)f (µ | y)dµ

(cannot be computed analytically in general)

Monte Carlo integration: generate R draws µ(1) , . . . , µ(R) from

f (µ | y), and compute
R
1X
gb = g (µ(r ) )
R
r =1

By the weak law of large numbers, gb converges weakly in

probability to E(g (µ) | y) < ∞
Numerical Examples (Continued)

Consider case 2. Then, µ

b = 19.67 and Dµ = 0.3226

Suppose g (x) = log |x|

Recall that if Z ∼ N(0, 1), then

Y = a + bZ

follows the N(a, b 2 ) distribution

MATLAB Code

mu0 = 10; sig02 = 10; n=3; ybar = 20; sig2 = 1;

Dmu = 1/(1/sig02 + n/sig2);
muhat = Dmu*(mu0/sig02 + n*ybar/sig2);
R = 10000;
mu = muhat + sqrt(Dmu)*randn(R,1);
ghat = mean(log(abs(mu)));
2-parameter Normal Model

Now we extend the previous model so that σ 2 is unknown

The model is again

(yi | µ) ∼ N(µ, σ 2 ), i = 1, . . . , n,

where both µ and σ 2 are unknown

The same prior for µ : N(µ0 , σ02 )

Need a prior for σ 2 : InvGamma(ν0 , S0 )

Inverse-Gamma Distribution

A random variable Z is said to have an inverse-gamma distribution

with shape parameter α > 0 and scale parameter β > 0 if its pdf is
given by
β α −(α+1) −β/z
f (z; α, β) = z e
Γ(α)

We write Z ∼ InvGamma(α, β)

β β2
Moments: EZ = α−1 for α > 1, Var(Z ) = (α−1)2 (α−2)
for α > 2
Sampling from the Inverse-Gamma Distribution

Sample X ∼ Gamma(α, λ) and set Z = 1/X

Exercise: Show that the pdf of Z is

β α −(α+1) −β/z
f (z; α, β) = z e
Γ(α)

where the pdf of X is

β α α−1 −βx
f (x; α, β) = z e
Γ(α)
The Joint Posterior Distribution

By Bayes’ Theorem, the joint posterior is given by

f (µ, σ 2 | y) ∝ f (µ, σ 2 , y)
∝ f (µ)f (σ 2 )f (y | µ, σ 2 )
1 n
Y
− (µ−µ0 )2 S0 1 1 2
∝e 2σ 2
0 (σ 2 )−(ν0 +1) e− σ2 (σ 2 )− 2 e− 2σ2 (yi −µ)
i =1

We wish to obtain E(µ | y), Var(σ 2 | y) or the quantile µq so that

P(µ > µq ) = q

But those quantities cannot be obtained analytically

Monte Carlo Simulation

Idea: Use Monte Carlo methods. Specifically, obtain draws from

f (µ, σ 2 | y), say, µ(1) , . . . , µ(R) , σ 2(1) , . . . , σ 2(R)

Then compute
R R
1 X (r ) 1 X 2(r )
µ , (σ − σ̄ 2 )2 , µ(qR)
R R
r =1 r =1

One method to do that is the Gibbs Sampler

Gibbs Sampler: Overview

Suppose we want to sample from f (Θ) = f (θ 1 , . . . , θ n ) (target

distribution)

Gibbs sampler constructs a Markov chain Θ(1) , Θ(2) , . . . such that

its limiting distribution converges to f

Use full conditional distributions f (θ i | θ 1 , . . . , θ i −1 , θ i +1 , . . . , θ n )

as the transition kernels
Gibbs Sampler: Details

Starting from an initial state Θ(0) , repeat the following steps from
r = 1 to R:
1. Given the current state Θ(r ) = Θ, generate Y = (Y1 , . . . , Yn )
as follows:
1.1 Draw Y1 ∼ f (y1 | θ2 , . . . , θn ).
1.2 Draw Yi ∼ f (yi | Y1 , . . . , Yi −1 , θi +1 , . . . , θn ), i = 2, . . . , n − 1.
1.3 Draw Yn ∼ f (yn | Y1 , . . . , Yn−1 ).
2. Set Θ(r +1) = Y.
Common Misconceptions
The Markov chain Θ(1) , Θ(2) , . . . does not converge to a fixed
point in Rk

Rather, the distribution of Θ(r ) converges to the target distribution

An example:

3.2

3.15

3.1

3.05

2.95

2.9

2.85
0 2000 4000 6000 8000 10000
Histograms of the previous Markov chain:

150 150

100 100

50 50

0 0
2.9 3 3.1 2.9 3 3.1
2-parameter Normal Model (Continued)

To construct a Gibbs sampler to sample from f (µ, σ 2 | y), we need

to derive f (µ | y, σ 2 ) and f (σ 2 | y, µ)

(1) f (σ 2 | y, µ):

f (σ 2 | y, µ) ∝ f (µ, σ 2 | y)
S0 1 Pn 2
∝ (σ 2 )−(ν0 +1) e− σ2 (σ 2 )−n/2 e− 2σ2 i =1 (yi −µ)

P 2
S0 + ni =1 (yi −µ) /2
2 −(ν0 +n/2+1) −
∝ (σ ) e σ2

A known distribution?
Compare
P 2
S0 + ni =1 (yi −µ) /2
2 2 −(ν0 +n/2+1) −
f (σ | y, µ) ∝ (σ ) e σ2

with
2
f (σ 2 ; α, β) ∝ (σ 2 )−(α+1) e−β/σ

Hence,

n
!
X
(σ 2 | y, µ) ∼ InvGamma ν0 + n/2, S0 + (yi − µ)2 /2
i =1
When σ 2 is given (i.e., known), then the model reduces to the
1-parameter normal model

(2) f (µ | y, σ 2 ):

f (µ | y, σ 2 ) ∝ f (µ)f (y | µ, σ 2 )

1 1 n 2 µ0 nȳ
∝ exp − + µ − 2µ + 2
2 σ02 σ 2 σ02 σ

Hence, (µ | y) ∼ N(b
µ, Dµ ) where
−1
1 n µ0 nȳ
Dµ = 2
+ 2 , µ
b = Dµ + 2
σ0 σ σ02 σ
Gibbs Sampler for the 2-parameter Normal Model

Pick some initial values µ(0) = a0 and σ 2(0) = b0 > 0. Then,

repeat the following steps from r = 1 to R:
1. Draw σ 2(r ) ∼ f (σ 2 | y, µ(r −1) ) (inverse-gamma).
2. Draw µ(r ) ∼ f (µ | y, σ 2(r ) ) (normal).
Usually discard the first m draws as “burn-in”

Then, given µ(1) , . . . , µ(R) and σ 2(1) , . . . , σ 2(R) one can compute,
e.g.,
R
1 X (r )
µ
R
r =1

as an estimate of E(µ | y)
MATLAB Code

% norm_2para.m
nloop = 10000; burnin = 1000;
n = 500; mu = 3; sig2 = .5;
y = mu + sqrt(sig2)*randn(n,1);

% prior
mu0 = 0; sig20 = 100;
nu0 = 3; S0 = .5;

% initialize the Markov chain

mu = 0; sig2 = 1;
store_theta = zeros(nloop,2);
MATLAB Code

for loop=1:nloop
% sample mu
Dmu = 1/(1/sig20 + n/sig2);
muhat = Dmu*(mu0/sig20 + sum(y)/sig2);
mu = muhat + sqrt(Dmu)*randn;

% sample sig2
sig2 = 1/gamrnd(nu0+n/2,1/(S0+sum((y-mu).^2)/2));

% store the parameters

store_theta(loop,:) = [mu sig2];
end
store_theta = store_theta(burnin+1:end,:);
thetahat = mean(store_theta);
Linear Regression Model

Consider the linear regression model:

yt = β0 + xt,1 β1 + · · · + xt,k βk + ǫt , ǫt ∼ N(0, σ 2 ),

where xt = (1, xt,1 , . . . , xt,k )′ is a vector of regressors and

β = (β0 , β1 , . . . , βk )′ is the associated vector of regression
coefficients

(The 2-parameter normal model is a spacial case with only an

intercept and β0 = µ)
Linear Regression Model: Matrix Form

Hence,
     
y1 x1,1 x1,2 ··· x1,k   ǫ1
 y2   x2,1 β1
   x2,2 ··· x2,k  .   
 ǫ2 
 ..  =  .. .. .. ..   ..  +  .. 
 .   . . . .   .
βk
yT xT ,1 xT ,2 · · · xT ,k ǫT

Or equivalently,

y = Xβ + ǫ, ǫ ∼ N(0, σ 2 IT ),

where IT is the T × T identity matrix

Two Useful Results

(1) An affine transformation (linear transformation followed by a

translation) of normal random variables is a (multivariate) normal
random variable

(2) Suppose U has an expectation vector µU and covariance

matrix ΣU . Let V = AU. Then

µV = AµU , ΣV = AΣU A′ .
Since y is an affine transformation of ǫ ∼ N(0, σ 2 IT ), so y has a
normal distribution

Moreover, given β and σ 2 , the expectation and covariance matrix

of y are Xβ and σ 2 IT

Therefore, we have

(y | β, σ 2 ) ∼ N(Xβ, σ 2 IT )
Linear Regression Model: Likelihood

Since
(y | β, σ 2 ) ∼ N(Xβ, σ 2 IT ),
the likelihood function is given by:
1 1 ′ 2I −1 (y−Xβ)
f (y | β, σ 2 ) = |2πσ 2 IT |− 2 e− 2 (y−Xβ) (σ T)

T 1 ′
= (2πσ 2 )− 2 e− 2σ2 (y−Xβ) (y−Xβ)

Recall that for an n × n matrix A and c ∈ R

|cA| = c n |A|
Priors and Gibbs Sampler

Independent priors for β and σ 2 :

β ∼ N(β 0 , Vβ ), σ 2 ∼ InvGamma(ν0 , S0 )

Use Gibbs sampler to estimate the model

We need to derive (1) (σ 2 | y, β) and (2) (β | y, σ 2 )

Sample (σ 2 | y, β)

(1) f (σ 2 | y, β):
S0 1 ′
f (σ 2 | y, β) ∝ (σ 2 )−(ν0 +1) e− σ2 (σ 2 )−T /2 e− 2σ2 (y−Xβ) (y−Xβ)
1 ′
∝ (σ 2 )−(ν0 +T /2+1) e− σ2 (S0 +(y−Xβ) (y−Xβ)/2)

Hence,

(σ 2 | y, β) ∼ InvGamma ν0 + T /2, S0 + (y − Xβ)′ (y − Xβ)/2
Recall (AB)′ = B′ A′ , and hence

(y − Xβ)′ (y − Xβ) = y′ y − y′ Xβ − β ′ X′ y + β ′ X′ Xβ
= β ′ X′ Xβ − 2β ′ X′ y + y′ y,

since β ′ X′ y is a scalar so

β ′ X′ y = (β ′ X′ y)′ = y′ Xβ
Sample (β | y, σ 2)

(2) f (β | y, σ 2 ):

f (β | y, σ 2 ) ∝ f (β)f (y | β, σ 2 )
1 ′ −1
(β−β0 ) − 2σ1 2 (y−Xβ)′ (y−Xβ)
∝ e− 2 (β−β0 ) Vβ e
− 12 (β ′ Vβ
−1 −1
β−2β ′ Vβ β 0 ) − 2σ1 2 (β ′ X′ Xβ−2β ′ X′ y)
∝e e
h i
− 12 −1
β ′ Vβ + 1
X′ X −1
β−2β ′ Vβ β0+ 1
X′ y
∝e σ2 σ2

The exponent is quadratic in β. Hence,

b Dβ )
(β | y, σ 2 ) ∼ N(β,
Now compare
h i
2 − 21 β′ Vβ
−1
+ 1
X′ X −1
β−2β ′ Vβ β0 + 1
X′ y
f (β | y, σ ) ∝ e σ2 σ2

with 1
f (β | y, σ 2 ) ∝ e− 2 (β Dβ β β)
′ −1
β−2β ′ D−1 b

We have
−1
1 b 1 ′
Dβ = Vβ−1 + 2 X′ X , −1
β = Dβ Vβ β 0 + 2 X y
σ σ
Sampling from the Multivariate Normal Distribution

To generate R independent draws from N(µ, Σ) of dimension n,

carry out the following steps:
1. Compute the lower Cholesky factorization Σ = BB′ .
2. Generate Z = (Z1 , . . . , Zn )′ by drawing Z1 , . . . , Zn ∼ N(0, 1).
3. Output U = µ + BZ.
4. Repeat Steps 2 and 3 independently R times.
Gibbs Sampler for the Linear Regression Model

Pick some initial values β (0) = a0 and σ 2(0) = b0 > 0. Then,

repeat the following steps from r = 1 to R:
1. Draw σ 2(r ) ∼ f (σ 2 | y, β (r −1) ) (inverse-gamma).
2. Draw β (r ) ∼ f (β | y, σ 2(r ) ) (multivariate normal).
MATLAB Code

% linreg.m
nloop = 10000; burnin = 1000;
n = 500; beta = [1 5]’; sig2 = .5;
X = [ones(n,1) 1+randn(n,1)];
y = X*beta + sqrt(sig2)*randn(n,1);

% prior
beta0 = [0 0]’; invVbeta0 = speye(2)/100;
nu0 = 3; S0 = .5;

% initialize the Markov chain

beta = (X’*X)\(X’*y); sig2 = 1;
store_theta = zeros(nloop,3);
MATLAB Code
for loop=1:nloop
% sample beta
Dbeta = (invVbeta0 + X’*X/sig2)\speye(2);
betahat = Dbeta*(invVbeta0*beta0 + X’*y/sig2);
C = chol(Dbeta,’lower’);
beta = betahat + C*randn(2,1);

% sample sig2
e = y-X*beta;
sig2 = 1/gamrnd(nu0+n/2,1/(S0+e’*e/2));

% store the parameters

store_theta(loop,:) = [beta’ sig2];
end
store_theta = store_theta(burnin+1:end,:);
thetahat = mean(store_theta);
Linear Regression Model with Moving Average Errors

Consider again the linear regression model:

yt = x′t β + ǫt ,

but with MA(1) error:

ǫt = ut + ψut−1 ,

where u0 = 0 and u1 . . . , uT ∼ N(0, σ 2 )

(The previous linear regression model is a spacial case with ψ = 0)

Priors and Estimation

Independent priors for β, σ 2 , and ψ

β ∼ N(β 0 , Vβ ), σ 2 ∼ InvGamma(ν0 , S0 ), ψ ∼ U(−1, 1),

To construct a Gibbs sampler, we need to derive

1. (σ 2 | y, β, ψ),
2. (β | y, σ 2 , ψ),
3. (ψ | y, β, σ 2 ).

Turns out Steps 1 and 2 can be done as before, but f (ψ | y, β, σ 2 )

is not a standard density
Estimation: Overview

We will first derive f (σ 2 | y, β, ψ) (inverse-gamma) and

f (β | y, σ 2 , ψ) (multivariate normal)

Introduce the Metropolis-Hastings algorithm to handle Step 3

(The resulting sampler is sometimes called

Metropolis-within-Gibbs)
Likelihood: Derivation

To derive the likelihood function, rewrite the MA(1) model into

the matrix form (Chan, 2013)

ǫ = Hψ u,

where u = (u1 , . . . , uT )′ ∼ N(0, σ 2 IT ), and

 
1 0 0 ··· 0
ψ 1 0 ··· 0
 
 0
Hψ =  0 ψ 1 ··· 
 .. .. .. 
. . .
0 0 ··· ψ 1

is a sparse T × T matrix that contains only (2T − 1) non-zero

elements
Hence,

y = Xβ + ǫ = Xβ + Hψ u, u ∼ N(0, σ 2 IT )

By a change of variable, we have

(y | β, σ 2 , ψ) ∼ N(Xβ, σ 2 Hψ H′ψ ),

Note that Hψ is a lower triangular matrix with ones on the main

diagonal. Hence, |Hψ | = 1
Likelihood: Computation

The likelihood function can be written as

1 1 ′ 2H ′ −1 (y−Xβ)
f (y | β, σ 2 , ψ) = |2πσ 2 Hψ H′ψ |− 2 e− 2 (y−Xβ) (σ ψ Hψ )

T 1 ′ ′ −1 (y−Xβ)
= (2πσ 2 )− 2 e− 2σ2 (y−Xβ) (Hψ Hψ )

The likelihood function can be evaluated quickly

To compute (Hψ H′ψ )−1 (y − Xβ), one needs not obtain

(Hψ H′ψ )−1 , which is a very time-consuming operation
Instead, solve the system

(Hψ H′ψ )x = (y − Xβ)

for x

Can be done by in MATLAB using the command \

To evaluate the likelihood function for general MA(q) model, one

only needs to redefine the matrix Hψ appropriately
Sample (σ 2 | y, β, ψ)

(1) f (σ 2 | y, β, ψ):
S0 T 1 ′ ′ −1 (y−Xβ)
f (σ 2 | y, β, ψ) ∝ (σ 2 )−(ν0 +1) e− σ2 (σ 2 )− 2 e− 2σ2 (y−Xβ) (Hψ Hψ )
1 ′ ′ −1 (y−Xβ)/2)
∝ (σ 2 )−(ν0 +T /2+1) e− σ2 (S0 +(y−Xβ) (Hψ Hψ )

Hence,
(σ 2 | y, β) ∼ InvGamma (ν0 + T /2, S) ,
where S = S0 + (y − Xβ)′ (Hψ H′ψ )−1 (y − Xβ)/2
Sample (β | y, σ 2, ψ)

(2) f (β | y, σ 2 , ψ):
1 −1 1
′ (y−Xβ)′ (Hψ H′ψ )−1 (y−Xβ)
∝ e− 2 (β−β0 ) Vβ (β−β0 ) −
e 2σ 2

1 ′ −1 −1
β−2β ′ Vβ β 0 ) − 2σ1 2 (β ′ X′ (Hψ H′ψ )−1 Xβ−2β ′ X′ (Hψ H′ψ )−1 y)
∝ e− 2 (β Vβ e
h i
− 12 β ′ Vβ
−1
+ 1
X′ (Hψ H′ψ )−1 X −1
β−2β′ Vβ β0 + 1
X′ (Hψ H′ψ )−1 y
∝e σ2 σ2
Again, the exponent is quadratic in β, and we have
b Dβ )
(β | y, σ 2 , ψ) ∼ N(β,

where
−1
1 ′
Dβ = Vβ−1 ′ −1
+ 2 X (Hψ Hψ ) X ,
σ

b −1 1 ′ ′ −1
β = Dβ Vβ β 0 + 2 X (Hψ Hψ ) y
σ
Sample (ψ | y, β, σ 2)

(3) f (ψ | y, β, σ 2 ):

f (ψ | y, β, σ 2 ) ∝ f (ψ)f (y | β, σ 2 , ψ)
1 ′ ′ −1 (y−Xβ)
∝ e− 2σ2 (y−Xβ) (Hψ Hψ )

with −1 < ψ < 1

The density f (ψ | y, β, σ 2 ) is non-standard

We don’t know how to sample from f (ψ | y, β, σ 2 ), but it can be

evaluated quickly (up to a normalization constant)
Metropolis-Hastings Algorithm: Overview

Suppose ψ is the current draw. Obtain a candidate draw ψ c , and

decide (probabilistically) whether to accept it or not

The acceptance probability is computed so that the detailed

balance equations are satisfied

As a result, the limiting distribution of the Markov chain

constructed is guaranteed to converge to the target

See Chib and Greenberg (1995)

Metropolis-Hastings Algorithm: Implementation

Two variants: random walk sampler and independent sampler

Random walk sampler:

1. the candidate is obtained via ψ c = ψ + v , where v ∼ g for
some distribution g (usually g is symmetric around 0)
2. accept ψ c with probability

f (ψ c | y, β, σ 2 )
min 1,
f (ψ | y, β, σ 2 )

3. if ψ c is not accepted, stay at ψ

Some comments:
◦ easy to implement
◦ g is typically chosen to be normal of t distributions with 0
mean
◦ might need some fine-tuning, but works for low-dimensional
problems
◦ could be very inefficient for high-dimensional problems
Independent sampler:
1. the candidate is obtained via ψ c ∼ q(ψ) (called proposal
density)
2. accept ψ c with probability

f (ψ c | y, β, σ 2 )q(ψ)
min 1,
f (ψ | y, β, σ 2 )q(ψ c )

3. if ψ c is not accepted, stay at ψ

Some comments:
◦ requires more from user/more difficult to implement
◦ choose q(ψ) as “close” as possible to f (ψ c | y, β, σ 2 )
◦ (if q(ψ) = f (ψ | y, β, σ 2 ), then acceptance prob. = 1 and it
reduces to the Gibbs sampler)
◦ could be very efficient or very inefficient (depends on the
choice of q(ψ))
Sample (ψ | y, β, σ 2)

Recall 1 ′ ′ −1 (y−Xβ)
f (ψ | y, β, σ 2 ) ∝ e− 2σ2 (y−Xβ) (Hψ Hψ )
with −1 < ψ < 1

Use a random walk sampler with ψ c = ψ + v , where v ∼ N(0, 0.1)

MATLAB Code: Evaluating log f (ψ | y, β, σ 2)

% ppsi.m
function llike = ppsi(psi,y,X,beta,sig2)
n = length(y);
Hpsi = speye(n) + ...
psi*sparse(2:n,1:n-1,ones(n-1,1),n,n);
R = Hpsi*Hpsi’;
llike = -.5/sig2*(y-X*beta)’*(R\(y-X*beta));
end
MATLAB Code

% linreg_ma1_RW.m
nloop = 11000; burnin = 1000;
randn(’seed’, 314159);
psi = -.5; beta = [4 .6]’; sig2 = .5;
n = 100; y0 = 10; y = zeros(n,1);
u = sqrt(sig2)*randn(n,1);
for i=1:n
if i==1
y(i) = beta(1) + beta(2)*y0 + u(i);
else
y(i) = beta(1) + beta(2)*y(i-1) + ...
+ u(i) + psi*u(i-1);
end
end
X = [ones(n,1) [y0; y(1:end-1)]];
MATLAB Code

% prior
beta0 = [0 0]’; invVbeta0 = 1/100*speye(2);
nu0 = 3; S0 = 1;
% initialize the Markov chain
psi = .1; beta = (X’*X)\(X’*y); sig2 = 1;
store_theta = zeros(nloop,4); count = 0;
MATLAB Code

for loop=1:nloop
% sample beta
Hpsi = speye(n) + ...
psi*sparse(2:n,1:n-1,ones(n-1,1),n,n);
R = Hpsi*Hpsi’;
Dbeta = (invVbeta0 + X’*(R\X)/sig2)\speye(2);
betahat = Dbeta*(invVbeta0*beta0+X’*(R\y)/sig2);
C = chol(Dbeta,’lower’);
beta = betahat + C*randn(2,1);
% sample sig2
e = y-X*beta;
sig2 = 1/gamrnd(nu0+n/2,1/(S0+e’*(R\e)/2));
MATLAB Code

% sample psi
psic = psi + sqrt(.1)*randn;
if psic<1 && psic >-1
alp = ppsi(psic,y,X,beta,sig2) ...
- ppsi(psi,y,X,beta,sig2);
if exp(alp)>rand
psi = psic;
count = count+1;
end
end
% store the parameters
store_theta(loop,:) = [beta’ sig2 psi];
end
store_theta = store_theta(burnin+1:end,:);
thetahat = mean(store_theta);

Discrete Probability PDF
100% (3)
Discrete Probability PDF
272 pages
Normal Linear Regression Model Via Gibbs Sampling Gibbs Sampler Diagnostics
No ratings yet
Normal Linear Regression Model Via Gibbs Sampling Gibbs Sampler Diagnostics
11 pages
MCMC Bayes PDF
No ratings yet
MCMC Bayes PDF
27 pages
MA40189 Notes
No ratings yet
MA40189 Notes
70 pages
Bayesian Modelling Tuts-4-9
No ratings yet
Bayesian Modelling Tuts-4-9
6 pages
Intro Bayes Time Series 2
No ratings yet
Intro Bayes Time Series 2
73 pages
MCMC
No ratings yet
MCMC
76 pages
Bayes Regression
No ratings yet
Bayes Regression
7 pages
Var PPTS
No ratings yet
Var PPTS
249 pages
Bayesian Analysis of A Stochastic Volatility
No ratings yet
Bayesian Analysis of A Stochastic Volatility
25 pages
Linear Regression Model: Alan Ledesma Arista
No ratings yet
Linear Regression Model: Alan Ledesma Arista
12 pages
20 Bayesian2
No ratings yet
20 Bayesian2
50 pages
Chapter 9 Bayesian Methods - Machine Learning For Factor Investing
No ratings yet
Chapter 9 Bayesian Methods - Machine Learning For Factor Investing
11 pages
Lec12 13 BayesianInferenceForTheGaussian
No ratings yet
Lec12 13 BayesianInferenceForTheGaussian
57 pages
Bayesian Linear Model Gory Details
No ratings yet
Bayesian Linear Model Gory Details
9 pages
Lecture4 More Bayes
No ratings yet
Lecture4 More Bayes
24 pages
Bayes Gauss
100% (1)
Bayes Gauss
29 pages
Introduction To Bayesian Methods: Jessi Cisewski Department of Statistics Yale University
No ratings yet
Introduction To Bayesian Methods: Jessi Cisewski Department of Statistics Yale University
53 pages
Applied Bayesian Econometrics For Central Bankers Updated 2017 PDF
No ratings yet
Applied Bayesian Econometrics For Central Bankers Updated 2017 PDF
222 pages
19-Bayesian 2
No ratings yet
19-Bayesian 2
39 pages
Bayesian Inference For The Gaussian
No ratings yet
Bayesian Inference For The Gaussian
11 pages
Bayes Intro PT 2
No ratings yet
Bayes Intro PT 2
13 pages
Problem Set 1 Sol
No ratings yet
Problem Set 1 Sol
7 pages
Lec24 BayesianLinearRegression
No ratings yet
Lec24 BayesianLinearRegression
29 pages
Stat 111
No ratings yet
Stat 111
7 pages
Intro To Markov Chain Monte Carlo: Rebecca C. Steorts Bayesian Methods and Modern Statistics: STA 360/601
No ratings yet
Intro To Markov Chain Monte Carlo: Rebecca C. Steorts Bayesian Methods and Modern Statistics: STA 360/601
35 pages
The Gibbs Sampler: Function
No ratings yet
The Gibbs Sampler: Function
1 page
BT Wk3 LectureNotes
No ratings yet
BT Wk3 LectureNotes
16 pages
2 Statistical Definitions: 2.1 Probability Density Function
No ratings yet
2 Statistical Definitions: 2.1 Probability Density Function
9 pages
Bayesian Inference Slides 2021
No ratings yet
Bayesian Inference Slides 2021
37 pages
ML and MAP - HTML
No ratings yet
ML and MAP - HTML
9 pages
MCMC: Gibbs Sampling: D K k1 k+1 D
No ratings yet
MCMC: Gibbs Sampling: D K k1 k+1 D
7 pages
15.097: Probabilistic Modeling and Bayesian Analysis
No ratings yet
15.097: Probabilistic Modeling and Bayesian Analysis
42 pages
Bayes Lectures English
No ratings yet
Bayes Lectures English
74 pages
Multi Parametric Models
No ratings yet
Multi Parametric Models
5 pages
Lecture Notes For Probability and Statistics
No ratings yet
Lecture Notes For Probability and Statistics
7 pages
L28 Bayseian Linear Regression Linchpin Sampler PDF
No ratings yet
L28 Bayseian Linear Regression Linchpin Sampler PDF
6 pages
Description: Package Name: Author: Date
No ratings yet
Description: Package Name: Author: Date
14 pages
Point Estimation: Definition of Estimators
No ratings yet
Point Estimation: Definition of Estimators
8 pages
Lecture2 2013
No ratings yet
Lecture2 2013
60 pages
Computation
No ratings yet
Computation
11 pages
Introduction To Bayesian Methods With An Example
No ratings yet
Introduction To Bayesian Methods With An Example
25 pages
확통1 LectureNote09 on Bayesian Statistical Inference
No ratings yet
확통1 LectureNote09 on Bayesian Statistical Inference
78 pages
Notes4 BayesianLearning
No ratings yet
Notes4 BayesianLearning
8 pages
Lecture 15 Normal Model Joint Inference
No ratings yet
Lecture 15 Normal Model Joint Inference
34 pages
Finite Mixture Modelling Model Specification, Estimation & Application
No ratings yet
Finite Mixture Modelling Model Specification, Estimation & Application
11 pages
Week 11
No ratings yet
Week 11
11 pages
PRML Slides 2
No ratings yet
PRML Slides 2
86 pages
Lecture 5
No ratings yet
Lecture 5
23 pages
CH 5
No ratings yet
CH 5
45 pages
ProblemSheet1 23
No ratings yet
ProblemSheet1 23
5 pages
EC 6310: Advanced Econometric Theory: Bayesian Computation in The Nonlinear Regression Model
No ratings yet
EC 6310: Advanced Econometric Theory: Bayesian Computation in The Nonlinear Regression Model
33 pages
Lecture 11 - 14 Computational Techniques
No ratings yet
Lecture 11 - 14 Computational Techniques
64 pages
Maximum Likelihood Estimation by K.Kashin
No ratings yet
Maximum Likelihood Estimation by K.Kashin
34 pages
An Overview of Bayesian Econometrics
No ratings yet
An Overview of Bayesian Econometrics
30 pages
Bayesian
No ratings yet
Bayesian
26 pages
Worked Examples in Mathematics for Scientists and Engineers
From Everand
Worked Examples in Mathematics for Scientists and Engineers
G. Stephenson
No ratings yet
Mathematics 1St First Order Linear Differential Equations 2Nd Second Order Linear Differential Equations Laplace Fourier Bessel Mathematics
From Everand
Mathematics 1St First Order Linear Differential Equations 2Nd Second Order Linear Differential Equations Laplace Fourier Bessel Mathematics
Andrew Igla
No ratings yet
10+2 Level Mathematics For All Exams GMAT, GRE, CAT, SAT, ACT, IIT JEE, WBJEE, ISI, CMI, RMO, INMO, KVPY Etc.
From Everand
10+2 Level Mathematics For All Exams GMAT, GRE, CAT, SAT, ACT, IIT JEE, WBJEE, ISI, CMI, RMO, INMO, KVPY Etc.
Shubhankar Paul
No ratings yet
Computer Solved: Nonlinear Differential Equations
From Everand
Computer Solved: Nonlinear Differential Equations
Joe J. Ettl
No ratings yet
Trigonometric Ratios to Transformations (Trigonometry) Mathematics E-Book For Public Exams
From Everand
Trigonometric Ratios to Transformations (Trigonometry) Mathematics E-Book For Public Exams
Mohmmad Khaja Shareef
5/5 (1)
Monash Estimation-1
No ratings yet
Monash Estimation-1
107 pages
Computer Tutorial 1
No ratings yet
Computer Tutorial 1
4 pages
Challenges
No ratings yet
Challenges
11 pages
Dsge Var
No ratings yet
Dsge Var
59 pages
Monte Carlo Integration
No ratings yet
Monte Carlo Integration
9 pages
Lecture 8
No ratings yet
Lecture 8
37 pages
FDI International Economic
No ratings yet
FDI International Economic
23 pages
A Political Economy Study On The Trade Policies o
No ratings yet
A Political Economy Study On The Trade Policies o
28 pages
Time Series Analysis Lecture One
No ratings yet
Time Series Analysis Lecture One
46 pages
Cryptography and Network Security: Sixth Edition by William Stallings
No ratings yet
Cryptography and Network Security: Sixth Edition by William Stallings
35 pages
Progress Test 2 - Attempt Review-2
No ratings yet
Progress Test 2 - Attempt Review-2
24 pages
15 MAY - NR - Correlation and Regression
No ratings yet
15 MAY - NR - Correlation and Regression
10 pages
Lesson 9. Correlation Coefficient
No ratings yet
Lesson 9. Correlation Coefficient
18 pages
Chapter 5
100% (1)
Chapter 5
22 pages
2.4 the Binomial Distribution 習題
No ratings yet
2.4 the Binomial Distribution 習題
5 pages
EE 60563: Probability and Random Processes Fall 2018
No ratings yet
EE 60563: Probability and Random Processes Fall 2018
2 pages
Principles of Statistics
No ratings yet
Principles of Statistics
113 pages
Sem I - Descriptive Statistics - Question Bank - FYBCS - Xls - Compatibility Mode
No ratings yet
Sem I - Descriptive Statistics - Question Bank - FYBCS - Xls - Compatibility Mode
10 pages
BInomial Poisson Problem Soultion 1
No ratings yet
BInomial Poisson Problem Soultion 1
20 pages
Survey of Network Traffic Models: Balakrishnan Chandrasekaran, Bchandrasekaran@wustl - Edu
No ratings yet
Survey of Network Traffic Models: Balakrishnan Chandrasekaran, Bchandrasekaran@wustl - Edu
8 pages
Design Thinking and QFD - Two Faces of The Same Coin - (PDFDrive)
No ratings yet
Design Thinking and QFD - Two Faces of The Same Coin - (PDFDrive)
910 pages
Assignment 1
No ratings yet
Assignment 1
3 pages
Revision 1
No ratings yet
Revision 1
2 pages
Introduction To Probability Theory: Rong Jin
No ratings yet
Introduction To Probability Theory: Rong Jin
39 pages
Sem 6 - DSV - Unit 4 - Sampling and Estimation
No ratings yet
Sem 6 - DSV - Unit 4 - Sampling and Estimation
50 pages
Questions N Solutions
No ratings yet
Questions N Solutions
7 pages
VAR Workbook
No ratings yet
VAR Workbook
171 pages
AAMS2203 Make Good Assignment
No ratings yet
AAMS2203 Make Good Assignment
2 pages
ST102/ST109 Elementary Statistical Theory Course Pack 2022/23 (Michaelmas Term)
100% (1)
ST102/ST109 Elementary Statistical Theory Course Pack 2022/23 (Michaelmas Term)
235 pages
SSP4SE Appa
No ratings yet
SSP4SE Appa
10 pages
Unit 4 - Decision Tree ID3
No ratings yet
Unit 4 - Decision Tree ID3
5 pages
Lecture 7 Discrite Prob Distn
No ratings yet
Lecture 7 Discrite Prob Distn
5 pages
Week 4 Lecture Notes
No ratings yet
Week 4 Lecture Notes
37 pages
3 Confidence Intervals
No ratings yet
3 Confidence Intervals
16 pages
Brownian Motion Martingales and Stochastic Calculus 1st Edition Jean-François Le Gall 2024 Scribd Download
100% (1)
Brownian Motion Martingales and Stochastic Calculus 1st Edition Jean-François Le Gall 2024 Scribd Download
65 pages
Module 7 (Basic Concepts of Probability)
No ratings yet
Module 7 (Basic Concepts of Probability)
7 pages
October 2016 (IAL) QP - S2 Edexcel
No ratings yet
October 2016 (IAL) QP - S2 Edexcel
28 pages
Weibull Distribution - Wikipedia
No ratings yet
Weibull Distribution - Wikipedia
60 pages