0% found this document useful (0 votes)
29 views

Slidesbayesian

This document provides an introduction to Bayesian estimation. It discusses maximum likelihood estimation, the Kalman filter for time series models, and Bayesian estimation using tools like the Metropolis-Hastings algorithm. It presents the standard maximum likelihood problem and formula. It also describes how the Kalman filter provides a recursive way to calculate the linear projection of future states based on current and past observations in a time series model. The objective is to estimate future states based on the available information.

Uploaded by

muralidharan
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views

Slidesbayesian

This document provides an introduction to Bayesian estimation. It discusses maximum likelihood estimation, the Kalman filter for time series models, and Bayesian estimation using tools like the Metropolis-Hastings algorithm. It presents the standard maximum likelihood problem and formula. It also describes how the Kalman filter provides a recursive way to calculate the linear projection of future states based on current and past observations in a time series model. The objective is to estimate future states based on the available information.

Uploaded by

muralidharan
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 94

Introduction to Bayesian Estimation

Wouter J. Den Haan


London School of Economics
c 2011 by Wouter J. Den Haan

May 31, 2015


Overview ML Kalman Filter Estimating DSGEs ML & DSGE Bayesian estimation MCMC Other

Overview

Maximum Likelihood
A very useful tool: Kalman lter
Estimating DSGEs
Maximum Likelihood & DSGEs
formulating the likelihood
Singularity when #shocks number of observables
Bayesian estimation
Tools:
Metropolis Hastings
Overview ML Kalman Filter Estimating DSGEs ML & DSGE Bayesian estimation MCMC Other

Standard Maximum Likelihood problem

Theory:

yt = a0 + a1 xt + t
t N (0, 2 )
xt : exogenous

Data: fyt , xt gTt=1


Overview ML Kalman Filter Estimating DSGEs ML & DSGE Bayesian estimation MCMC Other

ML estimator

T
max
a0 ,a1 ,
p ( t )
t=1
where
t = yt a0 a1 xt
1 2t
p( t ) = p exp
2 22
Overview ML Kalman Filter Estimating DSGEs ML & DSGE Bayesian estimation MCMC Other

ML estimator

!
T
1 ( yt a0 a1 xt )2
max p exp
a0 ,a1 ,
t=1 2 22
Overview ML Kalman Filter Estimating DSGEs ML & DSGE Bayesian estimation MCMC Other

Rudolph E. Kalman

born in Budapest, Hungary, on May 19, 1930


Overview ML Kalman Filter Estimating DSGEs ML & DSGE Bayesian estimation MCMC Other

Kalman lter

Linear projection
Linear projection with orthogonal regressors
Kalman lter

The slides for the Kalman lter is based on Ljungqvist and Sargents
textbook
Overview ML Kalman Filter Estimating DSGEs ML & DSGE Bayesian estimation MCMC Other

Linear projection

y: ny 1 vector of random variables


x: nx 1 vector of random variables

First and second moments exist


Ey = y y = y y Exx 0 = xx
Ex = x x = x x Eyy 0 = yy
Eyx 0 = yx
Overview ML Kalman Filter Estimating DSGEs ML & DSGE Bayesian estimation MCMC Other

Denition of linear projection

The linear projection of y on x is the function

b [yjx] = a + Bx,
E

a and B are chosen to minimize

E trace (y a + Bx)(y a + Bx)0


Overview ML Kalman Filter Estimating DSGEs ML & DSGE Bayesian estimation MCMC Other

Formula for linear projection

The linear projection of y on x is given by

b [yjx] = + yx xx1 (x
E x )
y
Overview ML Kalman Filter Estimating DSGEs ML & DSGE Bayesian estimation MCMC Other

Dierence with linear regression problem

True model:
+ Dz
y = Bx + ,
Ex = Ez = E = 0, E [jx, z] = 0, E [zjx] 6= 0

B : measures the eect of x on y keeping all else also z and


constant.

Particular regression model:


+u
y = Bx
Overview ML Kalman Filter Estimating DSGEs ML & DSGE Bayesian estimation MCMC Other

Dierence with linear regression problem

Comments:
Least-squares estimate 6= B
Projection:
b [yjx] = Bx = Bx
E +D
Eb [zjx]

Projection well dened


linear projection can include more than the direct eect:
Overview ML Kalman Filter Estimating DSGEs ML & DSGE Bayesian estimation MCMC Other

Message:

You can always dene the linear projection

you dont have to worry about the properties of the error term.
Overview ML Kalman Filter Estimating DSGEs ML & DSGE Bayesian estimation MCMC Other

Linear Projection with orthogonal regressors

x = [x1 , x2 ] and suppose that x1 x2 = 0


x1 and x2 could be vectors

b [yjx] = + yx xx1 (x
E x )
y
x11x1 0
= y + yx1 yx2 (x x )
0 x21x2
= y + yx1 x11x1 (x1 x1 ) + yx2 x21x2 (x2 x2 )

Thus
b [yjx] = E
E b [yjx1 ] + E
b [ y j x2 ] y (1)
Overview ML Kalman Filter Estimating DSGEs ML & DSGE Bayesian estimation MCMC Other

Time Series Model

xt+1 = Axt + Gw1,t+1


yt = Cxt + w2,t
Ew1,t = Ew2,t = 0
0
w1,t+1 w1,t+1 V1 V3
E =
w2,t w2,t V30 V2
Overview ML Kalman Filter Estimating DSGEs ML & DSGE Bayesian estimation MCMC Other

Time Series Model

yt is observed, but xt is not


the coe cients are known (could even be time-varying)
Initial condition:
x1 is a random variable (mean x1 & covariance matrix 1 )
(it is not unusual that xt is simply set equal to x1 .

w1,t+1 and w2,t are serially uncorrelated and orthogonal to x1


Overview ML Kalman Filter Estimating DSGEs ML & DSGE Bayesian estimation MCMC Other

Objective

The objective is to calculate


bt xt+1
E b [xt+1 jyt , yt 1 ,
E , y1 , x 1 ]
b xt+1 jYt , x 1
= E

where x 1 is an initial estimate of x1

Trick: get a recursive formulation


Overview ML Kalman Filter Estimating DSGEs ML & DSGE Bayesian estimation MCMC Other

Orthogonalization of the information set

Let
y t = yt E b [yt jy t 1 , y t 2 , , y 1 , x 1 ]
t
Y = fy t , y t 1 , , y 1 g

space spanned by fx 1 , Y t g = space spanned by fx 1 , Yt g

That is, anything that can be expressed as a linear


combination with elements in fx 1 , Y t g can be expressed as a
linear combination of elements in fx 1 , Yt g.
Overview ML Kalman Filter Estimating DSGEs ML & DSGE Bayesian estimation MCMC Other

Orthogonalization of the information set

Then
b yt+1 jYt , x 1 = E
E b yt+1 jY t , x 1 = CE
b xt+1 jY t , x 1 (2)
Overview ML Kalman Filter Estimating DSGEs ML & DSGE Bayesian estimation MCMC Other

Derivation of the Kalman lter

From (1) we get


h i
b b b
E xt+1 jY , x 1 = E [xt+1 jy t ] + E xt+1 jY t
t 1
, x 1 Ext+1 (3)

The rst term in (3) is a standard linear projection:

b [xt+1 jy t ] = Ext+1 + cov(xt+1 , y t ) [cov(y t , y t )] 1


E (y t Eyt )
1
= Ext+1 + cov(xt+1 , y t ) [cov(y t , y t )] y t
Overview ML Kalman Filter Estimating DSGEs ML & DSGE Bayesian estimation MCMC Other

Some algebra
Similar to the denition of y t , let
x t+1 = xt+1 b [xt+1 jy t , y t
E 1, , y 1 , x 1 ]
= xt+1 bt xt+1
E

Let x t =Ext x t0
cov(xt+1 , y t ) = Ax t C0 + GV3
cov(y t , y t ) = Cx t C0 + V2

To go from unconditional covariance, cov( ), to conditional x t


requires some algebra (see appendix of Ljungqvist-Sargent for
details)
Overview ML Kalman Filter Estimating DSGEs ML & DSGE Bayesian estimation MCMC Other

Using the derived expressions

b [xt+1 jy t ]
E

1
= Ext+1 + cov(xt+1 , y t ) [cov(y t , y t )] y t

1
= Ext+1 + Ax t C0 + GV3 Cx t C0 + V2 y t (4)
Overview ML Kalman Filter Estimating DSGEs ML & DSGE Bayesian estimation MCMC Other

Derivation Kalman lter

Now get an expression for the second term in (3).

From xt+1 = Axt + Gw1,t+1 , we get


h i h i
b b
E xt+1 jY , x 1 = AE xt jY t
t 1 1 bt
, x 1 = AE 1 xt (5)
Overview ML Kalman Filter Estimating DSGEs ML & DSGE Bayesian estimation MCMC Other

Using (4) and (5) in (3) gives the recursive expression


bt xt+1 = AE
E bt 1 xt + Kt y t

where
1
Kt = Ax t C0 + GV3 Cx t C0 + V2
Overview ML Kalman Filter Estimating DSGEs ML & DSGE Bayesian estimation MCMC Other

Prediction for observable

From
yt+1 = Cxt+1 + w2,t+1
we get
b [yt+1 jYt , x 1 ] = CE
E bt xt+1
Thus
y t+1 = yt+1 b t xt + 1
CE
Overview ML Kalman Filter Estimating DSGEs ML & DSGE Bayesian estimation MCMC Other

Updating the covariance matrix

We still need an equation to update x t . This is actually not


that hard. The result is

x t+1 = Ax t A0 + GV1 G0 Kt (Ax t C0 + GV3 )0

Expression is deterministic and does not depend particular


realizations. That is, precision only depends on the coe cients
of the time series model
Overview ML Kalman Filter Estimating DSGEs ML & DSGE Bayesian estimation MCMC Other

Applications Kalman lter

signal extraction problems


GPS, computer vision applications, missiles
prediction
simple alternative to calculating inverse policy functions
(see below)
Overview ML Kalman Filter Estimating DSGEs ML & DSGE Bayesian estimation MCMC Other

Estimating DSGE models

Forget the Kalman lter for now, we will not use it for a while
What is next?
Specify the neoclassical model that will be used as an example
Specify the linearized version
Specify the estimation problem
Maximum Likelihood estimation
Explain why Kalman lter is useful
Bayesian estimation
MCMC, a necessary tool to do Bayesian estimation
Overview ML Kalman Filter Estimating DSGEs ML & DSGE Bayesian estimation MCMC Other

Neoclassical growth model


First-order conditions

h i
ct
= Et ct+1 (zt+1 kt 1 +1 )

ct + kt = zt kt 1 + (1 ) kt 1

zt = (1 ) + zt 1 + t

t N (0, 2 )

= f , , , , , g
Overview ML Kalman Filter Estimating DSGEs ML & DSGE Bayesian estimation MCMC Other

Policy functions

FOCs are not like

yt = a0 + a1 xt + t , t N 0, 2

But the policy functions are.similar

kt = g(kt 1 , zt ; )
ct = h(kt 1 , zt ; )
zt = (1 ) + zt 1 + t
Overview ML Kalman Filter Estimating DSGEs ML & DSGE Bayesian estimation MCMC Other

Policy functions

Problems:
functional form of policy functions not known
they are nonlinear

Solution to both problems:


use linearized approximations around steady state and treat
these as the truth
Overview ML Kalman Filter Estimating DSGEs ML & DSGE Bayesian estimation MCMC Other

Steady state

steady state solution when


no uncertainty, i.e., = 0
no transition left
Overview ML Kalman Filter Estimating DSGEs ML & DSGE Bayesian estimation MCMC Other

Steady state

no uncertainty =) no Et [ ] in equations
no transition =) zt = zt 1 and ct = ct+1

z = (1 ) + z =) z = 1
1/(1 )

c
= c
(k 1
+1 ) =) k =
1 (1 )
budget constraint =) c = k k
Overview ML Kalman Filter Estimating DSGEs ML & DSGE Bayesian estimation MCMC Other

Back to FOCs

FOC can be written as



zt kt 1 + (1 ) kt 1 kt
h i
1
= Et (zt+1 kt + (1 ) kt kt+1 ) (zt+1 kt +1 )

or h i
Et F(k t t , z t+1 ; ) = 0
1 , kt , kt+1 , z

where
k t = kt z t = zt
k, z
Overview ML Kalman Filter Estimating DSGEs ML & DSGE Bayesian estimation MCMC Other

linearized policy functions

Getting linearized policy functions correct in general is doable


but not trivial
I just give rough idea for this simple example
Overview ML Kalman Filter Estimating DSGEs ML & DSGE Bayesian estimation MCMC Other

linearized policy functions

h i
Et F(k t 1 , k t , k t+1 , z t , z t+1 ; ) = 0
h i

=) Et kt+1 + 1 kt + 2 kt 1 + 3 zt + 4 zt+1 = 0
h i
=) Et k t+1 + 1 k t + 2 k t 1 + 3 z t = 0, where 3 = 3 + 4

The coe cients are known functions of


Overview ML Kalman Filter Estimating DSGEs ML & DSGE Bayesian estimation MCMC Other

linearized policy functions

Conjecture that solution is as follows:

k t = ak,k k t 1 + ak,z z t

now we just have to solve for ak,k and ak,z


Overview ML Kalman Filter Estimating DSGEs ML & DSGE Bayesian estimation MCMC Other

linearized policy functions

Plug conjecture into linearlized Euler equation gives


0=h i 0=
Et ak,k k t + ak,z z t+1 ak,k ak,k k t 1 + ak,z z t + ak,z zt
+1 ak,k k t 1 + ak,z z t +1 ak,k k t 1 + ak,z z t
+ k t 1 + z t
2 3 + k t 1 + z t
2 3
Overview ML Kalman Filter Estimating DSGEs ML & DSGE Bayesian estimation MCMC Other

linearized policy functions

This has to hold for all k t 1 and z t =)

a2k,k + 1 ak,k + 2 = 0 and


ak,k ak,z + ak,z + 1 ak,z + 3 = 0

Concavity implies that only one solution for ak,k is less than 1
Overview ML Kalman Filter Estimating DSGEs ML & DSGE Bayesian estimation MCMC Other

Linearized solution

kt = k + ak,k (kt 1 k ) + ak,z (zt z )


zt = (1 ) + zt 1 + t
t N (0, 2 )
z0 N (1, 2 /(1 2 )
k0 is given

ak,k , ak,z , and k are known functions of the structural parameters


=) better notation would be ak,k (), ak,z (), and k ()
Consumption has been substituted out
Approximation error is ignored; linearized model is treated as
the true model with as the parameters
Overview ML Kalman Filter Estimating DSGEs ML & DSGE Bayesian estimation MCMC Other

Linearized solution & approximation error

Approximation error is ignored


This is ne for simple models with only aggregate risk
But never forget these are approximations
in particular; ak,k () and ak,z () do not depend on ; this is
called certainty equivalence
Overview ML Kalman Filter Estimating DSGEs ML & DSGE Bayesian estimation MCMC Other

Estimation problem

Given data for capital, fkt gT0 , estimate the set of coe cients,

= [, , , , , , z0 ]

No data on productivity, zt .
If you had data on zt =) Likelihood = 0 for sure
More on this below.
Overview ML Kalman Filter Estimating DSGEs ML & DSGE Bayesian estimation MCMC Other

Formulation of the Likelihood

Let YT be the complete sample


T
L(YT j) = p(z0 ) p(zt jzt 1)
t=1

p(zt jzt 1) corresponds with probability of a particular value for t


Overview ML Kalman Filter Estimating DSGEs ML & DSGE Bayesian estimation MCMC Other

Formulation of the Likelihood

Basic idea:

Given a value for and give the data set, YT , you can
calculate the implied values for t

We know the distribution of t =)

We can calculate the probability (likelihood) of f1 , , T g


Overview ML Kalman Filter Estimating DSGEs ML & DSGE Bayesian estimation MCMC Other

Formulation of the Likelihood

kt = k + ak,k (kt 1 k ) + ak,z (zt z )


=)

ak,z z k + ak,k k ak,k 1


zt = kt 1 + kt
ak,z ak,z ak,z

zt = b0 + b1 kt 1 + b2 kt

t = zt (1 ) zt 1
Overview ML Kalman Filter Estimating DSGEs ML & DSGE Bayesian estimation MCMC Other

Formulation of the Likelihood

t is obtained by inverting the policy function

For larger systems, this inversion is not as easy to implement.


Below, we show an alternative
Overview ML Kalman Filter Estimating DSGEs ML & DSGE Bayesian estimation MCMC Other

Formulation of the Likelihood

A bit more explicit


Take a value for
Given k0 and k1 you can calculate z1
Given z0 you can calculate 1
Continuing, you can calculate t 8t
To make explicit the dependence of t on , write t ()
The Likelihood can thus be written as
( )
T
1 (t ())2
p2 exp 22
t=1
Overview ML Kalman Filter Estimating DSGEs ML & DSGE Bayesian estimation MCMC Other

Too few unobservables & singularities

Above we assumed that there was no data on zt


Suppose you had data on zt

There are two cases to consider


Data not exactly generated by this model (most likely case)
=) Likelihood = 0 for any value of
Data is exactly generated by this model
=) Likelihood = 1 for true value of and
=) Likelihood = 0 for any other value for
Overview ML Kalman Filter Estimating DSGEs ML & DSGE Bayesian estimation MCMC Other

Too few unobservables & singularities

kt = k + ak,k (kt 1 k ) + ak,z (zt z )

z , ak,k , and ak,z .


Using the values for 4 periods, you can pin down k,
What about values for additional periods?
Data generated by model (unlikely of course)
=) additional observations will t this equation too
Data not generated by model
=) additional observations will not t this equation
=) Likelihood = zero
Overview ML Kalman Filter Estimating DSGEs ML & DSGE Bayesian estimation MCMC Other

Too few unobservables & singularities

Cant I simply add an error term?

kt = k + ak,k (kt 1 k ) + ak,z (zt z ) + ut

Answer: NO not in general


Why not? It is ok in standard regression
Overview ML Kalman Filter Estimating DSGEs ML & DSGE Bayesian estimation MCMC Other

Too few unobservables & singularities

Why is the answer NO in general?

1 ut represents other shocks such as preference shocks


ak,k , and ak,z
=) its presence is likely to aect k,
2 ut represents measurement error
=) you are ne from an econometric stand point
=) but is residual only measurement error?
Overview ML Kalman Filter Estimating DSGEs ML & DSGE Bayesian estimation MCMC Other

What if you also observe consumption?

Suppose you observe kt , ct , but not zt ?

kt = k + ak,k (kt 1 k ) + ak,z (zt z )


ct = c + ac,k (kt 1 k ) + ac,z (zt z )

Recall that the coe cients are functions of


Given value of you can solve for zt from top equation
Given value of you can solve for zt from bottom equation
With real world data you will get inconsistent answers.
Overview ML Kalman Filter Estimating DSGEs ML & DSGE Bayesian estimation MCMC Other

Unobservables and avoiding singularities

General rule:

For every observable you need at least one unobservable shock


Letting them be measurement errors is hard to defend
The last statement does not mean that you cannot also add
measurement errors
Overview ML Kalman Filter Estimating DSGEs ML & DSGE Bayesian estimation MCMC Other

Using the Kalman lter

xt+1 = Axt + Gw1,t+1 (6)


yt = Cxt + w2,t (7)

(6) describes the equations of the model;


xt consists of the "true" values of state variables like capital
and productivity.
(7) relates the observables, yt , to the "true" values
Overview ML Kalman Filter Estimating DSGEs ML & DSGE Bayesian estimation MCMC Other

Example

consumption and capital are observed with error


ct = ct + uc,t
kt = kt + uk,t
zt is unobservable

zt
xt0 = [kt 1 k, 1 z ]
w1,t+1 = t
c
yt0 = [kt 1 k, c ]
t
Overview ML Kalman Filter Estimating DSGEs ML & DSGE Bayesian estimation MCMC Other

Example

(6) gives policy function for kt and law of motion for zt

kt k ak,k ak,z kt 1 k 0
= +
zt+1 z 0 zt z t+1

Equation (7) is equal to


2 3 2 3 2 3
kt 1 k 1 0 uk,t
4 ct c 5 = 4 ac,k ac,z 5 kt 1 k
+4 0 5
zt z
ct c ac,k ac,z uc,t
Overview ML Kalman Filter Estimating DSGEs ML & DSGE Bayesian estimation MCMC Other

Back to the Likelihood

yt consists of kt and ct and the model is given by (6) and (7).


From the Kalman lter we get y t and y t
h i h i
b xt jYt
E 1
, x 1 = AEb xt 1 jYt 2 , x 1 + Kt t 1
1y
h i h i
b
E yt jYt 1 b t 1
, x 1 = CE xt jY , x 1
h i
y t = yt E b yt jYt 1 , x 1

x t+1 = Ax t A0 + GV1 G0 Kt (Ax t C + GV3 )0


y t = Cx t C0 + V2
Overview ML Kalman Filter Estimating DSGEs ML & DSGE Bayesian estimation MCMC Other

Back to the Likelihood

y t+1 is normally distributed because


this is a linear model and underlying shocks are linear
Kalman lter generates y t+1 and y t
(given and observables, YT )
Given normality calculate likelihood of fy t+1 g
Overview ML Kalman Filter Estimating DSGEs ML & DSGE Bayesian estimation MCMC Other

Kalman Filter versus inversion

with measurement error


have to use Kalman lter

withour measurement error


could back out shocks using inverse of policy function
but could also use Kalman lter
Dynare always uses the Kalman lter
hardest part of the Kalman lter is calculating the inverse of
Cx t C0 + V2 and this is typically not a di cult inversion.
Overview ML Kalman Filter Estimating DSGEs ML & DSGE Bayesian estimation MCMC Other

Log-Likelihood

1
ln L(YT j) = x00 bx 1b
nx ln(2 ) + ln(jbx0 j) + b x0
2 0
!
1 T h i
Tny ln(2 ) + ln(jbyt j) + b
yt0 byt 1b
yt
2 t=1

ny : dimension of y t
Overview ML Kalman Filter Estimating DSGEs ML & DSGE Bayesian estimation MCMC Other

For the neo-classical growth model

Start with x1 = [k0 , z0 ], y1 = k0 , and 1


Calculate

y 1 = y1 b [ y1 j x1 ]
E
= y1 Cx1

b [x2 jy1 , x1 ] using


Calculate E

bt xt+1 = AE
E bt 1 xt + Kt y t

where
1
Kt = Ax t C0 + GV3 Cx t C0 + V2
Overview ML Kalman Filter Estimating DSGEs ML & DSGE Bayesian estimation MCMC Other

For the neo-classical growth model

Calculate

y 2 = y2 b [y2 jy1 , x1 ]
E
= y2 CEb [x2 jy1 , x1 ]

etc.
Overview ML Kalman Filter Estimating DSGEs ML & DSGE Bayesian estimation MCMC Other

Bayesian Estimation

Conceptually, things are not that dierent


Bayesian econometrics combines
the likelihood, i.e., the data, with
the prior

You can think of the prior as additional data


Overview ML Kalman Filter Estimating DSGEs ML & DSGE Bayesian estimation MCMC Other

Posterior

The joint density of parameters and data is equal to

P(YT , ) = L(YT j)P() or

P ( YT , ) = P ( j YT ) P ( YT )
Overview ML Kalman Filter Estimating DSGEs ML & DSGE Bayesian estimation MCMC Other

Posterior
L ( YT j ) P ( )
From this we can get Bayes rule: P(jYT ) = P ( YT )

Reverend Thomas Bayes (1702-1761)


Overview ML Kalman Filter Estimating DSGEs ML & DSGE Bayesian estimation MCMC Other

Posterior

For the distribution of , P(YT ) is just a constant.

Therefore we focus on
L ( YT j ) p ( )
L ( YT j ) P ( ) _ = P ( j YT )
P ( YT )

One can always make L(YT j)P() a proper density by


scaling it so that it integrates to 1
Overview ML Kalman Filter Estimating DSGEs ML & DSGE Bayesian estimation MCMC Other

Evaluating the posterior


Calculating posterior for given value of not problematic.
But we are interested in objects of the following form
h i R
T g()P(jYT )d
E g()jY = R
P(jYT )d

Examples
to calculate the mean of , let g() =
to calculate the probability that 2 ,
let g() = 1 if 2 and
let g() = 0 otherwise
to calculate the posterior for jth element of
g( ) = j
Overview ML Kalman Filter Estimating DSGEs ML & DSGE Bayesian estimation MCMC Other

Evaluating the posterior

Even Likelihood can typically only be evaluated numerically

Numerical techniques also needed to evaluate the posterior


Overview ML Kalman Filter Estimating DSGEs ML & DSGE Bayesian estimation MCMC Other

Evaluating the posterior

Standard Monte Carlo integration techniques cannot be used


Reason: cannot draw random numbers directly from P(jYT )
being able to calculate P(jYT ) not enough to create a
random number generator with that distribution

Standard tool: Markov Chain Monte Carlo (MCMC)


Overview ML Kalman Filter Estimating DSGEs ML & DSGE Bayesian estimation MCMC Other

Metropolis & Metropolis-Hasting

Metropolis & Metropolis-Hasting are particular versions of the


MCMC algorithm

Idea:
travel through the state space of
weigh the outcomes appropriately
Overview ML Kalman Filter Estimating DSGEs ML & DSGE Bayesian estimation MCMC Other

Metropolis & Metropolis-Hasting

Start with an initial value, 0


discard the beginning of the sample, the burn-in phase, to
ensure choice of 0 does not matter
Overview ML Kalman Filter Estimating DSGEs ML & DSGE Bayesian estimation MCMC Other

Metropolis & Metropolis-Hasting

Subsequent values, i+1 , are obtained as follows

Draw using the "stand in" density f ( ji , f )


f contains the parameters of f ( )

is a candidate for i+1


i+1 = with probability q(i+1 ji )
i+1 = i with probability 1 q(i+1 ji )
Overview ML Kalman Filter Estimating DSGEs ML & DSGE Bayesian estimation MCMC Other

Metropolis & Metropolis-Hasting

properties of f ( )

f ( ) should have fat tails relative to the posterior


that is, f ( ) should "cover" P(jYT )
Overview ML Kalman Filter Estimating DSGEs ML & DSGE Bayesian estimation MCMC Other

Metropolis (used in Dynare)

P ( j YT )
q(i+1 ji ) = min 1,
P ( i j YT )

P ( j YT ) P(i jYT ) =)
always include candidate as new element
P( jYT ) < P(i jYT ) =)
not always included; the lower P( jYT ) the lower the
chance it is included
Overview ML Kalman Filter Estimating DSGEs ML & DSGE Bayesian estimation MCMC Other

Metropolis-Hasting

" #
P( jYT )/f ( ji , f )
q(i+1 ji ) = min 1,
P(i jYT )/f (i j , f )

P( jYT )/f ( ji , f ) high:


probability of high & should be included with high prob.
P(i jYT )/f (i j , f ) low =)
you should move away from this value =) q should be high
If f ( ) symmetric (as with random walk), then f ( ) terms drop
out and MH is M.
Overview ML Kalman Filter Estimating DSGEs ML & DSGE Bayesian estimation MCMC Other

Choices for f(.)

Random walk MH:

= i + with E [] = 0

and, for example,


N (0, 2f )

Independence sampler:

f ( j i , f ) = f ( j f )
Overview ML Kalman Filter Estimating DSGEs ML & DSGE Bayesian estimation MCMC Other

Couple more points

Is the singularity issue dierent with Bayesian statistics?


Choosing prior
Gibbs sampler
Overview ML Kalman Filter Estimating DSGEs ML & DSGE Bayesian estimation MCMC Other

The singularity problem again

What happens in practice?

lots of observations are available


practioners dont want to exclude data =)

add "structural" shocks


Overview ML Kalman Filter Estimating DSGEs ML & DSGE Bayesian estimation MCMC Other

The singularity problem again

Problem with adding additional shocks

measurement error shocks


not credible that this is reason for gap between model and data
structural shocks
good reason, but wrong structural shocks =) misspecied
model
Overview ML Kalman Filter Estimating DSGEs ML & DSGE Bayesian estimation MCMC Other

Possible solution to singularity problem?

Todays posterior is tomorrows prior


Overview ML Kalman Filter Estimating DSGEs ML & DSGE Bayesian estimation MCMC Other

Possible solution to singularity problem?

Suppose you want the following:


use 2 observables and
only 1 structural shock
Overview ML Kalman Filter Estimating DSGEs ML & DSGE Bayesian estimation MCMC Other

Possible solution to singularity problem?

1 Start with rst prior: P1 ()


2 Use rst observable Y1T to form rst posterior

F1 () = L(Y1T j)P1 ()

3 Let second prior be rst posterior: P2 () = F1 ()


4 Use second observable Y2T to form second posterior

F2 () = L(Y2T j)P2 ()
Overview ML Kalman Filter Estimating DSGEs ML & DSGE Bayesian estimation MCMC Other

Final answer:

F2 () = L(Y2T j)P2 ()
= L(Y2T j)L(Y1T j)P1 ()

Obviously:

F2 () = L(Y2T j)L(Y1T j)P1 ()


= L(Y1T j)L(Y2T j)P1 ()

Thus, it does not matter which variable you use rst


Overview ML Kalman Filter Estimating DSGEs ML & DSGE Bayesian estimation MCMC Other

Properties of nal posterior

Final posterior could very well have multiple modes


indicates where dierent variables prefer parameters to be

This is only informative, not a disadvantage


Overview ML Kalman Filter Estimating DSGEs ML & DSGE Bayesian estimation MCMC Other

Have we solved the singularity problem?

Problems of approach:
Procedure avoids singularity problem by not considering joint
implications of two observables
Procdure misses some structural shock/misspecication

Key question:
Is this worse than adding bogus shocks?
Overview ML Kalman Filter Estimating DSGEs ML & DSGE Bayesian estimation MCMC Other

How to choose prior

1 Without analyzing data, sit down and think


problem in macro: we keep on using the same data
so is this science or data mining?

2 Dont change prior depending on results


Overview ML Kalman Filter Estimating DSGEs ML & DSGE Bayesian estimation MCMC Other

Uninformative prior

P() = 1 8 2 R =) posterior = likelihood


P () = 1/ (b a) if 2 [a, b] is not uninformative
Which one is the least informative prior?

P () = 1/ (b a) if 2 [a, b]
P (ln ) = 1/ (ln b ln a) if 2 [ln a, ln b]
Overview ML Kalman Filter Estimating DSGEs ML & DSGE Bayesian estimation MCMC Other

Uninformative prior

P() = 1 8 2 R =) posterior = likelihood


P () = 1/ (b a) if 2 [a, b] is not uninformative
Which one is the least informative prior?

P () = 1/ (b a) if 2 [a, b]
P (ln ) = 1/ (ln b ln a) if 2 [ln a, ln b]

The objective of Jereys prior is to ensure that the prior is


invariant to such reparameterizations
Overview ML Kalman Filter Estimating DSGEs ML & DSGE Bayesian estimation MCMC Other

How to choose (not so) informative priors


Let the prior inherit invariance structure of the problem:
1 location parameter: If X is distributed as f (x ), then
Y = X + have the same distribution but a dierent location.
If the prior has to inherit this property, then it should be
uniform.
2 scale parameter: If X is distributed as (1/ ) f (x/ ), then
Y = X has the same distribution as X except for a dierent
scale parameter. If the prior has to inherit this property, then it
should be of the form
P () = 1/

Both are improper priors.


That is, they do not integrate to a nite number.
Overview ML Kalman Filter Estimating DSGEs ML & DSGE Bayesian estimation MCMC Other

Not so informative priors

Let the prior be consistent with "total confusion"

3 probability parameter: If is a probability 2 [0, 1], then the


prior distribution

P() = 1/ ( (1 ))

represents total confusion. The idea is that the elements of the


prior correspond to dierent beliefs and everybody is given a
new piece of info that the cross-section of beliefs would not
change.
See notes by Smith
Overview ML Kalman Filter Estimating DSGEs ML & DSGE Bayesian estimation MCMC Other

Gibbs sampler

Objective: Obtain T observations from p(x1 , , xJ ).


Procedure:
1 Start with initial observation X(0) .
2 Draw period t observation, X(t) , using the following iterative
scheme:
(t)
draw xj from the conditional distribution:
(t) (t) (t 1) (t 1)
p xj j x1 , , xj 1 , xj+1 , , xJ
Overview ML Kalman Filter Estimating DSGEs ML & DSGE Bayesian estimation MCMC Other

Gibbs sampler versus MCMC

Gibbs sampler does not require stand-in distribution


Gibbs sampler still requires the ability to draw from conditional
=) not useful for estimation DSGE models
Overview ML Kalman Filter Estimating DSGEs ML & DSGE Bayesian estimation MCMC Other

References

Chib, S. and Greenberg, E., 1995, Understanding the Metropolis-Hastings Algorithm,


The American Statistician.

describes the basics

Ljungqvist, L. and T.J. Sargent, 2004, Recursive Macroeconomic Theory

source for the description of the Kalman lter

Roberts, G.O., and J.S. Rosenthal, 2004, General state space Markov chains and
MCMC algorithms, Probability Surveys.

more advanced articles describing formal properties


Overview ML Kalman Filter Estimating DSGEs ML & DSGE Bayesian estimation MCMC Other

References

Smith, G.P., Expressing Prior Ignorance of a Probability Parameter, notes, University of


Missouri
http://www.stats.org.uk/priors/noninformative/Smith.pdf
on informative priors

Syversveen, A.R, 1998, Noninformative Bayesian priors. Interpretation and problems


with construction and applications
http://www.stats.org.uk/priors/noninformative/Syversveen1998.pdf
on informative priors

You might also like