Dynamics

Download as pdf or txt
Download as pdf or txt
You are on page 1of 43

Part 6: Dynamic models October 7, 2008

Part VI

Dynamic Econometric Models


reading: G[19,20], DM[13].

1 Introduction to dynamic models

Many models in economics related to dynamic behavior:

• Disequilibrium and incomplete adjustment,

• capital and resource stock utilization and depreciation.

• Whether our initial theory suggests dynamics or not, time series data often

hints at dynamic interactions across time periods.

This large section of the course deals with models in which adjacent observations

are related in one way or another. The most prominent application of this form of

model is with time-series data, although spatial data can have similar characteris-

tics. We will focus on data with temporal relationships.

1.1 General form of a linear dynamic model

A general representation of such a model might be

p
X r
X
yt = µ + γi yt−i + βj xt−j + εt ,
i=1 j=0

where εt might be related to past values of ε, such that

εt = ρ1 εt−1 + ρ2 εt−2 + · · · ρr εt−g + ut (serial correlation), or

εt = ut + θ1 ut−1 + · · · θq ut−q , (moving average error)

Page 73 — WSU Econometrics II 2007,


c Jonathan Yoder. All rights reserved
Part 6: Dynamic models October 7, 2008

We begin by looking closely at univariate time series, where yt is modeled

as a function of only past values of itself. Later we examine Autoregressive


 
Distributed Lag Models with other regressors xt · · · xt−r .

1.2 Necessary conditions for consistency of OLS

1. E[εt |xt−s ] = 0 ∀ s ≥ 0. Implies that εt contains only new info at t. No AR

or MA errors. [note: here x denotes all RHS vars, including lagged dep.

vars.]

PT
1
2. A regularity condition: plim T −s i=s+1 xt x0t−s = Q(s) finite

3. (required for univariate series) yt “stable” :

Pp
(a) Stationary: For a model yt = µ + i γi yt−i + εt , the roots (z) of the

polynomial 1 − γ1 z − γ2 z 2 − · · · − γp z p = 0 are outside the unit circle.

(b) Ergodic: events are asymptotically independent — impact of one on the

other weakens with temporal distance.

4. (required for ARDL models) The regression relationship between y and its

regressors x is “stable” over time: the series are “cointegrated” (note that

each series y, x need not be stationary, just the relationship between them).

5. Asymptotic normality holds with some additional assumptions called the

Mann and Wold conditions, so that b ∼a N β, σ 2 Q(s)−1 .




To summarize: for OLS to be consistent,

• with a lag(s) of the dependent variable on the right-hand-side, errors must

contain only new information (no AR or MA processes).

• You need either stationary series, or a stationary relationship among nonsta-

tionary series.

Page 74 — WSU Econometrics II 2007,


c Jonathan Yoder. All rights reserved
Part 6: Dynamic models October 7, 2008

When one or more of these conditions doesn’t hold, a number of approaches are

used for consistent parameter estimation, including pre-estimation data differenc-

ing, instrumental variables methods, and/or conditional Maximum Likelihood or

Nonlinear Least Squares.

2 Univariate time series models

Reading: G[20], DM[13], K[18]

2.1 lag operators

The lag operator L is defined such that

Lxt = xt−1

L(Lxt ) = L2 xt = xt−2

Lp xt = xt−p

Lq (Lp xt ) = Lp+q xt = xt−p−q

In some cases we want to work in changes, or differences of data across observations:

(1 − L)xt = xt − xt−1 = ∆xt (first-difference)

(1 − L)(1 − L)xt = (1 − L)(xt − xt−1 )

= (xt − xt−1 ) − (xt−1 − xt−2 )

= ∆2 xt (second-difference)

Now consider the infinite series


X 1
A(L) = 1 + aL + (aL)2 + (aL)3 + · · · = (aL)i = .
i=0
(1 − aL)

Page 75 — WSU Econometrics II 2007,


c Jonathan Yoder. All rights reserved
Part 6: Dynamic models October 7, 2008

this implies, for example, that

∞ ∞ ∞
xt X X X
= xt (aL)i = (aL)i xt = ai xt−i .
(1 − aL) i=0 i=0 i=0

Infinite series of lags on xt , but one associated parameter, a.

1. This section focuses on the univariate model

C(L)yt = µ + R(L)εt ,

which is called Autoregressive Moving Average model, or ARMA(p,q),

where p is the order of C(L) and q is the order of R(L).

2. An Autoregressive Integrated Moving Average, or ARIMA(p,d,q) model

corresponds to data that must be differenced d times to ensure stationarity,

which is a necessary condition for univariate estimation.

3. When a moving average disturbance process exist, then nonlinear least squares

or Maximum likelihood conditional on the initial observations is a consistent

estimator of ARIMA parameters.

4. We will define autocorrelation functions and partial autocorrelation

functions. The empirical counterparts to these will help us to identify the

dynamic characteristics of a data series

2.2 Stationarity and Invertibility

C(L)yt = µ + R(L)εt

where C(L) is represents an AR process in y and R(L) represents a moving average

process in ε, such as

R(L)εt = (1 − θ1 L − θ2 L2 )εt = εt − θ1 εt−1 − θ2 εt−2

Page 76 — WSU Econometrics II 2007,


c Jonathan Yoder. All rights reserved
Part 6: Dynamic models October 7, 2008

For estimation of ARIMA models, we require two things:

1. (Weak) stationarity of y for finite and stable asymptotic variance.

(a) E[yt ] is independent of t.

(b) Var[yt ] is finite and independent of t.

(c) Cov[yt , ys ] is a function of t − s, not t.

2. weak stationarity is satisfied if the root(s) z of C(z) lie outside the unit circle.

For example,

(a) AR(1) process in y,:the characteristic equation is c(z) = 1 − γz = 0 for

stationarity, z = 1/γ must be larger than one.

(b) AR(2) in y: c(z) = 1−γ1 z −γ2 z 2 = 0. The series is covariance stationary

if |γ2 | < 1, γ1 + γ2 < 1, and γ2 − γ1 < 1.

3. Invertibility of R(L). Invertibility requires the root(s) of R(z) to lie outside

the unit circle (same requirement as above for C(z)).

4. We need invertibility so that we can define the disturbances as an autoregres-

sive process of y:
µ
εt = D(L)yt −
R(L)

2.3 Nonstationarity and Integrated series

Consider the following time series specifications:


Random Walk: yt = µ + εt , εt = εt−1 + ut or equivalently yt = yt−1 + ut

Random Walk with a drift: yt = µ + yt−1 + ut

Trend stationary: yt = µ + βt + ut
Each of these can be characterized as (1 − L)yt = α + εt , where εt is white noise.

Page 77 — WSU Econometrics II 2007,


c Jonathan Yoder. All rights reserved
Part 6: Dynamic models October 7, 2008

For example, consider the trend stationary case. Take first differences:

∆yt = (1 − L)yt = (µ + βt + ut ) − (µ + β(t − 1) + ut−1 )

= β + εt .

where α = β, εt = ∆ut , is also white noise, but is a non-invertible moving average

error, MA(1). In any of these cases, the root of the characteristic equation for

(1 − L) equals one — a unit root; so y is nonstationary series.

2.4 Consequences of nonstationarity

non-constant variances: Consider the random walk, starting from period 1.

y1 = y0 + u1

y2 = (y0 + u1 ) + u2
..
.
X
yt = y0 + ut .
t
X X
E[yt ] = y0 + ut = y0 ; Var[yt ] = σu2 = tσ 2 .
t t

Spurious inference: suppose yt = γyt−1 + εt , and γ does in fact equal 1. it has

been shown that:

1. In finite samples, γ̂OLS is biased downward (away from 1),

2. γ̂OLS converges to its probability faster than normal, so we reject the null

hypothesis of γ = 1 too often with standard t and z tests.

Page 78 — WSU Econometrics II 2007,


c Jonathan Yoder. All rights reserved
Part 6: Dynamic models October 7, 2008

2.4.1 Integrated processes and differencing

• Take a random walk with a drift, and substitute lags back to infinity, and you

get:

X
yt = µ + yt−1 + εt = (µ + εt−i )
i=0

The expected value and variance grows to infinity as t grows.

• Now take the first difference:


X ∞
X
yt − yt−1 = (µ + εt−i ) − (µ + εt−i )
i=0 i=1

∆yt = µ + εt

which is a white noise process.

• Note that a unit root implies that past levels of y do not provide any infor-

mation for explaining the change y at t.

• The series yt is integrated of order one, or I(1) because first difference of

yt are stationary.

• A series that becomes stationary after d differences is I(d); integrated of order

d.

• For an individual (univariate) time series, the solution to the estimation prob-

lems associated with a unit root is to difference the data until it is I(0). Then

proceed with estimation.

• For two or more related series in a regression, differencing of the data is not

necessary for consistent parameter estimation if the series are cointegrated.

More on this later.

Now consider how to test for nonstationarity of a single series.

Page 79 — WSU Econometrics II 2007,


c Jonathan Yoder. All rights reserved
Part 6: Dynamic models October 7, 2008

2.4.2 Dickey-Fuller stationarity tests

Consider a univariate autoregressive relationship that nests the possibility of a ran-

dom walk, a random walk with a drift, and a trend stationary series:

(1 − γL)(yt − α − βt) = εt

(1 − γL)yt − (1 − γ)α − (1 − γL)βt = εt

Note that (1 − γL)βt = β(t − γ(t − 1)) = β(1 − γ)t + βγ, so

yt = [α(1 − γ) + βγ] + β(1 − γ)t + γyt−1 + εt .

A more convenient form is generated by subtracting yt−1 from both sides:

∆yt = [α(1 − γ) + βγ] + β(1 − γ)t + (γ − 1)yt−1 + εt ,

define γ ∗ = γ − 1,

= [−αγ ∗ + β(γ ∗ − 1)] − βγ ∗ t + γ ∗ yt−1 + εt .

Given a unit root, γ ∗ = 0, and the above equation collapses to

∆yt = β + εt ,

which is stationary as a first difference.

Constant and trend :∆yt = µ + βt + γ ∗ yt−1

Constant only :∆yt = µ + γ ∗ yt−1

no constant, trend :∆yt = γ ∗ yt−1

Page 80 — WSU Econometrics II 2007,


c Jonathan Yoder. All rights reserved
Part 6: Dynamic models October 7, 2008

Figure 1: DF and ADF critical Values

• The null hypothesis is that γ ∗ = 0, or that there IS a unit root.

• γ ∗ ≤ 0.

• The test statistic is calculated as the usual t-statistic: γ̂ ∗ /se(γ̂ ∗ ) but it does

not have a t standard distribution. The statistic is compared to specific dis-

tributions developed by Dickey and later improved by MacKinnon.

• Table 20.5 (Greene) provides three sets of critical values If you KNOW your

model doesn’t include a constant and/or a trend, then the appropriate critical

values will provide a more powerful test (reduce the chances of a type 2 error

– failure to reject H0 when H0 is false).

• If |γ̂ ∗ /se(γ̂ ∗ )| < |critical value|, then fail to reject null, so difference or de-

trend the data. Else proceed with estimation on the original (nondifferenced)

data.

Page 81 — WSU Econometrics II 2007,


c Jonathan Yoder. All rights reserved
Part 6: Dynamic models October 7, 2008

2.4.3 DF tests for AR(p) processes

Same concept, but add lagged differences (out to p − 1) on the right hand side:

∆yt = µ + βt + γ ∗ yt−1 + φ1 ∆yt−1 + φ2 ∆yt−2 + · · · + φp ∆yt−(p−1) .

where, ∆yt−i is the ith lag of the first-difference of the dependent variable (first

difference of yt ). The Augmented Dickey Fuller test is carried out by testing β =

γ ∗ = 0. Compare to the ADF critical values.

Once we have have a stationary series, then we can begin to examine the data

series model more closely and (ultimately) consistently estimate the parameters of

a model.

2.5 Autocovariances and Autocorrelations

Our goal is to characterize the relationship between yt and lags of yt . the estimated

counterparts to various autocorrelation functions help us do that; which will in turn

provide guidance for how to estimate the dynamic structure of a time-series.

Notation:

• λk = Cov[yt , yt−k ] is the autocovariance coefficient between yt and yt−k .

λ0 ≡ Var[yt ].

• The Autocovariance function is the (possibly infinite) series of covariances

λ0 , λ1 , λ2 · · · .

E[yt yt−k ] λk
• The autocorrelation coefficient is ρk = √ √ = λ0 .
Var(yt ) Var(yt−k )

• The Autocorrelation function (ACF) is the autocovariance function di-

vided through by λ0 .

Page 82 — WSU Econometrics II 2007,


c Jonathan Yoder. All rights reserved
Part 6: Dynamic models October 7, 2008

For estimation we will work mainly with the ACF and the Partial Autocor-

relation function (PACF), to be defined later, but we need the autovariances to

calculate the ACFs.

2.5.1 Variance of y

In general, C(L)yt = µ + R(L)εt , so

µ
yt = + A(L)εt where A(L) = R(L)/C(L).
C(1)

µ X
= + αi εt−i . then
C(1) i=0

X
λ0 = Var[yt ] = αi2 σε2 .
i=0

2.5.2 Example: ACF for AR(2)

yt = γ1 yt−1 + γ2 yt−2 + εt , the ACF is a function of γ1 , γ2 , and σε2 . Start by

multiplying both sides by yt and take expectations:

E[yt yt ] = E[yt (γ1 yt−1 + γ2 yt−2 + εt )] = λ0

= γ1 λ1 + γ2 λ2 + E[yt εt ].

E[yt εt ] = E[(γ1 yt−1 + γ2 yt−2 + εt )εt ] = E[εt εt ] = σε2 , so

λ0 = γ1 λ1 + γ2 λ2 + σε2 . similarly,

λ1 = E[yt yt−1 ] = E[yt−1 (γ1 yt−1 + γ2 yt−2 + εt )] = γ1 λ0 + γ2 λ1

λ2 = E[yt yt−2 ] = E[yt−1 (γ1 yt−2 + γ2 yt−3 + εt )] = γ1 λ1 + γ2 λ0

Page 83 — WSU Econometrics II 2007,


c Jonathan Yoder. All rights reserved
Part 6: Dynamic models October 7, 2008

Summary of 3 equations, three unknowns (λ0 , λ1 , λ2 ):

λ0 = γ1 λ1 + γ2 λ2 + σε2

λ1 = γ1 λ0 + γ2 λ1

λ2 = γ1 λ1 + γ2 λ0

σε2
solve for λ0 to get λ0 = σy2 = 1−γ12 (1−γ2 )−1 +γ2
, then plug this into the formulas for

λ1 and λ2 to get the autocoviances. The Autocorrelation coefficients for the first

two lags are

λ1 γ1
ρ1 = = γ1 + γ2 ρ1 =⇒ ρ1 =
λ0 1 − γ2
λ2 γ12
ρ2 = = γ1 ρ1 + γ2 =⇒ ρ2 = + γ2
λ0 1 − γ2
Generally,

λk = E[yt yt−k ] = E[yt−k (γ1 yt−1 + γ2 yt−2 + εt )]

= γ1 λk−1 + γ2 λk−2 ,

and the ACF for an AR(2) isρk = γ1 ρk−1 + γ2 ρk−2 .

Exercise: What’s the ACF for and AR(1)?

2.5.3 ACF for MA(q)

yt = εt − θ1 εt−1 − · · · − θq εt−q
!
X
γ0 = E[yt2 ] = σε2 1 + θq2
q
q
!
X
γk = Cov[yt , yt−k ] = σε2 −θ1 + θi−1 θi for i ≥ q.
i=2

Page 84 — WSU Econometrics II 2007,


c Jonathan Yoder. All rights reserved
Part 6: Dynamic models October 7, 2008

NOTE: the ACF for an MA(q) goes to zero abruptly at lag q.

exercises: Derive the ACF for an MA(1), then for an ARMA(1,1): (1 − γL)yt =

(1 − θL)εt .

2.6 Partial Autocorrelation coefficients

The Autocorrelation is the gross correlation λk between yt and yt−k . The partial

autocorrelation, ρ∗k is the simple linear correlation between yt and yt−k after

accounting for the effects of intervening lags.

e.g., for AR(1), yt = γyt−1 +εt , the correlation coefficient is ρ2 = corr(yt , yt−2 ) =

γ 2 . If we remove the effect of yt−1 on yt , then yt−2 will have no effect on yt .

ρ∗k can be calculated as the k th parameter estimate in an AR(k) process:

yt = ρ∗1 yt−1

yt = β1 yt−1 + ρ∗2 yt−2

yt = β1 yt−1 + β2 yt−2 + ρ∗3 yt−3

etc.

• ρ1 = ρ∗1 for any process.

• For an ARMA(p,0) process, ρ∗k = 0 for k > p.

Page 85 — WSU Econometrics II 2007,


c Jonathan Yoder. All rights reserved
Part 6: Dynamic models October 7, 2008

PACF for ARMA(0,q): yt = µ + R(L)εt , and with invertibility R(L)−1 (yt −

µ) = εt . This implies:

µ X
yt = + πi yt−i + εt ,
R(L) i=1

which is an AR(∞). Given invertibility, the πi will tend to dampen as i become

larger.

2.7 Sample counterpart to the ACF, PACF

The sample ACF is often called the autocorrelogram or periodogram , and

is calculated just as any standard correlation coefficient. Given that y is measured

in deviations from mean ȳ,

PT
t=k+1 yt yt−k
rk = PT .
2
t=1 yt

The sample PACF , or the partial autocorrelogram could be estimated based

on the progressive regressions above, but usually they are estimated in the following
 
way: Regress yt and yt−k (separately) on 1 yt−1 . . . yt−(k−1) , and calculate

the residuals, yt∗ and yt−k



from these regressions.

PT ∗
yt∗ yt−k
rk∗ = PTt=k+1
∗ 2
t=k+1 (yt−k )

Seasonality in ARMA models The basics are a straightforward extension of

basic ARMA processes. we will not discuss in class, but understand how to im-

plement seasonal differences and seasonal lags in an ARIMA framework. Refer to

Greene section 18.3.5 (very short).

Page 86 — WSU Econometrics II 2007,


c Jonathan Yoder. All rights reserved
Part 6: Dynamic models October 7, 2008

2.7.1 Summary of the relationship between ACF and PACF

Process ACF PACF

AR(p) Infinite (dampens) Finite (zero after lag p)

MA(q) Finite (zero after lag q) Infinite (dampens)

ARMA(p,q) Infinite (dampens) Infinite (dampens)

Example: AR(1), MA(1)


AR(1): yt = γyt−1 + εt MA(1): yt = εt − θεt−1

ACF ρk = γ k ρ1 = −θ/(1 + θ); ρi>1 = 0

PACF yt = ρ∗1 yt−1 + εt ; ρ∗i>1 = 0 ρ∗k = θk

2.8 Modeling procedure for ARIMA models

The Wold Decomposition theorem states that every zero-mean covariance stationary

series C(L)yt = µ + R(L)εt can be represented as

p
X ∞
X
yt = γi yt−i + πi εt−i .
i=1 i=0

We cannot estimate the infinite series of πi So we compromise and choose an

AR(p) and an MA process with finite q to best fit the data.

The Box-Jenkins approach to ARIMA(p,d,q) modeling is often broken down

into 3 steps:

• identification, which refers to identification of the best lag structure (p,d,q)

for the series.

1. First determine the need to difference or de-trend using Dickey-Fuller

tests or related tests; then difference the data for stationarity.

2. Given a stationary series, we use estimated autocorrelation functions and

tests for white noise (i.i.d.) errors to determine the appropriate lag struc-

ture.

Page 87 — WSU Econometrics II 2007,


c Jonathan Yoder. All rights reserved
Part 6: Dynamic models October 7, 2008

• estimation, which refers to the process of estimating the parameters based

on either OLS, or if necessary conditional NLS or ML methods.

• Forecasting based on the estimated parameters and their estimated vari-

ances.

NOTE: this class will only scratch the surface of ARIMA modeling.

Take STAT 516 for a more complete treatment.

2.8.1 Identification and testing for white noise

The goal of the Box-Jenkins approach is to specify the most parsimonious model

that provides you with white noise errors.

Once you have a stationary series, you can test for non-zero ACFs and PACFs.

ACFs and PACs will be approximately N(0,1/T) under the null hypothesis of white

noise. A test statistic for the joint test of whether all elements of the ACF (PACF)

are white noise up to lag p is Ljung-Box Q statistic:

p
0
X rk2
Q = T (T + 2) .
T −k
k=1

Process of model identification for a stationary series:

1. generate the autocorrelogram and partial autocorrelogram for the series. If,

for any lag, the Q statistic leads to rejection of the null hypothesis of white

noise in the ACF and/or PACF, then add AR or MA model components based

on the structure of the ACF and PACF.

2. Test for white noise in the residuals of the ARIMA model you have specified.

If tests indicate white noise for all lags in the correlograms, then stop. Else,

respecify the model and check the residuals again for white noise.

3. Akaike and Schwartz information criterion can also be used to help select a

model specification.

Page 88 — WSU Econometrics II 2007,


c Jonathan Yoder. All rights reserved
Part 6: Dynamic models October 7, 2008

2.8.2 Estimation of ARIMA models

the Wold Decomposition theorem states that every zero-mean covariance stationary

series C(L)yt = µ + R(L)εt can be represented as

p
X ∞
X
yt = γi yt−i + πi εt−i .
i=1 i=0

If we are able to specify our model such that πi = 0 for all i (that is, no MA

disturbance process), then a the model

p
X
yt = γi yt−i + εt , εt ∼ white noise
i=1

could be estimated consistently with OLS. If MA errors persist, Nonlinear models

are required.

Consider a simple model with MA(1) errors ut = θεt−1 − εt , and start at t = 1.

y1 = µ + ε1 − θε0 → ε1 = y1 − µ + θε0

y2 = µ + ε2 − θε1 = (1 + θ)µ − θy1 − θ2 ε0 + ε2


..
.···
t−1
X t−1
X
s
yt = µ θ − θs yt−s − εt − αs ε0 .
s=0 s=1

• If ε0 = 0 this is just a nonlinear regression function of µ and θ that explicitly

depends on the sample size. Thus, nonlinear least squares is often used.

• In practice, we don’t know what the true value of ε0 is. Often ε0 = 0 is used,

so the model is actually conditional nonlinear least squares model. More

sophisticated means of estimating ε0 are available too.

With more complicated ARIMA models, the estimated model can become quite

complicated.

Page 89 — WSU Econometrics II 2007,


c Jonathan Yoder. All rights reserved
Part 6: Dynamic models October 7, 2008

2.8.3 Forecasting with ARIMA models

You will see some forecasting methods based on Kalman filters in the next section.

We will not cover them here.

Page 90 — WSU Econometrics II 2007,


c Jonathan Yoder. All rights reserved
Part 6: Dynamic models October 7, 2008

3 Autoregressive distributed lag models

Reading: G[19], DM[13], K[18]

3.1 Distributed Lags

Distributed lags deal with the current and lagged effects of an independent variable

on the dependent variable. That is:

yt = α + β0 xt + β1 xt−1 + β2 xt−2 + . . . + et

X
=α+ βi xt−i + et
i=0

The effect of x on y is distributed over time:

• The immediate effect is β0 (AKA impact multiplier);

P
• The long-run effect over all future periods is βi (AKA equilibrium mul-

tiplier);

P∞ iβi P∞
• The mean lag is i=1
P∞
βj = i=1 iwi
j=0

The problem with the above model: an infinite number of coefficients. Two

feasible approaches are:

• Assume βi = 0 for i > some finite number.

• Assume that βi can be written as a function of a finite number of parameters

for all i = 1 to ∞.

3.2 Finite distributed lags

Consider the model yt = α + β0 xt + . . . + βp xt−p + εt . No restrictions are placed on

the coefficients of the current and lagged values of x, but we need to decide on p.

Page 91 — WSU Econometrics II 2007,


c Jonathan Yoder. All rights reserved
Part 6: Dynamic models October 7, 2008

t-tests are usually not good for selecting lag length because lagged values of x

are likely to be highly correlated with current values. i.e. t-tests will have low

power.

Two better approaches, both based on the assumption that you know some

upper bound P for the lag length:

• Choose the lag length p ≤ P that maximizes R̄2 or minimizes the Akaike Info.
 0 
Criterion (AIC) = ln eTe + 2p T .

• Start with high P and do F-tests for joint significance of βi . Successively drop

lags. Stop dropping lags as soon as Ho : βi = 0∀i is rejected.

Both methods tend to “overfit” (leave too many lags in), so high significance

levels should be used for the F-test (e.g. α = .01).

3.3 Geometric Lag models

Two models, the Adaptive Expectations Model and the Partial Adjustment

Model have been used a great deal in the literature. They are two specific models

that imply a specific form of infinite distributed lag effects called Geometric lags.

3.3.1 Partial Adjustment Model

Suppose the current value of the independent variables determines the desired value

or goal for the dependent variable:

yt∗ = α + βxt + εt ,

but only a fixed fraction of desired adjustment is accomplished in one period (it

takes time to build factories, restock diminished inventories, change institutional

Page 92 — WSU Econometrics II 2007,


c Jonathan Yoder. All rights reserved
Part 6: Dynamic models October 7, 2008

structure). The partial adjustment function is:

yt − yt−1 = (1 − λ)(yt∗ − yt−1 ); |λ| < 1.

rearrange:

yt∗ = (1 − λ)−1 (1 − λL)yt

replace yt∗ with the r.h.s. above and rearrange:

(1 − λL)yt = (1 − λ)(α + βxt + εt )

yt = α(1 − λ) + β(1 − λ)xt + λyt−1 + (1 − λ)εt

= α̃ + β̃xt + λyt−1 + ε̃t

Intrinsically linear in parameters and disturbances uncorrelated if ε uncorrelated.

OLS is consistent and efficient.

3.3.2 Adaptive Expectations model

An Adaptive expectations model is based on a maintained hypothesis about how

expectations change. Example: When input decisions (supply decisions) are based

on expected future prices.

yt = α + βx∗t+1 + δwt + εt

x∗t+1 = λx∗t + (1 − λ)xt

x∗t is the expected value for xt evaluated at time t − 1, and 0 < λ < 1. The second

equation implies that the change in expectations from t − 1 to t is proportional to

the difference between the actual value of x in period t and last periods expectation

about xt .

(1−λ)
1. Rearrange the second equation to get x∗t+1 = (1−λL) xt .

Page 93 — WSU Econometrics II 2007,


c Jonathan Yoder. All rights reserved
Part 6: Dynamic models October 7, 2008

2. Substitute x∗t+1 out of the first equation to get

1−λ
yt = α + β xt + δwt + εt
1 − λL
X∞
= α + β(1 − λ) λi xt−i + δwt + εt
i=0

= α + γzt (λ) + δwt + εt , γ = β(1 − λ)

this is in distributed lag form. Estimation proceeds recursively (as discussed in

Greene p. 568.) Briefly,

1. Constructed a variable zt (λ) that satisfies zt (λ) = xt + λzt−1 . Use z1 (λ) =

x1 (1 − λ).

2. pick a set of λs in (0,1), calculate z(λ)s, include it as one of the variables in

separate OLS regressions.

3. Choose λ̂ that minimizes SSE.

4. Use computer search and/or optimization routines to do this.

Note that the disturbances satisfy the CLRM assumptions, and if they are i.i.d.

normal, this recursive process (that minimizes SSE) is also Maximum Likelihood).

The autoregressive form is

(1 − λ)
yt = α + β xt + εt
(1 − λL)

yt (1 − λL) = α(1 − λL) + β(1 − λ)xt + δ(1 − λL)wt + (1 − λL)εt

yt = α̃ + β̃xt + δ̃wt + λyt−1 + ut

where ut = εt −λεt−1 is a moving average error. Rather than the recursive approach

discussed above, you could also use Instrumental Variables approach (replace yt−1 )

Page 94 — WSU Econometrics II 2007,


c Jonathan Yoder. All rights reserved
Part 6: Dynamic models October 7, 2008

with an appropriate instrument to ensure consistency.

3.4 Autoregressive Distributed Lag Models (ARDL)

The previous models are restrictive:

• The geometric lag is very restrictive regarding the relative impact of different

lagged values of x.

• Unrestricted lags truncate and eat up DF.

the ARDL is a more general form that can accommodate and approximate a huge

array of functional forms. An ARDL(p, r) is defined as

p
X r
X
yt = µ + γi yt−i + βj xt−j + εt , ε ∼ i.i.d ∀ t
i=1 j=0

C(L)yt = µ + B(L)xt + εt , where

C(L) = 1 − γ1 L − γ2 L2 − · · · − γp Lp and

B(L) = β0 + β1 L + β2 L2 + · · · + βr Lr

3.4.1 Estimation

Consider the simplest model with a lagged dependent variable:

yt = αyt−1 + εt

εt X
yt = = αi εt−i
1 − αL i=0

X
yt−1 = αi εt−i (note index i starts at 1)
i=1

So, yt−1 is a function of εt−1 and all previous disturbances. The CLRM assumption

of E[ε0 X] = 0 holds if ε is i.i.d. (in this case X is lag y), because yt−1 is not

Page 95 — WSU Econometrics II 2007,


c Jonathan Yoder. All rights reserved
Part 6: Dynamic models October 7, 2008

correlated with εt . We can therefore consistently estimate α. However, if εt is a

function of past disturbances, then Cov[yt−1 , εt ] 6= 0. OLS Biased, inconsistent.

Show for yourself that Cov[yt−1 , εt ] 6= 0 for a model with one lagged de-

pendent variable on the right and an AR(1) disturbance .

3.4.2 Summary stats for effect of x on y

The equilibrium multiplier (long-run effect of a change in x) in the ARDL model

generally is

∞ Pr
B(1) i βi
X
Long Run multiplier = αi = = A(1) = P p
i=0
C(1) 1 − i γi

B(L)
where A(L) = C(L) . Assuming no shocks (disturbances) and assuming stationarity,

the long-run relationship among the variables in a regression are

µ B1 (1) B2 (1) Bk (1)


ȳ = + X̄1 + X̄2 + · · · + X̄k
C(1) C(1) C(1) C(1)

where ȳ and X̄ are constant values of y and Xi .


0
The Mean Lag is AA(L) (L)

L=1
.

3.4.3 Calculating the lag coefficients

Consider an ARDL(2,1):

(β0 + β1 L)
yt = µ̃ + xt + ε̃t
(1 − γ1 L − γ2 L2 )
B(L)
= µ̃ + xt + ε̃t
C(L)

= µ̃ + A(L)xt + ε̃t

X
= µ̃ + αi xt−1 + ε̃t
i=0

Page 96 — WSU Econometrics II 2007,


c Jonathan Yoder. All rights reserved
Part 6: Dynamic models October 7, 2008

αi is the direct effect of xt−s on yt ; the coefficient on Li in A(L). Suppose we want

to calculate αi .

A(L)C(L) = B(L)

(α0 + α1 L + α2 L2 + . . . )(1 − γ1 L − γ2 L2 ) = (β0 + β1 L)

Expanding this over a subset of A(L),

(α0 −α0 γ1 L−α0 γ2 L2 )+(α1 L−α1 γ1 L2 −α1 γ2 L3 )+(α2 L2 −α2 γ1 L3 −α2 γ2 L4 )+· · · = β0 +β1 L

Now, collect terms for each lag length, respectively:

L0 : α0 = β0

L1 : −α0 γ1 + α1 = β1

L2 : −α0 γ2 − α1 γ1 + α2 = 0

L3 : −α1 γ2 − α2 γ1 + α3 = 0

Rearranging each line respectively gives αi as a function of the estimable parameters

βi and γi .

α 0 = β0

α1 = β1 + α0 γ1 = β1 + β0 γ 1

α2 = α0 γ2 + α1 γ1 = β0 γ2 + (β1 + β0 γ1 )γ1

α3 = α1 γ2 + α2 γ1 = etc. · · ·

αj = γ2 αj−2 + γ1 αj−1 = etc. · · · for j > 3 with an ARDL(2,1).

Page 97 — WSU Econometrics II 2007,


c Jonathan Yoder. All rights reserved
Part 6: Dynamic models October 7, 2008

3.4.4 Forecasting with ARDL

Consider an ARDL(2,1):

yT +1 |yT = γ1 yT + γ2 yT −1 + β0 xT +1 + εT +1
 0
= γ 0 xT +1 + εT +1 , where xT +1 = yT yT −1 xT +1 .

Because E[εT +1 ] = 0, ŷT +1 |yT is a consistent estimator of yT +1 |yT .

Var[e1 |T ] = E[ε0T +1 εT +1 ]

= x0T +1 σ 2 (X0 X)−1 xT +1 + σ 2 [be able to show this]


0 2 0 −1
\
Var[e 1 |T ] = xT +1 s (X X) xT +1 + s2 .

q
A forecast interval for y1 is ŷ1 ± tα/2 \
Var[e 1 |T ].

Kalman Filter simplifies extended forecasting

Assume that εT +1 is the only source of uncertainty.

  
   
γ1 γ2 γ3 ··· γp−1 γp   yT   
ŷT +1 µ̂T +1    ε̂T +1


 
 
 1
  0 0 ··· 0 0  yT −1  
   

 yT   0      0 
= +0 ··· 0  yT −2  + 
        
 1 0 0 
y   0      0 
 T −1     ..  .  
  ..  


..
 
..
 
 .   ..

. .    .
0 0 0 ··· 1 0 yT −p

ŷT +1 = µ̂T +1 + CyT + ε̂T +1

Page 98 — WSU Econometrics II 2007,


c Jonathan Yoder. All rights reserved
Part 6: Dynamic models October 7, 2008

where µ̂T +1 = µ + β0 xT +1 + · · · + βr xT +1−r is known with certainty (so forecasts are

conditional on xT +1 ).

 
2
σ 0 · · ·
Cov[ε̂T +1 ] = E[(ŷT +1 − yT +1 )(ŷT +1 − yT +1 )0 ] =  2 0
 
0 0 · · ·  = σ jj ,
. .. . . 
.. . .

 
0
where j = 1 0 0 · · · (which is a p×p matrix) and Var[εT +1 ] = Cov11 [ε̂T +1 ] =
σ2 .

Note: The forecast errors ε̂T +i are included above for intuition about the forecast

variance. When calculating the point estimates ŷT +i , set ε̂T +i to its expected value

of zero.

For T+2:

ŷT +2 = µ̂T +2 + CyT +1 + ε̂T +2

= µ̂T +2 + C(µ̂T +1 + CyT + ε̂T +1 ) + ε̂T +2

= µ̂T +2 + Cµ̂T +1 + C2 yT + (Cε̂T +1 + ε̂T +2 )

Cov[Cε̂T +1 + ε̂T +2 )] = σ 2 (Cjj0 C0 + jj0 )

and Var[ŷT +2 ] is the upper left element of σ 2 (Cjj0 C0 + jj0 ).

For F periods out (normalize T to T = 0):

F
X
ŷF = CF y0 + Cf −1 [µ̂F −(f −1) + ε̂F −(f −1) ]
f =1
−1
" F
#
X
2 0 i 0 i 0
Var[ŷF ] = σ jj + [C ]jj [C ] .
i=1

Page 99 — WSU Econometrics II 2007,


c Jonathan Yoder. All rights reserved
Part 6: Dynamic models October 7, 2008

Example: ARDL(2,1):

      
ŷT +1  µ̂T +1  γ̂1 γ̂2   yT 
 = +  
yT 0 1 0 yT −1

where µ̂T +1 = µ̂ + β̂0 xT +1 + β̂1 xT . Remember, for calculating forecasts, ε̂T +1 = 0.

3.4.5 Common Factor restrictions

An AR(1) model

yt = βxt + vt ; vt = ρvt−1 + εt

can be written as

yt = ρyt−1 + βxt − ρβxt−1 + εt

which is an ARDL(1,1) with a restriction on the coefficient on xt−1 .

AR(p) as a restricted ARDL(p,p): Let εt be an i.i.d. disturbance.

yt = βxt + vt ,

where vt = ρ1 vt−1 + · · · + ρp vt−p + εt

⇒ vt R(L) = εt .
εt
yt = βxt +
R(L)

R(L)yt = βR(L)xt + εt

C(L)yt = βB(L)xt + εt for C(L) = B(L).

Implications

1. Any AR(p) disturbance in a static model can be interpreted as a restricted

version of an ARDL(p,p).

Page 100 — WSU Econometrics II 2007,


c Jonathan Yoder. All rights reserved
Part 6: Dynamic models October 7, 2008

2. Finding an AR(p) error process in your regression results can be an indication

of unaccounted-for ARDL process (i.e. a misspecified model).

E.g. an ARDL(2,2)model is

yt = γ1 yt−1 + γ2 yt−2 + β00 xt + β10 xt−1 + β20 xt−2 + εt

Test for an AR(2) as a restricted ARDL(2,2) by testing the joint restriction

   
β1 + γ1 β0  0
f (b) =  = 
β2 + γ2 β0 0

CFRs using characteristic roots. A more flexible and general method of test-

ing the specification of ARDL models is based on the roots of the Lag operator

polynomials.

C(L) = (1 − γ1 L − γ2 L2 ) = (1 − λ1 L)(1 − λ2 L)

B(L) = β0 (1 − β1 L − β2 L2 ) = β0 (1 − τ1 L)(1 − τ2 L)

where λi and τi are characteristics roots (note, we just arbitrarily changed the signs

of β1 , β2 ). Then the ARDL(2,2) can be written as

(1 − λ1 L)(1 − λ2 L)yt = β0 (1 − τ1 L)(1 − τ2 L)xt + εt , or

yt = (λ1 + λ2 )yt−1 − (λ1 λ2 )yt−2 + β0 xt − β0 (τ1 + τ2 )xt−1 + β0 (τ1 τ2 )xt−2 + εt .

Page 101 — WSU Econometrics II 2007,


c Jonathan Yoder. All rights reserved
Part 6: Dynamic models October 7, 2008

Now restrict λ1 = τ1 = ρ (the lag operator polynomials have a “common factor”).

The model becomes an AR(1):

(1 − ρL)(1 − λ2 L)yt = (1 − ρL)(1 − τ2 L)β0 xt + εt

(1 − λ2 L)yt = (1 − τ2 L)β0 xt + ut

εt
where ut = 1−ρL = ρut−1 + εt , an AR(1) error process.

Implications for estimation

1. The ARDL(2,2) has a white noise error, can be estimated consistently with

OLS.

2. The restricted model has a lagged dep. var and and AR(1) — OLS is incon-

sistent.

Two possible approaches:

1. Estimate the unrestricted version (ARDL(p,r)); inefficient because the restric-

tion is not imposed, but consistent.

2. Use IV with an instrument for the lagged dep. vars on right hand side; per-

haps, for e.g., ŷt−1 from a regression of yt on xt ...xt−p .

Question: How would you test for autocorrelated errors in an ARDL(p,r) model?

3.4.6 Error Correction Models (ECM)

ARDL(1,1): yt = µ + γyt−1 + β0 xt + β1 xt−1 + εt

subtract yt−1 : ∆yt = µ + (γ − 1)yt−1 + β0 xt + β1 xt−1 + εt

add,subt. β0 xt−1 : ∆yt = µ + (γ − 1)yt−1 + β0 ∆xt + (β0 + β1 )xt−1 + εt

Page 102 — WSU Econometrics II 2007,


c Jonathan Yoder. All rights reserved
Part 6: Dynamic models October 7, 2008

 
γ−1
Then multiply (β0 + β1 )xt−1 by γ−1 to get

 
β0 + β1 B(1)
∆yt = µ + β0 ∆xt + (γ − 1)(yt−1 − θxt−1 ) + εt , where θ = = .
1−γ C(1)

B(1)
, where C(1) is the long-run multiplier we saw a while back. This is called an Error

correction model, or more precisely, the error correction form of the ARDL(1,1)

model. One more step:

∆yt = β0 ∆xt + γ̃[yt−1 − (µ̃ + θxt−1 )] + εt

µ µ
where µ̃ = 1−γ = − γ−1 and γ̃ = (γ − 1). ∆yt is comprised of two components (plus

disturbance): a short run shock from ∆xt and a reversion toward equilibrium, or

equilibrium-error correction. To see this, note that in equilibrium yt = yt−1 = ȳ,

and xt = xt−1 = x̄, so ∆yt = 0 and ∆xt = 0. Then the ECM is

0 = γ̃[yt−1 − (µ̃ + θxt−1 )], so

ȳ = µ̃ + θx̄

Therefore, yt−1 − (µ̃ + θxt−1 ) represents deviation from the equilibrium relationship

y = µ̃ + θx. β1 = (γ − 1) is the marginal impact of this deviation on ∆yt .

Estimation: Assuming stationarity of y, all parameter of the ECM can be cal-

culated based on estimates from the original ARDL(1,1) model. Alternatively, all

parameters of the ARDL(1,1) model can be calculated with the parameters from

the alternative specification

∆yt = α0 + α1 ∆xt + α2 yt−1 + α3 xt−1 + εt

Page 103 — WSU Econometrics II 2007,


c Jonathan Yoder. All rights reserved
Part 6: Dynamic models October 7, 2008

The results will be identical. Covariances can be calculated using the Delta method

if necessary. You could also estimate the ECM model parameters directly via non-

linear least squares.

3.5 Cointegration

• The problem: in general, a linear combination of variables will usually be

integrated to the highest order of the variables.

• If any of the variables are I(d) with d > 0, then many or all parameter

estimates and associated t statistics may be biased. Consider the regression

yt = x0t β + εt . If x and y are integrated of different orders, then εt = y − x0t β

will not be stationary.

• The exception: if two or more of the series are integrated of the same order —

drifting or trending at the same rate — then we may be able to find a linear

combination of the variables that are I(0).

• If so, we can consistently estimate parameters and use standard inference

statistics without having to difference or de-trend the variables.

3.5.1 Example: trending variables

Consider two trending random variables: y1t = 3t + ut and y2t = t + vt , where vt

and ut are uncorrelated white noise errors. Both y1 and y2 are I(1), because their

first difference is stationary. Now consider the error process from a relationship

Page 104 — WSU Econometrics II 2007,


c Jonathan Yoder. All rights reserved
Part 6: Dynamic models October 7, 2008

between y1 on y2 :

yit = αy2t + εt
 
y1t 
 
εt = y1t − αy2t = 1 −α  
y2t

= (3t + ut ) − α(t + vt )

= (3 − α)t + (ut − αvt )

• This is a linear combination of two I(1) variables, and so would in most cases

be I(1), and the variance if εt would explode as t increases (i.e. not stationary).

• However, if α = 3, then the εt is I(0) — stationary, implying that y1 and y2

are cointegrated: integrated of the same order.


   
• 1 −α = 1 −3 (or any multiple of it) is called a cointegrating vector

of y1t and y2t .

3.5.2 Error Correction form and cointegration

The ARDL(1,1) model

yt = α0 wt + γyt−1 + β0 xt + β1 xt−1 + εt

can be written as

∆yt = α0 wt + β0 ∆xt + γ ∗ (yt−1 − θxt−1 ) + εt

∆yt = α0 zt + β0 ∆xt + γ ∗ zt + εt

where zt = yt−1 − θxt−1 , θ = −(β0 + β1 )/(γ ∗ ), and γ ∗ = (γ − 1).

Page 105 — WSU Econometrics II 2007,


c Jonathan Yoder. All rights reserved
Part 6: Dynamic models October 7, 2008

• If y and x are I(1), and wt are I(0), then ε is I(0) if zt = yt−1 − θxt−1 is

I(0).

• Because θ is a function of the ARDL(1,1) parameters, a cointegrating rela-

tionship between the unrestricted ARDL parameters must hold for ε to be

stationary.

• If such a relationship DOES hold, then ε will be stationary and we can es-

timate both the ARDL(1,1) form and the ECM form in a standard fashion

(OLS, NLS, with standard sampling distributions for the parameter estimates)

WITHOUT having to difference all the data.

• If a cointegrating relationship does NOT hold, then the disturbance process

is not covariance stationary, and therefore the parameter estimates are not

covariance stationary, which means their sampling distributions are not sta-

tionary.

Note that when there is a cointegrating relationship, the regression above:

∆yt = α0 wt + β0 ∆xt + γ ∗ zt + εt

is a regression of the I(0) variable ∆yt on other I(0) variables.

Generally: If an ARDL(p,r) can be reparameterized as an ECM model with I(0)

variables, then the parameters on those I(0) variables can be estimated consistently

with OLS applied to the original ARDL(p,q) model, and the t-statistics on these pa-

rameter estimates are asymptotically standard normal. If in a reformulated (ECM)

regression only a subset of the parameters are associated with I(0) variables, this

subset of parameter estimates have standard sampling distributions. The others

don’t.

The next question: How do we know if a cointegrating relationship exists be-

tween the two variables?

Page 106 — WSU Econometrics II 2007,


c Jonathan Yoder. All rights reserved
Part 6: Dynamic models October 7, 2008

3.5.3 Testing for cointegration

Three approaches to testing for cointegrating vectors – single equation approaches

and a multiple equation approach. We begin with the single equation approach and

discuss the multiple equation approach in the context of VAR’s.

Single-equation cointegration test. If two (or more) series are cointegrated

(or I(0)) in the context of then the errors in a regression of one on the others

will be associated with a disturbance series that is I(0) . The Engle-Granger

cointegration test Proceed as follows:

• Calculate a DF test statistic based on the errors from your hypothesized


γ̂ ∗
regression: That is, run the regression ∆ε̂t = γ ∗ εt−1 +vt , and calculate se(γ̂ ∗ ) .

• This test statistic does not have the same distribution as the usual DF test

statistic. You need to compare it to a different set of critical values developed

by Davidson and MacKinnon (1993) (Not shown in Greene).

• If a unit root is not rejected, then no cointegration and inference relating to

model parameters based on that model is suspect.

• Differencing of one thing or another is likely called for.

3.6 Vector Autoregression, VAR

A VAR can be thought of as a reduced form for a system of dynamic equations.

Usefulness of the VAR framework:

• Forecasting

• Testing Granger Causality

• characterizing the time path of effects of shocks (impulse response).

Page 107 — WSU Econometrics II 2007,


c Jonathan Yoder. All rights reserved
Part 6: Dynamic models October 7, 2008

Figure 2: Engle-Granger critical Values

For two endogenous variables and known lag-length of p = 2, the VAR is a

two=equation model structured as:

y1t = µ1 + δ111 y1t−1 + δ112 y2t−1 + δ211 y1t−2 + δ212 y2t−2 + ε1t

y2t = µ2 + δ121 y1t−1 + δ122 y2t−1 + δ221 y1t−2 + δ222 y2t−2 + ε2t

or
           
y1t  µ1  δ111 δ112  y1t−1  δ211 δ212  y1t−2  ε1t 
 = +  +  + 
y2t µ2 δ121 δ122 y2t−1 δ221 δ222 y2t−2 ε2t

or

yt = µ + Γ1 yt−1 + Γ2 yt−2 + εt

Page 108 — WSU Econometrics II 2007,


c Jonathan Yoder. All rights reserved
Part 6: Dynamic models October 7, 2008

Where δjml is the coefficient for the j th lag in the mth equation on the lth endogenous

variable.

Estimation: VARs are systems of regression equations with interrelated errors, so

SUR seems appropriate. However, because there are no cross-equation restrictions,

SUR is mathematically equivalent to OLS equation by equation.

3.6.1 Granger Causality

yt = γyt−1 + βxt−1 + εt

• If β 6= 0 then x Granger-causes y in the regression above.

• Generally, if xt−1 adds information to yt in addition to that added by yt−1 , x

“Granger causes” y.

• Granger causality of x on y is absent when f (yt |yt−1 , xt−1 , xt−2 , · · · ) = f (yt |yt−1 );

lagged values of x add no additional information.

• This is a statistical relationship — it does not imply causation in any sense

more general than this.

Example 19.8 (Greene), but extended to a VAR(2,2) here: increased oil prices have
 0
preceded all but one recession since WWII. Let yt = GNP OIL PRICE .

           
GNP t  µ1  α1 α2   GNPt−1  α3 α4   GNPt−2  ε1t 
= + + + 

  
P OILt µ2 β1 β2 P OILt−1 β3 β4 P OILt−2 ε2t

If α2 = α4 = 0 then changes in oil prices do not “Granger cause” changes in GNP;

otherwise it does.

testing for GC: H0 : α2 = α4 = 0. We can use a likelihood ratio test based on

first the restricted and unrestricted regressions of the first equation (GNP) alone; no

Page 109 — WSU Econometrics II 2007,


c Jonathan Yoder. All rights reserved
Part 6: Dynamic models October 7, 2008

need to estimate the second equation for this test. Test stat distributed χ2 (J = 2).

3.6.2 Impulse Response Functions

Impulse response functions track the effect of a single shock (from one or more

disturbance terms εi ) on equilibrium values of y on the time path of y after the

shock.
           
y1t  µ1  δ111 δ112  y1t−1  δ211 δ212  y1t−2  ε1t 
 = +  +  + 
y2t µ2 δ121 δ122 y2t−1 δ221 δ222 y2t−2 ε2t

or
yt = µ + Γ1 yt−1 + · · · + Γp yt−p + vt
(mx1) (mx1) (mxm) (mx1) (mxm) (mx1) (mx1)

For forecasting we can use the same Kalman filter arrangement as with the ARDL
Pp
model before: Recast the general model yt = µ + i Γi yt−i + vt as

 
      
yt µ Γ Γ2 ··· Γp yt−1ε
     1    t
        
 yt−1   0   I 0 ··· 0 yt−2   0 
   
 . =.+ .  .  +  . 
    
 .  .  . ..
 .  .  . . ··· 0   ..   .. 
   
        
yt−p+1 0 0 ··· I 0 yt−p 0

or

ỹt = µ̃ + Γ (L)ỹt + vt .

let Γ (L) = Γ1 L + Γ2 L2 , so that yt = µ + Γ1 yt−1 + Γ2 yt−2 + εt can be written

as

yt = µ + Γ (L)yt + vt .

Page 110 — WSU Econometrics II 2007,


c Jonathan Yoder. All rights reserved
Part 6: Dynamic models October 7, 2008

Assuming a stable system (and leaving out the tildes),

[I − Γ (L)]yt = µ + vt

yt = [I − Γ (L)]−1 (µ + vt )

X
−1
= [I − Γ (L)] µ+ Γ i vt−i
i

X
= ȳ + Γ i vt−i
i

= ȳ + [I − Γ (L)]−1 vt

Note: for y to be stationary, we need [I − Γ (L)]−1 to be nonsingular — for this, all

eigenvalues must be less than one in absolute value (i.e. the moduli), whether or

not the eigenvalue(s) are real or complex. Note, the modulus of a complex number

h + vi is R = h2 + v 2 .

What we are interested in is how a one-time shock flows through to the yi,t+j . In

general, a set of impulse response function and it’s covariance matrix is calculated

as

ŷT +s = ȳ + Γ s vT
s−1
X
Σ̂T +s = Γ i Ω(Γ 0 )i
i=0

where Γ i is Γ to the ith power.

Example: Suppose a first order VAR with µ = 0 for both equations in.

        
y1t  0.008 0.461 y1t−1  v1t  1 .5
 =   +  ; Ω = Cov[v1t , v2t ] =  
y2t 0.232 0.297 y2t−1 v2t .5 2

Page 111 — WSU Econometrics II 2007,


c Jonathan Yoder. All rights reserved
Part 6: Dynamic models October 7, 2008

Now, suppose a one unit change in v2t at t=0, such that v20 = 1. Then

   
y10  0
 = 
y20 1
      
y
 11  0.008 0.461 y
  10  0.461
 =   = 


y21 0.232 0.297 y20 0.297
      2    
y
 12  0.008 0.461 y
  11  0.008 0.461 y
  10  0.141
 =   =    =


y22 0.232 0.297 y21 0.232 0.297 y20 0.195

The covariance estimate for the two-period ahead impulse response is

     0
1 .5 0.008 0.461  1 .5 0.008 0.461
Σ̂2 =  +   
.5 2 0.232 0.297 .5 2 0.232 0.297

which you can use for estimating a confidence interval.

Page 112 — WSU Econometrics II 2007,


c Jonathan Yoder. All rights reserved
Part 6: Dynamic models October 7, 2008

3.6.3 Estimation of nonstationary cointegrated variables with VAR

VAR can be applied to a set of nonstationary variables are and cointegrated (see

Davidson and Mackinnon section 14.5).

Consider the VAR with g endogenous variables Yt :

p+1
X
Yt = Xt B + Yt−i Φi + Ut
i=1

where Yt are I(1), Xt are assumed I(0) deterministic variables, and B and Γi are

matrices of parameters to be estimated. This VAR can be reparameterized as

p
X
∆Yt = Xt B + Yt−1 Π + ∆Yt−i Γi + Ut
i=1

Pp+1
where Γp = −Φp+1 , Γi = Γi+1 − Φi+1 for i = 1...p and Π = i=1 Φi − Ig . This

is the multivariate analogue of an augmented Dickey-Fuller test regression.

If the original variables Y are I(1) and the deterministic variables X are I(0),

then the rank r ≤ q of Π is equal to the number of cointegrating vectors. If r = 0,

there are no cointegrating vectors. r = g implies that all Y are stationary.

Note that the estimated value of Π will always be full rank (unless there is

perfect collinearity in the data to begin with, but then you wouldn’t be able to run

a regression in the first place). The question is: can we test where our estimates

suggest one or more cointegrating vectors.

3.6.4 VARs and cointegration tests

Cointegration tests become potentially more complicated because for g variables

(Greene uses M ) in a VAR with I(0) variables, there can be up to g-1 cointegrating

vectors.

The Johansen test seems to be the most popular method for testing for cointe-

grating vectors in VARs. We will not go into detail about estimation, but there is

Page 113 — WSU Econometrics II 2007,


c Jonathan Yoder. All rights reserved
Part 6: Dynamic models October 7, 2008

a brief discussion in Greene p. 656-657 and in Davidson and MacKinnon.

1. We need to the number r of linearly independent cointegrating vectors em-

bedded in Π . This is done with successive tests of Ho : there are r or fewer

cointegrating vectors, versus Ha : there are more than r cointegrating vectors

(up to g).

2. For each r starting with zero, a trace statistic or max statistic is calculated,

depending on the approach. These statistics are asymptotically χ2 [g − r]. Big

statistic ⇒ reject Ho , and move on to next r. When statistics<critical value,

accept null.

Note that r > 1 implies more than one possible long-run relationship repre-

sented by a number of possible parameterizations. This is similar to the case of

an overidentified structural equation. Indeed, a VAR is in effect a reduced form of

a dynamic structural equation. To identify which cointegrating relationship holds

requires out-of-sample structural information (as in structural models themselves).

3.6.5 Structural VARs

A VAR yt = µ + Γ yt−1 + vt can be seen as a reduced form of the structural model

Θyt = α + Φyt−1 + εt

where Γ = Θ −1 Φ, α = Θ −1 µ, εt = Θ −1 vt , and Cov[εt ] = [Θ −1 ]Σ [Θ −1 ]0 .

Thus, we are simply back to simultaneous equation systems, but with the issues of

dynamics and simultaneity combined.

Example: Suppose that

 
 1 −θ12 
Θ = .
−θ21 1

Page 114 — WSU Econometrics II 2007,


c Jonathan Yoder. All rights reserved
Part 6: Dynamic models October 7, 2008

Then we have a dynamic simultaneous equations problem, with all the lagged

dep. vars. being predetermined and therefore, for our purposes, exogenous.

Hsiao (1997) shows that if you have nonstationarity but cointegrating relation-

ships in your model, then 2SLS and 3SLS can proceed as usual to address endo-

geneity.

Page 115 — WSU Econometrics II 2007,


c Jonathan Yoder. All rights reserved

You might also like