All Tutorials Dynamic Econometrics
All Tutorials Dynamic Econometrics
All Tutorials Dynamic Econometrics
–
Week 1
Theory exercises
1. (W1V2) Consider a stochastic process Xt . We defined a general linear filter as
∞
X
τt = aj Xt+j ,
j=−∞
τt = bXt + (1 − b)τt−1 , τ1 = X1 .
We conclude that aj = 0 for j < −(t − 1) and j > 0. For j = −(t − 1), we have
a−(t−1) = (1 − b)t−1 , and for −(t − 2) ≤ j ≤ 0, aj = b(1 − b)−j . The last expression is
why the filter is referred to as “exponential smoothing”: the weight on past observations
decreases exponentially with j. Also note that it is a one-sided filter, and that it does
not use observations after time t.
1
2. (W1V3) Write in lag operator notation
Yt = 0.1Yt−1 + 0.3Yt−3 + εt ,
Yt = 0.1Yt−1 − 0.3Yt−2 + εt ,
Yt = 0.5Yt−2 + εt .
The lag operator L is such that LYt = Yt−1 and Lk Yt = Yt−k . Using these properties,
we see that
3. (W1V3) Consider a stochastic process Yt . Define ∆ = 1−L, where L is the lag operator
we discussed in the lecture. Show that ∆2 6= 1 − L2 .
You can repeatedly apply the operator ∆, and show that this is not the same as
applying the operator 1 − L2 , i.e.
Page 2
Exercises Dynamic Econometrics 2022
–
Week 2
Theory exercises
1. (W2V0) Show that for a covariance stationary process, the autocovariance function
satisfies γj = γ−j .
This is almost immediate from the definition of covariance stationarity, which tells us
that
γj,t = γj . (1)
The definition of jth autocovariance is
Assume that E[εt |Yt−1 ] = 0 for t ≥ 1. What is E[Yt |Yt−1 ]? Is this different from E[Yt ]?
1
The unconditional expectation can be calculated using linearity of the expectation
operator,
t
X
E[Yt ] = E[εj ]
j=1
t
X
= E[E[εj |Yj−1 ]]
j=1
= 0.
To obtain the conditional expectation E[Yt |Yt−1 ], we write the random walk process as
the following recursion
Yt = Yt−1 + εt , Y1 = ε1 . (5)
Taking the conditional expectation left and right, we have
We see that the unconditional and conditional expectation are different. E[Yt |Yt−1 ] is
a random variable, while E[Yt ] = 0, which is a constant.
Hence,
t
X
Yt = Yt−j + εs . (10)
s=t−j+1
Page 2
(b) Consider the process
1
Xt = √ Yt .
t
Is this process covariance stationary? Hint: for the calculation of the autocovari-
ance, you can use the result from (a).
This is closely related to the nonstationarity of the random walk process. We
know (see lecture slides) that for a random walk process,
E[Yt ] = 0,
(11)
var[Yt ] = tσ 2 .
Since for any constant a, we have E[aX] = aE[X] and var(aX) = a2 var(X), it
follows that
E[Xt ] = 0,
(12)
var[Xt ] = σ 2 .
t
1 2 1 X
γj,t = p E[Yt−j ] +p E[Yt−j εs ]
t(t − j) t(t − j) s=t−j+1
! (14)
t−j
= p σ2 + 0
t(t − j)
The first term uses the fact that the variance of the random walk Pt−jprocess at time
2
t − j is (t − j)σ . For the second term, we use that Yt−j = k=1 εk to see that
E[Yt−j εs ] = t−j
P
k=1 E[εk εs ], where k < s. Since εt is WN, E[εk εs ] = 0 if k 6= s. This
shows that the second term of (14) is equal to zero. Since the first term on the
second line of (14) depends on t, the process is not covariance stationary.
4. (W2V0) Suppose Yt = Xt +ηt with ηt a Gaussian White Noise process with variance ση2 ,
and where Xt a covariance stationary stochastic process with autocovariance function
γjX . You can assume that cov(Xt , ηs ) = 0 for all (t, s).
Page 3
Since ηt is GWN, E[Yt ] = E[Xt ]. Then,
(b) Write down the long run variance of Xt and relate it to the long-run variance of
Yt .
The long-run variance is defined as
∞
X
X
γLR = γ0X +2 γjX . (16)
j=1
Page 4
The unconditional mean is
For j > 1, γj = 0.
Page 5
The characteristic equation is
1 + θ1 z = 0,
which has the root z = − θ11 . The process is invertible if the root is outside of the
unit circle, so if |θ1 | < 1.
You can use the MA(1) process as given in the question to show that
γ0 = σ 2 (1 + θ12 ),
γ1 = σ 2 θ1 ,
γj = 0 for j ≥ 2.
(c) Suppose the root of the characteristic equation of the MA(1) process is inside the
unit circle. Find an equivalent representation to the MA(1) process above (in the
sense of the having the same autocovariance function), which has its root outside
the unit circle.
Define an MA(1) process Yt = µ + ε̃t + θ̃1 ε̃t−1 where ε̃t has variance σ̃ 2 .
Define θ̃1 = θ11 and σ̃ 2 = σ 2 θ12 . Then,
Since we have an MA(1) process, the fact that the root of the characteristic
equation in the original parametrization is inside the unit circle, implies that
|θ1 | > 1. Since θ̃1 = 1/θ1 , we have |θ̃1 | < 1 and the root of the characteristic
equation after reparametrization is outside of the unit circle.
γLR = σ 2 (1 + θ1 )2
You can verify that for σ̃ 2 = σ 2 θ12 and θ̃1 = 1/θ1 , we have that
Page 6
8. (W2V2) Let εt be WN with variance σ 2 = 1.
E[Yt ] = 0,
γ0 = var[Yt ] = (1 + 2.42 + 0.82 ),
γ1 = 2.4 · (1 + 0.8),
γ2 = 0.8,
γj = 0 if j > 2,
1 + θ1 z + θ2 z 2 + . . . = 0
1 + 2.4z + 0.8z 2 = 0,
(d) Find an observationally equivalent representation of the MA process that has roots
outside the unit circle. Use that the variance of this observationally equivalent
MA(2) process is σ̃ 2 = 4.
Page 7
The autocovariance function of an MA(2) process with parameters (σ̃ 2 , θ̃1 , θ̃2 ) is
γ0 = σ̃ 2 (1 + θ̃12 + θ̃22 )
γ1 = σ̃ 2 θ̃1 (1 + θ̃2 ) (22)
2
γ2 = σ̃ θ̃2
These are three equations for three unknowns. Using that σ̃ 2 = 4, we find θ̃1 = 0.9
and θ̃2 = 0.2. The characteristic equation is
1 + 0.9z + 0.2z 2 = 0,
which has solutions z = −2, and z = −2.5. These solutions are both outside the
unit circle, and hence the MA(2) process in terms of σ̃ 2 , θ̃1 , and θ̃2 is invertible.
9. (W2V2) Show that the long-run variance of an MA(q) process is equal to γLR =
P 2
q
σ2 j=0 θj .
Page 8
The MA(q) process is
q
X
Yt = µ + θj εt−j ,
j=0
E[Yt ] = µ,
q
X
2 2
γ0 = E[(Yt − µ) ] = σ θj2 ,
j=0
q−i
X
2
γi = E[(Yt − µ)(Yt−i − µ)] = σ θj θj+i for i ≤ q.
j=0
We now use that a square of a sum can be rewritten in terms of a sum of squares and
a sum of cross-products, as
q
!2 q
X X X
θj = θj2 + θk θm .
j=0 j=0 m6=k
To get all possible combinations where k 6= m, consider first all combinations where
there is a difference if 1 between the indices, soP{θ0 θ1 , θ1 θ0 , θ1 θ2 , θ2 θ1 , . . .}. Note that
q−1
all of these occur twice. You can write this as 2 j=0 θj θj+1 . Similarly, if the difference
Pq−2
between the indices is 2, then you get the sum 2 j=0 θj θj+2 . You can continue this
for differences of {3, 4, . . . , q}. Summing all these possibilities gives the second term in
the brackets in (23). We can therefore write
q
!2
X
γLR = σ 2 θj (24)
j=0
(a) What is the IRF (impulse response function) when a unit shock occurs at time t
for both the invertible and the not invertible process.
Page 9
Consider the non-invertible process. Suppose that the MA(1) process is at its
steady state (µ) at t − 1. Then, at time t a shock εt = 1 hits. No more shocks
occur after. The process Yt then satisfies
Yt−1 =µ
Yt = µ + εt
Yt+1 = µ + θ 1 εt
Yt+2 =µ
Consider the invertible process. Suppose that the MA(1) process is at its steady
state (µ) at t − 1. Then, at time t a shock ε̃t = 1 hits. No more shocks occur
after. The process Yt then satisfies
Yt−1 = µ
Yt = µ + ε̃t
Yt+1 = µ + θ̃1 ε̃t
Yt+2 = µ
11. (W2V3) Consider the second order difference equation yt = φ1 yt−1 + φ2 yt−2 + wt .
(a) Write this in matrix notation as a first order vector difference equation.
(b) Suppose φ1 = 0.6 and φ2 = −0.08. Is yt stable?
Page 10
(a).
yt φ1 φ2 yt−1 wt
= +
yt−1 1 0 yt−2 0
j
(b). It was discussed in the lectures that
stability
is requires the [1,1] element of F
φ1 φ2
to go to zero as j → ∞, where F = . This happens if the eigenvalues of
1 0
F are smaller than one in absolute values. The eigenvalues satisfy
φ1 − λ φ2
det = λ2 − φ1 λ − φ2
1 −λ
This is solved by
p
φ1 ± φ21 + 4φ2
λ= .
2
So that λ1 = 0.4 and λ2 = 0.2. Indeed, the eigenvalues are smaller than one in absolute
value, so that yt is stable.
12. (W2V4) The Wold decomposition theorem shows that under general conditions the
process Yt can be written as an MA(∞) process. The shocks are then white noise, and
in fact (this was not mentionend in the lectures) also satisfy
13. (W2V4) In the lectures, we derive the population moments of the AR(1) process via
the MA(∞) representation. Derive the mean and variance of a stable AR(1) process
Yt = c + φ1 Yt−1 + εt by invoking (1) stability, (2) covariance-stationarity, and (3)
εt = Yt − E[Yt |It−1 ], where It−1 = {Yt−1 , Yt−2 , . . .}.
Page 11
Taking the expectation on the left and right-hand side of the AR(1) process, we get
The cross-product is zero via property (3) in the question and the answer to question
2. Also, by covariance stationarity E[(Yt − µ)2 ] = var[Yt ] = var[Yt−1 ] = E[(Yt−1 − µ)2 ],
and hence
σ2
var[Yt ] = φ21 var[Yt ] + σ 2 ⇒ var[Yt ] = . (27)
1 − φ21
14. (W2V4) Suppose a stable AR(1) is initialized by Y0 . What assumptions do you need
to make on Y0 to guarantee that E[Y0 ] = E[Yt ] and var(Y0 ) = var(Yt ).
We first use recursive substitution to obtain
t−1
X t−1
X
Yt = c φi1 + φt1 Y0 + φi1 εt−i
i=0 i=0
σt2 = α0 + α1 ηt−1
2
Page 12
(a) Define ut = ηt2 − σt2 . Show that E[ut ] = 0
ηt2 = ut + σt2
2 (29)
= ut + α0 + α1 ηt−1
16. (W2V5) Suppose εt is WN with variance σ 2 . Is the following AR(2) process stable?
Yt = 0.8Yt−1 − 0.3Yt−2 + εt .
Page 13
For stability, the roots of the characteristic equation should be outside the unit circle.
The characteristic equation of this AR(2) process is
So the first three coefficients in the MA representation are {1, 0.8, (0.82 − 0.3)}.
Page 14
Exercises Dynamic Econometrics 2022
–
Week 3
Theory exercises
1. (W3V1) Write the ARMA(1,1) model in MA(∞) form and find the MA coefficients
The ARMA(1,1) model is given by
(1 − φ1 L)Yt = c + (1 + θ1 L)εt
To get rid of the AR part, multiply this from the left by (ψ0 +ψ1 L+ψ2 L2 +ψ3 L3 +. . .).
We should have
(ψ0 + ψ1 L + ψ2 L2 + ψ3 L3 + . . .)(1 − φ1 L) = 1
where ξi = φi−1
1 (φ1 + θ1 ).
2. (W3V1) Write the ARMA(1,1) model in AR(∞) form and find the AR coefficients
1
The ARMA(1,1) model is given by
(1 − φ1 L)Yt = c + (1 + θ1 L)εt
To get rid of the MA part, multiply this from the left by (ψ0 +ψ1 L+ψ2 L2 +ψ3 L3 +. . .).
We should have
(ψ0 + ψ1 L + ψ2 L2 + ψ3 L3 + . . .)(1 + θ1 L) = 1
3. (W3V1) Show that and ARMA(1,1) model with θ1 = −φ1 is white noise.
From the answers above, we see that if θ1 = −φ1 , we obtain Yt = c + εt , so this is white
noise (plus a drift term c).
4. (W3V1) Specify the order (p and q) of the following ARMA processes and determine
whether they are stable and/or invertible.
Yt + 0.19Yt−1 − 0.45Yt−2 = εt ,
Yt + 1.99Yt−1 + 0.88Yt−2 = εt + 0.2εt−1 + 0.8εt−2 ,
Yt + 0.6Yt−2 = εt + 1.2εt−1 .
Page 2
These are: an AR(2) model, an ARMA(2,2) model, and an ARMA(2,1) model.
For stability, we need the roots of the characteristic equation for the AR part to be
outside of the unit circle.
Process 1: No MA part
Process 2: 1 + 0.2z + 0.8z 2 = 0. Roots: [−0.125 + 1.111i, −0.125 − 1.111i]
Process 3: 1 + 1.2z = 0. Root: [−0.8333]
5. (W3V1) Suppose Yt follows an ARMA(2,1) process. Derive the first four coefficients
in the AR(∞) representation of this process
The ARMA(2,1) process is written as
(1 − φ1 L − φ2 L2 )Yt = (1 + θ1 L)εt
(1 − π1 L − π2 L2 − . . .)Yt = εt
We can now find the coefficients πi by matching the left hand side of this equation to
that of the first equation. We find
θ1 − π1 = −φ1
−θ1 π1 − π2 = −φ2
−π3 − π2 θ1 =0
−π4 − π3 θ1 =0
Page 3
6. (W3V1) Consider the process
Y t = X t + εt ,
Xt = φ1 Xt−1 + ηt ,
where εt and ηt are strictly white noise with variance σε2 and ση2 respectively. Also, εt
is independent of ηt .
Page 4
We have
∆Yt = Yt − Yt−1
= Xt − Xt−1 + εt − εt−1
= ηt + εt − εt−1
γ0 = σ 2 (1 + θ12 ), γ1 = σ 2 θ1
σ 2 θ1 = −σε2 (1)
σ 2 (1 + θ12 ) = ση2 + 2σε2 (2)
Adding twice the first equation to the second, we get for the second equation
σ 2 (1 + θ1 )2 = ση2
This gives
q
θ1 = −1 + ση2 /σ 2 .
This gives
p 2 p 2
ση + ση + 4σε2
|σ| = ,
2
(since we are only interested in the positive solution). Finally, using (1) again
θ1 = −σε2 /σ 2
4σ 2
= − p 2 p ε2 .
( ση + ση + 4σε2 )2
The MA(1) process with these parameters θ1 and σ 2 has the same autocovariance
function as ∆Yt .
Page 5
7. (W3V1) Suppose Yt = φ1 Yt−1 + εt + θ1 εt−1 . Now define the process Zt = Yt + Xt with
Xt white noise with variance σx2 .
And autocovariance
γ1 = θ1 σ 2 − φ1 σx2 (3)
2
Suppose we write the right-hand-side as ηt + θ̃1 ηt−1 where E[ηt ] = 0 and E[ηt−1 ]=
2 2 2 2
σ̃ . This would have variance γ0 = (1 + θ̃1 )σ̃ , and γ1 = θ̃1 σ̃ . Note that γ0 and
γ1 should be the same as (2) and (3). You can solve for σ̃ 2 and θ̃1 in terms of γ0
and γ1 to get a genuine MA(1) representation of the right hand side of (1).
In conclusion, we find that Zt also follows an ARMA(1,1) model.
(b) Suppose you know the parameters of the model for Zt , can you uniquely determine
the parameters of the model for Yt ?
The process for Zt can be described by three parameters (AR, MA and error
variance). However on the right hand side of Zt = Yt + Xt there are in total
four parameters (AR, MA and two error variance parameters). It is therefore
impossible to retrieve these parameters from the parameters describing Zt .
8. (W3V3) Suppose
Yt = φ1 Yt−1 + β1 Xt + εt ,
where Yt was initialized in the infinite past and |φ1 | < 1.
vt = φ1 vt−1 + εt .
Page 6
By recursive substitution of Yt−j , we have
Yt = φ1 Yt−1 + β1 Xt + εt
= φ1 (φ1 Yt−2 + β1 Xt−1 + εt−1 ) + β1 Xt + εt
= ...
X∞ X ∞
j
∞
= φ1 Y−∞ + β1 φ1 Xt−j + φj1 εt−j
j=0 j=0
Since |φ| <P1, the first term can be safely ignored. Since vt = φ1 vt−1 + εt , we have
that vt = ∞ j
j=0 φ1 εt−j . In total we have the requested result.
Ỹ = φ1 Ỹ + β1 X̃.
∂ Ỹ β1
= . (5)
∂ X̃ 1 − φ1
Why is this called the long run effect? We see that the immediate impact of a
unit change in Xt on Yt is equal to β1 . In the next period, this effect is still φ1 β1 .
Now think of Yt measuring GDP growth. Then the effect on growth is diminishing
overPtime, but the effect on the level of GDP is the sum of all the growth effects,
i.e. ∞ i β1
i=0 β1 φ1 = 1−φ1 .
9. (W3V4) Consider the AR(2) process Yt = φ1 Yt−1 +φ2 Yt−2 +εt for t = 2, . . . , T . Suppose
that εt is Gaussian white noise with variance σ 2 .
(a) Write down the density of Yt conditional on Yt−1 and Yt−2 .
Since we assume εt to have a normal distribution, we know that
Page 7
(b) Rewrite the density fY3 ,Y2 ,Y1 ,Y0 (y3 , y2 , y1 , y0 ) as the product of two conditional den-
sities as in the previous subquestion, and the joint density of Y1 and Y0 .
Note that
fY2 ,Y1 ,Y0 (y2 , y1 , y0 ) = fY2 |Y1 ,Y0 (y2 |y1 , y0 )fY1 ,Y0 (y1 , y0 )
And also
fY3 ,Y2 ,Y1 ,Y0 (y3 , y2 , y1 , y0 ) = fY3 |Y2 ,Y1 ,Y0 (y3 |y2 , y1 , y0 )fY2 ,Y1 ,Y0 (y2 , y1 , y0 )
= fY3 |Y2 ,Y1 (y3 |y2 , y1 )fY2 |Y1 ,Y0 (y2 |y1 , y0 )fY1 ,Y0 (y1 , y0 )
(c) Write down the log-likelihood `(φ1 , φ2 , σ 2 ) using a similar conditioning approach
as shown in the lectures for the AR(1) model.
Continuing in the same fashion as in the previous subquestion, we have that
"T #
Y
fYT ,YT −1 ,...,Y0 (yT , yT −1 , . . . , y0 ) = fYt |Yt−1 ,Yt−2 (yt |yt−1 , yt−2 ) fY1 ,Y0 (y1 , y0 )
t=2
(d) Write down the concentrated log-likelihood, `(φ1 , φ2 , σ̂ 2 (φ1 , φ2 )), where σ̂ 2 (φ1 , φ2 ) =
1
PT 2
T −1 t=2 (yt − φ1 yt−1 − φ2 yt−2 ) .
Page 8
If we ignore log fY1 ,Y0 (y1 , y0 ), then maximizing the concentrated log-likelihood is
1
PT 2
equivalent to minimizing the sum of squared errors T −1 t=2 (yt −φ1 yt−1 −φ2 yt−2 ) .
(f) Suppose Yt is covariance stationary. The joint density of the first two observations
is given by a multivariate normal distribution. Define y I = (y0 , y1 ), then
1 1 0 −1
fY0 ,Y1 (y0 , y1 ) = p exp − (y I − µ) Σ (y I − µ)
(2π)2 |Σ| 2
σ2 (1 − φ2 )σ 2
var(Yt ) = =
1 − φ21 − φ22 − 2φ21 φ2 /(1 − φ2 ) (1 + φ2 ) [(1 − φ2 )2 − φ21 ]
(Of course, this result can also be obtain using the Yule-Walker equations)
In total, we now have
σ2
1 − φ2 φ1
Σ=
(1 + φ2 ) [(1 − φ2 )2 − φ21 ] φ1 1 − φ2
10. (W3V4) Consider the MA(2) process Yt = εt +θ1 εt−1 +θ2 εt−2 for t = 2, . . . , T . Suppose
that εt is Gaussian white noise with variance σ 2 .
Page 9
(a) What conditioning step would you take to estimate the parameters using condi-
tional maximum likelihood?
Note that in an MA(2) process
Suppose that we know that ε0 = ε1 = 0, that is, we set the first two shocks equal
to their expected value. This is the essential conditioning step.
We have
1 y22
1
fY2 |ε1 =0,ε0 =0 (y2 ) = √ exp − 2
2πσ 2 2σ
fY3 |Y2 ,ε1 =0,ε0 =0 (y3 ) = fY3 |ε2 ,ε1 =0,ε0 =0 (y3 )
= fY3 |ε2 ,ε1 =0 (y3 )
1 (y3 − θ1 ε2 )2
1
=√ exp −
2πσ 2 2 σ2
fY4 |Y3 ,Y2 ,ε1 =0,ε0 =0 (y4 ) = fY4 |ε3 ,ε2 ,ε1 =0,ε0 =0 (y4 )
= fY4 |ε3 ,ε2 (y4 )
1 (y4 − θ1 ε3 − θ2 ε2 )2
1
=√ exp −
2πσ 2 2 σ2
fYt |Yt−1 ,Yt−2 ,...,Y2 ,ε1 =0,ε0 =0 (yt ) = fYt |εt−1 ,εt−2 (yt )
Page 10
The conditional likelihood is given by
Page 11
Exercises Dynamic Econometrics 2022
–
Week 4
p p
1. (W4V1) Show that if XT → cX and ZT → cZ , where cX and cZ are constants, then
p
XT + ZT → cX + cZ
p
XT ZT → cX cZ
p
Since ZT → cZ , with cZ a constant and the fact that convergence in probability implies
d
convergence in distribution, we also have ZT → cZ . Applying Slutsky’s theorem, this
shows that
d
XT + ZT → cX + cZ
d
XT ZT → cX cZ
Since convergence in distribution to a constant implies convergence in probability, we
have the desired result.
2. (W4V1) Let ST denote the sample average of {Y1 , . . . , YT }. Derive limT →∞ T · var(ST )
if:
(a) Yt is a white noise process with variance σ 2
For all questions, we know that if Yt has absolutely summable autocovariances,
then ∞
X
lim T var(ȳT ) = γLR = γ0 + 2 γj
T →∞
j=1
If Yt is white noise, then the process is uncorrelated over time, and hence
limT →∞ T var(ȳT ) = σ 2 .
1
(c) Yt is an AR(1) process
If Yt follows an AR(1) process, then γj = φj1 γ0 , and hance
∞
!
σ2 X j
lim T var(ȳT ) = 1+2 φ1
T →∞ 1 − φ21 j=1
σ2
1
= 1+2 −2
1 − φ21 1 − φ1
σ2
=
(1 − φ1 )2
The variance of φ̂1 is given by the [2, 2] element of this matrix, so we have avar(φ̂1 ) =
1 − φ21
4. (W4V2) Suppose {Yt } is a mean zero, independent sequence and E[|Yt |] < ∞. Show
that {Yt } is a martingale difference sequence with respect to the information set It =
{Yt , Yt−1 , . . .}, i.e. E[Yt |It−1 ] = 0.
{Yt } is an MDS if E[Yt |It−1 ] = 0. Since Yt is independent of Yt−1 , . . . , Y0 , we have
We conclude Yt is an MDS.
Page 2
5. (W4V2) Suppose {Yt } is a martingale difference sequence with respect to the informa-
tion set It = {Yt , Yt−1 , . . .}. Show that E[Yt+m |It ] = 0 for m > 0.
Note first that
E[X|Y, Z] = 0 ⇒ E[E[X|Y, Z]|Z] = E[X|Z] = 0 (1)
Since Yt is an MDS, we have (by definition) E[Yt+m |It+m−1 ] = 0. Define It+m−1 /It =
t+m−1 t+m−1
Ct+1 , i.e. Ct+1 is information that has accumulated from time t + 1 and t + m − 1.
Then we know that
t+m−1
E[Yt+m |It+m−1 ] = E[Yt+m |It , Ct+1 ]=0
6. (W4V2) Suppose Yt = εt εt−1 , and εt is some stochastic process. Suppose you can show
that Yt is an MDS with respect to the information set It−1 = {εt−1 , εt−2 , . . .}. Is Yt a
martingale difference sequence with respect to the information set Jt = {Yt , Yt−1 , . . .}?
Evidently, E[|Yt |] < ∞ and Yt ∈ Jt . The only thing left to show is that E[Yt |Jt−1 ] = 0
where Jt−1 = {Yt−1 , Yt−2 , . . .}. Note now that
8. (W4V3) The usual estimator for the sample mean is µ̂T = T1 Tt=1 yt . Suppose the un-
P
derlying process Yt is such that the sample mean is a consistent estimator for the
PT pop-
p 1
ulation mean µ, i.e. µ̂T → µ. Show that the alternative estimator µ̄T = T −k t=1 yt ,
with k > 0 is
(a) consistent if k = 3
Page 3
Write
T 1
µ̄T = µ̂T = µ̂T = cT µ̂T
T −k 1 − k/T
p p p
Since Tk → 0, we have cT → 1. Also, it is given that µ̂T → µ. Now µ̄T = g(µ̂T , cT )
with g(·) a continuous function of it’s arguments. By the continuous mapping
p p
theorem, we then have g(µ̂T , cT ) → g(µ, 1), i.e. µ̄T → µ. We conclude that µ̄T is
a consistent estimator for µ.
(b) inconsistent if k = 12 T
p p
In this case, we have cT → 2. By the same argument as above g(µ̂T , cT ) → g(µ, 2),
p
i.e. µ̄T → 2µ. So in this case µ̄T is an inconsistent estimator for µ.
θ1
We will show in steps that the probability limit of φ̂1 is 1+θ12
.
with
η1t = θ1 (ε2t−1 − σ 2 )
η2t = εt εt−1
η3t = θ12 εt−1 εt−2
η4t = θ1 εt εt−2
(b) Show that η1t , η2t , η3t , η4t are martingale difference sequences with respect to
their respective information sets It = {ηit , ηit−1 , . . .}.
Page 4
(c) Show that the variance of ηit is bounded for i = 1, . . . , 4.
(d) Invoke the WLLN for mixingales to find the probability limit of the numerator.
(e) Find a decomposition of the denominator analogous to the one for the numerator
provided above to find the probability limit of the denominator.
p
(f) Use Slutsky’s theorem to show that φ̂1 → θ1 /(1 + θ12 ).
Page 5
First analyze the numerator. Rewrite this as
T T
1 X 1 X
yt yt−1 = (εt + θ1 εt−1 )yt−1
T − 1 t=2 T − 1 t=2
T T
1 X 1 X
= θ1 εt−1 yt−1 + εt yt−1
T − 1 t=2 T − 1 t=2
T T
1 X 2 1 X
= θ1 εt−1 + εt εt−1
T − 1 t=2 T − 1 t=2
T T
1 X 1 X
+ θ12 εt−1 εt−2 + θ1 εt εt−2
T − 1 t=2 T − 1 t=2
T
2 1 X
= σ θ1 + (η1t + η2t + η3t + η4t )
T − 1 t=2
with
η1t = θ1 (ε2t−1 − σ 2 )
η2t = εt εt−1
η3t = θ12 εt−1 εt−2
η4t = θ1 εt εt−2
The trick is now (1) to show that {η1t }, {η2t }, {η3t }, {η4t } are martingale difference
sequences, (2) check that E[|ηit |2 ] < ∞ (so the condition in the WLLN for mixingales
with r = 2), and (3) invoke the WLLN.
Step 1: MDS p Define the information set It = {εt , εt−1 , . . .}. Note that
2 2
E[|εt εt−1 |] ≤ E[εt ]E[εt−1 ] < ∞, εt εt−1 ⊂ It and E[εt εt−1 |It−1 ] = 0 since εt is
i.i.d. Hence, {η2t } and {η3t } are MDS with respect to It .
Also, E[|εt εt−2 |] < ∞, εt εt−2 ∈ It and E[εt εt−2 |It−1 ] = 0, so that {η4t } is an MDS w.r.t.
It .
Finally E[|ε2t − σ 2 |] ≤ E[ε2t ] + σ 2 < ∞, ε2t ∈ It and E[ε2t − σ 2 |It−1 ] = 0, and hence {η1t }
is an MDS w.r.t. It .
Step 2: check the condition for the p WLLN Take r = 2. Using Cauchy-
2
Schwarz, we have E[η2t ] = E[(εt εt−1 )2 ] ≤ E[ε4t ]E[ε4t−1 ] < ∞ by the assumption
that the fourth moment of εt is finite. The argument for η3t and η4t is completely
2
analogous. With regard η1t , we have E[η1t ] = θ12 E[(ε2t−1 − σ 2 )2 ] = θ12 [E[ε4t−1 ] − σ 4 ] < ∞.
Step 3: Invoke the WLLN Since the condition for the WLLN to hold is
satisfied for all three sums, they all converge to zero in probability, and the only term
that remains is σ 2 θ1 .
Page 6
Write the denominator as
T T
1 X 2 1 X
yt−1 = σ 2 (1 + θ12 ) + ηt−1 , ηt = yt2 − σ 2 (1 + θ12 )
T − 1 t=2 T − 1 t=2
Rewrite
Step 2: check the condition for the WLLN You can follow exactly the
same argument as in Step 2 above.
Step 3: Invoke the WLLN Since the condition for the WLLN to hold is
satisfied for all three sums, they all converge to zero in probability, and the only term
that remains is σ 2 (1 + θ12 ).
Since we have established the probability limits of the numerator and denominator,
one can invoke Slutsky’s theorem to find that
p θ1
φ̂1 →
1 + θ12
Suppose we are worried that εt = ρ1 εt−1 + ut , where ut is an i.i.d. process with zero
mean. Show that this would imply that Yt follows an AR(p + 1) process. Also, show
that testing whether ρ1 = 0 is equivalent to testing whether φp+1 = 0 in this AR(p + 1)
model.
Page 7
Write
p
X
Yt−1 = µ + φj Yt−1−j + εt−1 .
j=1
Now consider
Page 8
Exercises Dynamic Econometrics 2022
–
Week 5
Yt = c + φ1 Yt−1 + φ2 Yt−2 + εt
(a) What is the optimal h-step ahead forecast of the AR(2) model given that you
know (c, φ1 , φ2 , σ 2 ) for h = 1, 2, 3.
We know that the optimal forecast is the conditional mean E[Yt+h |It ]. We have
(b) What is the MSE for the h-step ahead forecast for h = 1, 2, 3?
1
Define
et+h = Yt+h − E[Yt+h |It ]
We have for the MSE at horizon h = 1, 2, 3, the following
Yt+3 − Ŷt+3 d
p 2 → N (0, 1)
E[et+3 ]
where we have used the expression for the MSE of Ŷt+3 obtained in the previous
subquestion.
Page 2
(b) We need to forecast yT +1 . Consider the estimator
PT
Yt Yt−1
φ̂1 = Pt=2
T 2
.
t=2 Yt−1
p
We now construct a forecast as Ŷt+2 = φ̂21 Yt . Show that φ̂21 → φ21 .
p
The fact that φ̂1 → φ1 was proven last week. Since φ̂2 is a continuous function of
φ̂, the result follows from the continuous mapping theorem.
(c) Now instead of using an iterated forecast, we try to relate Yt and Yt+2 by pre-
tending that the process is Yt = φ2 Yt−2 + ut . Say we estimate φ2 by least squares,
i.e. PT
Yt Yt−2
φ̂2 = Pt=3T 2
.
t=3 Yt−2
p
We now construct a forecast as Ŷt+2 = φ̂2 Yt . Show that φ̂2 → φ21 . You may
1
PT 2 p
assume that T −3 t=3 Yt−2 → γ0 < ∞.
Page 3
Note that
Yt = φ21 Yt−2 + εt + φ1 εt−1
Substituting this expression for Yt into the estimator, we see that
1
PT 1
PT
2 T −2 t=3 εt Yt−2 T −2 t=3 εt−1 Yt−2
φ̂2 = φ1 + 1 PT 2
+ φ1 1
P T 2
T −2 t=3 Yt−2 T −2 t=3 Yt−2
Define It = {εt , Yt−1 , εt−1 , Yt−2 , . . .}. Since εt is i.i.d. and Yt−2 only depends on
εt−2 , εt−3 , . . ., we have
Also
q
E[|εt Yt−2 |] ≤ E[ε2t ]E[Yt−2
2
] < ∞.
We P can now invoke a law of large numbers for MDS sequences PTto show that
1 T p 1 p
T −2
ε Y
t=3 t t−2 → 0. An very similar argument shows that T −2 t=3 εt−1 Yt−2 →
1
PT 2
0. Since in the expression for φ̂2 , the denominator T −2 t=3 Yt−2 for both terms
converges in probability to γ0 < ∞, it follows from Slutsky’s theorem (the version
p
where everything converges in probability, see last week’s exercises) that φ̂2 → φ21 .
Page 4
Write
Yt = (1 + θ1 L)εt
We have shown before that when |θ1 | < 1, the MA(1) process is invertible and
This is equivalent to
∞
X
Yt = (−1)j+1 θ1j Yt−j + εt
j=1
Increase the time index to t+1, truncating the sum at j = t, and taking the expectation
conditional on It , we have
t
X
Ŷt+1 = (−1)j+1 θ1j Yt+1−j .
j=1
2
E[Yt Yt−1 ] = φ1 E[Yt−1 ] + φ2 E[Yt−1 Yt−2 ]
By covariance stationarity
φ1 2
E[Yt Yt−1 ] = E[Yt−1 ]
1 − φ2
p φ1
This shows that φ̂1 → 1−φ2
.
φ1
(b) Suppose we construct a one-step ahead forecast as YT +1 = 1−φ 2
YT . Show that the
MSE of this forecast is at least as high as the MSE when forecasting using the
conditional mean of YT +1 .
Page 5
The MSE is given by
" 2 #
φ1 φ1 φ2
E[(YT +1 − YT )2 ] = E εT +1 − YT + φ2 YT −1
1 − φ2 1 − φ2
" 2 #
φ φ
1 2
= σ2 + E − YT + φ2 YT −1
1 − φ2
≥ σ2
(c) Now suppose we need to make a two-step ahead forecast. The researcher is still
φ2
using his AR(1) model. Show that his iterated forecast is ŶTI+2 = (1−φ12 )2 YT .
If the researcher thinks the model is an AR(1) model, then according to him, the
optimal forecast is
YT +2 = φ̃21 YT ,
φ1
where he will substitute his estimate for φ1 for φ̃1 , i.e. φ̃1 = 1−φ2
.
(d) Alternatively, the researcher can consider a direct forecast ŶTD+2 = φ̂D YT where
PT
Yt Yt−2
φ̂D = Pt=3
T 2
.
t=3 Yt−2
p φ21
Show that φ̂D −→ 1−φ2
+ φ2 . Hint: use the result from (a), and use that
PT
εt Yt−2 p
Pt=3
T 2
−→ 0.
t=3 Yt−2
To get the direct forecast, estimate
PT
yt yt−2
φ̂D = Pt=3
T 2
t=3 Yt−2
From question (a), and noting that asymptotically it does not matter whether we
start the sums at t = 2 or t = 3, we then have the desired result.
Page 6
5. (W5V2) Consider the ARDL(1,1) model Yt = φ1 Yt−1 + β1 Xt + εt , with E[εt ] = 0, and
E[ε2t ] = σ 2 . Suppose that both Xt and Yt are CS.
(c) Suppose now that Xt+1 is not known. You consider a direct approach, i.e. you
forecast
Ŷt+1 = φ1 Yt + β̃1 Xt .
Write down the forecast error.
We now have
(d) Write down the MSE in terms of the variance and first-order autocovariance of
Xt .
The MSE is
Page 7
Under the AR(1) process, we have
2
σX
E[Xt2 ] =
1 − ρ21
σ2
E[Xt Xt−1 ] = ρ1 X 2
1 − ρ1
Hence,
E[e2t+1 ] = σ 2 + β12 + β̃12 E[Xt2 ] − 2β1 β̃1 E[Xt Xt+1 ]
= σ 2 + E[Xt2 ] β12 + β̃12 − 2ρ1 β1 β̃1
(f) Find the MSE corresponding to the optimal value of β̃1 . Does the value of φX
1
matter?
Substituting the optimal value of β̃1 into the MSE, we have
2
σX
E[e2t+1 ] = σ 2 + β 2 (1 − ρ21 ) = σ 2 + σX
2 2
β1
1 − ρ21 1
(a) Suppose γLR = 1. What would be your conclusion when testing the null hypoth-
esis of equal predictive accuracy when the loss function is mean squared errror?
Assume you are testing at a significance level of 5%.
The Diebold-Mariano test statistic would be
16 − 4
DM = = 12 > 1.96 (2)
1
and you would reject at any reasonable significance level.
Page 8
(b) What would be your conclusion when the loss function is based on mean absolute
error?
Although the MAE is not given for model 2, we know that
v
T u T
1 X u1 X
|et+1,2 | ≤ t e2 =2 (3)
T t=1 T t=1 t+1,2
So we know that
4−2
DM ≥ = 2 > 1.96 (4)
1
so we would reject the DM test at the 5% level.
(c) Suppose T = 80, you estimate the long run variance using the Bartlett kernel
with `T = b4(T /100)2/9 c and accidentally the estimated autocovariance function
is γ̂j = 1/(j + 1)2 . Would you reject the null hypothesis of equal predictive
accuracy based on the squared error loss?
We would estimate
(d) Does it change the outcome of the test if we would know for sure that γj =
1/(j + 1)2 ?
Page 9
The long run variance is
∞
X
γLR = γ0 + 2 γj
j=1
∞
X
= −γ0 + 2 γj
j=0
∞ (7)
X 1
= −1 + 2
j=1
j2
π2
= −1 + 2
6
= 2.2899
The DM statistic is
16 − 4
DM = = 5.24 > 1.96. (8)
2.2899
Using the true or the population long run variance does not change the outcome
of the test.
7. (W5V5) Suppose Yt = c + φ1 Yt−1 + εt , where |φ1 | < 1. You consider two forecasts, one
based on the unconditional mean of Yt , and one that assume Yt follows a random walk.
c
Specifically, we have Yt+1,1 = 1−φ 1
and Yt+1,2 = Yt . Show that a forecast combination
Yt+1,C = ωYt+1,1 + (1 − ω)Yt+1,2 with |ω| < 1 exists that has the same MSE as the
optimal forecast, i.e. the conditional mean Yt+1 = c + φ1 Yt .
c
Yt+1,C = ω + (1 − ω)Yt
1 − φ1
c
First, set the intercept equal to that of the optimal forecast, i.e. c = ω 1−φ 1
. Solving
gives ω = 1 − φ1 . This also implies that 1 − ω = φ1 . Hence, with ω = 1 − φ1 , we get
Yt+1,C = c + φ1 Yt ,
which is the optimal forecast itself. This is quite nice: using two misspecified models,
forecast combination can nevertheless help you to achieve the optimal MSE (disregard-
ing parameter uncertainty).
Page 10
Exercises Dynamic Econometrics 2022
–
Week 6
1. (W6V1) Suppose that Yt is an AR(1) process with a structural break in the intercept,
so
Yt = ct + φ1 Yt−1 + εt , (1)
where εt ∼ W N (0, σ 2 ) and
For t < Tb the process is simply an AR(1) process, so we can iterate it backwards
until Y0
Yt = c1 + φ1 Yt−1 + εt
t−1
X t−1
X
= c1 φi1 + φt1 Y0 + φi1 εt−i .
i=0 i=0
1 − φt1 c1
Et [Yt ] = c1 + φt1
1 − φ1 1 − φ1
c1
= .
1 − φ1
1
(b) Show that for t ≥ Tb
c2 c1
Et [Yt ] = (1 − φt−T
1
b +1
)+ φt−T
1
b +1
.
1 − φ1 1 − φ1
Yt = c2 + φ1 Yt−1 + εt
t−T
Xb t−1
X t−1
X
= c2 φi1 + c1 φi1 + φt1 Y0 + φi1 εt−i .
i=0 i=t−Tb +1 i=0
To get the indices right, a nice check is to take t = Tb . In this case, there should
only be one term involving c2 (so the upper limit on the first sum is correct in
that regard) and t − 1 terms involving c1 . Taking the expectation, we get
1 − φt−T b +1
1 − φ1t−1−t+Tb −1+1 c1
Et [Yt ] = c2 1
+ c1 φt−T
1
b +1
+ φt1
1 − φ1 1 − φ1 1 − φ1
c2 c 1
= (1 − φt−T
1
b +1
)+ φt−Tb +1 .
1 − φ1 1 − φ1 1
Beautiful!
2. (W6V1) Suppose that Yt = ct + φ1 Yt−1 + εt , where εt is i.i.d. with finite fourth moment
and where
0 if T < Tb
ct =
c otherwise.
Page 2
This follows by substituting the given DGP and using the ct = c for c ≥ Tb .
1
PT 2 p
(b) Assume that T −1 t=2 Yt−1 −→ a for some constant a. Argue that the last term
is op (1).
The argument is the same as we have seen in Week 4. Since εt is i.i.d. with
finite fourth moment and Yt−1 only depends on {εt−1 , εt−2 , . . .}, the numerator
converges in probability to 0 by the weak law of large numbers for martingale
difference sequences (if you are asked to show something like this on the exam,
you need to go over all the steps, unless it’s explicitly indicated that an informal
argument suffices). Since the denominator converges in probability to a constant,
the last term converges to zero in probability by the continuous mapping theorem.
(c) The results in Question 1 show that it takes some time for E[Yt−1 ] to transition to
c 2
1−φ1
after the break occurred. A similar effect is observed for E[Yt−1 ]. However,
if both Tb and T − Tb are sufficiently large (so as T → ∞, the ratio Tb /T → η
for some fixed fraction η), and φ1 is not too close to 1, we can safely ignore this
transitioning phase when calculating the probability limit of the second term in
equation (3).
What would you then argue is the probability limit of the numerator? What
about the denominator?
We expect the numerator to converge to its expectation. If we ignore the transi-
c
tioning phase, for the numerator the expectation is approximately (1 − η) 1−φ 1
.
σ2
For the denominator, the expectation is γ0 = 1−φ21
for pre-break observations
2 σ2 c2
and γ0 + µ = +
1−φ21
for post-break observations. We expect that the
(1−φ1 )2
denominator is converging to
σ2 c2
+ (1 − η) .
1 − φ21 (1 − φ1 )2
(e) What would be the outcome of an (augmented) Dickey Fuller test if one ignores
(large) structural breaks?
Page 3
The test would not be able to reject the null of a unit root.
3. (W6V2) Consider an AR(1) process (with intercept equal to zero) ith an innovation
∗
outlier at the end of the sample. Calculate the mean squared forecast error E[(Yt+1 −
2
Ŷt+1 ) ] when Ŷt+1 is
4. (W6V2) Consider an AR(1) process (with intercept equal to zero) with an additive
∗
outlier at time t. Calculate the mean squared forecast error E[(Yt+1 − Ŷt+1 )2 ] when
Ŷt+1 is
Page 4
(c) When does an outlier become ‘harmful’ in the sense that one is better off using
the second, rather than the first forecast above?
Equating the two MSE’s, we get that you prefer the first forecast if
σ 2 + φ21 ζ 2 ≤ σ 2 (1 + φ21 ).
ζ 2 < σ2
So even when the outlier is relative small (slightly above one standard deviation
of the noise), we already prefer the two step ahead forecast.
Yt = c + Yt−1 + εt ,
X t = δ · t + εt .
where εt ∼ W N (0, σ 2 ). Assume that c 6= 0 and δ 6= 0. The first process is a unit root
process, which is also called a process with a stochastic trend. The second process has
a deterministic trend.
Solution: For the unit root process, we have seen in the lecture slides for
week 1 that Var(Yt ) = σ 2 · t (note: you need to show this). This depends on
t and hence the process cannot be covariance stationary.
For the deterministic process, we even have that E[Xt ] = β · t, which again
depends on t.
(b) Suggest a transformation for both processes such that the resulting process is
stationary. Reflect on the differences in the autocovariances.
Page 5
6. (W6V3) Suppose that Yt = δ · t2 + εt , where εt ∼ W N (0, σ 2 ). However, a researcher
thinks that the model is Yt = δ · t + εt . Suppose that the researcher estimates δ by
least squares based on the linear trend model (while the actual data has a quadratic
trend). Denote the estimator by δ̂.
PT T (T +1) PT
(a) Show that the δ̂/T →p 34 δ. You can use that t=1 t = 2
, 2
t=1 t =
2 2
T (T +1)(2T +1)
and Tt=1 t3 = T (T4+1) .
P
6
PT PT 2
Solution: Let xt = t. Then 2
t=1 xt = t=1 t = T (T +1)(2T
6
+1)
. Also,
PT PT 3 PT T 2 (T +1)2 P T
t=1 xt Yt = δ t=1 t + t=1 tεt = δ 4
+ t=1 tεt . As such, the OLS
estimator is
T
3 T 2 (T + 1)2 6 X
δ̂ = δ + tεt .
2 T (T + 1)(2T + 1) T (T + 1)(2T + 1) t=1
Notice that the variance of the summation in the second term is Var( Tt=1 tεt ) =
P
(b) Calculate the MSE when forecasting h-steps ahead with the misspecified model
and using δ̂/T = 34 δ.
Solution: The researcher thinks that YT +h = δ(T +h)+εT +h and will forecast
ŶT +h = δ̂(T + h). This gives ŶT +h = 34 δT (T + h). The MSE is
2
δ2
2 2 3 2
E[(ŶT +h −YT +h ) ] = σ + δT (T + h) − δ(T + h) = σ 2 + (T +h)2 (T +4h)2
4 16
Note it grows as T 4 .
Page 6
Exercises Dynamic Econometrics 2022
–
Week 7
Y t = c + εt + Ψ1 εt−1 ,
where εt follows a (vector-valued) white noise process with covariance matrix Σ. Cal-
culate the population mean, variance matrix and autocovariance matrix.
The population mean is obtained by applying the expectation operator left and right
and noting that E[εt ] = 0 by the definition of a WN process. We obtain E[Y t ] = c.
The population variance is
Y t = c + Φ1 Y t−1 + εt
where ε follows a (vector valued) white noise process with covariance matrix Σ. Show
that this can be written as ∞
X
Yt =µ+ Ψj εt−j
j=0
1
Write the VAR(1) process in lag polymial notation, multiply from the left with an
infinite MA lag polynomial, and then match the coefficients to get the desired result:
3. (W7V2) Using the VMA(∞) representation from the previous question, show that for
the VAR(1) model, Γk = Φ1 Γk−1 for k = 1, 2, . . ..
" ∞ ∞
#
X X
E[(Y t − µ)(Y t−k − µ)0 ] = E Ψj εt−j ε0t−k−l Ψ0l
j=0 l=0
4. (W7V2) Show how to obtain γj for AR(2) using the companion form.
Page 2
The companion form of an AR(2) model is
Yt φ1 φ2 Yt−1 εt
= +
Yt−1 1 0 Yt−2 0
To obtain γj , multiply with (Yt−j , Yt−j−1 ) and take expectations. Notice that all ex-
pectations involving εt are zero.
γj γj+1 φ1 φ2 γj−1 γj
=
γj−1 γj 1 0 γj−2 γj−1
Y t = c + Φ1 Y t−1 + Φ2 Y t−2 + εt
where εt follows a (vector-valued) white noise process with mean zero and covariance
matrix Σ.
(a) Calculate the unconditional mean of Y t .
Taking expectations and invoking covariance stationarity, we have
µ = E[Y t ] = (I − Φ1 − Φ2 )−1 c
Y t − µ = (I − Φ1 − Φ2 )µ − µ + Φ1 Y t−1 + Φ2 Y t−2 + εt
= Φ1 (Y t−1 − µ) + Φ2 (Y t−2 − µ) + εt
Page 3
(c) Write down the VAR(2) model in companion form.
The VAR(2) in companion form is
Yt−µ Φ1 Φ2 Y t−1 − µ εt
= +
Y t−1 − µ I O Y t−2 − µ 0
(d) From Exercise 3, you know that Γc,k = Φc,1 Γc,k−1 . Use this to show that Γ1 =
Φ1 Γ0 + Φ2 Γ01 .
Multiplying the companion form from the right by (Y 0t−1 − µ0 , Y 0t−2 − µ0 ) and
taking expectations, we get
Γ1 Γ2 Φ1 Φ2 Γ0 Γ1
=
Γ0 Γ1 I O Γ01 Γ0
Γ1 = Φ1 Γ0 + Φ2 Γ01
(b) Calculate the first two values of the orthogonalized IRF function.
Let us find P :
.3 0
P = .
0 0.2
The first two elements of the OIRF:
∂Y t .3 0 ∂Y t .15 .02
= Ψ0 P = , = Ψ1 P = . (2)
∂u0t 0 0.2 ∂u0t−1 .12 .10
Even though the matrix Σ is diagonal, the OIRF is still different from the IRF.
Page 4
7. (W7V3) Consider the following SVAR(1) where Y t is a [3 × 1] vector.
A0 Y t = d + A1 Y t−1 + ut ,
Assume that E[ut u0t ] is a diagonal matrix. Also consider the corresponding reduced
form VAR(1) process
Y t = c + Φ1 Y t−1 + εt .
How many restrictions do you need to impose on the structural VAR coefficients to be
able to identify the structural coefficients from the reduced form VAR coefficients?
The structural models has 3 intercept parameters, 3×3 parameters relating Y t to Y t−1 ,
6 parameters for contemporaneous relations, and 3 variance parameters, so in total 21
parameters. The reduced form model has 3 intercept parameters, 3 × 3 parameters
relating Y t to Y t−1 , 0 parameters for contemporaneous relations, and 6 parameters
from the variance covariance matrix, so in total 18 parameters. Therefore, we need to
impose 3 restrictions on the structural VAR.
8. (W7V3) Consider the following structural VAR process derived from two ARDL(1,2)
models as in the lectures.
! !
(s) (s) (s) (s) (s) (s) (s)
εs1,t
φ1,1 + γ1,2 φ2,1 φ1,2 + γ1,2 φ2,2 1 γ1,2
Yt=α Y t−1 + α ,
(s) (s) (s) (s) (s) (s)
φ2,1 + γ2,1 φ1,1 φ2,2 + γ2,1 φ1,2
(s)
γ2,1 1 εs2,t
−1
(s) (s)
where α = 1 − γ1,2 γ2,1 . Also, consider the reduced form corresponding to this
process.
φ1,1 φ1,2 ε1,t
Yt= Y t−1 +
φ2,1 φ2,2 ε2,t
(s)
(a) Suppose you know φ1,2 = 0. What does this tell you about φ1,2 ?
(s) (s) (s)
If φ1,2 = 0, then we have that α(φ1,2 + γ1,2 φ2,2 ) = 0. Since α cannot be equal to
(s) (s) (s) (s) (s) (s)
zero, this implies that φ1,2 + γ1,2 φ2,2 = 0. This tells you that φ1,2 = −γ1,2 φ2,2 .
(s) (s)
However, since we have no information on γ1,2 and φ2,2 , the restriction that φ1,2 =
(s)
0 does not tell us anything about φ1,2 . This means that even though you might
not find a reduced form relation between Y1,t and Y2,t−1 , this does not mean that
there is no structural relation between Y1,t and Y2,t−1 .
(s)
(b) Suppose you know that φ1,2 = 0, what does this tell you about φ1,2 ?
Page 5
(s)
The answer is similar to that of the previous exercise. If φ1,2 = 0, you know
(s) (s)
that φ1,2 = αγ1,2 φ2,2 . However, without further restrictions on the structural
coefficients you cannot say anything about φ1,2 . So this means that even though
there is no structural relation between Y1,t and Y2,t−1 , there might be a reduced
form relation between the two.
(s) (s)
(c) Assume D = E[εt (εt )0 ] is diagonal, with diagonal elements d11 and d22 . Denote
Σ = E[εt ε0t ]. Suppose that there is no contemporaneous effect of Y2 on Y1 , i.e.
(s)
γ1,2 = 0.
(s)
i. Find d11 , d22 , and γ2,1 in terms of reduced form parameters by comparing the
covariance matrix of the structural errors to that of the reduced form errors.
(s)
If γ1,2 = 0, then α = 1. Also
(s)
1 0 d11 0 1 γ2,1
Σ= (s)
γ2,1 1 0 d22 0 1
!
(s)
d11 γ2,1 d11
= (s) (s) 2
γ2,1 d11 (γ2,1 ) d11 + d22
We see that
d11 = [Σ]1,1 ,
(s) (s)
γ2,1 d11 = [Σ]1,2 → γ2,1 = [Σ]1,2 /[Σ]1,1 ,
(s)
(γ2,1 )2 d11 + d22 = [Σ]2,2 → d22 = [Σ]2,2 − [Σ]21,2 /[Σ]1,1 .
ii. Find the remaining structural coefficients in terms of the reduced form coef-
ficients.
(s) (s) (s)
Since γ1,2 = 0, we immediately have φ1,2 = φ1,2 , and φ1,1 = φ1,1 . Using this
and the solution of the previous question, we have that
(s)
φ2,1 = φ2,1 − [Σ]1,2 /[Σ]1,1 φ1,1
(s)
φ2,2 = φ2,2 − [Σ]1,2 /[Σ]1,1 φ1,2
Page 6
Define mt = E[Y t+h |It ]
(b) Show that the 1-step ahead MSE under this forecast is trace(Σ), with Σ the
covariance matrix of the errors in the VAR(p) process.
The VAR(p) process is
p
X
Yt =c+ Φi Y t−i + εt
i=1
We see that Ŷt+1 = E[Yt+1 |It ] = c + pi=1 Φi Y t+1−i , and hence, the forecast error
P
is et+1 = Yt+1 − Ŷt+1 = εt+1 . From the definition of the MSE given in the question,
we then have that the MSE is
Page 7