Handout 4 Cointegration
Handout 4 Cointegration
Cointegration
Reading:
1 Definition
The variables yt and xt are said to be cointegrated of order d, b, denoted, if
1. the variables yt and xt are I(d), that is, they need d differences to induce stationarity
The coefficient β is called the cointegrating vector (the coefficient parameter in the case of there
just be two variables). In economics the case generally considered is when d = b=1. This is an
important result as any arbitrary linear combination of I(1) series will be I(1) (unless the series are
cointegrated).
NOTE:
You cannot find cointegration between an I(0) (stationary) dependent variable and I(1) (non-stationary)
explanatory variables. Similarly cointegration cannot be between an I(1) dependent variable and I(0)
explanatory variables). Both of these regressions would be unbalanced (in that the variables are
not all integrated of the same order).
For the series to be cointegrated if they must have a common I(1) (stochastic trend) compo-
nent, such that a linear combination of the series eliminates this common stochastic trend.
Consider, for example,
t
X
xt = ξt + εt = zt + εt
j=1
If we also have:
yt = βzt + ςt
then yt has the same I(1) stochastic trend component zt as well as a distinction stationary term, ςt .
There is a linear combination of yt and xt will eliminate the common stochastic trend, zt , and this
is written as:
yt − βxt = (βzt + ς) − β(zy + εt ) = ςt − βεt
| {z }
εt
the right hand side term ςt − βεt is a linear combination of two I(0) series and is therefore I(0) 1 . This
means that while yt and xt are both I(1), there is a linear combination yt − βxt which is I(0) and
1 Any linear combination of stationary (I(0)) series must itself be stationary (I(0))
1
EC226 (Term 2: Handout 4) 1 DEFINITION
2
EC226 (Term 2: Handout 4) 2 SPURIOUS VS COINTEGRATING RELATIONSHIPS
t
X
yt = yt−1 + εt where εt ∼ (0, σ12 ) ⇒ yt = y0 + εj
j=1
and a variable, xt
t
X
xt = xt−1 + ςt where ς2t ∼ (0, σ22 ) ⇒ xt = x0 + ςj
j=1
In this case both yt and xt are random walk processes without drift (and so are I(1) series). If we
also assume E(εt ςt ) = 0, then there is nothing linking the two series together. Despite this as yt
Pt t
P
and xt are trending over time (this is because of the stochastic trend εj in yt and ςj in xt ).
j=1 j=1
Therefore if we run a regression between the two variables:
yt = α + βxt + εt (1)
then while we would expect to only reject H0 : β = 0 around 5% of the time (for a 5% significance
level), when we are dealing with non-stationary series we can reject the null hypothesis up to 77% of
the time. If yt and xt were nonstationary processes with a non-zero drift parameter these rejection
rates would be even greater.
Consequently, we need to distinguish between
3
EC226 (Term 2: Handout 4) 2 SPURIOUS VS COINTEGRATING RELATIONSHIPS
yt = δ0 + δ1 xt + εt (2)
by OLS, or
by OLS, and solving for that equation for the long-run equation, that is:
γ0 (γ1 + γ2 + γ3 ) ut
yt = + xt + (3)
(1 − γ4 − γ5 ) (1 − γ4 − γ5 ) (1 − γ4 − γ5 )
| {z }
et
Either equation (2) or equation (3) is the long-run equation and the residuals from equation (2),
denoted as:
et = yt − δ̂0 − δ̂1 xt (4)
Where δ̂j j = 0, 1 is the OLS estimator of δj j = 0, 1, or the residuals from equation (3), denoted as:
γ̂0 (γ̂1 + γ̂2 + γ̂3 )
et = yt − + xt (5)
(1 − γ̂4 − γ̂5 ) (1 − γ̂4 − γ̂5 )
are measures of ”disequilibrium” between yt and xt and cointegration requires that the residuals
(which can be referred to as a ”disequilibrium” term) are stationary. Therefore a test for cointegration
revolves around whether the residuals (”disequilibrium” term), et in equation (4) or equation (5), is
stationary.
The stationarity of the residuals et , is determined using an ADF test as
p
X
∆et = µ + γet−1 + δj ∆et−j + ηt
j=1
and we re testing H0 : γ = 0 (the residuals are non-stationary and therefore the regression is spurious)
versus H1 : γ < 0 the residuals are stationary and therefore the regression was a cointegrating
regression.
The critical values are taken from MacKinnon (1991) have to be corrected for the number of non-
stationary variable you estimated in the long-run equation, n. The reason for this is that OLS by
definition finds the parameters which minimise the RSS (and hence the finds the residuals which are
most stationary, which therefore biases the results to finding a cointegrating equation, consequently
we adjust the critical values for n the number of non-stationary variables in the long-run equation.
STEP TWO:
If there is a cointegrating equation then you should estimate an Error Correction Model (ECM) of
the form:
X X
∆yt = ϕ0 + ϕj ∆yt−j + θh ∆xt−h + αet−1 + ηt (6)
j=1 h=1
by OLS, where we assume ηt is a well-behaved error term, as this equation has only I(0) variables
(remember the long-run residuals et−1 are stationary and all other variables are in first differences)
4
EC226 (Term 2: Handout 4) 2 SPURIOUS VS COINTEGRATING RELATIONSHIPS
standard hypothesis testing using t-ratios and diagnostic testing of the error term is appropriate,
note α < 0 as this is an error correcting model, such that if:
et−1 > 0 ⇒ yt−1 > δ̂0 + δ̂1 xt−1 and yt−1
is above its equilibrium value. In which case αet−1 < 0 ⇒ ∆yt < 0, that is, the dependent variable
falls.
Similarly, if
et−1 < 0 ⇒ yt−1 < δ̂0 + δ̂1 xt−1
and yt−1 is below its equilibrium value. In which case αet−1 > 0 ⇒ ∆yt > 0, that is, the dependent
variable rises – in general the dependent variable is adjusting (correcting) for disequilibrium (errors)
in the previous period. This equation has only I(0) variables and therefore the equation should be
consistent with CLRM assumptions, and this can be tested as the distribution of all statistics is
well-behaved.
This equation looks a bit like the ECM in equation (6), apart from (and very importantly) we do
not have the error correction term, et−1 , but we have still transformed all of the variables to be I(0).
If you have a combination of I(1) and I(0) variables in your regression and you do not have cointe-
gration between the I(1) variables, then any short-run regression you estimate between the variables
must involve only the I(0) variables (thereby ensuring the equation is balanced), and so we may have:
X X X
∆yt = ϕ0 + ϕj ∆yt−j + θh ∆x1t−h + κh x2t−h + ηt
j=1 h=1 h=1
if
yt ∼ I(1), x1t ∼ I(1), x2t ∼ I(0)
Alternatively,
X X X
yt = ϕ0 + ϕj yt−j + θh ∆x1t−h + κh ∆x2t−h + ηt
j=1 h=1 h=1
5
EC226 (Term 2: Handout 4) 2 SPURIOUS VS COINTEGRATING RELATIONSHIPS
4. endogeneity
However, we do not have to worry about these issues as much with cointegration analysis and estimat-
ing our long-run equations by OLS. The reason for this is donw to the fact that the OLS estimators
of the long-run coefficients are said to be ;;super-consistent”.
The idea of super-consistent is that the OLS estimator approaches the true coefficient very quickly
as the number of observations T increases. Considerthe OLS estimator for equation (4) is:
P P P
(xt − x̄)(yt − ȳ) (xt − x̄)εt (xt − x̄)εt /T
δ̂1 = t P 2 = δ 1 + t
P 2 ⇒ δ̂ 1 − δ 1 = t
P 2 (7)
(xt − x̄) (xt − x̄) (xt − x̄) /T
t t t
P
now (xt − x̄)εt /T , is the covariance between a stationary I(0) series εt and the nonstationary I(1)
t
P 2
series, xt , and this term will not increase over time. In contrast (xt − x̄) /T is the variance of the
t
I(1) series, xt , and this increases with T . Consequently, as T → ∞ the denominator becomes very
large relative to the numerator, therefore the term
P
(xt − x̄)εt /T
t
P 2 →0
(xt − x̄) /T
t
quickly, meaning the bias disappears very quickly and this will always be the case providing the terms
entering into ε are I(0) terms.
For example, consider the case where there is an omitted relevant variables (which is expressed as
an incorrectly specified dynamic equation) and there is a true (T) equation and an estimated (F)
equation:
yt =β0 + β1 xt−1 + εt (T)
yt =δ0 + δ1 xt − δ1 (xt − xt−1 ) + εt (F)
Here the omitted variable is zt = −δ(xt − xt−1 ), which is an I(0) process. Therefore,
cov(xt , zt )
E(δ̂1 ) =β1 − δ1
var(xt )
cov(xt , ∆xt )
= β1 − δ 1 (8)
var(xt )
cov(xt , ∆xt )/T
= β1 − δ 1
var(xt )/T
The numerator of the bias term in equation (8) will not increase with T as it is the covariance between
an I(1) and an I(0) series, while the denominator will increase with T , therefore asymptotically the
bias term goes to zero. The same proof works for measurement error, incorrect function form, and
endogeneity bias, which are all likely to be I(0) biases.
NOTE:
The t-ratios from estimating equation (2) cannot be interpreted, as this long-run equation will have
serial correlation (due to incorrectly specified dynamics) as well as omitted variable problems. The
traditional diagnostic tests from (2) are largely unimportant as the only important question is the
stationarity or otherwise of the residuals.
6
EC226 (Term 2: Handout 4) A EXAMPLE OF COINTEGRATION ANALYSIS
7
EC226 (Term 2: Handout 4) A EXAMPLE OF COINTEGRATION ANALYSIS
8
EC226 (Term 2: Handout 4) A EXAMPLE OF COINTEGRATION ANALYSIS
For all series these is some evidence of potential non-stationarity and as a result we undertake
unit root tests on each of the series using model C and 2 lags as the model, i.e. ∆yt = α + µt +
2
P
γyt−1 + δj ∆yt−j + εt With H0 : γ = 0; H1 : γ < 0
j=1
9
EC226 (Term 2: Handout 4) A EXAMPLE OF COINTEGRATION ANALYSIS
There is a very significant and positive relationship between the two, but if we undertake a unit root
test on the residuals (denoted, res y2):
The critical value comes from MacKinnon’s tables with n = 2 (see Appendix 2) and at the 5%
significance level this is approximately -3.34 and we are unable to reject H0 . This is not surprising as
the stochastic trend in y2t cannot be cancelled by anything in the stationary series, y1t , which does
not have a stochastic trend – by definition.
Now we look for cointegration between y1t and y2t .
10
EC226 (Term 2: Handout 4) A EXAMPLE OF COINTEGRATION ANALYSIS
There is a very significant and positive relationship between the two, but if we undertake a unit root
test on the residuals (denoted, res y1):
The critical value comes from MacKinnon’s tables with n = 2 and at the 5% significance level this
is approximately -3.34. Therefore we are able to reject H0 . However, this test makes no sense as
the dependent variable y1t is stationary and therefore any residuals from this equation MUST be
stationary – but this has nothing to do with cointegration. Now we look for cointegration between
y3t and y2t .
There is no real relationship between the two series and if we undertake a unit root test on the
residuals (denoted, res y3):
11
EC226 (Term 2: Handout 4) A EXAMPLE OF COINTEGRATION ANALYSIS
The critical value comes from MacKinnon’s tables with n = 2 and at the 5% significance level this
is approximately -3.34 we are unable to reject H0 there is no cointegrating equation and so there
is no cointegration, but there is no evidence of a spurious regression either as the coefficient in the
long-run equation is insignificant.
Now we look for cointegration between y4t and y2t .
There is a very significant and positive relationship between the two, but if we undertake a unit root
test on the residuals (denoted, res y4)
The critical value comes from MacKinnon’s tables with n = 2 and at the 5% significance level this
is approximately -3.34. Therefore we are able to reject H0 and this implies we have a cointegrating
equation between the two series.
Given a cointegrating equation, we now look at building an ECM and the form of the ECM should
be:
12
EC226 (Term 2: Handout 4) A EXAMPLE OF COINTEGRATION ANALYSIS
The model might be over-parameterised as the coefficients of the lag 2 variables are insignificantly
different from zero. However, the term on the lagged residuals (res y4) is -0.31 suggesting that around
one-third of the disequilibrium is corrected for within one quarter. Diagnostic testing of the residuals
gives:
13
EC226 (Term 2: Handout 4) B CRITICAL VALUES FOR THE ADF TEST
C(p) = ϕ∞ + ϕ1 T −1 + ϕ2 T −2
14
EC226 (Term 2: Handout 4) C STATA COMMANDS FOR COINTEGRATION
15
EC226 (Term 2: Handout 4) C STATA COMMANDS FOR COINTEGRATION
16
EC226 (Term 2: Handout 4) D R COMMANDS FOR COINTEGRATION
17
EC226 (Term 2: Handout 4) D R COMMANDS FOR COINTEGRATION
18
EC226 (Term 2: Handout 4) D R COMMANDS FOR COINTEGRATION
19
EC226 (Term 2: Handout 4) D R COMMANDS FOR COINTEGRATION
20
EC226 (Term 2: Handout 4) REFERENCES
References
Dougherty, C. (2016). Introduction to Econometrics. OUP Catalogue. Oxford University Press.
MacKinnon, J. (1991). ”critical values for cointegration test. In Engle, R. F. and Granger, C. W. J.,
editors, Long-Run Economic Relationships, chapter 13. Oxford University Press, Oxford.
Stock, J. and Watson, M. W. (2003). Introduction to Econometrics. Prentice Hall, New York.
21