Econometrics For Finance Ch6
Econometrics For Finance Ch6
Page 79
A given series can be either stationary or non-stationary. The main difference between these
series is the degree of persistence of shocks.
A stochastic process is said to be stationary if its mean and variance are constant over time
and the value of the covariance between the two time periods depends only on the distance or
gap or lag between the two time periods and not the actual time at which the covariance is
computed. Such a stochastic process is known as a weakly stationary, or covariance
stationary. Such a time series will tend to return to its mean (called mean reversion) and
fluctuations around this mean (measured by its variance) will have broadly constant
amplitude.
A time series is strictly stationary if all the moments of its probability distribution are
invariant over time. If, however, the stationary process is normal, the weakly stationary
stochastic process is also strictly stationary, for the normal stochastic process is fully
specified by its two moments, the mean and the variance.
To explain weak stationarity, let Yt be a stochastic time series with these properties:
Mean: E(Yt) = µ
Variance: var (Yt) = E(Yt − µ)2 = σ2
Covariance: γk = Cov (Yt, Yt-k) = Cov (Yt, Yt+k) = E[(Yt − µ) (Yt+k − µ)]
As the covariance (autocovariances) are not independent of the units in which the variables
are measured, it is common to standardize it by defining autocorrelations ρk as
𝐶𝑜𝑣(𝑌𝑡 , 𝑌𝑡−𝑘 )
𝜌𝑘 = 𝐶𝑜𝑟𝑟 (𝑌𝑡 , 𝑌𝑡−𝑘 ) =
𝑉𝑎𝑟(𝑌𝑡 )
Note that ρ0 = 1, while − 1 ≤ ρk ≤ 1.
The correlation of a series with its own lagged values is called autocorrelation or serial
correlation.
The autocorrelations considered as a function of k are referred to as the autocorrelation
function (ACF).
From the ACF we can infer the extent to which one value of the process is correlated with
previous values and thus the length and strength of the memory of the process. It indicates
how long (and how strongly) a shock in the process (εt) affects the values of Yt.
Page 80
A shock in an MA(p) process affects Yt in p+1 periods only, while a shock in the AR(p)
process affects all future observations with a decreasing effect.
Why are stationary time series so important? Because if a time series is non-stationary, we
can study its behaviour only for the time period under consideration. Each set of time series
data will therefore be for a particular episode. As a consequence, it is not possible to
generalize it to other time periods. Therefore, for the purpose of forecasting, such
(nonstationary) time series may be of little practical value. Besides, the classical t tests, F
tests, etc. are based on the assumption of stationarity.
Page 81
Y on X should tend to zero. Yule showed that (spurious) correlation could persist in
nonstationary time series even if the sample is very large.
The spurious regression can be easily seen from regressing the first differences of Yt (=
∆Yt) on the first differences of Xt (= ∆Xt) where R2 is practically zero. One way to guard
against it is to find out if the time series are cointegrated.
The usual statistical results do not hold for spurious regression when all the regressors are
I(1) and not cointegrated.
Page 82
– Yt-1) = ut it becomes stationary. Hence, a RWM without drift is a difference stationary
process and we call the RWM without drift integrated of order 1.
Random walk with drift: If β1 ≠ 0, β2 = 0, β3 = 1, we get Yt = β1 + Yt-1 + ut which is a
random walk with drift and is therefore non-stationary. If we write it as (Yt – Yt-1) = ∆Yt
= β1 + ut, this means Yt will exhibit a positive (β1 > 0) or negative (β1 < 0) trend. Such a
trend is called a stochastic trend. Equation (Yt – Yt-1) is a difference stationary process
because the non-stationarity in Yt can be eliminated by taking first differences of the time
series.
If a non-stationary time series has to be differenced d times to make it stationary, that
time series is said to be integrated of order d. A time series Yt integrated of order d is
denoted as Yt ∼ I(d).
If a time series Yt is stationary to begin with (i.e., it does not require any differencing), it
is said to be integrated of order zero, denoted by Yt ∼ I(0). Most economic time series are
generally I(1).
An I(0) series fluctuates around its mean with a finite variance that does not depend on
time, while an I(1) series wanders widely.
Page 83
∆Yt = δYt-1 + ut where δ = (ρ − 1).
The null hypothesis now becomes δ = 0. If δ = 0, then ρ = 1, that is we have a unit
root.
It may be noted that if δ = 0, ∆Yt = (Yt – Yt-1) = ut and since ut is a white noise
error term, it is stationary.
If δ is zero, we conclude that Yt is nonstationary. But if it is negative, we
conclude that Yt is stationary.
Which test we should use to find out if the estimated coefficient of Yt-1 is zero or not?
Under the null hypothesis that δ = 0, the t value of the estimated coefficient of Yt-1
does not follow the t distribution even in large samples; that is, it does not have an
asymptotic normal distribution. Hence, t test can’t be used.
Dickey and Fuller have shown that under the null hypothesis that δ = 0, the
estimated t value of the coefficient of Yt-1 follows the τ (tau) statistic. These
authors have computed the critical values of the tau statistic on the basis of Monte
Carlo simulations.
Dickey–Fuller (DF) test
Page 84
out the first model because the coefficient of GDPt-1 that is δ is positive implying that ρ >
1.
E.g. The U.S. GDP time series
∆GDP𝑡 = 0.00576 𝐺𝐷𝑃𝑡−1
τ = (5.7980)
This can be ruled out because in this case the GDP time series would be explosive, δ > 0
→ ρ > 1.
∆GDP𝑡 = 28.2054 − 0.00136 𝐺𝐷𝑃𝑡−1
τ = (1.1576) (−0.2191) , ρ= 0.9986
Our conclusion is that the GDP time series is not stationary.
∆GDP𝑡 = 190.3857 + 1.4776𝑡 − 0.0603 𝐺𝐷𝑃𝑡−1
τ = (1.8389) (1.6109) (−1.6252) , ρ= 0.9397
Our conclusion is that the GDP time series is not stationary.
Critical Values Critical Values Critical Values
1% 5% 10%
No constant −2.5897 −1.9439 −1.6177
With constant −3.5064 −2.8947 −2.5842
Constant and trend −4.0661 −3.4614 −3.1567
where εt is a pure white noise error term. The number of lagged difference must
be enough to make the error term serially uncorrelated.
In ADF we still test whether δ = 0 and the ADF test follows the same asymptotic
distribution as the DF statistic, so the same critical values can be used.
∆𝐺𝐷𝑃𝑡 = 234.9729 + 1.8921𝑡 − 0.0786 𝐺𝐷𝑃𝑡−1 + 0.3557 ∆𝐺𝐷𝑃 𝑡−1
τ = (2.3833) (2.1522) (−2.2152) (3.4647)
Page 85
The GDP series is still non-stationary.
In an econometric modelling, the relationship between the dependent variable and the
explanatory variables has been defined either in a form of a static relationship, or in a
dynamic relationship.
A static relationship defines the dependent variable as a function of a set of explanatory
variables at the same point in time. This form of relation is also called “the long run”
relationship.
A dynamic relation involves the non-contemporaneous relationship between the
variables. This relationship defines “the short run” relationship.
VAR
According to Sims, if there is true simultaneity among a set of variables, they should all
be treated on an equal footing; there should not be any a priori distinction between
endogenous and exogenous variables. It is in this spirit that Sims developed his VAR
model.
It is a truly simultaneous system in that all variables are regarded as endogenous.
The term autoregressive is due to the appearance of the lagged value of the dependent
variable on the right-hand side and the term vector is due to the fact that we are dealing
with a vector of two (or more) variables.
In VAR modeling the value of a variable is expressed as a linear function of the past, or
lagged, values of that variable and all other variables included in the model.
If each equation contains the same number of lagged variables in the system, it can be
estimated by OLS.
k k
Yt = δ0 + αi Yt−i + βi Xt−i + ut
i=1 i=1
k k
Xt = δ1 + θi Xt−i + γi Yt−i + vt
i=1 i=1
Page 86
One way of deciding this question is to use a criterion like the Akaike or Schwarz
information criteria and choose that model that gives the lowest values of these criteria
(prediction errors). There is no question that some trial and error is inevitable.
Engle-Granger Test
Note that EG test runs static regression.
Be aware that the issue of efficient estimation of parameters in cointegrating relationships
is quite a different issue from the issue of testing for cointegration.
Assume personal consumption expenditure (PCE) and personal disposable income (PDI)
are individually I(1) variables and we regress PCE on PDI.
PCE𝑡 = β1 + β2 PDIt + ut
a) Estimate the error term ut
ut = PCE𝑡 − β1 − β2 PDIt
b) Perform unit root test for the error term
𝑢𝑡 = 𝜌𝑢𝑡−1 + 𝜀𝑡
The null hypothesis in the Engle-Granger procedure is no-cointegration and the alternative is
cointegration. If the test shows that ut is stationary [or I(0)], it means that the linear
combination of PCE and PDI is stationary. If you take consumption and income as two I(1)
variables, savings defined as (income − consumption) could be I(0) and the initial equation is
meaningful. In this case we say that the two variables are cointegrated. If PCE and PDI are
not cointegrated, any linear combination of them will be non-stationary and, therefore, the ut
will also be non-stationary.
τ = (−7.4808) (119.8712)
Page 87
Since PCE and PDI are individually non-stationary, there is the possibility that this
regression is spurious.
𝜏 = (−3.7791)
The Engle–Granger 1% critical τ value is −2.5899 and so the residuals from the
regression of PCE on PDI are I(0). Thus, this regression is not spurious and we call it the
static or long run consumption function and 0.9672 represents the long-run, or
equilibrium, marginal propensity to consumer (MPC).
We just showed that PCE and PDI are cointegrated; that is, there is a long-term
relationship between the two. Of course, in the short run there may be disequilibrium.
The Granger representation theorem, states that if two variables Y and X are
cointegrated, then the relationship between the two can be expressed as ECM.
∆𝑃𝐶𝐸𝑡 = 𝛼0 + 𝛼1 ∆𝑃𝐷𝐼𝑡 + 𝛼2 𝑢𝑡−1 + 𝜀𝑡 , where ut−1 = PCE𝑡−1 − β1 − β2 PDIt−1
This ECM equation states that ∆PCE depends on ∆PDI and also on the equilibrium error
term. If the latter is nonzero, then the model is out of equilibrium. If ∆PDI is zero and ut-1
is negative (i.e., PCE is below its equilibrium value), α2ut-1 will be positive (as α2 is
expected to be negative), which will cause ∆CPEt to be positive, leading PCEt to rise in
period t.
∆𝑃𝐶𝐸𝑡 = 11.6918 + 0.2906 ∆𝑃𝐷𝐼𝑡 − 0.0867 𝑢𝑡−1
t = (5.3249) (4.1717) (−1.6003)
Statistically, the equilibrium error term is zero, suggesting that PCE adjusts to changes in
PDI in the same time period (automatically). One can interpret 0.2906 as the short-run
marginal propensity to consume (MPC).
The error correction mechanism (ECM) developed by Engle and Granger is a means of
reconciling the short-run behavior of an economic variable with its long-run behavior.
The ECM links the long-run equilibrium relationship implied by cointegration with the
short-run dynamic adjustment mechanism that describes how the variables react when
they move out of long-run equilibrium.
Page 88