0% found this document useful (0 votes)
46 views85 pages

Var - Vecm

Uploaded by

yahyaam1804
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
46 views85 pages

Var - Vecm

Uploaded by

yahyaam1804
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 85

MULTIVARIATE TIME SERIES ANALYSIS

MRISHO HAMISI
@SUALISA
Email: [email protected]
Mobile: 0717 275 661
0683 947 378

1
VAR & VECM
VECTOR AUTOREGRESSIVE MODEL
Road map
i. What is VAR
ii. Motivation towards VAR model
iii. Estimation Procedure of a VAR model
iv. Worked example1 using stata
v. Diagnostic tests for VAR model
vi. How to test Granger causality
vii. VECM
viii. Worked example 2
What is VAR?
Autoregressive imply the presence of the lagged values of the dependent
variable on the RHS
Vector Autoregression (VAR) model is an extension of univariate autoregression
model to multivariate time series data.
Vector connotates that you have a system of equations
VAR model is a multi-equation system where all the variables are treated as
endogenous (dependent).
Endogeneity means none of them is determined out of the system (exogenous)
The model is constructed only if all the variables involved are integrated of order
1
If the variables are cointegrated then we estimates both long run model (VECM)
and short run model (VAR)
If no long run relationship we only have VAR
The model demands that all the variables must have equal lags
Below is an example of VAR model specification
What is VAR?
Let us consider GDP, Exports and Imports
These 3 variables are sometimes thought to be endogenous in the
following sense
• GDP growth could lead to increase in exports
• Exports also could lead to increased imports
• Imports especially on capital goods could enhance production leading
to increased GDP
However the stated situations might vary from country to country
One could therefore regards the three variables to belong in a system
of simultaneous equations such that
• 𝑔𝑑𝑝𝑡 = 𝛼 + 𝑎11 𝑔𝑑𝑝𝑡−1 + ⋯ 𝑎𝑘1 𝑔𝑑𝑝𝑡−𝑘 + 𝑏11 𝑒𝑥𝑝𝑡−1 +
⋯ 𝑏𝑘1 𝑒𝑥𝑝𝑡−𝑘 + 𝑐11 𝑖𝑚𝑝𝑡−1 + ⋯ 𝑐𝑘1 𝑖𝑚𝑝𝑡−𝑘 + 𝜀1𝑖
k connotates number of lags to be taken by each of the variable
What is VAR?
The stated equations simply means that current GDP is a function of
its own lags, lags of exports and lags of imports
𝑎11 means coefficients of lag1 of gdp in equation 1, 𝑏11 also means
coefficient of lag1 of exports in equation1 and 𝑐11 means coefficient
of lag1 of imports in equation1
Similarly exports equation could be given as
• 𝑒𝑥𝑝𝑡 = 𝛽 + 𝑎12 𝑔𝑑𝑝𝑡−1 + ⋯ 𝑎𝑘2 𝑔𝑑𝑝𝑡−𝑘 + 𝑏12 𝑒𝑥𝑝𝑡−1 +
⋯ 𝑏𝑘2 𝑒𝑥𝑝𝑡−𝑘 + 𝑐12 𝑖𝑚𝑝𝑡−1 + ⋯ 𝑐𝑘2 𝑖𝑚𝑝𝑡−𝑘 + 𝜀12
Likewise for imports its equation would be given as
• 𝑖𝑚𝑝𝑡 = 𝛾 + 𝑎13 𝑔𝑑𝑝𝑡−1 + ⋯ 𝑎𝑘3 𝑔𝑑𝑝𝑡−𝑘 + 𝑏13 𝑒𝑥𝑝𝑡−1 +
⋯ 𝑏𝑘3 𝑒𝑥𝑝𝑡−𝑘 + 𝑐13 𝑖𝑚𝑝𝑡−1 + ⋯ 𝑐𝑘3 𝑖𝑚𝑝𝑡−𝑘 + 𝜀13
What is VAR?
As you can see all the variables must be specified in levels
The stochastic error term in the model are often called impulses or
innovations or shocks
They show extent and direction of a shock that could happen if one of the
variables is disturbed
The choice of the optimal number of lags for all the variables is done by
using AIC and BIC criteria
 Otherwise, specifying too much lags in model could suffer from
multicollinearity and statistically insignificant coeffects
Equally well too few lags could make the model suffer from specification
errors
The estimation procedures follows OLS
Motivation for VAR Model
• Estimating relationship among variables when there is no co
integration relationship among them
• Stimulating shock to the system and trace the effects of shocks on the
endogenous variables
• Time-series data with autoregressive in nature (serially correlated).
• VAR model is one of the most successful and flexible models for the
analysis of multivariate time series.
• Especially useful for describing the dynamic behavior of economic
and financial time series.
• Useful for forecasting
Steps in performing VAR in STATA
1. Lag selection of Variables
• As noted in the above equation, the variables are interrelated with
lagged values of other variables. However, it is unclear how many lags
the variables show interrelation.
• Therefore, to begin VAR, first, it is imperative to recognize the exact
level of lags at which variables are inter-connected or endogenously
obtained.
2. Stationarity
The second step would be to check and assure stationarity in data. I.e.;
stationarity of Gdp, Export and Import
Steps in performing VAR in STATA
3. Test for Co-integration
• In the case of co-integration, suppose there are two or more non-
stationary variables for regression. While estimating residuals from the
regression, the residuals turn out to be stationary. That means, two or
more non-stationary series may result in a stationary series. This is called
as co-integration.
• The implication of co-integration is that two variables have a long-term
casualty and in the long run, the variables might converge towards an
equilibrium value. Equilibrium value is steady, therefore they have equal
means and variance, or ‘stationary’.
• Therefore, before initiating VAR, find out if the present model contains any
co-integration or equilibrium state. Co-integration indicates a long-term
association between two or more non-stationary variables.
Steps in performing VAR in STATA
4. If Co-integration is not present = We apply VAR.
• VAR technique where variables are endogenous and dependent on
lagged values of other variables.
5. If co-integration is present = apply Vector Error Correction Model
(VECM).
• VECM model takes into account the long term and short term
causality dynamics. It also offers a possibility to apply VAR to
integrated multivariate time series.
6. VECM / VAR diagnostic, tests and forecasting
• Based on the constructed VECM model, review the assumptions of
autocorrelation and normality, and then proceed to forecast.
Estimation Procedure of the VAR model
(i) Specify the model correct
(ii) Set stata to recognize data as time series
(iii)Do stationarity tests to ensure that all the variables are I(1)
(iv)Determine the optimal number of lags using the syntax
varsoc variable1 variable2 variable3
(v) Then choose the maximum number of lags based on the AIC or BIC
value
(vi) Check for an evidence of cointegration following the procedures
outlined in the next slide
Checking for cointegration in a VAR model
Type the following:
• vecrank variable1 variable2 variable3 …., trend(constant) lags(4) max
Alternatively
• Go to statistics
• Multivariate time series
• Cointegrating rank of a vector correction model
• Put all the variables in the depend option
• Put the value of the maximum number of lags identified before
• Then click ok
Results will appear with null hypothesis of no cointegration at all possible
ranks
Reject or not reject the hypotheses on the basis of the trace statistic and
the maximum statistic
Estimation Procedure of the VAR model
(vii) Estimate long run relationship if there is cointegration, otherwise
just estimate VAR
(viii) Estimate the model using the syntanx var variable1 variable2 ,
lags(1/n) where n is the maximum number of lags identified before
(ix) Perform diagnostic tests using the following syntax
• (I) varlmar - for autocorrelation
• (ii) varnorm, jbera - normality
• (iii)varstable-graph - stability
Setting the time variable
• Firstly set the time variable for the new time series data. Since the
data of 1980-2018 is in yearly format, use the below command:
tsset year

. tsset year
time variable: year, 1980 to 2018
delta: 1 unit
percapita inflation1
200.00400.00600.00800.00
1000.00 0.00 10.00 20.00 30.00 40.00

1980
1980

1990
1990

year
year

2000
2000

2010
2010

2020
2020

inflation2
0.00 10.00 20.00 30.00 40.00
Test stationarity of the data

1980
1990
year
2000
2010
2020
Augmented Dickey-Fuller (ADF) Test: A test for stationarity
• Incase of stationarity the hypothesis will become:
H0: The time series is non-stationary against
H1: The time series is stationary
Then this hypothesis is tested by carrying out appropriate difference (d
= 1) of the data and applying the ADF test to the differenced time
series. First order difference (d = 1) means we have generated a
differenced time series data of current (Yt) and immediate previous one
values [Y = Y(t) – Y(t-1)].
Note:
• A time series is “stationary” if all of its statistical properties—mean,
variance, autocorrelations, etc.—are constant in time. Thus, it has no
trend, no heteroscedasticity.
Augmented Dickey–Fuller unit root test.
• In our final evaluation all the variables became stationary after first difference.
Hence, they are integrated of order I (1). Once all the series are non-stationary in
the level, one can estimate an econometric model only if they are co-integrated.
Thus, co-integration tests can be applied for all variables.

Variables Level constant and First difference Order of integration


trend Constant trend
Inflation1 -1.385 (0.5897) -7.178 (0.0000) I(1)
Inflation2 -2.019 (0.2782) -11.154 (0.0000) I(1)
Per capital 0.845 (0.9923) -5.215 (0.000) I(1)
Lag selection and cointegration test in VAR
• To start with lag selection parameters in STATA, follow the steps
below:
1.Click on ‘Statistics’ on the result window
2.Choose ‘Multi-variate Time Series’
3.Click on ‘VAR Diagnostic and Test’
4.Select ‘Lag-order selection statistics’.
Lag selection and cointegration test in VAR
Example one
Consider variables inflation1, inflation2 and percapita
• We had seen before that they are non-stationarity
• They are both I(1), that becomes stationery when differenced once
• Let us find the maximum number of lags by typing
• varsoc inflation1 inflation2 percapita
• The results suggest that maximum number of lags is 4
Worked example1 using STATA
. varsoc inflation1 inflation2 percapita

Selection-order criteria
Sample: 1984 - 2017 Number of obs = 34

lag LL LR df p FPE AIC HQIC SBIC

0 -468.649 2.2e+08 27.7441 27.79 27.8787


1 -378.64 180.02 9 0.000 1.9e+06 22.9788 23.1625 23.5175*
2 -371.972 13.336 9 0.148 2.2e+06 23.116 23.4375 24.0588
3 -354.578 34.787 9 0.000 1.4e+06 22.6223 23.0816 23.969
4 -341.216 26.725* 9 0.002 1.2e+06* 22.3656* 22.9627* 24.1165

Endogenous: inflation1 inflation2 percapita


Exogenous: _cons
Identify the number of lags
• The results table will show the number of lags in the first column and
a number of parameters. Select the optimal lags, like, Final Prediction
Error (FPE), Akaike Information Criterion (AIC), Hannan Quinn
Information Parameters (HQIC), and Schwartz Information Parameters
(SBIC). STATA will compute four information parameters as well as a
sequence of likelihood ratio tests.
• To identify the number of lags, select the values showing.
• Therefore, the lag as per FPE parameters is 4. Following the same
rule, the lag as per AIC is also 4, and as per HQIC is 4 and SBIC is 1. To
select parameters with optimal lags for VAR, follow the majority.
• Therefore, the number of lags selected for the present case is 4.
Johansen cointegration test
• Johansen cointegration test, also known as the eigenvalue test or
trace test, is a likelihood ratio test. There are two tests under
Johansen cointegration; maximum eigenvalue test, and trace test. For
both test statistics, the initial Johansen test is a test of the null
hypothesis of no cointegration against the alternative of
cointegration.
• The null hypothesis for this test differs in the case of differing ranks.
For clarity, the Johansen cointegration test is performed for
variables inflation1, inflation2 and percapita. Follow these steps to
start (figure below):
1.Click on ‘Statistics’ on ‘Result’ window
2.Select ‘Multivariate Time-series’
3.Select ‘Co-integrating rank of a VECM’.
Worked example1 using STATA
To test for cointegration relationship type the following
• vecrank inflation1 inflation2 percapita, trend(constant) lags(4) max
Alternatively
• Go to statistics
• Multivariate time series
• Cointegrating rank of a vector correction model
• Put all the variables in the depend option
• Put the number of lags as identified before
• Click reporting and tick “reporting maximum eigen value statistic
• Then click OK
The results as indicated in the next slides shows that there is no any
cointegrating equation
• In the ‘Dependent variables’ option, select three-time series
variables inflation1, inflation2 and percapita. Since co-integration
analysis takes the case of non-stationary variables to check for
causality, take inflation1, inflation2 and percapita instead of their
first differences. Then select the number of lags. In this case,
the lag selected parameters were conducted in a previous
analysis, therefore, the number of lags here is 3.
• After selecting for lag, click on the ‘Reporting’ tab of the vecrank
window and click on ‘Report maximum-eigenvalue statistic’
(figure below). Click on ‘OK’.
. vecrank inflation1 inflation2 percapita, trend(constant) lags(4) max

Johansen tests for cointegration


Trend: constant Number of obs = 34
Sample: 1984 - 2017 Lags = 4

5%
maximum trace critical
rank parms LL eigenvalue statistic value
0 30 -349.71115 . 16.9904* 29.68
1 35 -344.49732 0.26413 6.5628 15.41
2 38 -341.23226 0.17474 0.0326 3.76
3 39 -341.21594 0.00096

5%
maximum max critical
rank parms LL eigenvalue statistic value
0 30 -349.71115 . 10.4277 20.97
1 35 -344.49732 0.26413 6.5301 14.07
2 38 -341.23226 0.17474 0.0326 3.76
3 39 -341.21594 0.00096
• The result of the Johansen co-integration test can be interpreted in
parts. Converge the focus towards three columns; maximum rank,
trace statistics or max statistics, and critical values.
Starting from maximum rank zero, the null and alternative
hypotheses are as follows:
• Null Hypothesis: There is no cointegration
• Alternative Hypothesis: There is cointegration
• As the figure above shows, at maximum rank zero, the trace statistic
(16.9904) do not exceed critical values (29.68). Therefore the null
hypothesis cannot be rejected. Also, this suggests that the time series
variables inflation1, inflation2 and percapita are not cointegrated.
• Similarly, for max statistics, the value 10.4277 does not exceed the
critical value of 20.97, thus suggesting a similar result that the null
hypothesis cannot be rejected. Thus, as per maximum rank
0, inflation1, inflation2 and percapita are not cointegrated.
• Following the above results, apply unrestricted VAR to time
series inflation1, inflation2 and percapita.
Worked example1 using STATA
 Therefore we proceed to estimate a VAR model which gives a short
run relationship among the variables by typing
• var inflation1 inflation2 percapita, lags(1/4)
Alternatively
• Go to statistics
• Multivariate time series
• Vector Autoregressive (VAR)
• Enter inflation1 inflation2 percapita in dependent window
• Supply number of lags
• OK
• Results will be given showing significance level for the three
equations as well as the coefficients for all the lags
• To start with the unrestricted VAR model in STATA, follow:
1.Click on ‘Statistics’
2.Select ‘Multivariate Time Series’
3.Select ‘VAR’
4.Enter inflation1 inflation2 percapita in dependent window
. var inflation1 inflation2 percapita, lags(1/4) inflation2
inflation1
Vector autoregression L1. .5628474 .3252214 1.73 0.084 -.0745748 1.20027
L2. -1.042368 .3379676 -3.08 0.002 -1.704772 -.3799636
Sample: 1984 - 2017 Number of obs = 34 L3. .3663707 .3829769 0.96 0.339 -.3842503 1.116992
L4. -.1109212 .223191 -0.50 0.619 -.5483675 .326525
Log likelihood = -341.2159 AIC = 22.36564
FPE = 1172642 HQIC = 22.96272
inflation2
Det(Sigma_ml) = 104599.5 SBIC = 24.11647
L1. .4645426 .2341694 1.98 0.047 .005579 .9235061
L2. .490062 .3567581 1.37 0.170 -.209171 1.189295
Equation Parms RMSE R-sq chi2 P>chi2 L3. .4284189 .3332097 1.29 0.199 -.2246602 1.081498
L4. -.2543364 .2500125 -1.02 0.309 -.744352 .2356791
inflation1 13 3.45405 0.9427 558.9825 0.0000
inflation2 13 4.76478 0.8994 304.095 0.0000 percapita
percapita 13 70.0311 0.9664 977.5961 0.0000 L1. -.0139952 .0117966 -1.19 0.235 -.0371161 .0091257
L2. -.0120753 .0157292 -0.77 0.443 -.0429039 .0187533
L3. .06579 .0164268 4.01 0.000 .0335941 .0979859
L4. -.0425098 .0149989 -2.83 0.005 -.071907 -.0131126
Coef. Std. Err. z P>|z| [95% Conf. Interval]
_cons 1.858573 2.688965 0.69 0.489 -3.411701 7.128848
inflation1
percapita
inflation1
inflation1
L1. -.0005647 .235757 -0.00 0.998 -.4626399 .4615105
L1. -3.259989 4.779993 -0.68 0.495 -12.6286 6.108625
L2. -.9771132 .2449968 -3.99 0.000 -1.457298 -.4969282 L2. -2.504575 4.967332 -0.50 0.614 -12.24037 7.231216
L3. .4578849 .2776247 1.65 0.099 -.0862494 1.002019 L3. -12.364 5.628864 -2.20 0.028 -23.39637 -1.331633
L4. .2912956 .1617939 1.80 0.072 -.0258145 .6084057 L4. -.9344438 3.280384 -0.28 0.776 -7.363878 5.494991

inflation2 inflation2
L1. 1.003069 .1697522 5.91 0.000 .6703611 1.335778 L1. .2488198 3.441741 0.07 0.942 -6.496869 6.994508
L2. .5787744 .2586183 2.24 0.025 .0718919 1.085657 L2. 4.990442 5.243508 0.95 0.341 -5.286644 15.26753
L3. .1605202 .2415478 0.66 0.506 -.3129048 .6339452 L3. 7.54439 4.897402 1.54 0.123 -2.054342 17.14312
L4. -.5490827 .1812371 -3.03 0.002 -.904301 -.1938645 L4. 3.35987 3.674599 0.91 0.361 -3.84221 10.56195

percapita percapita
L1. .0020406 .0085515 0.24 0.811 -.01472 .0188012 L1. .92678 .1733822 5.35 0.000 .5869572 1.266603
L2. .160259 .2311818 0.69 0.488 -.2928491 .613367
L2. -.002563 .0114023 -0.22 0.822 -.0249111 .019785
L3. .0124965 .2414351 0.05 0.959 -.4607076 .4857007
L3. .007987 .011908 0.67 0.502 -.0153522 .0313262
L4. -.1467818 .220448 -0.67 0.506 -.578852 .2852883
L4. -.0075004 .0108729 -0.69 0.490 -.0288108 .01381
_cons 84.70345 39.52149 2.14 0.032 7.242761 162.1641
_cons -.2622933 1.949264 -0.13 0.893 -4.08278 3.558193

.
1.Only lag 1,2 and lag 4 of inflation2 is significantly identified having an
effect on inflation1
• lag 2 of inflation1 is significantly identified having an effect on inflation2
• lag 3 and 4 of percapita is significantly identified having an effect
on inflation2
• lag 3 of inflation1 is significantly identified having an effect on percapita
2.R square for inflation1 model is also above 90% verifying the goodness of
fit.
3.Log-likelihood value 341 is also highest, further indicating consistency.
4.The constant identified in percapita model is also significant with 0 p-
values.
• Therefore, the overall result presented in this article reveals three
outcomes:
1.inflation1, inflation2 and percapita Johnsen cointegration test indicates
that there is no cointegration between the those time series.
Diagnostic test for VAR model
Type the following do diagnosis the VAR model

• I) varlmar- for autocorrelation

• (ii) varnorm, jbera - normality

• (iii)varstable, graph - stability


. varlmar, mlag(2)

Lagrange-multiplier test

lag chi2 df Prob > chi2

1 17.5338 9 0.04098
2 17.0882 9 0.04735

H0: no autocorrelation at lag order

The null hypothesis states that no autocorrelation is present at lag order. Although
at lag 1 & 2, p values are significant, indicating the presence of autocorrelation.
Therefore reject the null hypothesis. Hence it means VAR model indicating the
presence of autocorrelation
. varnorm, jbera

Jarque-Bera test

Equation chi2 df Prob > chi2

inflation1 0.076 2 0.96266


inflation2 0.445 2 0.80070
percapita 0.702 2 0.70383
ALL 1.223 6 0.97573

The null hypothesis states that the residuals of variables are not
normally distributed. Since the p values of all variables are
insignificant, indicating the null hypothesis is accepted. Therefore
residuals of these variables are normally distributed.
Stability of the model (varstable)
Eigenvalue stability condition

Eigenvalue Modulus

1.013855 1.01385
-.9307207 .930721
.8779007 + .1772527i .895616
.8779007 - .1772527i .895616
-.3818256 + .7763177i .865136
-.3818256 - .7763177i .865136
-.4661664 + .6289068i .782838
-.4661664 - .6289068i .782838
.1605716 + .7062344i .724258
.1605716 - .7062344i .724258
.4633316 + .1671969i .492576
.4633316 - .1671969i .492576

At least one eigenvalue is at least 1.0.


VAR does not satisfy stability condition.
varstable
Granger Causality - Chicken or Egg?
• This causality test is also can be used in explaining which comes first:
chicken or egg. More specifically, the test can be used in testing
whether the existence of egg causes the existence of chicken or vise
versa.
• Thurman and Fisher (1988) did this study using yearly data of chicken
and egg productions in the US from 1930 to1983.
• The results:
1. Egg causes the chicken.
2. There is no evidence that chicken causes egg.
Granger Causality - Remarks
• Granger causality does not equal to what we usually mean by
causality.
• Even if x1 does not cause x2, it may still help to predict x2, and thus
Granger-causes x2 if changes in x1 precedes that of x2 for some reason
(usually because of a third variable, missing in the model).
• Example: A dragonfly flies much lower before a rain storm, due to the
lower air pressure. We know that dragonflies do not cause a rain storm,
but it does help to predict a rain storm, thus Granger-causes a rain
storm
• When x1 does not cause x2, we say that x2 is strongly exogeneous
and thus Granger-causes x2 if changes in x1 precedes that of x2 for
some reason (usually because of a third variable, missing in the
model).
Granger casualty
• Granger causality refers to scenario where a time series variable can cause
the happening of another time series variable
• If suppose y is one time series variable and x is another time series variable
• If we run a VAR model, then X would granger cause y, if its lags can
influence the happening of y
• We ascertain that x granger cause y by fitting y as a function of its own lags
without lags of x, then we compare this fitted model against a model fitted
with inclusion of x-lags
• When the model which includes x lags is significantly different from model
with y lags only that provides an evidence of granger casualty otherwise
there is no granger causality
• However, an automatic test for granger causality after running a VAR model
is simply obtained by typing
• vargranger
Granger causality test
• For executing the Granger causality test in STATA, follow these steps:
1.Go to ‘Statistics’.
2.Click on ’Multivariate time series’.
3.Select ‘VAR diagnostics and tests’.
• Choose ‘Granger causality tests’. Use active or svar results’ and click on
‘OK’.
The null hypothesis for Granger causality test is:
• First equation: Lagged values of inflation2 and percapita do not cause
inflation1
• Second equation: Lagged values of inflation1 and percapita do not cause
inflation2.
• Third equation: Lagged values of inflation1 and inflation2 do not cause
percapita.
Granger casualty
. vargranger

Granger causality Wald tests

Equation Excluded chi2 df Prob > chi2

inflation1 inflation2 61.074 4 0.000


inflation1 percapita .69904 4 0.951
inflation1 ALL 77.565 8 0.000

inflation2 inflation1 32.277 4 0.000


inflation2 percapita 16.743 4 0.002
inflation2 ALL 61.966 8 0.000

percapita inflation1 5.5376 4 0.236


percapita inflation2 3.1676 4 0.530
percapita ALL 13.881 8 0.085

.
Granger casualty
• First row
The first row of the above figure shows that lagged values of
inflation2 cause inflation1 as p-value is equal less than 0.05.
However, because of the p value (0.951 > 0.05), lagged values of
percapita do not cause inflation1. Therefore the null cannot be
rejected. The direction of causality is therefore
from inflation2 to inflation1.
• Second row
The results in the Second row show that lagged values of both
inflation1 and percapita cause inflation2. Since p values for both
the variables are less than 0.05, accept the null hypothesis
‘lagged values of inflation1 and percapita do not cause inflation2’
at 5% level of significance. The direction of causality is from
both inflation1 and percapita to inflation2.
Granger casualty
• Third row
In the third row, p value of inflation1 and inflation2 do not cause
percapita (since p value > 0.05)
Therefore the presence of Granger causality is as follows.
1. inflation1 and inflation2 show bidirectional Granger causality.
2. Unidirectional causality from percapita to inflation2
Impulse Response Function (IRF)
When fitting a VAR model apart from bring interested with granger
causality you could also be interested with how one variable in the
system when disturbed influence other variables
A variable disturbed is known as an impulse variable, where as the
influenced variables are known as impulse response
The function which provide a relationship between a response
variable and an impulse variable is called impulse response function
(IRF)
We can command STATA to produce a graph of such impulse
response functions and try interpret them
To do that type the following:
• varbasic inflation1 inflation2 percapita, lags(1/4) step(8) irf
• irf graph irf
Impulse Response Function (IRF)
VAR & Impulse Response Function
Interpretation of a SD shock to per capital
• Response on Inflation1
As one SD shock (innovation) to per capital temporarily decreases Inflation1.
This positive response gradually declines until the third period when it hits
its steady state value. Beyond period three, Inflation1 rises above its steady
state value and remain in the positive region.
• Response on Inflation2
As one SD shock (innovation) to per capital initially increases Inflation2. This
positive response gradually declines until the third period when it hits its
steady state value. Beyond period three, Inflation2 fall below its steady state
value and remain in the positive region.
Applications of VAR
• Analysis of system response to different shocks/impacts.
• Model-based forecast. In general VAR encompasses correlation
information of the observed data and use this correlation information
to forecast future movements or changes of the variable of interest
• In economics, VAR is used to forecast macroeconomic variables, such
as GDP, money supply, and unemployment.
• In finance, predict spot prices and future prices of securities; foreign
exchange rates across markets
• In accounting, predict different accounting variables such as sales,
earnings, and accruals.
• In marketing, VAR can be used to evaluate the impact of different
factors on consumer behavior and forecast its future change.
Criticisms of the VAR
• Many argue that the VAR approach is lacking in theory.
• There is much debate on how the lag lengths should be determined •
• It is possible to end up with a model including numerous explanatory
variables, with different signs, which has implications for degrees of
freedom.
• Many of the parameters will be insignificant, this affects the efficiency
of a regression.
• There is always a potential for multicollinearity with many lags of the
same variable
Stationarity and VARs
• Should a VAR include only stationary variables, to be valid?
• Sims argues that even if the variables are not stationary, they should
not be first-differenced.
• However others argue that a better approach is a multivariate test
for cointegration and then use first-differenced variables and the
error correction term
• They would argue that the purpose of VAR estimation is purely to
examine the relationships between the variables, and that
differencing will throw information on any long-run relationships
between the series away.“ (According to Chris Brooks (2014), in his
book Introductory Econometrics for Finance)
Conclusion
VARs have a number of important uses, particularly causality tests and
forecasting
• To assess the affects of any shock to the system, we need to use
impulse response functions and variance decomposition
• VECMs are an alternative, as they allow first differenced variables and
an error correction term.
• The VAR has a number of weaknesses, most importantly its lack of
theoretical foundations
VECM
Vector Error Correction Model (VECM)
• It can be understood that cointegration indicates the presence of
causality among two time series but it does not detect the direction
of the causal relationship.
• According to Engle and Granger (1987), the presence of cointegration
among the variables shows unidirectional or bi-directional Granger
causality among those variables.
• Further, they demonstrate that the cointegration variables can be
specified by an Error Correction Mechanism (henceforth ECM) that
can be estimated by applying standard methods and diagnostic tests.
Consider dataset (Vecm Var.dta) MACROECONOMICS DATA, UNITED
STATES, 1970–I TO 1991–IV which contain the following variable

• GDP (Gross Domestic Product), billions of 1987 dollars


• PDI (Personal disposable income), billions of 1987 dollars
• PCE (Personal consumption expenditure), billions of 1987 dollars
• Profits (corporate profits after tax), billions of dollars,
• Dividends (net corporate dividend payments), billions of dollars,
1. 𝑝𝑑𝑖𝑡 = 𝛼 + 𝑎11 𝑝𝑑𝑖𝑡−1 + ⋯ 𝑎𝑘1 𝑝𝑑𝑖𝑡−𝑘 + 𝑏11 𝑝𝑐𝑒𝑡−1 +
⋯ 𝑏𝑘1 𝑝𝑐𝑒𝑡−𝑘 + 𝑐11 𝑔𝑑𝑝𝑡−1 + ⋯ 𝑐𝑘1 𝑔𝑑𝑝𝑡−𝑘 + 𝜀1𝑖

• 𝑝𝑐𝑒𝑡 = 𝛽 + 𝑎12 𝑝𝑑𝑖𝑡−1 + ⋯ 𝑎𝑘2 𝑝𝑑𝑖𝑡−𝑘 + 𝑏12 𝑝𝑐𝑒𝑡−1 +


⋯ 𝑏𝑘2 𝑝𝑐𝑒𝑡−𝑘 + 𝑐12 𝑔𝑑𝑝𝑡−1 + ⋯ 𝑐𝑘2 𝑔𝑑𝑝𝑡−𝑘 + 𝜀12

• 𝑔𝑑𝑝𝑡 = 𝛾 + 𝑎13 𝑝𝑑𝑖𝑡−1 + ⋯ 𝑎𝑘3 𝑝𝑑𝑖𝑡−𝑘 + 𝑏13 𝑝𝑐𝑒𝑡−1 +


⋯ 𝑏𝑘3 𝑝𝑐𝑒𝑡−𝑘 + 𝑐13 𝑔𝑑𝑝𝑡−1 + ⋯ 𝑐𝑘3 𝑔𝑑𝑝𝑡−𝑘 + 𝜀13
Setting the time variable
• Firstly set the time variable for the new time series data. Since the
data of 1970Q1-1991Q4 is in quarterly format, use the below
command:
tsset date

. tsset date
time variable: date, 1970q1 to 1991q4
delta: 1 quarter
Lag selection and cointegration test in VAR
Example one
Consider variables lnpdi, lnpce and lngdp
• We had seen before that they are non-stationarity
• They are both I(1), that becomes stationery when differenced once
• Let us find the maximum number of lags by typing
• varsoc lnpdi lnpce lngdp, maxlag(8)
• The results suggest that maximum number of lags is 2
Worked example2 using STATA
. varsoc lnpdi lnpce lngdp, maxlag(8)

Selection-order criteria
Sample: 1972q1 - 1991q4 Number of obs = 80

lag LL LR df p FPE AIC HQIC SBIC

0 508.747 6.5e-10 -12.6437 -12.6079 -12.5544


1 849.659 681.82 9 0.000 1.6e-13 -20.9415 -20.7982 -20.5842*
2 864.426 29.533* 9 0.001 1.4e-13* -21.0856* -20.8349* -20.4604
3 872.256 15.661 9 0.074 1.4e-13 -21.0564 -20.6983 -20.1631
4 878.646 12.779 9 0.173 1.5e-13 -20.9911 -20.5256 -19.8299
5 882.521 7.7502 9 0.559 1.8e-13 -20.863 -20.29 -19.4338
6 888.64 12.239 9 0.200 1.9e-13 -20.791 -20.1105 -19.0938
7 893.773 10.266 9 0.329 2.2e-13 -20.6943 -19.9064 -18.7292
8 900.78 14.014 9 0.122 2.3e-13 -20.6445 -19.7492 -18.4113

Endogenous: lnpdi lnpce lngdp


Exogenous: _cons
Identify the number of lags
• The results table will show the number of lags in the first column and
a number of parameters. Select the optimal lags, like, Final Prediction
Error (FPE), Akaike Information Criterion (AIC), Hannan Quinn
Information Parameters (HQIC), and Schwartz Information Parameters
(SBIC). STATA will compute four information parameters as well as a
sequence of likelihood ratio tests.
• To identify the number of lags, select the values showing.
• Therefore, the lag as per FPE parameters is 2. Following the same
rule, the lag as per AIC is also 2, and as per HQIC is 2 and SBIC is 1. To
select parameters with optimal lags for VAR, follow the majority.
• Therefore, the number of lags selected for the present case is 2.
Worked example2 using STATA
To test for cointegration relationship type the following
• vecrank lnpdi lnpce lngdp, trend(constant) max
• Alternatively
• Go to statistics
• Multivariate time series
• Cointegrating rank of a vector correction model
• Put all the variables in the depend option
• Put the number of lags as identified before
• Click reporting and tick “reporting maximum eigen value statistic
• Then click OK
The results as indicated in the next slides shows that there is no any
cointegrating equation
• In the ‘Dependent variables’ option, select three-time series
variables lnpdi lnpce and lngdp.
• Since co-integration analysis takes the case of non-stationary
variables to check for causality, take lnpdi lnpce and lngdp
instead of their first differences. Then select the number of lags.
In this case, the lag selected parameters were conducted in a
previous analysis, therefore, the number of lags here is 2.
• After selecting for lag, click on the ‘Reporting’ tab of the vecrank
window and click on ‘Report maximum-eigenvalue statistic’
(figure below). Click on ‘OK’.
. vecrank lnpdi lnpce lngdp, trend(constant) max

Johansen tests for cointegration


Trend: constant Number of obs = 86
Sample: 1970q3 - 1991q4 Lags = 2

5%
maximum trace critical
rank parms LL eigenvalue statistic value
0 12 903.3552 . 52.6342 29.68
1 17 920.68692 0.33173 17.9708 15.41
2 20 928.34886 0.16321 2.6469* 3.76
3 21 929.67232 0.03031

5%
maximum max critical
rank parms LL eigenvalue statistic value
0 12 903.3552 . 34.6635 20.97
1 17 920.68692 0.33173 15.3239 14.07
2 20 928.34886 0.16321 2.6469 3.76
3 21 929.67232 0.03031
• The result of the Johansen co-integration test can be interpreted in
parts. Converge the focus towards three columns; maximum rank,
trace statistics or max statistics, and critical values.
Starting from maximum rank zero, the null and alternative
hypotheses are as follows:
• Null Hypothesis: There is no cointegration
• Alternative Hypothesis: There is cointegration
• As the figure above shows, at maximum rank zero, the trace statistic
(52.6342 ) exceed critical values (29.68). Therefore the null hypothesis
can be rejected. Also, this suggests that the time series
variables lnpdi lnpce and lngdp are cointegrated.
• Similarly, for max statistics, the value 34.6635 does not exceed the
critical value of 20.97, thus suggesting a similar result that the null
hypothesis cannot be rejected. Thus, as per maximum rank 0, lnpdi
lnpce and lngdp are cointegrated.
• Following the above results, apply VECM to time series lnpdi lnpce
and lngdp. Therefore we do not apply unrestricted VAR to integrated
multivariate time series.
Worked example2 using STATA
 Therefore we proceed to estimate a VECM model which gives a long
& short run relationship among the variables by typing
• vec lnpdi lnpce lngdp, trend(constant)
• Alternatively
• Go to statistics
• Multivariate time series
• Vector Error Correction Model
• Enter lnpdi lnpce lngdp, lags(1/2) in dependent window
• Supply number of lags
• OK
• Results will be given showing significance level for the two equations
as well as the coefficients for all the lags
Vector error-correction model D_lnpce
_ce1
Sample: 1970q3 - 1991q4 Number of obs = 86 L1. -.1210753 .0234306 -5.17 0.000 -.1669985 -.0751521
AIC = -21.01597
Log likelihood = 920.6869 HQIC = -20.82072 lnpdi
Det(Sigma_ml) = 1.01e-13 SBIC = -20.53081 LD. .1441779 .0791226 1.82 0.068 -.0108996 .2992555

Equation Parms RMSE R-sq chi2 P>chi2 lnpce


LD. -.2819061 .1639899 -1.72 0.086 -.6033204 .0395082
D_lnpdi 5 .009769 0.4158 57.65747 0.0000
D_lnpce 5 .006467 0.6181 131.1235 0.0000 lngdp
D_lngdp 5 .007506 0.6066 124.9157 0.0000 LD. .101286 .1063435 0.95 0.341 -.1071434 .3097155

_cons .0019773 .0011517 1.72 0.086 -.00028 .0042345

Coef. Std. Err. z P>|z| [95% Conf. Interval] D_lngdp


_ce1
D_lnpdi L1. -.1563081 .0271924 -5.75 0.000 -.2096041 -.103012
_ce1
L1. -.0670597 .035392 -1.89 0.058 -.1364268 .0023074 lnpdi
LD. .2012824 .0918256 2.19 0.028 .0213076 .3812573
lnpdi
LD. -.1443889 .119515 -1.21 0.227 -.3786339 .0898561 lnpce
LD. -.0028113 .1903181 -0.01 0.988 -.3758279 .3702053
lnpce
LD. .440211 .2477072 1.78 0.076 -.0452862 .9257081 lngdp
LD. .0769994 .1234167 0.62 0.533 -.164893 .3188917
lngdp
LD. -.0997838 .1606322 -0.62 0.534 -.414617 .2150495 _cons -.0024484 .0013366 -1.83 0.067 -.0050681 .0001712

_cons .0021371 .0017396 1.23 0.219 -.0012725 .0055467


Interpretation
• lags of lnpce and lngdp is not significantly identified having an
effect on lnpdi
• lag 1 of lnpdi and lngdp is not significantly identified having an effect
on lnpce
• Only lag of lnpdi is significantly identified having an effect on lngdp
• The constant identified in lnpdi lnpce lngdp model is not significant
since p-values > 0.05.
Cointegrating equations

Equation Parms chi2 P>chi2

_ce1 2 1223.217 0.0000

Identification: beta is exactly identified

Johansen normalization restriction imposed

beta Coef. Std. Err. z P>|z| [95% Conf. Interval]

_ce1
lnpdi 1 . . . . .
lnpce -3.631147 .4496589 -8.08 0.000 -4.512462 -2.749832
lngdp 3.074722 .5080554 6.05 0.000 2.078951 4.070492
_cons -4.919437 . . . . .
• Lnpdi is positioned as the dependent variable.
• Note: The signs of the coefficient are reversed in the long-run
• Interpretation; In the long-run, lnpce has a positive impact while
lngdp has a negative impact on Lnpdi. The coefficients are statistically
significant at the 1% level.
• Conclusion: lnpce lngdp have asymmetric effects on Lnpdi in the long-
run, on average, ceteris paribus.
𝐸𝐶𝑇𝑡−1 = 1.000𝑙𝑛𝑝𝑑𝑖𝑡−1 − 3.631 𝑙𝑛𝑝𝑐𝑒𝑡−1 + 3.075𝑙𝑛𝑔𝑑𝑝𝑡−1 − 4.919
Lnpdi as a target variable
𝑙𝑛𝑝𝑑𝑖𝑡
= 0.002 − 0.144 𝑙𝑛𝑝𝑑𝑖𝑡−1 + 0.44 𝑙𝑛 𝑝𝑐𝑒𝑡−1 + 0.099 𝑙𝑛𝑔𝑑𝑝𝑡−1
− 0.067𝐸𝐶𝑇𝑡−1
• Interpretation of the ECT coefficient
The adjustment term (-0.067) is statistically significant at the 10% level,
suggesting that previous years errors (or deviation from long-run
equilibrium) are corrected for within the current year at convergence
speed of 6.7%.
. test ([D_lnpdi]: LD.lnpce)

( 1) [D_lnpdi]LD.lnpce = 0

chi2( 1) = 3.16
Prob > chi2 = 0.0755

This means that pce does not cause pdi. Because probability of chi-
square is higher than 0.05
. test ([D_lnpdi]: LD.lngdp)

( 1) [D_lnpdi]LD.lngdp = 0

chi2( 1) = 0.39
Prob > chi2 = 0.5345

This means that gdp does not cause pdi. Because probability of chi-square is
higher than 0.05.
Therefore both regressors pci and gdp have no short run causal effect on pdi
. test ([D_lnpce]: LD.lnpdi)

( 1) [D_lnpce]LD.lnpdi = 0

chi2( 1) = 3.32
Prob > chi2 = 0.0684

This means that pdi does not cause pce. Because probability of chi-
square is higher than 0.05.
. test ([D_lnpce]: LD.lngdp)

( 1) [D_lnpce]LD.lngdp = 0

chi2( 1) = 0.91
Prob > chi2 = 0.3409

This means that gdp does not cause pce. Because probability of chi-
square is higher than 0.05.
Therefore both regressors pdi and gdp have no short run causal effect
on pce
. test ([D_lngdp]: LD.lnpdi)

( 1) [D_lngdp]LD.lnpdi = 0

chi2( 1) = 4.80
Prob > chi2 = 0.0284
This means that pdi have short run causal effect gdp. Because
probability of chi-square is less than 0.05.
. test ([D_lngdp]: LD.lnpce)

( 1) [D_lngdp]LD.lnpce = 0

chi2( 1) = 0.00
Prob > chi2 = 0.9882
This means that pce does not cause gdp. Because probability of chi-
square is higher than 0.05.
Diagnostic test for VECM model
Type the following do diagnosis the VAR model

• I) veclmar- for autocorrelation

• (ii) vecnorm, jbera – normality

• (iii)vecstable, graph - stability


. veclmar

Lagrange-multiplier test

lag chi2 df Prob > chi2

1 10.7541 9 0.29294
2 14.1067 9 0.11858

H0: no autocorrelation at lag order

The null hypothesis states that no autocorrelation is present


at lag order. Although at lag 1 & 2, p values are insignificant,
indicating the presence of no autocorrelation. Therefore
accept the null hypothesis. Hence it means VECM model is
free of the problem of autocorrelation.
. vecnorm, jbera

Jarque-Bera test

Equation chi2 df Prob > chi2

D_lnpdi 3.100 2 0.21227


D_lnpce 7.927 2 0.01900
D_lngdp 2.547 2 0.27989
ALL 13.574 6 0.03478

The null hypothesis states that the residuals of variables are not
normally distributed. Since the p values of the variables D_lnpdi
and D_lngdp are insignificant, indicating the null hypothesis is
accepted. Therefore residuals of these variables are normally
distributed. Therefore, this VECM model of these variables does
not carries the problem of normality, only variable D_lnpce carries
the problem of normality.
Stability of the model (varstable)
Eigenvalue stability condition

Eigenvalue Modulus

1 1
1 1
.7185436 .718544
-.3805513 .380551
.2295313 .229531
-.02484063 .024841

The VECM specification imposes 2 unit moduli.


vecstable, graph
Worked example2 using STATA
The command will display both long run and short run model results
First check whether
To show long run relationship either ce1 and/or ce2 coefficient must show a
significant p-value and negative sign.
In the given results there is an evidence of longrun causality from exports
and gdp to imports because coefficient of ce1 has a significant and negative
value.
However, there is no evidence of longrun relationship from any other
equation
On the short run relationship
1. There is evidence that imports are affected
• Positively by lags one, two and three of …. itself
• Negatively by Lags one, two and three of ….
• Lag two of … to influence …… negatively
Forecasting using VECM in STATA
• Further, to forecast the values of lnpdi, lnpce and
lngdp using VECM results, follow these steps as shown in the figure below:
• Click on ‘Statistics’ on the main bar.
• Select ‘Multivariate Time Series’.
• Select ‘VEC/VAR Forecast’.
• Click on ‘Compute Forecasts’
‘fcast’ window will appear. Choose a prefix (in this case, “ieg”). Then
select the period to be forecast. In this case, the values of the time
series till four quarters, therefore select ‘5’.
The window does not reveal the results of the forecast. Rather, they
appear in data editor window as newly created variables. The table
below shows the forecast for the case. (fcast compute ieg, step(5)
date vecmlnpdi vecmlnpdi_SE vecmlnpdi_LB vecmlnpdi_UB
1992q1 8.1774574 0.00976897 8.1583105 8.1966042
1992q2 8.1837663 0.01337976 8.1575424 8.2099901
1992q3 8.1895301 0.01591662 8.1583341 8.2207261
1992q4 8.1957359 0.01801411 8.1604289 8.2310429
1993q1 8.2019 0.01986493 8.1629655 8.2408346
1993q2 8.2081615 0.0215851 8.1658554 8.2504675
1993q3 8.2144422 0.02320661 8.1689581 8.2599263
1993q4 8.2207555 0.02475154 8.1722434 8.2692677
date vecmlnpce vecmlnpce_SE vecmlnpce_LB vecmlnpce_UB
1992q1 8.0984521 0.00646737 8.0857763 8.1111279
1992q2 8.1039697 0.00928079 8.0857797 8.1221598
1992q3 8.1102541 0.01218008 8.0863816 8.1341266
1992q4 8.1165465 0.01495945 8.0872265 8.1458665
1993q1 8.1229987 0.01777797 8.0881545 8.1578429
1993q2 8.129495 0.02053414 8.0892488 8.1697412
1993q3 8.136047 0.02320737 8.0905614 8.1815326
1993q4 8.1426293 0.02577049 8.0921201 8.1931386
date vecmlngdp vecmlngdp_SE vecmlngdp_LB vecmlngdp_UB
1992q1 8.4925923 0.00750569 8.4778814 8.5073031
1992q2 8.4966629 0.01053492 8.4760149 8.517311
1992q3 8.5016442 0.01249011 8.4771641 8.5261244
1992q4 8.5068557 0.0143793 8.4786728 8.5350386
1993q1 8.5122706 0.0165743 8.4797856 8.5447556
1993q2 8.5177885 0.01900599 8.4805375 8.5550396
1993q3 8.5233903 0.02156182 8.4811299 8.5656507
1993q4 8.5290471 0.02413898 8.4817356 8.5763587
the end

85

You might also like