0% found this document useful (0 votes)
5 views37 pages

Assignment No 03

The document outlines a coursework assignment involving the analysis of FTSE 100 returns using an Excel dataset from 2000 to 2017. It includes tasks such as descriptive statistics, graphical analysis, stationarity testing, and forecasting using various time series models. The findings indicate a slightly negative average return, non-normal distribution, and significant AR and MA terms in the modeling of returns.

Uploaded by

Ayesha Masood
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views37 pages

Assignment No 03

The document outlines a coursework assignment involving the analysis of FTSE 100 returns using an Excel dataset from 2000 to 2017. It includes tasks such as descriptive statistics, graphical analysis, stationarity testing, and forecasting using various time series models. The findings indicate a slightly negative average return, non-normal distribution, and significant AR and MA terms in the modeling of returns.

Uploaded by

Ayesha Masood
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 37

BE372 - Coursework Assignment 24/25.

Question 1
The Excel spreadsheet ”AssignmentQ1data.xls” contains data for the returns of FTSE 100
and its 5-minute realized variance from 04-01-2000 to 29-12-2017. This question draws
heavily from material across different weeks but the code that you need to use is very similar
to the ones used in our computer classes. You need to do the following tasks:

a) Import all variables above into the software of your choice and provide a summary
of the FTSE 100 returns using descriptive statistics and graphical analysis. Your
summary should contain information on any visible patterns (if any) in the data and
your next steps.
SOLUTION: Descriptive statistics

RETURN
Mean -6.26E-05
Median 0.000266
Maximum 0.093872
Minimum -0.088661
Std. Dev. 0.011567
Skewness -0.197665
Kurtosis 9.357238

Jarque-Bera 7669.565
Probability 0

Sum -0.284043
Sum Sq. Dev. 0.606911

Observations 4537

Central Tendency:
1. Mean (-6.26E-05):
The average return is close to zero (-0.0000626), suggesting the returns hover around
zero but with slight negativity over the dataset.
2. Median (0.000266):
The median return is slightly positive, indicating that at least half of the observed returns
are greater than or equal to 0.000266.

Dispersion:
3. Maximum (0.093872) & Minimum (-0.088661):
The returns range from -0.088661 to 0.093872, showing a wide variation in returns.
4. Standard Deviation (0.011567):
This indicates the returns are dispersed around the mean by approximately 0.011567. This
suggests moderate volatility in the returns.

Shape of the Distribution:


5. Skewness (-0.197665):
The returns are slightly negatively skewed, meaning the distribution has a tail slightly
longer on the left side. This implies there are marginally more extreme negative returns
than extreme positive ones.
6. Kurtosis (9.357238):
A kurtosis greater than 3 indicates the returns have heavier tails and a sharper peak
compared to a normal distribution. This suggests the returns are leptokurtic, with a higher
likelihood of extreme values.

Normality Test:
7. Jarque-Bera Statistic (7669.565) & Probability (0):
The Jarque-Bera statistic is significantly large, and the probability value is 0, indicating
the null hypothesis of normality is rejected. The returns are not normally distributed.

Additional Metrics:
8. Sum (-0.284043):
Over the 4537 observations, the cumulative return is slightly negative (-0.284043),
reflecting the slight negativity in the average return.
9. Sum of Squared Deviations (0.606911):
This reflects the total squared deviations of returns from their mean, emphasizing the
magnitude of fluctuations in the data.
10. Number of Observations (4537):
A robust sample size provides confidence in the reliability of these descriptive statistics.

Overall Interpretation:
 The data suggests a slightly negative average return, with moderate volatility.
 The distribution is not normal, showing negative skewness and leptokurtosis, which
implies higher probabilities of extreme negative and positive returns.
 Risk managers and analysts should consider the non-normal nature and potential for
extreme outcomes when analyzing this dataset.

GRAPHICAL REPRESENTATION:

Visible Patterns in the Data:


From the time series plot of returns:
1. Volatility Clustering: There are visible clusters of high volatility followed by periods of
relatively low volatility. For example, between observations around 2000–2500, there is a
noticeable spike in return magnitude.
2. Stationarity: The data appears to fluctuate around zero without a clear trend, suggesting
stationarity. However, this needs to be statistically tested.
3. Spikes in Returns: There are several sharp upward and downward movements in returns,
likely indicating rare but extreme events or market shocks.
4. Mean-Reverting Behavior: The returns appear to revert back to zero over time after
deviations, consistent with financial time series.
Next Steps for Analysis:
1. Stationarity Testing:
o Conduct the Augmented Dickey-Fuller (ADF) test or Kwiatkowski-Phillips-
Schmidt-Shin (KPSS) test to confirm whether the returns are stationary.
2. Volatility Modeling:
o Analyze volatility clustering further by fitting a GARCH (Generalized
Autoregressive Conditional Heteroskedasticity) model.
3. Extreme Events Analysis:
o Identify and study the events causing the extreme returns, such as spikes in
observations around 2000-2500.
4. Autocorrelation Analysis:
o Compute the Autocorrelation Function (ACF) and Partial Autocorrelation
Function (PACF) to check for serial correlation in returns, which could suggest
potential predictability or model selection.
5. Normality Testing:
o Use further tools like histograms, Q-Q plots, and statistical tests to confirm non-
normality observed in descriptive statistics.
6. Outlier Detection:
o Quantify and analyze the extreme spikes in the return data to see if they follow
expected distributions (e.g., power-law tails).
b) Explain how one should check for stationarity. The check for stationarity and if
needed make your variables stationary. Explain the steps you would take to achieve
this.
SOLUTION:

Null Hypothesis: RETURN has a unit root


Exogenous: Constant
Lag Length: 2 (Automatic - based on SIC,
maxlag=31)
t-Statistic Prob.*

Augmented Dickey-Fuller test statistic -43.58485 0.00000


Test critical values: 1% level -3.431609
5% level -2.861981
10% level -2.567048

*MacKinnon (1996) one-sided p-values.

Augmented Dickey-Fuller Test Equation


Dependent Variable: D(RETURN)
Method: Least Squares
Date: 01/16/25 Time: 23:27
Sample (adjusted): 4 4537
Included observations: 4534 after
adjustments

Variable Coefficient Std. Error t-Statistic Prob.

RETURN(-1) -1.171142 0.02687 -43.58485 0.00000


D(RETURN(-1)) 0.123581 0.021377 5.780948 0.00000
D(RETURN(-2)) 0.060631 0.014804 4.095579 0.00000
C -5.67E-05 0.000171 -0.331778 0.7401

Mean dependent
R-squared 0.524293 var 4.87E-06
S.D. dependent
Adjusted R-squared 0.523978 var 0.016672
Akaike info
S.E. of regression 0.011503 criterion -6.091588
Sum squared resid 0.599374 Schwarz criterion -6.085925
Hannan-Quinn
Log likelihood 13813.63 criter. -6.089593
F-statistic 1664.223 Durbin-Watson 1.995949
stat
Prob(F-statistic) 0
Null Hypothesis: RETURN has a unit root
Exogenous: Constant
Lag Length: 2 (Automatic - based on SIC,
maxlag=31)

t-Statistic Prob.*

Augmented Dickey-Fuller test statistic -43.58485 0.00000


Test critical values: 1% level -3.431609
5% level -2.861981
10% level -2.567048

*MacKinnon (1996) one-sided p-values.

Augmented Dickey-Fuller Test Equation


Dependent Variable: D(RETURN)
Method: Least Squares
Date: 01/16/25 Time: 23:27
Sample (adjusted): 4 4537
Included observations: 4534 after
adjustments

Variable Coefficient Std. Error t-Statistic Prob.

RETURN(-1) -1.171142 0.02687 -43.58485 0.00000


D(RETURN(-1)) 0.123581 0.021377 5.780948 0.00000
D(RETURN(-2)) 0.060631 0.014804 4.095579 0.00000
C -5.67E-05 0.000171 -0.331778 0.7401

Mean dependent
R-squared 0.524293 var 4.87E-06
S.D. dependent
Adjusted R-squared 0.523978 var 0.016672
Akaike info
S.E. of regression 0.011503 criterion -6.091588
Sum squared resid 0.599374 Schwarz criterion -6.085925
Hannan-Quinn
Log likelihood 13813.63 criter. -6.089593
Durbin-Watson
F-statistic 1664.223 stat 1.995949
Prob(F-statistic) 0
Steps to Check for Stationarity and Making Variables Stationary
1. Checking for Stationarity
Stationarity of a time series means its statistical properties (mean, variance, and
autocorrelation) remain constant over time. To check for stationarity, follow these steps:

a) Visual Inspection
 Plot the data: Examine the raw time series plot (like the one you provided). Non-
stationary data often shows trends, seasonality, or changing variance.
 In your data: The mean appears constant, but volatility clustering may indicate
conditional heteroskedasticity.

b) Statistical Tests
 Use statistical tests to confirm stationarity, such as:
o Augmented Dickey-Fuller (ADF) Test:
 Null Hypothesis: The series has a unit root (non-stationary).
 Alternative Hypothesis: The series is stationary.
 Result from your ADF test:
The t-statistic (-43.58485) is far below the critical values at 1%, 5%, and
10%. The p-value is 0.0000, strongly rejecting the null hypothesis.
Conclusion: The series is stationary, and no further transformation is
necessary for this dataset.
o KPSS (Kwiatkowski-Phillips-Schmidt-Shin) Test (optional complement to
ADF):
 Null Hypothesis: The series is stationary.
 High test statistic or low p-value indicates non-stationarity.

c) Autocorrelation Analysis
 Plot the Autocorrelation Function (ACF) and Partial Autocorrelation Function (PACF):
o A slow decay in ACF suggests non-stationarity.
o Your data likely shows no slow decay, consistent with the ADF result.

2. Making Non-Stationary Data Stationary (if Needed)


If the ADF test failed to reject the null hypothesis (i.e., the data is non-stationary), take these
steps:

a) Differencing
 Apply first-order differencing: Compute the change between consecutive observations:
Dt=Xt−Xt−1 This removes trends in the data. Apply further differencing (e.g., second-
order) if required.

b) Log Transformation
 Apply a logarithmic transformation to stabilize variance in data with an exponential
trend: Yt=ln(Xt)

c) Detrending
 Use linear regression to model and remove deterministic trends. Subtract the fitted trend
line from the original series.

d) Testing Stationarity Again


 After transformations (e.g., differencing, detrending), repeat the ADF or KPSS test to
confirm stationarity.
Conclusion:
Based on your provided ADF results, the return series is already stationary. No further
transformation is needed. If other series exhibit non-stationarity, you could use the above
methods to achieve stationarity before proceeding with modeling or analysis. Let me know if
you’d like detailed steps for differencing or other transformations.

c) Explain the differences between recursive and rolling window pseudo-out-of-


sample forecasting experiments. Then proceed with an out-of-sample (recursive)
forecasting exercise for the returns of FTSE 100, for forecasting horizons h = {1, 7,
14, 28} for the last 100 observations using the following models:

i) AR(1), AR(2), MA(1), MA(2), ARMA(1,1), ARMA(2,2) and an ARMA(p, q) of


your choice

Dependent Variable: RETURN


Method: ARMA Maximum Likelihood
(OPG - BHHH)
Date: 01/19/25 Time: 22:37
Sample: 1 4537
Included observations: 4537
Convergence achieved after 58
iterations
Coefficient covariance computed using
outer product of gradients

Coefficie t-
Variable nt Std. Error Statistic Prob.

-6.09E- -
C 05 0.000157 0.38797 0.6981
AR(1) 0.628215 0.279898 2.24444 0.0249
-
AR(2) -0.10658 0.208038 0.51229 0.6085
-
MA(1) -0.67411 0.281269 2.39666 0.0166
MA(2) 0.070914 0.224333 0.31611 0.7519
92.9922
SIGMASQ 0.000133 1.43E-06 4 0

Mean
dependent -6.26E-
R-squared 0.007716 var 05
S.D.
dependent 0.01156
Adjusted R-squared 0.006621 var 7
Akaike
info -
S.E. of regression 0.011529 criterion 6.08662
Schwarz -
Sum squared resid 0.602228 criterion 6.07813
Hannan- -
Log likelihood 13813.49 Quinn criter. 6.08363
Durbin- 1.99747
F-statistic 7.046607 Watson stat 8
Prob(F-statistic) 0.000001

Inverted AR Roots .31+.09i .31-.09i


Inverted MA Roots 0.54 0.13

FINDINGS:
1. AR (Autoregressive) Terms:
o AR(1) (Coefficient: 0.628): This term is statistically significant with a t-statistic
of 2.24 and a p-value of 0.0249 (< 0.05). It indicates that the first lag of the return
series has a positive and significant effect on current returns.
o AR(2) (Coefficient: -0.107): The second lag is not significant, with a p-value of
0.6085. It has a small and negative effect, suggesting no strong dependency on the
second lag.
2. MA (Moving Average) Terms:
o MA(1) (Coefficient: -0.674): This term is significant, with a t-statistic of -2.40
and a p-value of 0.0166 (< 0.05). It suggests that the first lag of the error term
influences the current returns negatively.
o MA(2) (Coefficient: 0.071): This term is not significant, with a p-value of
0.7519. It implies little contribution to explaining variations in returns.
3. Constant (C):
o The constant term is not significant (p = 0.6981), meaning there is no substantial
mean shift in the series.
4. Variance of Errors (SIGMASQ):
o The variance of the residuals is very small (1.33×10−41.33 \times 10^{-
4}1.33×10−4) and highly significant. This supports the model's good fit to the
observed volatility.

Model Diagnostics
1. Goodness-of-Fit:
o R-squared = 0.0077: Indicates the model explains a small fraction (0.77%) of the
variation in returns.
o Adjusted R-squared = 0.0066: Corrected for model complexity, confirming
limited explanatory power.
2. Residual Analysis:
o Durbin-Watson Statistic (1.997): Very close to 2, indicating no significant
autocorrelation in residuals.
o Sum of Squared Residuals (0.602): Shows a small magnitude of errors in the
model fit.
3. Information Criteria:
o Akaike Information Criterion (AIC = -6.087), Schwarz Criterion (SC = -
6.078), and Hannan-Quinn Criterion (HQ = -6.084): These metrics are used to
compare model performance. Lower values indicate better model fit.
4. F-statistic:
o F-statistic = 7.05 (p-value < 0.0001): The overall model is statistically
significant, suggesting that the combination of AR and MA terms explains some
variation in returns.

Inverted Roots Analysis


1. Inverted AR Roots: .31±0.09i.31 indicates that the AR process is stationary (roots lie
within the unit circle).
2. Inverted MA Roots: [0.54,0.13] confirms invertibility (roots within the unit circle).

Insights and Recommendations


 AR(1) and MA(1) terms are the most significant contributors, suggesting a simple
ARMA(1,1) model might suffice.
 Non-significant AR(2) and MA(2) terms could potentially be excluded in simpler model
specifications.
 The low R-squared indicates that the model captures only a small portion of the
variability in returns, implying the need for additional explanatory variables (e.g.,
exogenous predictors) or nonlinear modeling approaches.

Comparison Between ARMA(1,1) and ARMA(2,2) Models


To compare ARMA(1,1) and ARMA(2,2) models, we evaluate their coefficients, goodness-of-fit
metrics, and statistical significance.
1. ARMA(1,1) Model Summary
The ARMA(1,1) model contains one autoregressive term (AR(1)) and one moving average term
(MA(1)).

2. ARMA(2,2) Model Summary


The ARMA(2,2) model includes two autoregressive terms (AR(1), AR(2)) and two moving
average terms (MA(1), MA(2)

Diagnostics
 Akaike Information Criterion (AIC): −6.087-6.087−6.087
 Schwarz Criterion (SC): −6.078-6.078−6.078
 Hannan-Quinn Criterion (HQ): −6.084-6.084−6.084
 Durbin-Watson Statistic: 1.9971.9971.997 (No significant autocorrelation in residuals).
Comparison

Metric ARMA(1,1) ARMA(2,2) Observation

AR(1) Coefficient 0.628 (p=0.025) 0.628 (p=0.025) Significant in both models

-0.107
AR(2) Coefficient N/A AR(2) not significant in ARMA(2,2)
(p=0.609)

-0.674 -0.674
MA(1) Coefficient Significant in both models
(p=0.017) (p=0.017)

MA(2) Coefficient N/A 0.071 (p=0.752) MA(2) not significant in ARMA(2,2)

Identical; no improvement with


Akaike (AIC) -6.087 -6.087
ARMA(2,2)

Identical; no improvement with


Schwarz (SC) -6.078 -6.078
ARMA(2,2)

Hannan-Quinn Identical; no improvement with


-6.084 -6.084
(HQ) ARMA(2,2)

Durbin-Watson Residuals show no autocorrelation in


1.997 1.997
(DW) both models

Conclusion
 The ARMA(1,1) model is more parsimonious and performs as well as the ARMA(2,2)
model based on information criteria (AIC, SC, HQ).
 The additional terms (AR(2), MA(2)) in ARMA(2,2) are not statistically significant and
do not improve the model fit.
Recommendation
 Proceed with the ARMA(1,1) model, as it is simpler and achieves the same level of
performance as ARMA(2,2). Would you like to explore residual diagnostics or alternative
models?
(ii) Which model would you prefer? To answer this question you need to choose
information criteria. Use MAE and RMSFE and also explain the differences between
the two. Further, state which ones should be preferred and when.
SOLUTION:
MAE (Mean Absolute Error):
 Definition: The average absolute differences between observed and predicted values.
 Interpretation: Measures how far predictions deviate from the actual values without
considering the direction of the error (over or under-prediction).
 Advantages:
o Easy to interpret.
o Less sensitive to large outliers because it does not square errors.
 Limitations: Does not penalize large errors as heavily as RMSFE.
RMSFE (Root Mean Square Forecast Error):
 Definition: The square root of the average squared differences between observed and
predicted values.
 Interpretation: Places a larger penalty on large prediction errors than MAE because it
squares the errors before averaging.
 Advantages:
o Useful when large prediction errors are critical and should be avoided.
o Penalizes larger errors more, making it sensitive to outliers.
 Limitations: More influenced by outliers than MAE.

Key Differences Between MAE and RMSFE


Criterion MAE RMSFE

Error Handling Treats all errors equally Penalizes large errors heavily

Sensitivity Less sensitive to outliers Highly sensitive to outliers

Focuses on minimizing large


Focus Robust for general accuracy
deviations

Reflects average error


Interpretation Emphasizes squared deviations
magnitude

Model Preference: When to Use Each


1. MAE:
o Preferred when the goal is to minimize average prediction error uniformly.
o Suitable for cases where all errors (small and large) have equal importance.
o Robust against outliers.
2. RMSFE:
o Preferred when large errors are particularly problematic or critical.
o Suitable for cases where large deviations from predictions (e.g., forecasting in
finance or weather) have higher costs.
o Penalizes outliers heavily.

Application to Model Comparison


To decide which model is preferable:
1. Calculate both MAE and RMSFE for the models being compared.
2. Use the following guidelines:
o If MAE is substantially lower for a model, it suggests better overall accuracy.
o If RMSFE is substantially lower for a model, it suggests fewer large forecast
errors
Final Recommendation
 If precision and overall accuracy are the primary goals, prioritize MAE.
 If minimizing extreme forecast errors is critical, prioritize RMSFE.

(d) Explain how testing for ARCH effects works. State clearly the test and
results. Test for ARCH effects using the software of your choice.
Heteroskedasticity Test:
ARCH

F-statistic 226.4698 Prob. F(1,4534) 0.0000


Prob. Chi-
Obs*R-squared 215.7911 Square(1) 0.0000

Test Equation:
Dependent Variable: RESID^2
Method: Least Squares
Date: 01/19/25 Time: 23:26
Sample (adjusted): 2 4537
Included observations: 4536 after
adjustments

t-
Variable Coefficient Std. Error Statistic Prob.

17.8039
C 0.000104 5.81E-06 1 0.0000
15.0489
RESID^2(-1) 0.217803 0.014473 1 0.0000

Mean dependent 0.00013


R-squared 0.047573 var 2
S.D. dependent 0.00037
Adjusted R-squared 0.047363 var 9
Akaike info
S.E. of regression 0.00037 criterion -12.9677
Sum squared resid 0.000619 Schwarz criterion -12.9649
Hannan-Quinn
Log likelihood 29412.83 criter. -12.9667
Durbin-Watson 2.11231
F-statistic 226.4698 stat 3
Prob(F-statistic) 0

E) Estimate a GARCH(p, q) model using the best performing ARMA(p,q) model as the
conditional mean. Do an out-of-sample (recursive) forecasting exercise for h = 1, 2, 3,
where you need to produce conditional variance forecasts for the last 100 observations.
Use RMFSE to select the optimal order for the GARCH model. Use both yt 2 (squared
returns of FTSE 100) and Realised variance as proxies to compute RMFSE. Explain
why the choice of the proxy matters.

Dependent Variable: RETURN


Method: ML ARCH - Normal distribution
(Marquardt / EViews legacy)
Date: 01/19/25 Time: 23:40
Sample: 1 4537
Included observations: 4537
Convergence achieved after 14 iterations
Presample variance: backcast (parameter =
0.7)
GARCH = C(2) + C(3)*RESID(-1)^2 +
C(4)*GARCH(-1)

z-
Variable Coefficient Std. Error Statistic Prob.

-
REALISED__VARIANCE__5_MIN_ -3.151126 5.19E-01 6.066581 0

Variance
Equation

C 1.53E-06 2.26E-07 6.782936 0


RESID(-1)^2 0.103588 7.15E-03 14.49426 0
GARCH(-1) 0.884026 0.007753 114.0176 0

Mean -6.26E-
R-squared 0.012945 dependent var 05
S.D. 0.01156
Adjusted R-squared 0.012945 dependent var 7
Akaike info
S.E. of regression 0.011492 criterion -6.5036
Schwarz
Sum squared resid 0.599054 criterion -6.49793
Hannan-
Log likelihood 14757.4 Quinn criter. -6.5016
Durbin-Watson stat 2.084302
Question 2
1 The Excel spreadsheet AssignmentQ2data.xlsx contains data on four time
series, X1, X2, X3 and X4. This question is similar to that posed in Computer
Class 9 but you are now dealing with four variables instead of three.
a) Examine whether these variables are stationary or non-stationary. Explain carefully
which test you would use to do the above. Note that you should allow for a linear trend in
the data where the lag selection determined automatically using the standard settings in
Eviews or in MATLAB.
Null Hypothesis: X1 has a unit root
Exogenous: Constant
Lag Length: 1 (Automatic - based on SIC,
maxlag=21)

t-
Statistic Prob.*
-
Augmented Dickey-Fuller test statistic 1.07902 0.7259
-
Test critical values: 1% level 3.43668
-
5% level 2.86423
-
10% level 2.56825

*MacKinnon (1996) one-sided p-values.

Augmented Dickey-Fuller Test Equation


Dependent Variable: D(X1)
Method: Least Squares
Date: 01/20/25 Time: 23:12
Sample (adjusted): 3 1000
Included observations: 998 after adjustments

Coefficien t-
Variable t Std. Error Statistic Prob.

-
X1(-1) -0.00334 0.003095 1.07902 0.2808
-
D(X1(-1)) -0.25575 0.030644 8.34576 0
1.05808
C 0.119818 0.11324 9 0.2903

Mean 0.00342
R-squared 0.067457 dependent var 1
S.D. dependent 1.17894
Adjusted R-squared 0.065582 var 8
Akaike info 3.10229
S.E. of regression 1.139634 criterion 3
Schwarz 3.11703
Sum squared resid 1292.272 criterion 9
Hannan-Quinn 3.10789
Log likelihood -1545.04 criter. 8
Durbin-Watson 2.02122
F-statistic 35.9872 stat 3
Prob(F-statistic) 0

Null Hypothesis: X2 has a unit root


Exogenous: Constant
Lag Length: 1 (Automatic - based on SIC,
maxlag=21)

t-Statistic Prob.*

Augmented Dickey-Fuller test statistic -0.854322 0.8025


Test critical values: 1% level -3.436683
5% level -2.864225
10% level -2.568251

*MacKinnon (1996) one-sided p-values.

Augmented Dickey-Fuller Test Equation


Dependent Variable: D(X2)
Method: Least Squares
Date: 01/20/25 Time: 23:12
Sample (adjusted): 3 1000
Included observations: 998 after
adjustments

Coefficien
Variable t Std. Error t-Statistic Prob.

X2(-1) -0.002754 0.003223 -0.854322 0.3931


D(X2(-1)) -0.292813 0.030365 -9.643104 0
C 0.117874 0.130944 0.900192 0.3682

Mean 0.00826
R-squared 0.087217 dependent var 4
S.D. dependent 1.19880
Adjusted R-squared 0.085382 var 5
Akaike info
S.E. of regression 1.146485 criterion 3.11428
Schwarz 3.12902
Sum squared resid 1307.856 criterion 7
Hannan-Quinn 3.11988
Log likelihood -1551.026 criter. 5
Durbin-Watson 2.03873
F-statistic 47.53622 stat 9
Prob(F-statistic) 0
Null Hypothesis: X3 has a unit root
Exogenous: Constant
Lag Length: 3 (Automatic - based on SIC,
maxlag=21)

t-Statistic Prob.*

Augmented Dickey-Fuller test statistic -0.670293 0.8519


Test critical values: 1% level -3.436696
5% level -2.86423
10% level -2.568255

*MacKinnon (1996) one-sided p-values.

Augmented Dickey-Fuller Test Equation


Dependent Variable: D(X3)
Method: Least Squares
Date: 01/20/25 Time: 23:12
Sample (adjusted): 5 1000
Included observations: 996 after
adjustments

Coefficien
Variable t Std. Error t-Statistic Prob.

X3(-1) -0.003588 0.005353 -0.670293 0.5028


D(X3(-1)) -0.551783 0.031639 -17.43993 0
D(X3(-2)) -0.318445 0.03464 -9.193043 0
D(X3(-3)) -0.160669 0.031429 -5.112164 0
C 0.190852 0.258049 0.739596 0.4597

Mean 0.00992
R-squared 0.241366 dependent var 8
S.D. dependent 1.23467
Adjusted R-squared 0.238304 var 8
Akaike info 2.99229
S.E. of regression 1.077568 criterion 7
Schwarz 3.01691
Sum squared resid 1150.702 criterion 4
Hannan-Quinn 3.00165
Log likelihood -1485.164 criter. 5
Durbin-Watson
F-statistic 78.8239 stat 2.02187
Prob(F-statistic) 0

Null Hypothesis: X4 has a unit root


Exogenous: Constant
Lag Length: 1 (Automatic - based on SIC,
maxlag=21)

t-
Statistic Prob.*

-
Augmented Dickey-Fuller test statistic 0.04877 0.9528
-
Test critical values: 1% level 3.43668
-
5% level 2.86423
-
10% level 2.56825

*MacKinnon (1996) one-sided p-values.

Augmented Dickey-Fuller Test Equation


Dependent Variable: D(X4)
Method: Least Squares
Date: 01/20/25 Time: 23:12
Sample (adjusted): 3 1000
Included observations: 998 after adjustments

Coefficien t-
Variable t Std. Error Statistic Prob.

-
X4(-1) -0.00011 0.002159 0.04877 0.9611
-
D(X4(-1)) -0.10071 0.031608 3.18622 0.0015
0.30826
C 0.038362 0.124444 8 0.7579

Mean 0.02968
R-squared 0.010155 dependent var 4
S.D. dependent 1.06160
Adjusted R-squared 0.008166 var 8
Akaike info
S.E. of regression 1.057265 criterion 2.95225
Schwarz 2.96699
Sum squared resid 1112.22 criterion 6
Hannan-Quinn 2.95785
Log likelihood -1470.17 criter. 5
Durbin-Watson 1.98583
F-statistic 5.104056 stat 4
Prob(F-statistic) 0.006232

Level of stationarity of X1,X2,X3 &X4:


Here’s the interpretation of the stationarity results based on the Augmented Dickey-Fuller (ADF)
test for the four variables:
1. X1: Non-stationary
o ADF test statistic = -1.07902 > critical values at all levels.
o p-value = 0.7259 > 0.05 (fail to reject the null hypothesis).
2. X2: Non-stationary
o ADF test statistic = -0.854322 > critical values at all levels.
o p-value = 0.8025 > 0.05 (fail to reject the null hypothesis).
3. X3: Non-stationary
o ADF test statistic = -0.670293 > critical values at all levels.
o p-value = 0.8519 > 0.05 (fail to reject the null hypothesis).
4. X4: Non-stationary
o ADF test statistic = -0.04877 > critical values at all levels.
o p-value = 0.9528 > 0.05 (fail to reject the null hypothesis).

Conclusion: None of the variables (X1, X2, X3, X4) are stationary at level form

b) Estimate a VAR(1) model for the four variables, again allowing for a linear
trend. Like in the previous computer class, you can use Eviews’ built-in
sequential modified LR test statistic to determine an appropriate order for a
VAR model for these variable
Vector Autoregression Estimates
Date: 01/20/25 Time: 23:25
Sample (adjusted): 3 1000
Included observations: 998 after
adjustments
Standard errors in ( ) & t-statistics in [ ]

X1 X2 X3 X4

0.60757 0.50648 0.12176 -


X1(-1) 1 3 3 0.04553
- - - -
0.03662 0.03427 0.03429 0.03631
[-
[ 16.591 [ 14.780 [ 3.5514 1.25396
0] 0] 0] ]

0.00456 - 0.03340 -
X1(-2) 1 0.02206 5 0.08218
- - - -
0.03854 0.03606 0.03608 0.03821
[- [-
[ 0.1183 0.61177 [ 0.9258 2.15091
6] ] 4] ]

0.48621 0.39506 0.14974 0.14970


X2(-1) 5 5 3 6
- - -
0.04061 -0.038 0.03802 0.04026
[ 11.972 [ 10.395 [ 3.9381 [ 3.7181
1] 4] 7] 1]

- - - -
X2(-2) 0.03616 0.02712 0.09818 0.06538
- - - -
0.03679 0.03443 0.03445 0.03648
[- [- [- [-
0.98263 0.78753 2.85004 1.79215
] ] ] ]

0.02018 0.01716 0.11951 0.17914


X3(-1) 7 1 8 6
- - - -
0.04077 0.03815 0.03817 0.04042
[ 0.4951 [ 0.4497 [ 3.1310 [ 4.4320
3] 9] 6] 1]

X3(-2) 0.08521 0.02103 0.03806 0.05613


9 7
- - - -
0.03917 0.03665 0.03667 0.03883
[ 2.1756 [ 0.5738 [ 1.0381 [ 1.4457
0] 1] 8] 2]

- 0.10581 0.23615 0.81748


X4(-1) 0.08598 5 2 5
- - - -
0.03667 0.03432 0.03433 0.03636
[-
2.34474 [ 3.0836 [ 6.8783 [ 22.485
] 3] 0] 6]

- 0.00520 - 0.10461
X4(-2) 0.02044 6 0.01278 7
- - -
-0.0369 0.03453 0.03455 0.03658
[- [-
0.55388 [ 0.1507 0.36985 [ 2.8598
] 6] ] 6]

- 20.4774
C 3.23819 -0.1147 7 -5.7501
- - - -
1.28031 1.19806 1.19869 1.26932
[- [- [-
2.52923 0.09574 [ 17.083 4.53008
] ] 3] ]

0.99221 0.99270 0.97763


R-squared 5 8 6 0.99569
0.99215 0.99264 0.97745 0.99565
Adj. R-squared 2 9 5 5
1059.45 927.715 928.678 1041.34
Sum sq. resids 7 6 9 3
1.03500 0.96852 0.96902 1.02612
S.E. equation 7 2 4 2
16829.4 5404.16 28562.0
F-statistic 15755.4 6 5 2
- - - -
Log likelihood 1445.92 1379.66 1380.18 1437.32
2.91567 2.78288 2.78392 2.89842
Akaike AIC 1 5 3 7
2.95991 2.82712 2.82816 2.94266
Schwarz SC 2 5 3 7
34.6875 39.0426 47.8148 55.5810
Mean dependent 9 5 6 1
11.6829 11.2961 6.45368 15.5678
S.D. dependent 6 6 6 5

0.49093
Determinant resid covariance (dof adj.) 4
0.47346
Determinant resid covariance 3
-
Log likelihood 5291.31
10.6759
Akaike information criterion 7
10.8529
Schwarz criterion 3
Number of coefficients 36

Estimation of VAR (1) Model


The results of the VAR (1) model are provided in the table. Each variable X1, X2,X3, and X4 is
regressed on lagged values of itself and other variables.
 Significant coefficients:
o The coefficients with ∣t∣>1.96|t| > 1.96∣t∣>1.96 (approximately significant at 5%
level) indicate notable relationships.
o Example highlights:
 X1(−1): Significant for X1, X2, X3, but not X4
 X2(−1): Significant for all variables X1, X2, X3,X4.
 X3(−1): Significant for X3, X4.
 X4(−1): Highly significant for X4, X3 and X2
 Determinant Residual Covariance:
The determinant of residual covariance is small (0.4734630.), suggesting the model fits
well overall.
 Fit (R-squared):
o High R2 values for all equations (>0.99), indicating an excellent fit to the data.
o Adjusted R2 values remain very close to the unadjusted ones, further supporting
the goodness-of-fit.
2. Sequential Modified LR Test for VAR Order
The sequential modified LR test is typically used to determine the optimal lag order for the VAR
model. This involves comparing models with progressively higher lag lengths and evaluating
whether the additional lag improves the fit significantly.
Steps:
1. Conduct the LR Test for Lag Length:
o Use EViews or equivalent software with built-in capabilities to compute LR
statistics for lag orders (e.g., 1, 2, ...).
o For each test, compare the LR statistic with the critical value at a 5% significance
level.
2. Select the Optimal Lag:
o Begin with lag 1 and proceed sequentially until adding more lags no longer
improves the model significantly.
o The optimal lag order minimizes information criteria (AIC, SC) and retains
significant relationships in the equations.
3. Interpret Results:
o If lag 1 is determined to be optimal (as likely in this case based on provided AIC
and SC values), proceed with VAR (1).

3. Consideration of Linear Trend


To incorporate a linear trend in the VAR model:
 Include a deterministic trend variable explicitly in the model.
 This would account for non-stationary trends in the data, especially if variables are not
stationary.

Conclusion
Based on the high fit, t-statistics, and low determinant residual covariance, the VAR(1) model
appears to describe the dynamics effectively. To finalize the order, the sequential modified LR
test can confirm whether lag 1 suffices or if a higher order is needed. Incorporating a linear trend
is advisable if a trend in the data is evident
(C)Perform the Johansen test for cointegration imposing the lag order
identified in Question 2. Allow for a linear trend in the data by allowing for an
intercept in the cointegrating equation and test VAR. Explain carefully on how
many cointegrating relationships do you find?

Date: 01/20/25 Time: 23:33


Sample (adjusted): 6 1000
Included observations: 995 after adjustments
Trend assumption: Linear deterministic trend
Series: X1 X2 X3 X4
Lags interval (in first differences): 1 to 4

Unrestricted Cointegration Rank Test (Trace)

Hypothesized Trace 0.05


Eigenvalu Critical Prob.*
No. of CE(s) e Statistic Value *

47.8561
None * 0.171631 374.7514 3 0.0001
29.7970
At most 1 * 0.166022 187.3963 7 0.0001
15.4947
At most 2 0.005992 6.756266 1 0.6061
3.84146
At most 3 0.00078 0.776594 6 0.3782

Trace test indicates 2 cointegrating eqn(s) at the 0.05


level
* denotes rejection of the hypothesis at the 0.05 level
**MacKinnon-Haug-Michelis (1999) p-values

Unrestricted Cointegration Rank Test (Maximum


Eigenvalue)

Max-
Hypothesized Eigen 0.05
No. of CE(s) Eigenvalu Statistic Critical Prob.*
e Value *

27.5843
None * 0.171631 187.3551 4 0.0001
21.1316
At most 1 * 0.166022 180.64 2 0.0001
At most 2 0.005992 5.979672 14.2646 0.6158
3.84146
At most 3 0.00078 0.776594 6 0.3782

Max-eigenvalue test indicates 2 cointegrating eqn(s) at


the 0.05 level
* denotes rejection of the hypothesis at the 0.05 level
**MacKinnon-Haug-Michelis (1999) p-values

Unrestricted Cointegrating Coefficients (normalized by


b'*S11*b=I):

X1 X2 X3 X4
1.308984 -2.04585 1.283201 0.0301
-0.90722 0.478443 2.218595 -0.69324
0.073212 0.03013 0.029599 -0.08803
0.03926
0.027808 0.02071 -0.01008 1

Unrestricted Adjustment Coefficients (alpha):

D(X1) -0.18737 0.170228 -0.06635 -0.0005


D(X2) 0.271384 -0.13252 -0.05123 -0.0056
D(X3) -0.12294 -0.30779 -0.04535 -0.0048
-
0.0256
D(X4) -0.02655 0.09708 -0.02897 8

Log
likelihoo
1 Cointegrating Equation(s): d -5341.29

Normalized cointegrating coefficients (standard error in


parentheses)
X1 X2 X3 X4
0.02299
1 -1.56293 0.980303 5
-0.04206 -0.13718 -0.03028

Adjustment coefficients (standard error in parentheses)


D(X1) -0.24526
-0.0435
D(X2) 0.355237
-0.04059
D(X3) -0.16093
-0.04222
D(X4) -0.03475
-0.04282

Log
likelihoo
2 Cointegrating Equation(s): d -5250.97

Normalized cointegrating coefficients (standard error in


parentheses)
X1 X2 X3 X4
1.14157
1 0 -4.19009 1
-0.05661 -0.02334
0.71569
0 1 -3.30814 2
-0.04268 -0.0176

Adjustment coefficients (standard error in parentheses)


D(X1) -0.3997 0.464773
-0.05221 -0.06888
D(X2) 0.475459 -0.61861
-0.04892 -0.06454
D(X3) 0.118308 0.104255
-0.04892 -0.06453
D(X4) -0.12282 0.100759
-0.05186 -0.06842

Log
likelihoo
3 Cointegrating Equation(s): d -5247.98

Normalized cointegrating coefficients (standard error in


parentheses)
X1 X2 X3 X4
1 0 0 -0.71466
-0.25798
0 1 0 -0.74984
-0.20368
0 0 1 -0.44301
-0.06159

Adjustment coefficients (standard error in parentheses)


0.13527
D(X1) -0.40456 0.462774 1
-0.05216 -0.06874 -0.08385
0.05272
D(X2) 0.471709 -0.62016 3
-0.0489 -0.06445 -0.07862
D(X3) 0.114988 0.102888 -0.84196
-0.04891 -0.06447 -0.07864
0.18045
D(X4) -0.12494 0.099886 8
-0.0519 -0.0684 -0.08343

Johansen Cointegration Test Results


1. Trace Test:
 Hypotheses tested:
o H0: Number of cointegrating relationships is r, where r can be 0, 1, 2, etc.
o H1: Number of cointegrating relationships is greater than r.

Hypothesized No. of CE(s) Trace Statistic Critical Value (5%) p-value Decision

None 374.7514 47.85613 0.0001 Reject H0

At most 1 187.3963 29.79707 0.0001 Reject H0

At most 2 6.7563 15.49471 0.6061 Fail to reject H0

At most 3 0.7766 3.84147 0.3782 Fail to reject H0


Interpretation: The trace test indicates that there are 2 cointegrating relationships at the 5%
significance level (reject H0 for None and At most 1, fail to reject H0 for At most 2).
2. Maximum Eigenvalue Test:
 Hypotheses tested:
o H0: Number of cointegrating relationships is r.
o H1: Number of cointegrating relationships is r+1.

Hypothesized No. of CE(s) Max-Eigen Statistic Critical Value (5%) p-value Decision

None 187.3551 27.58434 0.0001 Reject H0

At most 1 180.64 21.13162 0.0001 Reject H0

At most 2 5.9797 14.2646 0.6158 Fail to reject H0

At most 3 0.7766 3.84147 0.3782 Fail to reject H0

Interpretation: The maximum eigenvalue test also confirms that there are 2 cointegrating
relationships at the 5% significance level.

3. Cointegrating Relationships
 Number of Cointegrating Equations: Both the Trace Test and Maximum Eigenvalue
Test consistently indicate 2 cointegrating relationships.
 Normalized Cointegrating Coefficients:
For the two identified cointegrating equations:
1. X1−1.56293X2+0.980303X3+0.022995X4=0
 X2−4.19009X3+1.141571X4=0
 These coefficients indicate long-term equilibrium relationships among the variables
X1,X2,X3, and X4.
Conclusion:
The Johansen test identifies 2 cointegrating relationships among the variables X1,X2,X3,and
X4 when allowing for a linear deterministic trend. These relationships suggest that there exist
long-term equilibrium dynamics binding the variables together, despite short-term deviations.
(D)How would your answer change in the previous question if the Johansen
procedure determined more cointegrating relationships.

1. Implication of Additional Cointegrating Relationships


 Higher Cointegration Rank:
A greater number of cointegrating relationships (say 3 or 4) would suggest stronger and
more complex long-term equilibrium relationships among the variables X1,X2,X3, and
X4.
o Example: If there are 3 cointegrating equations, all four variables may be part of
at least one long-term relationship, and some variables might simultaneously
belong to multiple relationships.
 System Stability:
More cointegrating relationships typically indicate that the system has more constraints
binding the variables together, implying greater stability in their long-term dynamics.

2. Changes in Cointegrating Equations


With more cointegrating relationships, the normalized cointegrating coefficients would include
additional equations. Each equation would describe a distinct equilibrium condition linking the
variables. For example:
 Three Cointegrating Equations:
1. X1−a12X2+a13X3+a14X4=0
2. X2−a23X3+a24X4=0
3. X3−a34X4=0
 Four Cointegrating Equations (Full Rank):
If all variables are stationary or fully cointegrated, all deviations would return to
equilibrium, indicating the strongest possible constraints on the system.

3. Adjustment Coefficients
The adjustment coefficients (α\alphaα) would describe how deviations from the new, larger set of
equilibrium relationships are corrected. With more cointegrating equations, each variable would
respond to a larger number of deviations. For instance:
 If there are 3 cointegrating equations, the adjustment coefficients
(D(X1),D(X2),D(X3),D(X4) would reflect corrections for deviations from all 3
equations.

4. Interpretation
 The system’s dynamics would appear more interconnected, with more robust long-term
linkages.
 Short-term shocks to any variable would need to correct for deviations from more
equilibrium conditions. This could result in faster or more intricate adjustment processes.

5. Changes in Policy or Practical Insights


 Policy Analysis:
Identifying more cointegrating relationships would highlight additional equilibrium
conditions to maintain, making policy decisions more nuanced.
For example, if three equations are found, policy measures might need to consider
constraints from all three to ensure system stability.
 Forecasting:
With more cointegrating relationships, the model’s ability to forecast long-term trends
improves, as it captures more aspects of the equilibrium dynamics.

Conclusion
If the Johansen procedure determined more cointegrating relationships, it would reveal a more
interconnected system with stronger long-term equilibrium dynamics. The analysis would
expand to include these additional relationships, and the adjustment process for short-term
deviations would involve corrections toward multiple equilibrium conditions, leading to a more
stable and predictable system in the long run.
(e) Estimate the Vector Error Correction Model that is implied by the results
from the Johansen test for cointegration in Question 3. You need to write out
the estimated long run relationship(s) and also the equations governing the
short run dynamics of the system. Discuss in depth your results.

1. Long-Run Relationships
The Johansen test indicated 2 cointegrating relationships among the variables
X1,X2,X3,X4. These relationships are derived from the normalized cointegrating
coefficients:
Normalized Cointegrating Equations:
1. X1−1.56293X2+0.980303X3+0.022995X4=0
2.X2−4.19009X3+1.141571X4=0.
These equations imply:
 The first cointegrating relationship describes a long-term equilibrium among
X1,X2,X3,X4, where deviations from this equilibrium would lead to
adjustments in the system.
 The second cointegrating relationship highlights another equilibrium,
primarily driven by X2,X3,X4.
2. VECM Representation
The VECM captures both:
1. Short-Run Dynamics (via lagged differences of variables).
2. Long-Run Adjustments (via error correction terms derived from
cointegrating relationships).
For n=4n variables and r=2r cointegrating relationships, the VECM is:

Where:
 ΔX: First differences of the variables (short-run changes).
 α: Adjustment coefficients, showing how quickly each variable corrects
deviations from long-run equilibrium.
 β: Cointegrating matrix (long-run relationships).
 Γ: Coefficients for lagged differences (short-run dynamics).
 μ: Deterministic trend or intercept.
 ϵt: Error terms.
 3. Estimated VECM Equations
 Using the results, we can write the estimated short-run dynamics for each variable (ΔX1,
ΔX2, ΔX3, ΔX4)

General Form:

4. Interpretation of Results
Adjustment Coefficients (α):
 aij values indicate how strongly each variable adjusts to deviations from the
cointegrating relationships:
o A large absolute value of αij suggests that the variable responds
significantly to restore equilibrium when deviations occur.
o For example:
 α11: Adjustment of X1 to the first equilibrium.
 α21: Adjustment of X2 to the first equilibrium.
Short-Run Dynamics (Γi):
 The Γi coefficients capture the effects of lagged differences of variables on
current changes. These coefficients explain short-term interactions between
the variables.
Cointegrating Relationships (β):
 The β (normalized cointegrating coefficients) defines the long-term
equilibria binding the variables.

5. Discussion of Results
 The system has 2 long-run equilibria, indicating strong and stable
relationships among the variables in the long term. These relationships
suggest economic, financial, or physical interdependencies.
 Short-term deviations occur, but the adjustment coefficients show that the
system gradually restores equilibrium. Variables with higher α\alphaα-values
play a stronger role in the adjustment process.
 The inclusion of short-run dynamics (Γi) ensures that transient changes are
captured, improving the model's accuracy.
6. Practical Implications
 Policymakers or analysts can focus on the cointegrating relationships to
understand the fundamental links among variables.
 The adjustment coefficients help predict how quickly the system reacts to
shocks or deviations from equilibrium.
 The short-run dynamics highlight immediate interdependencies, useful for
forecasting and intervention.
This comprehensive VECM provides insights into both long-term stability and
short-term fluctuations in the system.

You might also like