Advanced Quantitative Methods
Advanced Quantitative Methods
Faculty of Finance
MSc Energy, Trade and Finance
MSc Shipping, Trade and Finance
Contents
1
Reading List
1- Data Analysis and Decision Making with Microsoft Excel: WITH Infotrac, AND Decision
Tools AND Statistic Tools Suite, by S. Christian Albright, Wayne L. Winston, and
Christopher J. Zappe (2006)
CHAPTER 1
Yi
X 1,i
X 2,i
~ (0,
and X 2i , respectively.
Now the underlying assumption for the validity of the OLS for multiple regression are the
same as two-variable regression; that is,
1- The relationship between Y and Xs is linear,
2- The error term has a zero mean,
3- The error term has a constant variance,
4- Errors corresponding to different observations are independent,
5- The error terms are normally distributed,
6- There is no correlation between error terms and independent variables,
7- Finally, there should not be any correlation between the independent variables (no
multicolinearity).
The last assumption is in addition to those for bivariate regression.
ESS
(Yi Yi ) 2
ESS
(Yi
X
1 1i
X
1 1i
X , therefore,
2
2i
X )2
2 2i
Y
X
1 1
ESS
(Yi (Y 1 X 1 2 X 2 )
rearranging the above equation we can write
ESS
[(Yi Y )
(X
1
1i
X1 )
X
1 1i
X )2
2 2i
(X
2
2i
X 2 )]2
and using the notation for deviation of variables from their means as xi
yi
Xi
X and
Yi Y , we can write
ESS
( yi
x
1 1i
x )2
2 2i
2 x1i
( yi
1 1i
2 2i
2 x2i
( yi
1 1i
2 2i
and
means
x ) 0
ESS
x ) 0
and
yields
Y X
Then the total variation in Y can be found as
, that is X =0 and X =0 ,
R2
ESS
'
1
TSS
Y' Y
' X' X
Y' Y
However, if Y is not a zero mean variable, the formula for R2 should be modified to
R2
where y' y
ESS
TSS
' X' X nY 2
y' y
Y' Y nY and yi
Yi Y
There is however a problem with the above formula for R2. The problem is because it
increases as the number of regressors increase, regardless of whether incremental explanatory
variables have in fact any explanatory power. Therefore, the R2 should be adjusted for the
number of variables included in the Right-hand-side of the regression. This is done by
applying calculating R-bar-squared ( R 2 ), which is R2 adjusted for the number of regressors
(degrees of freedom) as follows
R2 1
(' ) /(n k )
1
(y' y ) /(n 1)
'
y' y
(n 1)
(n k )
yi
X 2i
X 3i
X ki
RSS /(k 1)
ESS /(n k )
1, n k
RSS n k
ESS k 1
R2 n k
1 R2 k 1
Note that ESS is error sum of squares and RSS is regression sum of squares
Furthermore, if the hypothesis involves testing certain parameters of the regression and not
all of them at the same time, means that we need to estimate a restricted regression with
reduced number of variables in the following form
yi
X 2,i
X 3,i
k p
Xk
p ,i
( ESS R ESSUR ) / p
~ F( p ,n
ESSUR /(n k )
k)
or
2
( RUR
RR2 ) / p
~ F( p ,n
2
(1 RUR
) /(n k )
k p 1
...
k)
0 is rejected in favour of
Let us now consider the following example where we perform F test on some of variables.
GM 2 t
( M 2t
M 2t 1 ) / M 2t
GPWt
and
( PWt
PW t 1) / PW t
IRt
f ( IP t , GM 2 t , GPWt 1 )
IRt
IPt
GM 2t
GPWt
~ N(0,
Coefficient
1.214078
0.048353
140.3261
104.5884
0.216361
0.210816
2.481026
2609.927
-994.2079
0.183733
Std. Error
t-Statistic
0.551692
2.200644
0.005503
8.786636
36.03850
3.893784
17.44218
5.996295
Mean dependent var
S.D. dependent var
Akaike info criterion
Schwarz criterion
F-statistic
Prob(F-statistic)
Prob.
0.0283
0.0000
0.0001
0.0000
6.145764
2.792815
4.664523
4.702459
39.02177
0.000000
Coefficient
2.816457
0.042703
0.124664
0.122609
2.616006
2915.326
-1017.889
0.043288
Std. Error
t-Statistic
0.445743
6.318565
0.005482
7.789104
Mean dependent var
S.D. dependent var
Akaike info criterion
Schwarz criterion
F-statistic
Prob(F-statistic)
Prob.
0.0000
0.0000
6.145764
2.792815
4.765836
4.784804
60.67014
0.000000
( ESS R ESSUR ) / p
ESSUR /(n k )
F(crit
2 , 428
4 ), 5%
(2915.326 2609.927) / 2
2609.927 /(428 4)
24.80
3.017
which means that the two variables which are restricted form the model have explanatory
power over the dependent variable (interest rates).
10
and the volatility in the market, VOL. Therefore, hypothesized freight (charter) rate
regression model:
K
dfri ,t
1 LC i ,t
2 SZ i
3 AGi
2
4 AGi
5VOLt
RTi , j vi ;
vi ~ iid(0,
j 1
Where dfri,t is the difference between the log of fixture rate for contract i at time t, fri,t,, and
the log of Baltic benchmark freight rate (Baltic Average 4TC Rates) at time t, bfit=ln(B4TCt).
They use dummy variables to distinguish between fixtures in different routes. For instance, 9
binary dummy variables are used to distinguish 10 panamax routes. Table below presents the
number of fixtures and statistics for the laycan period in each route over the sample period
(January 2003 to July 2009, 9076 fixtures).
5000
FRATE
Route
1
2
3
4
5
6
7
8
9
10
Panamax
Trans Atlantic Round Voyage
Continent to Far East
Trans Pacific Round Voyage
Far East to Continent
Mediterranean to Far East
PG - Indian Ocean to Far East
Far East to PG-Indian Ocean
Continent to PG-Indian Ocean
PG - Indian Ocean to Continent
Other routes
Total Fixtures
Number of
Fixtures
No
%
1397
15.4%
1033
11.4%
3174
35.0%
503
5.5%
104
1.1%
865
9.5%
376
4.1%
218
2.4%
208
2.3%
1198
13.2%
9076
11
7500
BPI_4TC
Med
3.0
3.0
3.0
4.0
5.0
4.0
3.0
3.0
4.0
3.0
3.0
Min
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
Max
40.0
38.0
61.0
34.0
27.0
32.0
30.0
17.0
30.0
47.0
61.0
SD
4.4
4.5
4.2
4.5
5.2
5.5
3.9
3.6
5.5
5.3
4.6
Estimation results clearly shows that all the variables considered except freight market
volatility are significant in determination of freight rates. Also, there are significant
differences between freight rates differences in different routes. For instance, front haul
routes (e.g. Continent or Med to Far East) are at premium to back haul routes (e.g. Far East to
the Continent or Med).
dfri ,t
1 LC i ,t
2 SZ i
3 AGi
2
4 AGi
5VOLt
RTi , j
vi ;
vi ~ iid(0,
j 1
0
1
2
3
4
5
Constant
Laycan
Size
Age
LCi,t
SZi
AGi
AG2i
VOLt
Volatility
Route
Transatlantic Round Voyage
1
Continent to Far East
2
Trans Pacific Round Voyage
3
Far East to Continent
4
Mediterranean to Far East
5
PG Indian Ocean to Far East
6
Far East to PG-Indian Ocean
7
Continent to PG-Indian Ocean
8
PG Indian Ocean to Continent
9
R-bar-Square
BG test
White test
JB test
Coeff
P-val
-0.4435
0.0035
0.0060
0.0066
-0.0006
0.0020
0.000
0.000
0.000
0.000
0.000
0.878
0.0368
0.1494
-0.0910
-0.1709
0.2474
0.0809
-0.0757
0.1900
-0.1504
0.346
111.44
801.53
1.4*104
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
12
OBS
WAGE
SEX
ED
AGE
NONWH
HISP
1
2
3
4
5
6
7
8
9
10
11
.
.
198
199
200
201
202
203
204
205
206
8.999779
5.499735
3.799996
10.50026
14.99925
8.999779
9.569682
14.99925
11.00005
4.99981
24.97562
.
.
4.99981
14.99925
5.550012
8.999779
24.97562
8.490093
4.99981
22.20017
9.239613
0
0
1
1
0
1
1
0
0
1
0
.
.
1
0
1
0
0
1
1
0
0
10
12
12
12
12
16
12
14
8
12
17
.
.
12
12
11
16
17
12
14
12
16
43
38
22
47
58
49
23
42
56
32
41
.
.
46
60
62
29
54
42
37
44
27
0
0
0
0
0
0
0
0
0
0
0
.
.
0
0
0
1
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
1
.
.
0
0
0
0
0
0
1
0
0
50
100
150
200
Fem ale
These variables are then used in a multiple regression in the following form
Wi
Sexi
EDi
Agei
Nonwhi
Hispi
Coefficient
-6.409050
-2.761399
0.992564
0.116709
-1.060823
0.238682
0.367537
0.351725
4.217573
3557.584
-585.7443
1.776918
Std. Error
t-Statistic
1.895795
-3.380667
0.598422
-4.614464
0.116158
8.544971
0.025227
4.626442
0.986848
-1.074961
1.069428
0.223187
Mean dependent var
S.D. dependent var
Akaike info criterion
Schwarz criterion
F-statistic
Prob(F-statistic)
13
Prob.
0.0009
0.0000
0.0000
0.0000
0.2837
0.8236
9.596389
5.238212
5.745090
5.842019
23.24480
0.000000
The results indicate that sex, education and age are significant variables in determination of
wage level while race and ethnic background are not significant variables. Furthermore, the
coefficient of sex variable (-2.761) indicates that on average the wage level for female
workers is less than their male counterparts by 2.761$/hour. Similarly the coefficient of
education variable (0.993) indicates that on average there is an increase of 0.99$/hour in the
wage level for every one year education. Furthermore, the age coefficient indicates that the
average salary increases by 0.12$/hour for every one year age.
It is however, difficult to justify the last finding (age) economically as increase in age may
decline performance and salaries. Therefore this relationship might not be completely linear.
In order to test whether there is a nonlinear relationship between age and level of salaries of
workers, we can modify the model to the following where we add square of the age of
workers to the model.
Wi
Sexi
Variable
C
1
SEX
2
ED
3
AGE
4
NONWH
5
HISP
6
2
AGE
7
R-squared
Adjusted R-squared
S.E. of regression
Sum squared resid
Log likelihood
Durbin-Watson stat
EDi
Coefficient
-14.79336
-2.641083
0.923222
0.623939
-1.177989
0.297989
-0.006308
0.398196
0.380051
4.124402
3385.128
-580.6262
1.753024
Agei
Nonwhi
Hispi
Std. Error
t-Statistic
3.220386
-4.593660
0.586421
-4.503730
0.115660
7.982170
0.161202
3.870530
0.965748
-1.219768
1.045969
0.284893
0.001981
-3.184043
Mean dependent var
S.D. dependent var
Akaike info criterion
Schwarz criterion
F-statistic
Prob(F-statistic)
Agei2
Prob.
0.0000
0.0000
0.0000
0.0001
0.2240
0.7760
0.0017
9.596389
5.238212
5.705109
5.818192
21.94541
0.000000
The estimated coefficients of the Age and Age2 suggest that there is a nonlinear relationship
between age and workers wages in the US. The sign of coefficients indicate that increase in
age results in an increase in salary to up to a certain age and then for every one year increase
in age salaries on average decrease by 0.0063$/hour.
Furthermore comparing the R-bar-square of the second regression (0.38) with the first one
(0.35) indicate that including the age2 in the regression has increase the explanatory power of
the regression.
1.5.2 Slope, irregular and event dummies
In some economic cases, the impact of one explanatory variable on the dependent variable
may change after some time due to changes in policy or other factors which we may know (or
not know). But what we know is the date that the influence of the explanatory variable on the
dependent variable has changed. In order to allow such change in the model, we need to let
the slope coefficient (coefficient of the explanatory variable) to change after certain time
period.
14
D1
1
0
0
0
1
0
0
0
D2
0
1
0
0
0
1
0
0
D3
0
0
1
0
0
0
1
0
D4
0
0
0
1
0
0
0
1
They can be used in regressions to let the intercept of the regression take different values at
different time periods. Therefore, significance of the coefficient of a dummy variable means
that the level of the series tends to change at that point in time. Zero and one seasonal
dummies enter in a regression equation in two ways,
A) Estimation with a constant term;
Yt = +
D2 +
D3 + . . . +
12
D12 +
(1)
If a constant term is included in the regression one of the dummy variables must be dropped
in order to avoid the problem of perfect colinearity. However, in equation (1), the constant
term indicates the level of the dependent variable at the season for which the dummy variable
is dropped (in this case D1), while, significance of each coefficient shows the change in the
level with respect to the base month (January in this case) due to seasonal factors in that
month. For example, in February the dependent variable would be equal to constant plus 2.
Equation (1) can be estimated by OLS. The base month can be chosen, e.g. Jan. or Dec.,
according to the form of interpretation required by the researcher. The major disadvantage of
this type of seasonal specification is that the model does not tell us anything about the
15
changes in January or the changes for each month (Jan to Dec.) with respect to the overall
mean of the dependent variable.
Example: Seasonality in shipping freight rates
To see how seasonality in time series data is detected, consider the following example on
shipping freight rates. In this case we collected monthly freight rates for panamax vessels for
the period January 1980 to October 1999. We also found the logarithmic changes in monthly
freight rates using the following formula
ln pmxt
and using a series of zero and one dummies we run the following regression
ln pmxt
D2,t
D3,t
12
D12,t +
It can be seen that mean change in log of panamax freight rate (growth rate) over the sample
period including the January effect was 0.0586, while there has been significant seasonal
increases in freight rates during February, March, September, October and November.
Furthermore the adjusted R-square suggests that around 12% variation in freight rates can be
due to seasonal factors.
If we try to exclude insignificant variables one by one, starting from the most insignificant
dummy, we finally end up with the following model
16
17
Y1
X 11
Y2
Yn
X 21 ...
X 12
X 22 ...
X 1n
X 2 n ...
X k2
X k1
X kn
This means that we can write our parameters and variable in the following matrix form
Y=XB+
Where
1 X11 X 21 X k1
Y1
Y
Y2
Yn
1 X12 X 22 X k2
1 X1n X 2n X kn
E(' )
E(
2
1
E(
2 1
) E(
2
2
E(
n 1
) E(
n 1
E(
) E(
E(
E(
2
n
which is in fact
Var (
E( ' )
Cov(
Cov(
2
1
Cov(
2 1
) Var (
n 1
) Cov(
1
2
2
) Cov(
Cov(
Var (
2
n
n 1
18
The above result follows because according to the underlying assumptions of the OLS, Var
Cov( 2i)= 2 and Cov( i j)=0.
Let us now look at how the parameters can be estimated using matrix notation. Once again
consider that we are minimising the ESS.
n
i2
ESS
'
i 1
where
Y Y
and
therefore we have
)'(Y Y
) (Y X )'(Y X ) Y' Y ' X' Y Y' X ' X' X
' (Y Y
Y' Y 2 ' X' Y ' X' X
the FOC yields
ESS
2X' Y 2X' X
(X' X) -1 (X' Y)
the cross-product matrix, XX, will have an inverse only and only if the matrix X is full rank;
that is, there is no linear relationship between any two columns of the matrix. This condition
is known as non-colinearity (as opposed to perfect colinearity or multicolinearity) condition.
If there is colinearlity between explanatory variables in matrix X, the matrix is reduced rank
and does not have an inverse. Furthermore, XX should be positive definite for the SOC to
hold.
SOC:
1
2
ESS
2
2 X' X
X 11 X 12
X n1 X n2
n
X 1n
X nn
X 1i
X 2i
X ni
X 11
X n1
X 12
X n2
X 1n
X 2i
2
X 1i
X 1i
2
X 1i X 2i
X 2i X 1i
2
2i
X ni X 2i
X nn
X ni
X 1i X 1n
X 2i X ni
X ni X 1i
(2.1)
X 2ni
H1
n 0, H 2
n X1i2 ( X1i ) 2
0 , and so on until H
X' X
0.
Therefore, the condition for minimum ESS with respect to regression parameters holds.
From the above we can see that
19
X'
and since
X'
X' (Y X )
20
X' Y (X' Y)
CHAPTER 2
21
2.2 Hetersoscedasticity
One of the main underlying assumptions of the classical linear regression model (CLRM) is
the assumption of Homoscedasticity, which requires the disturbance terms to appearing in
the population regression to be have a constant variance in the form
E(
2
i
2
i
2
i
Yi
Xi
Plot of savings against income shows that as income increases, savings on the average also
increase. However, it might also be the case that as income increases, not only on average the
level of savings increase but also the variance of savings increases too. This can be seen in
22
graphical form in the figure below, where the density of disturbances are more concentrated
for low income levels that for high income levels.
Density
savings
1+ 2X
Income
2.2.1 The effect of heteroscadasticity on OLS
estimates of CLRM
We know that when CLRM is estimated, one of the objectives is to perform hypothesis tests
on coefficients of the regression in order to, e.g. investigate the validity of an economic
theory. We also know that the standard errors of the estimated coefficients of the following
linear regression model
Yi
Xi
SE( 1 )
Var ( 1 )
n
n X12i
n
and
SE( 2 )
Var ( 2 )
n
n X12i
X12i
( X1i ) 2
n
X12i
n (X1i X ) 2
( X1i ) 2
n
n (X1i
X )2
23
and as a result, the confidence intervals for hypothesis test are no longer valid. In other
words, conclusions drawn about hypothesis tests on regression coefficients in the presence of
heteroscedasticity are not valid.
In fact a Monte-Carlo study by Davidson and MacKinnon (1993), based on a bivariate
regression model, shows that the standard errors of coefficients, when heteroscedasticity is
ignored is wider than the ones corrected for heterscedasticity. Their model is based on the
following regression
Yi
Xi
~ N(0, X i )
which shows that the variance of error terms is not constant and depends on values of Xi. The
power means that the relationship between variance of error terms and the independent
variable might be nonlinear. Davidson and MacKinnon assume that
1= 2 =0 and run a
series of simulations with different values of to show that the standard error of estimated
coefficients depends on the presence of heteroscedasticity. They report the following results.
Value of
0.5
1.0
2.0
3.0
4.0
OLS
0.164
0.142
0.116
0.100
0.089
Standard Error of 1
OLSHET
GLS
0.134
0.110
0.101
0.048
0.074
0.0073
0.064
0.0013
0.059
0.0003
Standard Error of 2
OLS
OLSHET
GLS
0.285
0.277
0.243
0.246
0.247
0.173
0.200
0.220
0.109
0.173
0.206
0.056
0.154
0.195
0.017
It can be seen clearly that OLS consistently overestimate the standard error of estimates
compared to the Generalised Least Square (GLS) method. Even when the OLS estimates of
standard errors are corrected for heteroscedasticity, the result is not as good as the GLS.
Therefore, it is better to use GLS in the presence of heteroscedasticity, but when such method
cannot be applied, heteroscedasticity corrected standard errors should be used and reported
for hypothesis testing to ensure valid conclusion. We will discuss how standard errors can be
corrected for heterscedasticity and how GLS method can be used for estimating models with
heteroscedastic errors after we explained tests for detecting heteroscdeasticity.
2.2.2 Detecting Heteroscedasticity
In order to correct standard error for the effects of heteroscedasticity in error terms, first we
need to investigate whether error terms show signs of heteroscedasticity. There are a number
of methods proposed in the literature over the years by econometricians. In what follows we
discuss a few of these tests and show how each one can be performed using examples in
Eviews.
Graphical method
Sometimes heteroscedasticity in error terms can easily be detected if they are plotted against
another variable. If error terms show a particular pattern, then heteroscedasticity might be
24
present. The shape of the diagram may also suggest that what form of heteroscedasticity is
present in error terms.
However, what we need is some form of formal statistical test which can be formulated and
examined using a set of distribution and critical values. Most of the tests proposed to detect
heteroscedasticity are based on finding a relationship between the variance of error terms and
an explanatory variable.
Goldfeld and Quandt test
Goldfeld and Quandt (1972) suggest that in a linear regression variance of error terms might
be related to the explanatory variable; that is
Yi
Xi
2
i
2
i
X i2
where
Therefore, if the above is true, then large (small) variances are associated with large (small)
values of X. Goldfeld and Quandt (1972) therefore splits the sample into two and investigates
the equality of the equality of variances of error terms of two subsamples, 12 and 22 . This
can be written as
H0 :
2
1
2
2
aganist H1 :
2
1
2
2
Hence they suggest following steps to investigate the validity of the above relationship
1- Order all observations according to the values of Xi, starting with the lowest value
of Xi,
2- Choose a middle observation c, omitting that observation divide the remaining
sample into two subsamples with (n-c)/2 observation each,
3- Run two separate OLS on each subsample, and estimate the ESS for each
(n c)
regression, ESS1 and ESS2. We know that each of these ESS has
k
2
degrees of freedom, where k is the number of estimated parameters (e.g. for
bivariate regression k=2),
4- Compute the ratio of ESSs as
ESS1 / df
~ F( df ,df )
ESS 2 / df
5- Compare the Fobs with Fcrit to test the hypothesis that heteroscedasticity is present
or not.
One should note that this test largely depends on the sample size and the way c, the central
observation(s) is chosen.
25
inc
80
100
85
110
120
115
130
140
125
90
105
160
150
165
145
Con
115
140
120
145
130
152
144
175
180
135
140
178
191
137
189
inc
180
225
200
240
185
220
210
245
260
190
205
265
270
230
250
con_r
55
70
75
65
74
80
84
79
90
98
95
108
113
110
125
inc_r
80
85
90
100
105
110
115
120
125
130
140
145
150
160
165
con_r
115
130
135
120
140
144
152
140
137
145
175
189
180
178
191
inc_r
180
185
190
200
205
210
220
225
230
240
245
250
260
265
270
Next we drop 4 middle observations and perform two separate regressions on the remaining
sub-samples.
Consi
Inci
Std. Error
t-Statistic
8.704924
0.391667
0.074366
9.369531
Mean dependent var
S.D. dependent var
Akaike info criterion
Schwarz criterion
F-statistic
Prob(F-statistic)
Prob.
0.7028
0.0000
83.53846
16.80087
6.513306
6.600221
87.78810
0.000001
Std. Error
t-Statistic
30.64214
-0.914661
0.131582
6.035307
Mean dependent var
S.D. dependent var
Akaike info criterion
Schwarz criterion
F-statistic
Prob(F-statistic)
Prob.
0.3800
0.0001
155.8462
23.49768
7.918077
8.004993
36.42493
0.000085
26
1536.8 / 11
4.07 ~ F(11,11)
377.17 / 11
crit
F(11
2.8179
,11), 5%
Therefore, it can be concluded that there is significant heteroscedasticity in the above model
based on the Goldfeld-Quandt test.
White test
An alternative test for heteroscedasticity is proposed by White (1980). The main advantage of
this test is that unlike the Breusch-Pagan-Godfrey test does not depend on the normality of
the error terms, and unlike the Goldfeld-Quandt test does not require playing with the sample.
However, the principle of this test is very similar to the one for Breusch-Pagan-Godfrey. The
steps, which need to be taken in order to perform the White test for heteroscedasticity on the
following regression, are;
Yi
X 2,i
X 3,i
i2
X 2,i
X 3,i
X 22,i
X 32,i
X 2,i X 3,i
vi
asy
2
df
27
Consi
Inci
Std. Error
t-Statistic
5.231386
1.775879
0.028617
22.28718
Mean dependent var
S.D. dependent var
Akaike info criterion
Schwarz criterion
F-statistic
Prob(F-statistic)
Prob.
0.0866
0.0000
119.7333
39.06134
7.336918
7.430332
496.7183
0.000000
Once residuals are obtained, we regress the squared residuals on income and squared income
as follows
2
i2
vi
1
2 Inci
2 Inci
which yields the following results
Dependent Variable: RESID_MAINSQR
Method: Least Squares
Sample: 1 30
Included observations: 30
Variable
Coefficient
C
-12.29621
INCOME
0.197385
INCOMESQR
0.001700
R-squared
0.177697
Adjusted R-squared
0.116785
S.E. of regression
105.8043
Sum squared resid
302252.7
Log likelihood
-180.8355
Durbin-Watson stat
0.791307
Std. Error
t-Statistic
191.7731
-0.064119
2.368760
0.083329
0.006707
0.253503
Mean dependent var
S.D. dependent var
Akaike info criterion
Schwarz criterion
F-statistic
Prob(F-statistic)
Prob.
0.9493
0.9342
0.8018
78.70511
112.5823
12.25570
12.39582
2.917301
0.071274
asy
2
df
the 5% significant level for this test which is distributed as chi-square with 2 df (two variable
in the auxiliary regression) is 5.99. Therefore, based on the White tests for heteroscedasticity,
we can reject the existence of heteroscedasticity at 5% significant level.
ARCH test
Sometimes, the squared residuals do not show relationship with any variable, however, they
may depend on their past values. In other words, squared residuals might be autocorrelated.
This is also a different form of heteroscedasticity, which is called Autoregressive
Conditional Heteroscedasticity, ARCH. ARCH effects are quite common in time series
analysis or regression, where variables are measured over time.
28
The first test for ARCH is proposed by Engle (1982) where he suggest testing squared
residuals for any serial dependence or autocorrelation using the following regression
t2
2
1 t 1
t2 k
vi
The existence of ARCH can be tested through the significance of coefficients of lagged
squared residuals using an F test or a Wald test. Therefore, the null of no ARCH is
Example:
We run the regression of consumption on income once again, but this time we use the proper
time subscripts as Income and Consumption are measured over time
Const
Inct
Std. Error
t-Statistic
5.231386
1.775879
0.028617
22.28718
Mean dependent var
S.D. dependent var
Akaike info criterion
Schwarz criterion
F-statistic
Prob(F-statistic)
Prob.
0.0866
0.0000
119.7333
39.06134
7.336918
7.430332
496.7183
0.000000
As usual we save residuals and square them for any tests of heteroscedasticity, and run the
following autoregression.
t2
2
1 t 1
t2 k
vi
the result is
Dependent Variable: RESID_MAINSQR
Method: Least Squares
Date: 06/03/02 Time: 16:48
Sample(adjusted): 4 30
Included observations: 27 after adjusting endpoints
Convergence achieved after 3 iterations
Variable
Coefficient
Std. Error
t-Statistic
C
82.75053
32.33636
2.559055
AR(1)
0.931161
0.211672
4.399066
AR(2)
-0.436685
0.305868
-1.427689
AR(3)
-0.066562
0.271402
-0.245254
R-squared
0.492564 Mean dependent var
Adjusted R-squared
0.426377 S.D. dependent var
S.E. of regression
89.50718 Akaike info criterion
Sum squared resid
184265.3 Schwarz criterion
Log likelihood
-157.4933 F-statistic
Durbin-Watson stat
1.978088 Prob(F-statistic)
29
Prob.
0.0175
0.0002
0.1668
0.8084
82.42946
118.1802
11.96247
12.15444
7.441975
0.001179
The value of the F test and its significance shown at the table suggests that there is ARCH
effect in the consumption-income model. However, the coefficients of the auxiliary
regression and their significance level suggest that the ARCH effect is of order one or two.
The ARCH test can be directly performed using the Eviews regression menu by going to
View/residualtest/ARCH LM test. You notice that the result would be the same.
2.2.3 What to do in the presence of heteroscedasticity
So far we explained the effects of heterescedasticity on estimated parameters and different
tests that can be used to detect heteroscedasticity. But what can we do about it. In what
follows we discuss a few methods that can be use as remedial measures to overcome
problems associated with models where heteroscedasticity is present.
The method of Generalised Least Squares
The method of Generilsed Least Squares (GLS) and Weighted Least Squares (WLS) are used
to estimate a regression when error terms are heteroscedastic. In this method, we divide the
variables in the regression by the independent variable which causes heteroscedasticity. For
instance, in the following model
Yi
X 2,i
X k ,i
2
i
~ NID(0,
var( i )
CX 22,i
where, C is a constant which relates the variance with the independent variable, X 22,i . Using a
similar procedure followed in the case of the know variance, we redefine variables as
Yi *
X 2,i
Yi
, X 2*,i
X 2 ,i
X 2,i
1 , X k*,i
X k ,i
X 2,i
*
i
1 and
vi
X 2 ,i
X 2 ,i
X 2 ,i
X 2 ,i
X k ,i
X 2 ,i
X 2 ,i
X 2,i
~ NID(0, C)
It can be seen that the parameters of the equation should be interpreted with care when we
need to explain the relationships in the original regression as
Yi*
X 2*,i
X k2,i
*
i
*
i
~ NID(0, C)
30
is constant and
(1 / X ) on Yi .
Note that when i2 is known to the researcher, i.e. the form of the heteroscedasticity is clear,
the researcher should take heteroscedasticity into account when estimating the model. This
can be easily done by using a method known as Weighted Least Squares (WLS). In this
method we divide the regression by the variance, i2 , which is known.
Example: Housing expenditure (Pindyck and Rubinfeld 6.1)
In this cross section study the relationship between housing expenditures and annual incomes
of four groups of families is investigated.
Dependent Variable: EXPENDITURE
Method: Least Squares
Sample: 1 20
Included observations: 20
Variable
Coefficient
C
0.890000
INCOME
0.237200
R-squared
0.933511
Adjusted R-squared
0.929817
S.E. of regression
0.373021
Sum squared resid
2.504600
Log likelihood
-7.602738
Durbin-Watson stat
1.363966
Std. Error
t-Statistic
0.204312
4.356086
0.014921
15.89724
Mean dependent var
S.D. dependent var
Akaike info criterion
Schwarz criterion
F-statistic
Prob(F-statistic)
Prob.
0.0004
0.0000
3.855000
1.408050
0.960274
1.059847
252.7223
0.000000
And when we test for heteroscedasticity using the Eviews Menu we get
White Heteroskedasticity Test:
F-statistic
5.979575
Obs*R-squared
8.259324
Probability
Probability
0.010805
0.016088
This means the residuals are heteroscedastic. Now, let us assume that squared residuals are
related to income level, and then we can weigh the variables to get the new variables and
rerun the regression in the following form
Prob.
0.0000
0.0000
0.327917
0.051375
-4.400404
-4.300831
58.72056
0.000000
In the above regression, the corresponding coefficient of income is in fact C=0.249 which is
quite similar to 0.237 in the original regression. Also note that you can implement the
31
correction directly in Eviews by choosing the option WLS in the regression menu this will
give you the following result
Dependent Variable: EXPENDITURE
Method: Least Squares
Sample: 1 20
Included observations: 20
Weighting series: INCOME
Variable
Coefficient
C
1.131742
INCOME
0.221935
Weighted Statistics
R-squared
0.974215
Adjusted R-squared
0.972782
S.E. of regression
0.521160
Sum squared resid
4.888948
Log likelihood
-14.29122
Durbin-Watson stat
1.217059
Unweighted Statistics
R-squared
0.928268
Adjusted R-squared
0.924283
S.E. of regression
0.387450
Durbin-Watson stat
1.143174
Std. Error
0.440283
0.025634
t-Statistic
2.570486
8.657760
Prob.
0.0193
0.0000
4.448000
3.158963
1.629122
1.728695
680.0714
0.000000
3.855000
1.408050
2.702117
T
W
(X ' X )
2
t t
x x't ( X ' X )
t 1
where T is the number of observations, k is the number of regressors, and is the least squares
residual.
The above expenditure and income model is re-estimated again, but this time we use the
White Heteroscedasticity Corrected covariance to get the following result
Dependent Variable: EXPENDITURE
Method: Least Squares
Sample: 1 20
Included observations: 20
White Heteroskedasticity-Consistent Standard Errors & Covariance
Variable
Coefficient
Std. Error
t-Statistic
C
0.890000
0.157499
5.650847
INCOME
0.237200
0.016710
14.19495
R-squared
0.933511 Mean dependent var
Adjusted R-squared
0.929817 S.D. dependent var
S.E. of regression
0.373021 Akaike info criterion
Sum squared resid
2.504600 Schwarz criterion
Log likelihood
-7.602738 F-statistic
Durbin-Watson stat
1.363966 Prob(F-statistic)
32
Prob.
0.0000
0.0000
3.855000
1.408050
0.960274
1.059847
252.7223
0.000000
33
correlated. Alternatively, one can plot the residual against time and see whether there is a
consistent pattern exists.
Residuals with serial correlation
12
-4
-8
60
65
70
75
80
85
90
95
R3 Residuals
0.10
0.05
0.00
- 0.05
- 0.10
12/29/99
5/17/00
10/04/00
2/21/01
RAACB
Yt
X 2,t
X k ,t
~ (0,
t 1
vt
where vt is white noise error term. If serial correlation of the first order is present then
and if it is not =0.
34
The Durbin-Watson test involves calculation of the test statistic based on the residual from
the OLS regression and it is defined as
T
DW
( t
t 1 ) 2
t 2
T
( t ) 2
t 1
DW
2(1
In summary
Value of DW
4-dl<DW<4
4-du<DW<4- dl
2<DW<4- du
du<DW<2
dl<DW<du
0<DW<dl
35
GM 2 t
( M 2t
M 2t 1 ) / M 2t
GPWt
and
( PWt
PW t 1) / PW t
f ( IP t , GM 2 t , GPWt 1 )
IRt
IPt
GM 2t
GPWt
~ N(0,
Dependent Variable: R3
Method: Least Squares
Sample: 1960:01 1995:08
Included observations: 428
Variable
C
IP
GM2
GPW(-1)
R-squared
Adjusted R-squared
S.E. of regression
Sum squared resid
Log likelihood
Durbin-Watson stat
Coefficient
1.214078
0.048353
140.3261
104.5884
0.216361
0.210816
2.481026
2609.927
-994.2079
0.183733
Std. Error
t-Statistic
0.551692
2.200644
0.005503
8.786636
36.03850
3.893784
17.44218
5.996295
Mean dependent var
S.D. dependent var
Akaike info criterion
Schwarz criterion
F-statistic
Prob(F-statistic)
Prob.
0.0283
0.0000
0.0001
0.0000
6.145764
2.792815
4.664523
4.702459
39.02177
0.000000
It can be seen that the value of the DW test is quite low (0.183733). This suggests that error
terms are positive correlated. We can see this graphically by plotting the residuals against
time. It can be noticed that when residuals are positive they tend to remain positive and when
they are negative they tend to remain negative.
36
12
-4
-8
60
65
70
75
80
85
90
95
R3 Residuals
This means that the estimates should be corrected for the presence of serial correlation,
otherwise not only inferences would be invalid but also there might be some problems with
using this model for forecasting.
Ljung-Box test for serial correlation (autocorrelation)
Ljung and Box (1978) propose a test for autocorrelation, which is is defined as
k2
LB
T (T
2)
k 1
2
p
In large sample LB statistics follows the chi-squared distribution with p degrees of freedom.
Sometimes the LB test is called as LB-Q test statistic rather than LB and it should not be
confused with Box-Pierce Q statistic. LB test can be performed on residuals of the regression
directly in EViews.
IPt
GM 2t
GPWt
~ N(0,
Coefficient
1.214078
0.048353
140.3261
104.5884
0.216361
0.210816
2.481026
2609.927
-994.2079
0.183733
Std. Error
t-Statistic
0.551692
2.200644
0.005503
8.786636
36.03850
3.893784
17.44218
5.996295
Mean dependent var
S.D. dependent var
Akaike info criterion
Schwarz criterion
F-statistic
Prob(F-statistic)
37
Prob.
0.0283
0.0000
0.0001
0.0000
6.145764
2.792815
4.664523
4.702459
39.02177
0.000000
The LB-Q test can be performed directly in Eviews using the regression menu following
residual test (View/Residual tests/correlogram Q-statistic) to get
Sample: 1960:01 1995:08
Included observations: 428
Autocorrelation
Partial Correlation
.|*******|
.|*******|
.|*******|
.|*
|
.|****** |
.|*
|
.|****** |
.|.
|
.|****** |
.|*
|
.|****** |
.|.
|
.|***** |
.|.
|
.|***** |
.|*
|
.|***** |
.|.
|
.|***** |
*|.
|
.|***** |
.|.
|
.|***** |
.|.
|
k2
428(428 2)
k 1 428 k
2
LB Q(2)
1
2
3
4
5
6
7
8
9
10
11
12
AC
PAC Q-Stat
0.907 0.907 354.32
0.856 0.188 670.53
0.818 0.100 959.96
0.773 -0.013 1219.2
0.760 0.168 1470.8
0.732 -0.017 1704.5
0.712 0.048 1925.9
0.713 0.134 2148.5
0.700 0.022 2363.7
0.661 -0.164 2556.0
0.637 0.016 2735.1
0.617 0.047 2903.7
(0.907) 2
184040
427
Prob
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
(0.856) 2
426
184040(0.0019266 0.00172)
670.53 ~
2
2
5.99
T
T k
( X' X)
2
t t
x x't ( X' X)
t 1
Newey and West (1987) have proposed a more general covariance matrix estimator that is
consistent in the presence of both heteroskedasticity and autocorrelation of unknown form.
The Newey-West estimator is given by
NW
( X' X) 1 ( X' X)
where
38
T k
t 1
2
t t
x x't
1 v /( p 1
1
( xt
t
x't
t t
xt
x't )
In most of econometric packages nowadays the New-West correction for serial correlation
and heteroscedasticity can be applied automatically. The variance covariance estimated using
the Newey-West is called heteroscedasticity/autocorrelation consistence covariance matrix
(HAC).
IRt
IPt
GM 2t
GPWt
~ N(0,
Dependent Variable: R3
Method: Least Squares
Sample: 1960:01 1995:08, Included observations: 428
Variable
Coefficient
Std. Error
t-Statistic
C
1.214078
0.551692
2.200644
IP
0.048353
0.005503
8.786636
GM2
140.3261
36.03850
3.893784
GPW(-1)
104.5884
17.44218
5.996295
R-squared
0.216361 Mean dependent var
Adjusted R-squared
0.210816 S.D. dependent var
S.E. of regression
2.481026 Akaike info criterion
Sum squared resid
2609.927 Schwarz criterion
Log likelihood
-994.2079 F-statistic
Durbin-Watson stat
0.183733 Prob(F-statistic)
Prob.
0.0283
0.0000
0.0001
0.0000
6.145764
2.792815
4.664523
4.702459
39.02177
0.000000
39
Prob.
0.0862
0.0000
0.0134
0.0003
6.145764
2.792815
4.664523
4.702459
39.02177
0.000000
It can be seen that while coefficient remain the same as before, the SEs are now corrected for
serial correlation using Newey-West and the lag truncation used is 5 as indicated in the
output.
Coefficient
1.214078
0.048353
140.3261
104.5884
Std. Error
0.551692
0.005503
36.03850
17.44218
The Newey-West corrected standard errors are all larger than the uncorrected standard error
in every case as expected. This means that if the model is not corrected for the presence of
serial correlation, standard errors would be smaller than the true standard errors and any
hypothesis test based on them would be invalid. This also applies to all hypothesis tests such
as F test and Wald test where variance covariance matrix of coefficients is used to construct
the test.
2.4 Normality
The assumption of normality is needed for inference in the regression models. Under the
assumption of Normality the ML estimators are equivalent to the OLS estimators. It is
important therefore that this assumption is tested for. There are different tests proposed in the
literature for Normality of residuals. Among these are: Jarque and Bera, Shapiro and Wilk,
and Kolmogorov-Smironov tests. These tests are all based on measuring the departure of
residuals for normality based on the 3rd and the 4th moments of the residuals.
2.4.1 Jarque Bera test
To test normality of a variable (error terms) we can use the test proposed by Jarque and Bera
(1982). Jarque-Bera is a test statistic for testing whether the series is normally distributed.
The test statistic measures the difference of the skewness and kurtosis of the series with those
from the normal distribution. The statistic is computed as:
JB
n k
SK 2
6
1
( KU
4
3) 2
where SK is the skewness, KU is the kurtosis, and k represents the number of estimated
coefficients used to create the series.
Under the null hypothesis of a normal distribution, the Jarque-Bera statistic is distributed as
chi-squared distribution with 2 degrees of freedom. The reported Probability is the
probability that a Jarque-Bera statistic exceeds (in absolute value) the observed value under
the nulla small probability value leads to the rejection of the null hypothesis of a normal
distribution.
40
CHAPTER 3
41
3.3
Judgmental forecasting
Extrapolation or simple time series methods
Econometric or causal methods
Combination forecasting
We start with time series models. These models are simple quantitative extrapolation
methods that use past data of a time series and sometimes time trend to forecast future values
of the variable. The idea is that the variable follows some patterns and we try to determine the
pattern and project it into future. There are many extrapolation methods and here we focus on
only a few important and widely used ones.
As mentioned in time series forecasting we try to determine the existence pattern in historical
data. This can be done in a number of ways. First, we can plot the series and investigate the
42
graphical pattern of the series. Second, we can use statistical analysis such as correlation
coefficients to examine the dependencies of the variable on its past values. We can also look
whether there are any seasonal patterns in the time series and model such seasonal movement
and then forecast. Let us look at some time series.
~ iid(0,
t is an identically and independently distributed error term with zero mean and constant variance,
which is also called white noise.
Yt
~ iid(0,
Yt
~ iid(0,
As an example consider the time series of Dow Jones index over the period January 1988 to
March 1992 on a monthly basis.
Time series plot of the variable shows that the variable seems to be wandering around a
constant upward trend. However, this is not enough to show that Dow Jones Index follows a
random walk process. We must assess the autocorrelation function of this variable to gain
more insight as whether Dow Jones is a RW process. It can be seen that coefficients of
autocorrelation are all close to one (0.912, 0.8161 and so on) and significant. This can be
regarded as an indication that this variable is close to being a RW process. Furthermore,
looking at the correlogram of the series (plot of coefficients of autocorelation) presents a
graphical assessment of the nature of autocorrelation of the series. In this case, it can be seen
that the coefficients of autocorrelation die out very slowly, meaning that the series has a long
memory.
43
Dow
3000
2750
2500
2250
Jan-92
Mar-92
Nov-91
Sep-91
Jul-91
Mar-91
May-91
Jan-91
Nov-90
Jul-90
Sep-90
May-90
Jan-90
Mar-90
Nov-89
Sep-89
Jul-89
Mar-89
May-89
Jan-89
Jul-88
Sep-88
Mar-88
May-88
Jan-88
1750
Nov-88
2000
Date
0.5
0.0
1
10
11
12
-0.5
-1.0
Lag
On the other hand, when we study the autocorrelation function of first differences of the Dow
Jones index over the sample period, we find that the series wiggle around a constant mean,
44
which indicate that the series are highly mean reverting. Also looking at the autocorrelation
functions and correlogram of the returns, we note that the series are not correlated; that is, the
observations are independent of each other. This in turn suggests that the return on Dow
Jones index is a random variable while the Dow Jones index level is a random walk process.
Dow_Diff1
200
100
0
-100
-200
Observation Number
0.0
1
-0.5
-1.0
Lag
45
10
11
12
51
49
47
45
43
41
39
37
35
33
31
29
27
25
23
21
19
17
15
13
11
-300
1200
1000
Sales
800
600
400
200
0
1
11
13
15
17
19
21
23
25
27
29
31
33
35
37
39
41
Observation Number
It can be seen that the series is increasing around a constant (linear) trend. Therefore, one can
use a constant trend to model the series. This will result in the following model.
Yt
~ iid(0,
Yt
244.8154 16.5304t
(8.6206)
~ iid(0,
(14.3665)
0.9152
0.8377
90.3844
df
1
40
SS
1686121.5545
326773.3761
MS
1686121.5545
8169.3344
F
206.3964
Coefficient
244.8154
16.5304
Std Err
28.3989
1.1506
t-value
8.6206
14.3665
p-value
0.0000
0.0000
Regression coefficients
Constant
Time
46
p-value
0.0000
The fitted values of the linear trend model can be seen in the following figure.
Sales
800
600
400
200
0
Time
Sales
Fitted Values
Now if we need to forecast quarterly sales for Reebok, we simply project the trend values
into future and hope that they give us a good indication of future Reebok sales.
1200
1000
Sales
800
600
400
200
0
Time
Sales
Fitted Values
Yt
t
t
~ iid(0,
ln
t ln
47
~ iid(0,
This model suggests that the value of the variable Yt increase by certain percentage, , every
period. As an example consider the quarterly sales data for the computer chip manufacturing
firm Intel for the period 1986 to 1996.
4500
Sales
3750
3000
2250
1500
750
0
Time
Yt
1 t 1
or
Yt
1 t 1
Where <1,
Note that if 1=1, then we have a random walk series.
Similarly the order of autocorrelation may be higher than one. In this case the series can be
written as
Yt
1 t 1
2 t 2
...
p t p
Yt
i t i
i 1
As an example consider the monthly stereo sales (in $000) between January 1995 and
December 1998. Using Statpro we produced the autocorrelation function and correlogram for
the series.
48
1.0
0.5
0.0
1
-0.5
-1.0
Lag
Since we found that period on period stereo sale values are dependent on each other through
autocorrelation analysis, we can regress current sale values, yt, on a constant, , and oneperiod lagged sale values, yt-1, to obtain the autoregressive model for sale values in the
following form
Yt
1 t 1
Estimating the above model using Statpro or Eviews will result in the following output.
49
It can be seen that the one-period lagged sale values is estimated as 0.3495 and it is
significant. Therefore, we can write the time series model for stereo sale values as
Yt
117.8574 0.3495Yt
(4.6533) (2.5606)
In the next step we can use the above Autoregressive model known as AR(1) model to
forecast future stereo sale values recursively in the following form
Yt
117.8574 0.3495Yt
Yt
117.8574 0.3495Yt
Yt
117.8574 0.3495Yt
117.8574 0.3495Yt
n 1
Yt
We should note that in order to forecast Yt+n, we need to know the value of observation t+n-1,
that is Yt+n-1. This means that, in order to forecast Yt+n, first we need to forecast Yt+n-1 and
then obtain the forecast for Yt+n. Therefore, for forecasting, Yt+n, we need all the values of
Yt+1 to Yt+n-1. If we attempt to forecast Yt+n, using all forecast values of Yt+1 to Yt+n-1, then we
have a dynamic forecast for Yt+n. Alternatively, we can forecast sale values one period at a
time and wait for the actual sale values to be realised and forecast one period ahead again.
This method is called static forecast. We will talk more about this later.
50
280
260
240
220
200
180
160
140
120
Sales
3.4
Oct-99
Jul-99
Apr-99
Jan-99
Oct-98
Jul-98
Apr-98
Jan-98
Oct-97
Jul-97
Apr-97
Jan-97
Oct-96
Jul-96
Apr-96
Jan-96
Oct-95
Jul-95
Apr-95
80
Jan-95
100
forecast
This a widely used and simple alternative model for forecasting time series and is based on
the principal of averaging passed values as prediction of future values. To implement such
methodology, first we chose a span, which is the number of lagged period used in averaging.
Therefore, assuming s span of p period, the forecast for time t+1 of Y, Yt+1, is determined as
the average of Yt-p+1 to Yt in the following form.
Yt
(Yt
p 1
... Yt ) / p
Yt
(Yt
p 2
... Yt 1 ) / p
Yt
(Yt
p k
... Yt
k 1
)/ p
Now if we only have observations up to time t, the forecast of Yt+2 should be based on the
forecast value of Yt+1, Yft+1. Therefore,
Yt
Yt
2
3
(Yt
(Yt
p 2
p 3
... Yt f1 ) / p
... Yt f1 Yt f 2 ) / p
51
and so on
Since this method of forecasting takes the average of the observations over a period, by
construction, it makes the resultant series smoother than the original series. In fact this is why
the method is called smoothing. Also note that as you increase the span, the series become
smoother. Therefore, one should note that if fluctuations in the series are not quite random
and part of the series, application of a large span reduces the predictive power. Similarly, if
the series is not very volatile and changes are mainly random and not part of the series, then
larger span can be used.
Let us look at such forecast for Dow Index using span of 3 and 12 periods. The following
figures illustrate the actual, fitted and forecast values of Dow Index. It can be noticed that
MA3 method follows the original series more closely than MA12. In fact MA12 over
smooths the series.
MA 3 forecasts for Dow Index
3400
3200
3000
2800
2600
2400
2200
2000
1800
Dow
Sep-93
May-93
Jan-93
Sep-92
May-92
Jan-92
Sep-91
May-91
Jan-91
Sep-90
May-90
Jan-90
Sep-89
May-89
Jan-89
Sep-88
May-88
Jan-88
1600
Forecas t
Ja
n8
M 8
ay
-8
S 8
ep
-8
8
Ja
n8
M 9
ay
-8
S 9
ep
-8
9
Ja
n9
M 0
ay
-9
S 0
ep
-9
0
Ja
n9
M 1
ay
-9
S 1
ep
-9
1
Ja
n9
M 2
ay
-9
S 2
ep
-9
2
Ja
n9
M 3
ay
-9
S 3
ep
-9
3
1600
Dow
Forecas t
52
values than farther past values. Therefore, taking this characteristic of time series into
account, we need to allocate different weights, perhaps exponentially declining weights,
when we calculate our Moving Averages. There are different methods of applying EWMA
(exponential smoothing).
Yt
Yt
(1 a)Yt
(1 a) 2 Yt
(1 a) 3 Yt
...
Where Yt 1 is the forecast of Y for next period based on available information at time t; that
is, the historical values of Y (Yt, Yt-1, Yt-2, .). Also note that 0< <1.
It is not difficult to show that the above model can be simply written as
Yt
Yt
(1 a)Yt
If is chosen to be close to 1 then the model resembles a RW model; that is, there is a
stronger weight is given to very recent values of Y, while small values of implies that the
weights decay very slowly. Let us see how exponential smoothing can be done in Eviews.
Consider the Coca-Cola quarterly sales series over the period 1986:1 and 1996:2.
5250
Sales
4500
3750
3000
2250
1500
1
10
13
16
19
22
25
Observation Number
53
28
31
34
37
40
Applying the simple exponential smoothing model to this data set, using a estimation period
1986:1 to 1994:2.
Choosing the first option (single smoothing) and allowing the program to choose the best
estimate for , we will have a new series called SALESSM, which is the fitted series.
Plotting the actual and fitted series we will get the following graph.
6000
5000
4000
3000
2000
1000
86
87
88
89
90
91
92
93
SALESSM
94
95
96
97
98
99
SALES
Note that for periods after the last observation, forecast beyond the end of the sample, the
procedure yields same forecast for all future observations.
54
Lt
Yt
Tt
( Lt
Yt
Lt
(1 a)(Lt
Lt 1 ) (1
Tt 1 )
)Tt
kTt
Let us see how this works in Excel on Coca-Cola data and then we perform the same model
in Eviews. Note that formulas in columns G, H and I are simply the above set of formulas.
The only problem is the initial values for L1 and T1. The former can simply be Y1, while the
second one can be estimated using the solver to minimise one of the criteria given, MAE,
RMSE or MAPE.
Sales
Forecast
55
Now, performing the same Holts procedure (ES with Trend) in Eviews Yields the following
plot of actual and fitted sale values for Coca-Cola
6000
5000
4000
3000
2000
1000
86
87
88
89
90
91
92
SALESSM
93
94
95
96
SALES
Once the parameters of the Holts procedure are estimated, it is not difficult to perform a
recursive forecast for the period 1996Q3 to 1999Q4, in Excel or Eviews.
7000
6000
5000
4000
3000
2000
1000
86
87
88
89
90
91
92
93
SALESSM
3.5
94
95
96
97
98
99
SALES
As we have seen earlier sometimes time series show regular pattern over the calendar. This
type of regularity in time series is known as seasonality. Seasonality might be detected in data
with any frequency, e.g. semi-annual, quarterly, monthly, or even in daily data when we
consider days of the week or intra-day data when we consider hours of the day as regular
periods.
Seasonality in the time series can be detected using seasonal dummy variables. These dummy
variable models can then be use on their own or within some other more detailed econometric
models to produce forecast for time series.
56
For example consider the following data which represent the quarterly sales of Coca-Cola
over the period Q1 1986 and Q2 1996. Plot of the sale values indicates a distinct regular
movement in the series over the sample period. Also, the graph shows that there is an upward
trend in the sale values.
Time series chart of Sales
5250
Sales
4500
3750
3000
2250
1500
1
10
13
16
19
22
25
28
31
34
37
40
Observation Number
There are two methods that can be used to forecast seasonal series. These are;
1- Exponential Smoothing Method of Holt-Winter
2- Seasonal Dummy Regression Model
3.5.1 Holt-Winter Model for seasonal time series
Holt-Winter Model for seasonal time series is a little more complex than Holts model, in that
it used three parameters, , , , for smoothing rather than two. Also, seasonality can be
treated in the Holt-Winter Method as Additive or Multiplicative.
The difference between the two treatments is the way seasonal changes are applied to the
mean of the series. Suppose we are dealing with monthly seasonal series. The mean of the
series (or the mean for the base season) is 100 and the analysis shows that seasonal change in
March is +30 and in Jun is -20. Then, in additive treatment the changes are added to the mean
of the series (or the mean of the base season). Therefore, the figures for March and June will
be 130 and 80, respectively. On the other hand, if the treatment is multiplicative, then
assuming that the seasonal changes for March and June are 1.3 and 0.8, the level of the series
for March and April will be 100*1.3=130 and 100*(0.8)=80, respectively.
The Additive Holt-Winter Exponential Smoothing Method uses the following set of
equations to forecast a seasonal series.
Lt
(Yt
St
Tt
( Lt
Lt 1 ) (1
St
(Yt
Lt ) (1
Lt
kTt
Yt
) (1 a)( Lt
St
)Tt
)St
k M
57
Tt 1 )
Where: Lt is the permanent component (intercept), Tt is the trend component, St is the additive
seasonal component, M refers to the number id seasons (quarterly=4, monthly=12, etc)
The Multiplicative Holt-Winter Exponential Smoothing Method uses the following set of
equations to forecast a seasonal series.
Lt
Yt
St M
Tt
( Lt
Lt 1 )
St
Yt
Lt
(1
( Lt
kTt ) S t
Yt
(1 a )( Lt
(1
)St
Tt 1 )
)Tt
k M
Let us see how this method works in Eviews. Note that you can perform this easily using
Statpro. We use the same data series; that is, Coca-Cola sales.
Using =0.2, =0.2 and =0.2 as smoothing parameters we get the following results. Notice
that these parameters can be changes to optimise some criteria such as RMSE. The program
can do this automatically, but it usually minimises these criteria in such a way that one or two
of parameters become zero.
Sample: 1986:1 1996:2
Included observations: 42
Method: Holt-Winters Additive Seasonal
Original Series: SALES
Forecast Series: SALESSM_AD
Parameters: Alpha
Beta
Gamma
Sum of Squared Residuals
Root Mean Squared Error
End of Period Levels:
Mean
Trend
Seasonals: 1995:3
1995:4
1996:1
1996:2
0.2000
0.2000
0.2000
1947573.
215.3388
4841.597
112.4040
211.0728
-190.6499
-354.7695
334.3466
We can see from the result that seasonal components can be Q1=-354, Q2=334, Q3=211 and
Q4=-190. We can also plot the actual and fitted values of the Holt-Winter Model with
additive seasonal components.
58
0.2000
0.2000
0.2000
1539869.
191.4773
4837.382
111.8119
1.063009
0.950468
0.887070
1.099454
59
Yt
S 2,t
S 3,t
S 4,t
~ iid(0,
Note that we use dummies S2,t, S3,t and S4,t and we drop S1,t. This is to avoid multicolinearity.
For monthly series, we use dummies S1,t to S12,t.
Dependent Variable: SALES
Method: Least Squares
Sample: 1986:1 1996:2
Included observations: 42
Variable
Coefficient
C
1169.499
TREND
72.97756
S2
639.1542
S3
505.0406
S4
172.9101
R-squared
0.916479
Adjusted R-squared
0.907450
S.E. of regression
297.5065
Sum squared resid
3274874.
Log likelihood
-296.1419
Durbin-Watson stat
0.440827
Std. Error
t-Statistic
193.6863
6.038108
7.590032
9.614922
77.45190
8.252272
100.2065
5.039997
73.16934
2.363149
Mean dependent var
S.D. dependent var
Akaike info criterion
Schwarz criterion
F-statistic
Prob(F-statistic)
Prob.
0.0000
0.0000
0.0000
0.0000
0.0235
2994.353
977.9309
14.34009
14.54696
101.5009
0.000000
Estimating the above regression shows that coefficients of all dummy variables as well as the
trend variable are significant. Therefore, we can calculate fitted values and plot them along
the actual values. The graph shows that the fitted is quite good and this is confirmed by the
R-bar squared of the model, which is 90.7%
Actual, fitted and residual values of Seasonal Regression Model for Coca-Cola sales
60
S2
Yt
S3
S2
S4
S3
S4
Eviews does this automatically and produces the graph of the forecasts as well as a series of
statistics which we will discuss later.
Actual, fitted and forecast values of Seasonal Regression Model for Coca-Cola sales
61
CHAPTER 4
62
Estimation Period
|
t=0
Ex post
Forecast period
|
t-p
Ex ante
Forecast period
|
t
|
T
63
In some cases, models might only include, trend, dummy and seasonal dummy variables.
These are also known explanatory variables. Therefore, forecasts produced using these types
of models are unconditional too.
4.2.1 Forecast error
Errors associated with econometric forecasts can come from a combination of four distinct
sources;
1- The random error terms in the model (unexplained part of variation in the
dependent variable
2- The process of estimation of regression parameters even when they are correctly
estimated as these are random variables
3- In the case of conditional forecast, the error due to the estimation or prediction of
explanatory variable(s)
4- Errors may be induced due to model specification (e.g. estimating a liner instead
of a nonlinear model)
Let us consider the following model for unconditional forecasting
Yt
Xt
~ NI(0,
1,2,, T
The forecasting problem can therefore be posed as; Given a known value for XT+1, what
would be the best forecast that can be made for YT+1?
This can be answered if the values of
and
appropriate forecast for YT+1 can be written as
YT
Et (YT 1 )
XT
Therefore the error of this forecasting exercise; i.e. the forecast error, can be written as
eT
YT
YT
The forecast error from the above model has two important and desirable properties:
1- The mean of forecast errors is zero, E(eT 1 )
means that the forecast is unbiased.
E (YT
Y T 1)
E(
T 1
0 , this
2
2- The variance of forecast error, defined as 2f E[( eT 1 ) 2 ] E[( T 1 ) 2 ]
, has
the minimum variance among all possible forecasts that are based on linear
models.
Furthermore, since forecast error are normally distributed with mean zero and variance
the significance of the forecasted values can be performed using
YT
YT
which means
~ N(0,1)
64
2
f
Therefore the confidence interval around the point forecast can be provided using
YT
Pr ob
Or
4.3
p YT
YT
/2
/2
YT
/2
YT
as
/2
Evaluating forecasts
The question after estimating a model and making forecasts is how to evaluate these
forecasts. One method is to use the forecast error variance and construct the confidence
interval for the forecasts as mention above.
An alternative method is to use several forecasts over a period and compare the forecast
values with realised values. The objective here is to see how good forecasts values track the
actual values of the dependent variable.
There are a number of methods are proposed for such comparison. Among these are; Mean
Absolute Error (MAE), Root Mean Absolute Error (RMAE), Mean Square Error (MSE) and
Root Mean Square Error (RMSE).
Xt
~ NI(0,
1,2,,200
65
Forecast values
Actual values
Forecast error
Absolute Error
Square error
Y -Y
|Y -Y |
a
( Y181
- Y181f )2
Y -Y
|Y -Y |
a
( Y182
- Y182f )2
f
181
f
182
a
181
a
182
a
181
a
182
f
181
f
182
a
181
a
182
f
181
f
182
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Y199f
a
Y199
a
- Y199f
Y199
a
| Y199
- Y199f |
a
( Y199
- Y199f )2
f
Y200
a
Y200
a
f
- Y200
Y200
a
f
| Y200
- Y200
|
a
f
( Y200
- Y200
)2
MAE
MSE
1 200 a
| Yi Yi f |
20 i 181
1 200 a
(Yi Yi f ) 2
20 i 181
RMAE
1
20 i
RMSE
200
1
20 i
| Yi a Yi f |
181
200
(Yi a Yi f ) 2
181
5- Therefore, in general, the RMAE and RMSE for evaluating the forecast can be
found using the following formulas
1
M
RMAE
| Yi
RMSE
Yi |
i 1
1
M
(Yi a Yi f ) 2
i 1
These two statistics can b used to compare the forecasting performance of different models
proposed to forecast a variable. Obviously the model with the smaller RMSE (RMAE) is
preferred in terms of forecasting performance.
Another method suggested in the literature for evaluating the performance of forecasts, is the
Theils inequality coefficient or Theils U statistics, which is defined as
1
M
Theil' s U
1
M
(Yi a Yi f ) 2
i 1
(Yi f ) 2
i 1
1
M
(Yi a ) 2
i 1
Although the numerator of the Theils U is the same as the RMSE, the denominator is chosen
in such a way to confine the statistic between 0 and 1.
1- When Theils U = 0, it means that YTa m = YTf m ; that is there is a perfect relation
(fit) between the actual and forecast values.
2- When Theils U = 1, it means that the prediction is no way close to the actual
values.
Therefore, the Theils U is in fact the standardised RMSE and measures the RMSE in relative
terms.
66
The figure below contains the forecast statistics and the plot of the forecasts along with the
95% confidence interval.
0.08
Forecast: DLWHTFF
Actual: DLWHTF
Sample: 4/12/2000 4/04/2001
Include observations: 52
0.06
0.04
0.02
0.00
-0.02
-0.04
-0.06
-0.08
4/12/00
6/21/00
8/30/00
11/08/00
DLWHTFF
1/17/01
3/28/01
2 S.E.
Now consider an alternative model [ARMA(1,1)] which is estimated over the exactly same
estimation period and used to perform forecast for the returns on wheat future over the
exactly same forecast period.
67
0 .0 8
Fo r e c a s t: D L W H TFF
Ac tu a l: D L W H TF
Sa mp le : 4 /1 2 /2 0 0 0 4 /0 4 /2 0 0 1
In c lu d e o b s e r v a tio n s : 5 2
0 .0 6
0 .0 4
0 .0 2
R o o t Me a n Sq u a r e d Er
0 .0
r o2r8 3 8 2
Me a n Ab s o lu te Er r o r 0 .0 2 2 7 9 6
Me a n Ab s . Pe r c e n t Er1
ro
0r1 .4 8 9 4
Th e il In e q u a lity C o e ffic
0ie
.9n
4t5 2 8 7
Bia s Pr o p o r tio n
0 .0 0 5 6 2 8
Va r ia n c e Pr o p o r tio0
n.8 8 8 8 8 2
C o v a r ia n c e Pr o p o0
r tio
.1 0
n5 4 9 0
0 .0 0
- 0 .0 2
- 0 .0 4
- 0 .0 6
- 0 .0 8
4 /1 2 /0 0
6 /2 1 /0 0
8 /3 0 /0 0
1 1 /0 8 /0 0
D L W H TFF
1 /1 7 /0 1
3 /2 8 /0 1
2 S.E.
Based on the above results, RMSE and Theils U statistics, it can be concluded that the
ARMA(1,1) produce better forecasts than the AR(1) model, as the values of both statistics are
lower for the ARMA(1,1) model.
68
| | | | | | | | | | | | | |
2-step ahead forecast
Estimation Period
|
| | | | | | | | | | | | | |
2-step ahead forecast
Estimation Period
|
| | | | | | | | | | | | | |
2-step ahead forecast
And now we use all the 2-step ahead forecasts and the actual values to find the forecast
errors, MAE, RMSE and Theils U statistic just as we did in the case of one step ahead
forecasts. Then these statistics are compared to the ones with alternative (competing) models.
3, 4, and more step ahead forecast are performed and examined in a similar way.
4.3.2 Static versus dynamic forecast
Forecasts can also be classified into static and dynamic forecasts.
In a static forecasting exercise, just as in the above, in the one-step ahead forecast, the model
is re-estimated each time an observation is added to the estimation period.
In a dynamic forecasting exercise, once the model is estimated over the estimation period, 1,
2, 3, , m step ahead forecasts are calculated. This means that we use forecast values as
observed values and try to forecast further into future.
69
Estimation Period
|
t
1-step
T
|
2-step
3-step
k- step
m- step
Coefficient
1.214078
0.048353
140.3261
104.5884
0.216361
0.210816
2.481026
2609.927
-994.2079
0.183733
Std. Error
t-Statistic
0.551692
2.200644
0.005503
8.786636
36.03850
3.893784
17.44218
5.996295
Mean dependent var
S.D. dependent var
Akaike info criterion
Schwarz criterion
F-statistic
Prob(F-statistic)
Prob.
0.0283
0.0000
0.0001
0.0000
6.145764
2.792815
4.664523
4.702459
39.02177
0.000000
And forecasting the model (using a dynamic forecast) over the period January 1995 to
February 1996, i.e. 14 forecast points, 1 month ahead to 14 month ahead.
14
Fo r e c a s t: R 3 F
Ac tu a l: R 3
Sa mp le : 1 9 9 5 :0 1 1 9 9 6 :0 2
In c lu d e o b s e r v a tio n s : 1 4
12
10
R o o t Me a n Sq u a r e d Er
2 .5
r o0r4 0 9 2
Me a n Ab s o lu te Er r o r 2 .4 7 0 4 3 7
Me a n Ab s . Pe r c e n t Er4
ro
5r.8 8 1 6 8
Th e il In e q u a lity C o e ffic
0ie
.1n
8t7 6 0 5
Bia s Pr o p o r tio n
0 .9 7 3 3 0 0
Va r ia n c e Pr o p o r tio0
n.0 0 0 0 4 3
C o v a r ia n c e Pr o p o0
r tio
.0 2
n6 6 5 7
8
6
4
2
9 5:01
9 5:03
9 5:05
9 5:07
R 3F
9 5:09
9 5:11
9 6:01
2 S.E.
70
4
94:07
94:10
95:01
95:04
95:07
R3
95:10
96:01
R3F
1
M
Theil' s U
1
M
(Yi a Yi f ) 2
i 1
1
M
(Yi f ) 2
i 1
(Yi a ) 2
i 1
(Ymf
Where Y f , Y a ,
Yma ) 2
and
(Y
Y a)
)2
2(1
are the means and standard deviations of the forecast and actual
values over the forecasting period, respectively, and is the correlation coefficient between
(1 / f a M ) (Ymf Y f )(Yma Y a ) . Therefore, three proportions of the
the two series;
Theils inequality can be defined as
US
UC
(Y
(Ymf
(1 / M )
(
(1/ M )
2(1
(1 / M )
Y a )2
a
f
m
Yma ) 2
)2
Yma ) 2
(Y
f
f
m
(Y
Yma ) 2
71
The variance proportion US indicates the ability of the model to replicate the degree of
variability in the dependent variable for which the forecasts are produced. If US is found to be
large, it means that the actual values fluctuated considerably higher than the forecast values,
or vice versa. In such situations the model should be revised.
The covariance proportion UC measures the unsystematic errors; i.e., it represents the
remaining error after deviations from average values have been accounted for. As there are
always differences between actual and forecast values, and these series are not perfectly
correlated, this component is less problematic.
0.08
Forecast: DLWHTFF
Actual: DLWHTF
Sample: 4/12/2000 4/04/2001
Include observations: 52
0.06
0.04
0.02
0.00
-0.02
-0.04
-0.06
-0.08
4/12/00
6/21/00
8/30/00
11/08/00
DLWHTFF
1/17/01
2 S.E.
72
3/28/01
CHAPTER 5
73
dy ex 12
we use following mathematical signs to indicate the relationship between variables,
a<b
a.> b
a b
a b
a is less than b
a is greater than b
a is greater than or equal to b
a is less than or equal to b
74
Linear mathematical inequalities can also be plotted and sketched on the Cartesian Plane. For
example the following simple inequality [y x] relation can be shown by first plotting the
line y=x, and then excluding those areas on which the inequality does not hold.
10
9
y=
8
7
y x
6
5
4
3
2
1
y x
0
0
10
Note that the shaded area represents the set of points that do not satisfy the inequality.
In general to sketch a mathematical inequality such as
dy ex
dy ex
dy ex
dy ex
First plot the line representing dy ex f , then find that which side of the line (area) has to
be excluded. For this purpose, two arbitrary points can be chosen at either sides of the line to
check whether the values satisfy the inequality. The side for which inequality is not satisfied
should then be excluded.
Example 2: Sketch the following linear mathematical inequality y 2x 8
10
8
Y+2x=8
6
4
y+2x 8
2
0
-10
-5
-2
10
-4
y+2x<8
-6
-8
-10
It is not difficult to extend the above to sketch two or more linear mathematical inequality
simultaneously to obtain the feasible region. Therefore, the feasible region is defined as the
area in which every point satisfied all inequalities simultaneously.
75
Example 3: find the feasible area or the following set of mathematical inequalities.
x 0
y 0
y x 10
10
10
0
-10
-5
-2
0
0
10
-10
-5
-2
-4
-4
-6
-6
-8
-8
-10
-10
10
2y+x
76
y+x 5 ,
y-x 2
x 0
y 0.
First use the constraint to sketch the feasible region on a graph and use inequality conditions
to find the feasible regions and the final feasible area.
6
Y=X+2
5
4
Y=-0.5X+4.1
3
X=0
Y=-4.1X+3.5
Y=-0.5X+3
1
Y=-X+5
0
0
Y=0
Assuming an arbitrary value for the objective function, draw the objective function (note that
this is not the maximised function or value). Then shift the objective function up or down to
find the point at which the function is in the feasible area and has its maximum value, with
respect to other points within the feasible area.
O(0,0)
A(0,2)
B(1.5, 3.5)
C(5,0)
2y + x = 0
2y + x = 4
2y + x = 8.5
2y + x = 5
77
Note that only in point B within the feasible area the objective function has its maximum
value.
8
8
7
7
6
5
Feasible
Area
0
0
0
0
2
6
Example: Find the maximum value of the objective function 3y+x subject to the following
constraints, 5x+3y 90, y-2x 15, x 0, y y.
1- Draw the constraints and specify the feasible region
78
40
35
Y-2X 15
30
25
3Y+x
20
15
10
3Y+5X 90
5
0
0
10
15
20
2- Draw the objective function and move it around the feasible area to find the miximum
value that it can take within the feasible region.
3- Find the points of intersection between constraints. These are:
O(0,0), A(0,15),
O(0,0)
A(0,15)
B(4.09, 23.18)
C(18,0)
x + 3y = 0
x + 3y = 45
x + 3y = 73.63
x + 3y = 18
Note that only at point B within the feasible area the objective function has its maximum
value.
Example: An air transport company operates two types of aircrafts, A4030 and B6015. The
A4030 has a carrying capacity of 40 passengers and 30 tons of cargo, whereas the B6015 can
carry 60 passengers and 15 tons of cargo. The company has a contract to carry 480
passengers and 180 tons of cargo each day. If each A4030 flight costs 500 and each B6015
flight costs 600, what choice of aircraft would minimise the transportation cost subject to
fulfilment of the contract?
The objective function is
z=500*X + 600*Y
Where X is the number of A4030 flights and Y is the number of B6015 flights per day. The
constraints are
40*X + 60*Y 480
30*X + 15*Y 180
X 0 and Y 0
number of passengers
amount of cargo
non-negativity conditions
79
The passenger constraint implies that the number flight required to carry minimum number of
passengers should be
40*X + 60*Y = 480
A: (X1 =0, Y1 = 8)
or
B: (X2 =12, Y2 = 0)
Similarly, the cargo constraint implies that the number flight required to carry minimum
amount of cargo should be
30*X + 15*Y = 180
C: (X1 =0, Y1 = 12) or
D: (X2 =6, Y2 = 0)
However, the above combinations do not lead to minimum transportation costs. Therefore, a
LP solution is required to optimise the cost.
No of B Flights
Drawing the feasible area helps to identify the possible solution to the problem. Thus, we
draw the feasible area first.
17
16
15
14
13
12
11
10
9
8
7
6
5
4
3
2
1
0
0
9 10 11 12 13 14 15 16 17
No of A Flights
Now let us evaluate the cost at the point of intersection between the linear constraint, which
is within the feasible area.
E: (XE=3, YE = 6)
The total transportation cost is
This combination of scheduling ensures that the terms of contracts are fulfilled and cost is
minimised.
80
each flight for A4030 aircrafts is 600 and the cost of each flight for B6015 planes is 900.
The objective function would be
z=600*X + 900*Y
which is parallel to the line drawn for constraint to carry passengers. This means that
operating on any combination of aircrafts that lie on the line drawn for passenger constraint
will be an optimum solution.
X + 2*Y 41
6*X + 5*Y 150
X 0 and Y 0
Y=12.8571
81
Next we examine the case where a unit increase is allowed in the constraint function for
labour and the final revenue is derived using the same procedure as before.
z = 2*X + 3*Y
Subject to
and
X + 2*Y 41
6*X + 5*Y 150
X 0 and Y 0
the results indicate that the value of the objective function is increased to
z= 2*13.57 + 3 * 13.71 = 68.28
Therefore the increase in profit as a result of one additional unit in labour hour is
SP = 68.28 67.14 = 1.14
To find out the shadow price of moulding material, we need to repeat the procedure by
increasing the available moulding material by one unit and keeping the rest constant.
z = 2*X + 3*Y
Subject to
and
X + 2*Y 40
6*X + 5*Y 151
X 0 and Y 0
82
objective function for portfolio selection problem strategy is always to maximise the
expected return and/or minimise the risk of portfolio. The constraints are usually the
restrictions on the type of investment instruments due to state laws, company policy,
maximum permissible risk, etc. These problems are generally addressed using complicated
mathematical programming techniques one which (perhaps the simplest) is linear
programming.
Consider a portfolio manager who is looking for different investment instruments to allocate
the sum of $100,000 cash, which she has just been given. The senior financial advisor of his
company has recommended investing in biotech, car manufacturing industry or government
bonds. These are based on the analysts results in which five investment opportunities have
been identified and their annual rate of returns are projected. These are reported in the
following table
7.3
10.3
6.4
7.5
4.5
The guidelines for investment in Bud fund management company are as follows
1- Investment on each industry should not exceed $50,000 at any time
2- There should be at least 25% of the investment on the car manufacturing industry on Government
Bonds
3- The investment on high risk company (Fosters Pharmaceutical) should not be more than 60% of
the investment in that sector (biotech)
What should be the portfolio selection of the portfolio manager to maximise the projected
return subject to imposed investment constraints?
In this case a linear programming model can help the portfolio manager to solve the problem
and come up with the optimum solution; that is, achieving the highest return for such
investment opportunities.
Assuming
And using the projected rates of returns, the objective function that need to be maximise is
83
A+B+C+D+E = $100,000
A+B $50,000
C+D $50,000
E 0.25(C+D) or E0.25C-0.25D 0
B 0.6(A+B) or -0.60A+0.40B 0
Furthermore, there are non-negativity restrictions, which should be considered. What are
those constraints?
Using Excel Solver the following results are obtained as a solution to the problem. You need
to look at Excel manual to learn how to use a solver. At the moment this is out of the scoop
of this class.
84
Name
maximise
Original Value
8000
Final Value
8000
Adjustable Cells
Cell
Name
$B$13
$B$14
$B$15
$B$16
$B$17
Original Value
20000
30000
0
40000
10000
Final Value
20000
30000
0
40000
10000
Constraints
Cell
$B$21
$B$22
$B$23
$B$24
$B$25
Name
Fund avialable =
Investment on A + B =
Investment on C + D =
Gov Bond Constraint
Investment Constraint
Cell Value
Formula
100000 $B$21=$C$21
50000 $B$22<=$C$22
40000 $B$23<=$C$23
0 $B$24>=$C$24
0 $B$25<=$C$25
85
Status
Binding
Binding
Not Binding
Binding
Binding
Slack
0
0
10000
0
0
Name
$B$13
$B$14
$B$15
$B$16
$B$17
Allowable
Increase
Allowable
Decrease
0
0.073
0.03
0.055
0
0.103
1E+30
0.03
0 0.06399996 0.011000166
1E+30
0
0.075
0.0275 0.011000149
0
0.045
0.03 37796.2825
Constraints
Cell
$B$21
$B$22
$B$23
$B$24
$B$25
Name
Fund avialable =
Investment on A + B =
Investment on C + D =
Gov Bond Constraint
Investment Constraint
Final
Value
100000
50000
40000
0
0
Shadow
Price
0.069
0.022
0
-0.024
0.03
Constraint
R.H. Side
Allowable
Increase
100000
50000
50000
0
0
12500
50000
1E+30
50000
20000
Cell
$B$7
Target
Name
maximise
Value
8000
Cell
Adjustable
Name
Value
$B$13
$B$14
$B$15
$B$16
$B$17
20000
30000
0
40000
10000
Lower Target
Limit Result
20000
30000
0
40000
10000
86
8000
8000
8000
8000
8000
Upper Target
Limit Result
20000
30000
0
40000
10000
8000
8000
8000
8000
8000
Allowable
Decrease
50000
12500
10000
12500
30000
When an objective function is nonlinear, as we saw earlier in QM, the function may have
more than one turning point. The number of turning points depends on the degree of the
objective function. Also we noticed that the turning points might be local maxima (minima),
or global maximum (minimum).
Example 1: Florida Power and Light faces demands during both peak-load and off peak-load
times. FPL must determine the price per kilowatt-hour (kwt) to charge during both peak and
off-peak periods. The daily demand for power during each period (in kwh) is related to price
as follows:
87
0.1P0 ) P0 * (40 P0
0.1Pp )
Expanding the revenue equation we will get a nonlinear model for revenue
40P0
P02
The above model can be optimised using Excel solver in the following way
Enter the inputs of the problem in a spreadsheet as shown
88
The objective cell (B24) is the target cell for excel to maximise. The optimisation will take
place with respect to variables which are peak price, off-peak price and capacity. These
variables are related nonlinearly through the model. The optimisation is then performed
subject to the set of constraints given in the constraint box
The result shows that the maximum profit is obtained when prices are set at $70.31/kwh and
$26.53/kwt for peak and off peak periods respectively. This also, implies a capacity of 27.5.
The maximum profit therefore is $2202.30.
Mean
0.14
0.11
0.10
SD
0.2
0.15
0.08
Correlation
Stock 1 and 2
Stock 1 and 3
Stock 2 and 3
89
0.6
0.4
0.7
The company wants to invest in a minimum variance portfolio, which has an expected return
of at least 0.12
The following are the summary information
X1, X2 and X3 are weights invested in each of stock 1, stock 2 and stock 3,
Total weight 1,
Expected return on portfolio min 12%,
Select the target cell, variable cells and set up the constraints in the solver. Once everything
is in place, optimise the model
90
Cell
$B$26
Name
Portfolio variance Stock 1
Original
Value
0.0148
Final Value
0.0148
Cell
$B$16
$C$16
$D$16
Name
Fractions to invest Stock 1
Fractions to invest Stock 2
Fractions to invest Stock 3
Original
Value
0.5
0
0.5
Final Value
0.5
0
0.5
Cell Value
0.12
1.0
0.5
0
0.5
Formula
$B$20>=$D$20
$E$16=$G$16
$B$16>=0
$C$16>=0
$D$16>=0
Cell
$B$20
$E$16
$B$16
$C$16
$D$16
Name
Actual
Fractions to invest Total
Fractions to invest Stock 1
Fractions to invest Stock 2
Fractions to invest Stock 3
91
Status
Binding
Binding
Not Binding
Binding
Not Binding
Slack
0.00
0
0.5
0
0.5
CAHPTER 6
Simulation Analysis
92
6 Simulation Analysis
6.1 Introduction
Simulation can be defined as a computer model which can imitate real life situations taking
into account the uncertainty involved in one or more factors. In fact, when we run a
simulation, we let these factors take random quantities and values, and we monitor and record
the outcome of the model each time. The collection of all outcomes can then be used to
produce the distribution of the outcomes for further analysis as well as decision making.
Simulation models are widely used in operation research, engineering, finance and asset
pricing, among other areas of science. In particular, in finance simulation technique is used in
option pricing, Value at Risk (VaR) analysis, capital budgeting, etc.
There are several steps involved in setting up a simulation.
1) Identifying the inputs and outputs of the model
2) Identifying the model - the relationship between inputs and outputs
3) Identifying the process of inputs - determining which distributions do the follow.
4) Set up the model - use a spread sheet or any other statistical software to relate input
and output variables
5) Simulate the model - run the model several hundred times when inputs take different
values according to their specified distributions
6) Record the outcome of outputs each time
7) Interpret the result
It is argued that the simulation process requires some input variables to be generated
randomly and fed to the model. To randomly generate a variable, we need to use the
distribution that we think the variable follows and draw randomly from this distribution. This
can be done easily in excel using RAND() command.
The random number that Excel generates takes values between 0 and 1. This random number
can be used for drawing observations from a specified distribution.
93
6.2
To see how generate a random variable let use look at the following example.
We consider the random variable Y, which we know from historical observation, follows the
following probability distribution.
Variable Y
10
Probability
0.05
0.06
0.075
0.1
0.125
0.18
0.125
0.1
0.075
0.06
0.05
cum prob
0.05
0.11
0.185
0.285
0.41
0.59
0.715 0.815
0.89
0.95
0.2
0.18
0.16
0.14
0.12
0.1
0.08
0.06
0.04
0.02
0
0
10
We can use the pdf of the variable to generate (simulate) randomly drawn number from the
distribution that Y follows.
1234-
94
5- copy and paste the cells A3, B3 and C3 until row 1003. This will give you 1000 random
numbers drawn from the distribution that Y follows.
If we plot the histogram of these 1000 draws we will have the following figure, which more
or less resembles the original pdf of Y. Of course this is not identical to the pdf of Y, but if
you increase the number if simulations the shape of the generated distribution will get closer
and closer to the original pdf.
0.2
0.18
0.16
0.14
0.12
0.1
0.08
0.06
0.04
0.02
0
0
10
Calendars Demanded
Probability
100
0.3
150
0.2
200
0.3
250
0.15
300
0.05
To find out the best order quantity for Walton Bookstore based on simulation analysis we
construct a simulation spread sheet and perform the following steps
1- Use the given information, identify the input (fixed and variables) and fill appropriate
cells. These are
Cost Data and Demand distribution
2- Enter the decision variable. In this case the Order Quantity
95
Another important part of the output is the table of average profits and plot of average profit
with respect to order quantity. This output shows the sensitivity of your profit with respect to
changes in order quantity. For example, according to our simulation result, order quantity of
150 yields the maximum profit of $277.5. However, you should note that each time you run
the simulation, you may get different result and it is the average of those results that you must
look at the average of all these simulations.
96
100
0.3
150
0.2
97
200
0.3
250
0.15
300
0.05
Now let us assume that the demand for calendars follow a Normal distribution with mean 175
and SD 60. Walton wants to maximise the expected profit from calendar sales. Using @Risk,
we can simulate Walton's profits for order quantities of {100, 150, 200, 250 and 300}.
This is done in the following way.
1- First open the @Risk application,
2- Enter the input cells. These are the values for Unit Cost (cell B4), Unit Price(cell B5),
Unit Refund (cell B6), Mean and SD of the distribution for demand (cells E5 and E6), and
possible order quantities (cells G4 to G8).
3- Next construct the decision variable, order quantity, in cell B9. Just write
=RiskSimTable(G4:G9) in that cell. This tells the program to run the simulation for
different order quantities.
4- Next construct the Demand Variable in cells B14. You just need to write in that cell
=RiskNormal(B5,B6).
This tells the program that the random variable, demand, should be drawn from a normal
distribution with mean 175 and SD 60.
5- Revenue in cell C14 can be written as
=B5*MIN(B14,B9)
=B4*B9
7- Refund is calculated as
=B6*MAX(B9-B14,0)
8- Finally, Profit is the difference between revenue and cost plus any refund
=C14-D14+E14
We are now set to run the simulation, but before that we need to do one more thing. We need
to set the simulation setting. This means to tell to the program how many simulations to run
98
and how many iterations for each simulation. This can be done by clicking on the simulation
setting icon.
Test correlation of data
Define a function
Test which distribution
Select a distribution
for input variable
This box has several pages, but the page that we are dealing with now is the Iteration page.
We set the #Simulations to 5 as we have 5 different simulations, one for each order quantity.
We can also change the #Iteration as required (e.g. 10000 in this case). This means that for
each simulation case we have 10000 simulations.
Start
running
simulation
Define
setting,
simulation
output
and
updates
99
the
Now we need to add the output cell(s) to @risk. This is done by selecting the output
(resultant) cell(s) and clicking on the "add output icon". In this case select cell F14 (profit)
and click on the icon.
Finally, once everything is set, click on the "start simulation" button and watch the process
of simulation.
Once the simulation is completed, you will switch to @risk window which shows the results.
The result window has two parts, upper window and lower window. The upper window
shows the simulation details (e.g. simulation numbers, #iterations, #inputs, #outputs, runtime,
sampling type, etc). The second part of the upper window presents the results of the
simulated cell(s). In this case, cell F14 is the output cell and it can be seen that the summary
results shows the Min, Max and Mean of profit for different simulations (from 100 to 300
order quantities).
100
The lower part of the result window contains the full simulation statistic for each simulation
including Min, Max, SD, Var, Skewness, Kurtosis, and percentiles. The output is presented
for each variable in the model.
We can summarise the results by copying and pasting them in Excel and preparing them for
tabular or graphical reports and analysis.
Possible Order Quantities
100
150
200
250
300
Sim 1
Sim 2
Sim 3
Sim 4
Sim 5
Mean Profit
227.23
273.86
211.36
39.73
-190.55
SD of Profit
$89.85
$198.38
$323.70
$409.64
$442.59
Mean Profit
300.00
200.00
100.00
0.00
Sim 1
Sim 2
Sim 3
-100.00
-200.00
101
Sim 4
Sim 5
8%
15000
7000
20
3%
10
pa
$/day
$/day
$m
pa
$m
Year
Period
Annual Cost
Annual Revenue
2015
0
20.00
2016
1
2.555
5.250
2017
2
2.632
5.250
2018
3
2.711
5.250
2019
4
2.792
5.250
2020
5
2.876
5.250
2021
6
2.962
5.250
2022
7
3.051
5.250
2023
8
3.142
5.250
2024
9
3.237
5.250
2025
10
3.334
15.250
-20.00
-20.00
2.695
2.488
2.618
2.231
2.539
1.998
2.458
1.785
2.374
1.592
2.288
1.416
2.199
1.256
2.108
1.111
2.013
0.980
11.916
5.354
NPV Handysize
0.21
At first sight the project seems OK as it has a positive NPV, however, as we mentioned there
are many factors in this evaluation which are uncertain and could change as the project gets
102
underway. Using MC simulation technique we can relax the assumption that these factors are
fixed and obtain the distribution of NPV as well as the probability that the NPV could be
negative.
Running 10,000 simulation we can obtain the distribution of the NPV of this project and it
reveals that while the mean of the NPV is still about 0.28, there is significant probability that
this project ends up in a loss (50% probability of NPV being negative)!!
103
Pt
P0 exp[(
0.5
)t
Z t]
Where
is the mean percentage growth rate of the stock,
is the standard deviation of the growth rate,
Z is a random variable which follows a normal distribution (0,1)
Assuming growth rate and standard deviation are annualised and expressed in decimals, e.g.
0.06 growth and 0.1 standard deviation per year.
If we have necessary information; i.e. average growth rate (mean return) and SD of a stock
we can simulate the value of the stock using the above formula. For example, consider a
stock's price today is 10 and has a growth of 6% and SD of 10%. Using a simple simulation
technique, we can construct the distribution of stock price in 25 days time (1 month).
104
The value of the cell for day 0 (cell D8) is the current stock price. The value of the stock for
next period (day 1) in cell D9 is given by the following formula
=D8*EXP((($C$4-0.5*$C$5^2)*(C9/250))+$C$5*RiskNormal(0,1)*SQRT(1/250))
while the values for days 2 to 25 in cells D10 to D33 is just copy of the previous cell. Next
we need to set up the simulation parameters as 1000 iteration and one simulation. Then we
need to determine the simulation input (cells D9 to D33) and finally run the simulation.
Running the simulation will create 1000 price series 10 of which is shown in the following
figure. It can be seen that stock price fluctuates over the period and everyday, however, there
is a slight growth in the series. The important point is the distribution of the final prices that
we obtain. The mean of this distribution should be 10+(1+0.06/12)*10
10.6
10.4
10.2
10
9.8
9.6
9.4
9.2
9
1
Sim 1
Sim 2
Sim 6
Sim 7
11
13
105
15
17
19
21
23
25
Sim 3
Sim 4
Sim 5
Sim 8
Sim 9
Sim 10
@Risk can produce histograms for distribution of returns for each day of the simulation
period. For example, the following histogram shows the distribution of the prices in the last
day of the simulation.
Furthermore, you can obtain the summary statistic and percentiles of distributions of price
movements for each day of the simulation. For example, the table shows that in the last day
of simulation period, the stock price has a mean of 10.060 and SD of 0.312. The percentiles
show the cut off pints for the percentage of observations below that price. For example, 5%
percentile for the last day (day 25) has a value of 9.5551. This means that 5% of the
observations produced by simulation exercise fall below 9.5551. This is obtained simply by
sorting the observations according to their value in an acceding form and finding the value of
observation 50.
Similarly, the 95% percentile value of 10.5857 for day 25 indicates that 95% of the
observations fall below this value (10.5857). Furthermore, using the percentiles you can build
a confidence interval and say that we are 90% confident that stock price in day 25 should be
between 9.5551 and 10.857.
Name
Cell
Minimum =
Maximum =
Mean =
Std Deviation =
Variance =
Skewness =
Kurtosis =
Mode =
5% Perc =
10% Perc =
15% Perc =
.
.
.
85% Perc =
90% Perc =
95% Perc =
.
.
.
106
Price Day23
D31
8.95781
11.1546
10.0554
0.30571
9.35E-02
0.19571
3.0793
10.1687
9.55501
9.66773
9.72741
.
.
.
10.3786
10.4583
10.5444
The other useful graph that @Risk can produce is the summary of distributions of all
simulation period; that is day1 to day25 or cells D9 to D33. This is shown below for this
example.
It can be seen that as the simulation period increases the variance of the distribution of price
increases. This is also evident from the table of results.
107
on a stock which has the same SD as the underlying and a growth rate equal to risk free rate
(interest rate).
Example:
Current price for a non-dividend paying stock is P0=$12. Assuming the annual growth rate
and standard deviation of this stock are =12% and =20%, and risk free rate is r=7%,
calculate the value of an European call option on this stock with strike price X=$14 and 3
month to maturity.
According to Cox el at (1979) model, we need to find the mean of the cash flow for this
option and discount it to the present time, t=0, while assuming the stock price increase at the
risk free rate. To find the mean of the cash flow, we need to simulate the stock price several
times for the next 3 month using a growth rate equal to risk free rate and find the cash flow as
the difference between stock value at time 0.25 and the strike price. That is
If stock price at maturity, T=0.25, exceed the exercise price, then Pay off = PT - $42
If stock price at maturity, T=0.25, is less than the exercise price, thenPay off = 0
Then we find the mean of these cash flows and discount the mean using a r=7% discount rate.
The value that is obtained will be the call option price.
Input cells:
$13.50
$14.00
12%
20%
7%
0.25
Output cells:
Stock price in 3 month
using risk free growth =C6*EXP((C10-0.5*C9^2)*C11+C9*RiskNormal(0,1)*SQRT(C11)) 13.6698
rate
Option
cash
flow
at
=MAX(0,C16-C7)
$0.00
maturity
DPV of the cash flow
=C17*EXP(-C10*C11)
$0.00
108
Option Price
=C18
$0.00
We can then run the simulation by choosing cell C18 to be the output cell in @risk. The
simulation exercise will result in
It can be seen that the mean discounted present value of cash flow is 0.424cents. This is the
call option price. You can check this using Black-Scholes pricing formula too.
109
CHAPTER 7
110
f ( x)
2
exp
111
(x
2
)2
2
Where x, the asset return (cash flow), is defined over the interval - <x<+ , is mean return
(cash flow) and 2 is the variance of the returns (cash flow). A normal pdf with zero mean
and variance 1, N(0,1), is shown in the figure below
0.45
0.4
0.35
0.3
0.25
0.2
0.15
0.1
0.05
0
-4
-3
-2
-1
The pdf of cash flow gives a complete representation of all possible random outcomes. In
fact, it tells us about each possible cash flow and its likelihood. Knowing the distribution of
cash flows one can answer questions about the likelihood of outcomes and possible future
cash flows.
Therefore, in order to be able to report the likelihood of loss, we need some information to
begin with. First we need to know the confidence level (1- ). Second we need to know the
investment or project horizon (holding period) N. Third we need to know about the
distribution of the cash flow of the investment, project or portfolio at time N.
Once we have all these information, VaR can be expressed as "we are (1- ) sure that we will
not lose more than V dollars in the next N days. Therefore, the VaR is a function of two
parameters, and N. For example, if the distribution of our cash flow after N=1 days is given
as N(0,1) below
0.45
0.4
0.35
0.3
0.25
0.2
0.15
0.1
0.05
0
-4
-3
-2
-1
112
Assuming a confidence level of 95%, we can say that: "we are 95% confident that we will
not lose more than $1.645 by the next trading day". This means that VaR=$1.645.
Similarly, if the confidence level is chosen as (1- )=99% and N=10, and we find that the
distribution of cash flow after 10 trading days is going to be N(2,2) as shown below
0.25
0.2
0.15
0.1
0.05
0
-5
-3
-1
2.326 . Therefore,
But we always report the VaR as a positive value, although it is a loss. Therefore, we can say
that "we are 99% confident that we will not lose more than $3.289 in ten days" or "our
value at risk in ten days, assuming 99% CL is $3.289".
Note that if our distribution is symmetric, there is no difference if we use the right tail or left
tail of the pdf to calculate and report the VaR. However, if the distribution is not normal or if
it is asymmetric, then we need to calculate the VaR using the left tail and report the figure as
a positive value.
daily
252
Similarly, if we assume there are 25 trading days in one month, we can write
monthly
daily
25
113
10
daily
SD1d * 10
SD10 d
$200,000 * 10
$632,456
Now, if we want to calculate the VaR of this portfolio for 10 days horizon, given that
CL=99% and the returns are normally distributed with mean zero, simply write
VaR Z * SD10d
VaR
2.326 * $632,456
$1,473,621
Note that although the assumption of normality and zero is not exactly true but it serves the
purpose in practice. This is because normally, the average daily returns compared to SD are
small.
Let us now consider another portfolio of single asset which consist of $5m AT&T shares with
daily SD =1% (approximately 16% per year). Using N=10 and CL=99%, we can find the SD
as
SD10 d SD1d * 10
SD10 d
$50,000 * 10
$158,114
VaR as
VaR Z * SD10d
VaR
2.326 * $158,114
$368,405
2
IBM
2
AT &T
IBM
114
AT &T
Note that the SD's mentioned in the formula are values therefore, we did not include the
weights.
To find the 10 day VaR of the portfolio at 99% CL, we need to calculate the SD of portfolio
and using the appropriate Z , obtain the VaR. Thus
(632,456) 2
(158,114) 2
2(0.7)(632,456) * (158,114)
$751,665
Therefore, our 10 day SD for portfolio is $751,665. Having obtained the SD of portfolio we
proceed to calculate the VaR using =1%.
VaR Z * SD10d
VaR
2.326 * $751,665
$1,751,379
From this value at risk calculation you can see directly the effect of diversification as the VaR
of the portfolio is smaller than the sum of VaR of holding each stock individually.
xi
2
i
2
i
i 1
and
n
2
p
i 1
ij
i 1 j i
Where P is the change in the value of portfolio, xi is the change in the market variable
(asset) i, i is the weight of the asset i in the portfolio, and ij is the correlation between i and
j market variables.
Alternatively, for multi-asset portfolios, one can estimate the VaR for individual assets and
construct a vector of VaRs, V. Then estimate the correlation between assets and construct the
correlation matrix, C, and use the following formula to find the VaR of the portfolio of the
assets
115
(VCV' )1 / 2
VaR p
where
12
1n
2n
n1
n2
note that in this case you do not need to use weighs as the dollar values of VaR for each asset
in the portfolio will take the weights into account automatically.
As an example consider a portfolio of 4 TD3 FFA positions of 5000t (1m, 2m, 3m and 4m to
maturity) on 17 March 2003, with the following 1% 10-day individual VaRs and correlation
matrix
1M
Current Price WS
Volatility SD
1% 10-day VaR $
112.5
113.6%
39,941
2M
3M
88.0
67.5
86.7%
74.4%
23,866
15,709
65
61.4%
12,606
1
0.724 0.593 0.426
0.724
1
0.826 0.667
0.593 0.826
1
0.774
0.426 0.667 0.774
4M
V' CV
1/ 2
$80,314.8
116
Figure A
Figure B
Figure C shows the relationship between VaR, CL and HP in one diagram. It can be seen that
the VaR surface changes as the CL and HP change.
Figure C
VaR Surface Against Confidence Level and Holding Period
12.000
10.000
8.000
VaR
6.000
4.000
2.000
16
11
0.98
0.99
0.96
0.97
0.94
0.95
0.92
6
0.93
0.9
0.91
0.000
1
Holding period
Confidence level
117
Therefore, a better way of expressing the risk in terms of probability of loss is to report the
Expected Tail Loss (ETL). The ETL is the average of all the losses that we may make given
% probability that things go wrong in N days.
VaR=1.64
5
ETL=2.06
1
In order to calculate the ETL, we must find the average of the losses from =5% to =5%.
This is done by slicing the tail into 10, 100 or more slices and find the loss values (VaR) in
each case and taking the average of these VaR's. Assuming 10 slices, the ETL value is given
in the following table.
Tail VaR's
0.95
0.955
0.96
0.965
0.97
0.975
0.98
0.985
0.99
0.995
1.645
1.695
1.751
1.812
1.881
1.960
2.054
2.170
2.326
2.576
1.987
It can be seen that the accuracy of the ETL depends on the number of slices used in
calculating the average. The following table shows the relationship between number of
slices and ETL.
Number of Tail slices
N=10
N=50
N=100
N=500
N=1000
N=5,000
ETL
1.9870
2.0432
2.0521
2.0602
2.0614
2.0624
118
Although in practice people use ETL in the way mentioned, but it is not again the most
accurate way of calculating the ETL. This is because the VaR changes with the probability of
loss, therefore, one has to consider a weighted average ETL. Weighted average ETL is an
ETL, which is calculated using 's as weights. Consider the same example.
VaR=1.64
5
ETL=2.06
1
This is shown in the following table where the last column is the weighted VaR's and
weighted average ETL 1.844.
1
10.95
0.955
0.96
0.965
0.97
0.975
0.98
0.985
0.99
0.995
2
VaR
1.645
1.695
1.751
1.812
1.881
1.960
2.054
2.170
2.326
2.576
0.050
0.045
0.040
0.035
0.030
0.025
0.020
0.015
0.010
0.005
/0.275
0.182
0.164
0.145
0.127
0.109
0.091
0.073
0.055
0.036
0.018
Sum
Ave=1.987
0.275
1.000
5
VaR x 4
0.299
0.277
0.255
0.231
0.205
0.178
0.149
0.118
0.085
0.047
1.844
119
following sections are devoted to discuss some of these methods which are broadly classified
into parametric and non-parametric approaches in VaR estimation.
7.9.1 Parametric VaR Estimation
The parametric approach in estimating the VaR explicitly assumes that returns follow a
defined parametric distribution, such as the Normal, Student-t or Generalised Error
Distribution, among others. Based on this approach, parametric models are used to estimate
the unconditional and conditional distribution of returns, which is then used to calculate VaR.
These methods are usually preferred and used frequently in estimating VaR because they are
simple to apply and produce relatively accurate VaR estimates (see for example Jorion; 1995
and 2002).
7.9.1.1 Sample Variance & Covariance Method
A simple and straightforward method of calculating VaR is to use the historical constant
variance and covariance between the return series, and find the difference between the mean
and the % percentile of the distribution of the asset or the portfolio; that is, VaRt 1 Z t .
Based on this method, forecasts of variances of returns are usually generated using a rolling
window of a specified size, e.g. 1,000 data points. The Variance-Covariance method is a
simple and fast method for estimating the VaR, but is believed to be efficient only in the short
term. The main disadvantage of this method is that it does not take into account the dynamics
of volatility of the underlying asset as it applies equal weights to past observations in the
variance calculation.
7.9.1.2 Exponential Weighted Averages variance or RiskMetrics Method
RiskMetrics uses a weighted average of the estimated volatility and the last price changes at
any point in time to estimate future volatility and VaR.1 The weighting factor, which
determines the decay in the importance of past observations, could be estimated from
historical data. However, usually it is set as constant between 0.9 and 0.98. J.P. Morgan
RiskMetrics, for instance, uses a weighting multiplier of 0.97 which is argued to be the
optimal decay factor in variance calculation. Thus the RiskMetrics exponentially weighted
average variance estimator can be obtained using the following equation.
2
t 1
2
t
(1
)rt 2
(7.1)
It is obvious that the higher the decay factor, the longer the memory assigned to past
observations. In case of a portfolio of assets, the covariance and correlation of two assets, say
X and Y, can be estimated, respectively, as:
1
For more details on this approach the reader is referred to J.P. Morgans RiskMetrics Technical Manual (1995) as well as
Chapter 3 of this book.
120
2
XY ,t 1
2
XY ,t
(1
(7.2)
)rt X rtY
(7.3)
XY ,t
XY ,t 1
X ,t
Y ,t
2
1 t 1
2
t 1
(7.4)
their long run mean, then there is a higher likelihood for a price increase (Figure 7.1, Price
Path 2).
Figure 7.1: Mean Reversion and VaR Estimation
45
Price
40
35
30
25
20
1
Price Path 1
10
13
16
19
Price Path 2
22
25
28
31
34
37
40
43
46
49
52
55
58
Time
Another advantage of the Monte Carlo simulation is that due to its flexibility it can be used to
estimate VaR of portfolios containing short-dated as well as nonlinear instruments such as
options and option portfolios. In addition, sensitivity analysis can be performed in a simple
way by changing market parameters. For instance, by changing variances and correlations of
a portfolio, we can assess the sensitivity of the portfolio and examine the effect of different
volatility regimes on VaR estimates. However, simulation techniques are highly dependent
on the accuracy and quality of the processes chosen for the behaviour of the underlying asset
prices, and their complexity and computational burden increases with the number of the
assets in the portfolio.
The steps in MCS for VaR calculation include:
Step 1: Specify the dynamics of the underlying processes.
Step 2: Generate N sample paths by sampling changes in asset prices over the
required horizon. A minimum of 10,000 simulations are typically necessary
for satisfactorily accurate estimates
Step 3: Evaluate the portfolio at the end of horizon for each generated sample path.
The N outcomes constitute the distribution of the portfolio values at the
required time horizon.
Step 4: Finally, VaR can be estimated as the difference between the mean of the
simulated distribution and the lower % percentile of the ordered simulated
outcomes at the point in time for which the VaR is considered; for instance,
see Figure 7.2 for the 1% VaR.
122
mean
20
1% VaR
10
1% tail
0
1
10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 58
Consider portfolio A consisting of two long FFA contracts on two different hypothetical
shipping routes (Route 1 and 2), and portfolio B which consist of a long position in route 1
and a short position in route 2. The current value for route 1 is $30/mt for 54,000mt and daily
volatility of 2.52%, while the current value for route 2 is $10/mt for 150,000mt of cargo and
daily volatility of 1.89%. The long run mean of freight rates for route 1 and 2 are $25/mt and
$8/mt, respectively, while estimated historical correlation between the two routes is 0.8. In
addition, it is assumed that freight rates in both routes are mean reverting with mean
reversion rates of 0.33 and 0.4 for routes one and two, respectively.2 The estimated one-day
5% VaR of the two portfolios using the variance covariance method would be:
Route 1: VaR15d%
daily
5%
1d
daily
Route 2: VaR
$67,143
and therefore
A mean reverting process is defined as a process in which prices tend to revert back to a long run mean. A discrete version
of a bivariate mean reverting process can be written as
1 2
s1,t [ 1 ( 1 s1,t ))
t 1,t
1 ] t
1
t ~ N(0, )
2
1 2
s 2,t [ 2 ( 2 s 2,t ))
t 2,t
2 ] t
2
2
where s1,t and s2,t are log of asset prices and s1,t and s2,t are log price changes at time t; 1 and 2 are the log price levels
to which prices of asset 1 and 2 revert over the long run, respectively; 1 and 2 are the coefficients of mean reversion
measuring the speed at which prices revert to their mean; 1 and 2 are the standard deviation of prices, and t is a (2x1)
vector of stochastic terms which follow a bivariate normal distribution with zero mean and variance-covariance
2
1
1, 2
1, 2
2
2
123
Portfolio A: VaR
5%
1d
Portfolio B: VaR
5%
1d
67,143
46,627
67,143 46,627
1
0.8
0.8 67,147
1
1/ 2
$108,127
46,627
0.8 67,143
0.8
1/ 2
46,627
$40.905
Using Monte Carlo simulation, we estimate 5% VaR for each route as well as portfolio A and
B for different time horizons; namely, 1-, 10-, 20- and 40-days. Panel A of Table 7.1 presents
the parameters of the underlying routes, whereas, Panel B presents the estimated VaR for
both the individual routes and the two portfolios. It can be seen that VaR estimated for
portfolios A and B through MCS are slightly lower than those estimated using the simple
variance-covariance method. This is because MCS incorporates the assumed mean reversion
property of freight rate processes and, as a result, the estimated portfolio VaRs are marginally
less than those estimated through the simple variance-covariance method. Although the
difference in VaRs is relatively small for short horizons, this increases as we consider longer
VaR periods, consistent with the fact that the impact of mean reversion increases as we
consider longer periods; for instance, the estimated 5% 40-day VaR for portfolio A using
MCS is $601,538, while the estimated 5% 40-day VaR for portfolio A using variancecovariance method is $683,854 ( = $108,127x 40 ).
Table 7.1: VaR calculation using Monte Carlo Simulation
Panel A: Assumptions
Route
1
2
Maturity
60 -days
60-days
Rate($/t)
30
10
Contract
Size
(tonnes)
54,000
150,000
1.620m
1.500m
1-day VaR
10-day VaR
20-day VaR
40-day VaR
Route 1
$66,121
$201,509
$279,375
$372,858
Daily Vol
2.52%
1.89%
Mean
Reversion
Rate
0.33
0.40
(=30$x54,000t)
(=10$x150,000t)
Portfolio A
$105,428
$333,136
$451,969
$601,538
124
Portfolio B
$41,106
$125,630
$175,951
$245,969
Long Run
Mean
$25/t
$8/t
Correlation
0.8