Chapter8 Solutions
Chapter8 Solutions
- Frequency: Stock market prices are measured every time there is a trade or
somebody posts a new quote, so often the frequency of the data is very high
Of these, we can allow for the non-stationarity within the linear (ARIMA)
framework, and we can use whatever frequency of data we like to form the
models, but we cannot hope to capture the other features using a linear model
with Gaussian disturbances.
(b) GARCH models are designed to capture the volatility clustering effects in
the returns (GARCH(1,1) can model the dependence in the squared returns, or
squared residuals), and they can also capture some of the unconditional
leptokurtosis, so that even if the residuals of a linear model of the form given
by the first part of the equation in part (e), the ’s, are leptokurtic, the
standardised residuals from the GARCH estimation are likely to be less
leptokurtic. Standard GARCH models cannot, however, account for leverage
effects.
- How do we decide on q?
GARCH(1,1) goes some way to get around these. The GARCH(1,1) model has
only three parameters in the conditional variance equation, compared to q+1
for the ARCH(q) model, so it is more parsimonious. Since there are less
(d) There are a number that you could choose from, and the relevant ones that
were discussed in Chapter 8, inlcuding EGARCH, GJR or GARCH-M.
The first two of these are designed to capture leverage effects. These are
asymmetries in the response of volatility to positive or negative returns. The
standard GARCH model cannot capture these, since we are squaring the
lagged error term, and we are therefore losing its sign.
The conditional variance equations for the EGARCH and GJR models are
respectively
And
The EGARCH model also has the added benefit that the model is expressed in
terms of the log of ht, so that even if the parameters are negative, the
conditional variance will always be positive. We do not therefore have to
artificially impose non-negativity constraints.
(e). Since yt are returns, we would expect their mean value (which will be
given by ) to be positive and small. We are not told the frequency of the data,
but suppose that we had a year of daily returns data, then would be the
average daily percentage return over the year, which might be, say 0.05
(f) Since the model was estimated using maximum likelihood, it does not seem
natural to test this restriction using the F-test via comparisons of residual
sums of squares (and a t-test cannot be used since it is a test involving more
than one coefficient). Thus we should use one of the approaches to hypothesis
testing based on the principles of maximum likelihood (Wald, Lagrange
Multiplier, Likelihood Ratio). The easiest one to use would be the likelihood
ratio test, which would be computed as follows:
where Lr and Lu are the values of the LLF for the restricted and
unrestricted models respectively, and m denotes the number of
restrictions, which in this case is one.
4. If the value of the test statistic is greater than the critical value, reject
the null hypothesis that the restrictions are valid.
f
Let h1,T be the one step ahead forecast for h made at time T. This is easy to
calculate since, at time T, we know the values of all the terms on the RHS.
f f
Given h1,T , how do we calculate h2,T , that is the 2-step ahead forecast for h
made at time T?
ht t = E[(ut)2]
Turning this argument around, and applying it to the problem that we have,
ET[(uT+1)2] = hT+1
f
but we do not know hT+1 , so we replace it with h1,T , so that (4) becomes
h2,f T = 0 + 1 h1,f T +
= 0 + (1+) h1,T
f
And so on. This is the method we could use to forecast the conditional
variance of yt. If yt were, say, daily returns on the FTSE, we could use these
volatility forecasts as an input in the Black Scholes equation to help determine
the appropriate price of FTSE index options.
(h) An s-step ahead forecast for the conditional variance could be written
(x)
For the new value of , the persistence of shocks to the conditional variance,
given by (1+) is 0.1251+ 0.98 = 1.1051, which is bigger than 1. It is obvious
For (1+)<1, the forecasts will converge on the unconditional variance as the
forecast horizon increases. For (1+) = 1, known as “integrated GARCH” or
IGARCH, there is a unit root in the conditional variance, and the forecasts will
stay constant as the forecast horizon increases.
2. (a) Maximum likelihood works by finding the most likely values of the
parameters given the actual data. More specifically, a log-likelihood function
is formed, usually based upon a normality assumption for the disturbance
terms, and the values of the parameters that maximise it are sought.
Maximum likelihood estimation can be employed to find parameter values for
both linear and non-linear models.
(b) The three hypothesis testing procedures available within the maximum
likelihood approach are lagrange multiplier (LM), likelihood ratio (LR) and
Wald tests. The differences between them are described in Figure 8.4, and are
not defined again here. The Lagrange multiplier test involves estimation only
under the null hypothesis, the likelihood ratio test involves estimation under
both the null and the alternative hypothesis, while the Wald test involves
estimation only under the alternative. Given this, it should be evident that the
LM test will in many cases be the simplest to compute since the restrictions
implied by the null hypothesis will usually lead to some terms cancelling out
to give a simplified model relative to the unrestricted model.
(c) OLS will give identical parameter estimates for all of the intercept and
slope parameters, but will give a slightly different parameter estimate for the
variance of the disturbances. These are shown in the Appendix to Chapter 8.
The difference in the OLS and maximum likelihood estimators for the
variance of the disturbances can be seen by comparing the divisors of
equations (8A.25) and (8A.26).
Var(ut) = 2.
(c) There are of course a large number of competing methods for measuring
and forecasting volatility, and it is worth stating at the outset that no research
has suggested that one method is universally superior to all others, so that
each method has its merits and may work well in certain circumstances.
Historical measures of volatility are just simple average measures – for
example, the standard deviation of daily returns over a 3-year period. As such,
they are the simplest to calculate, but suffer from a number of shortcomings.
First, since the observations are unweighted, historical volatility can be slow
to respond to changing market circumstances, and would not take advantage
of short-term persistence in volatility that could lead to more accurate short-
term forecasts. Second, if there is an extreme event (e.g. a market crash), this
will lead the measured volatility to be high for a number of observations equal
to the measurement sample length. For example, suppose that volatility is
being measured using a 1-year (250-day) sample of returns, which is being
rolled forward one observation at a time to produce a series of 1-step ahead
volatility forecasts. If a market crash occurs on day t, this will increase the
measured level of volatility by the same amount right until day t+250 (i.e. it
will not decay away) and then it will disappear completely from the sample so
that measured volatility will fall abruptly. Exponential weighting of
observations as the EWMA model does, where the weight attached to each
observation in the calculation of volatility declines exponentially as the
observations go further back in time, will resolve both of these issues.
However, if forecasts are produced from an EWMA model, these forecasts will
not converge upon the long-term mean volatility estimate as the prediction
N(0, t) ,
(b) One of two procedures could be used. Either the daily returns data would
be transformed into weekly returns data by adding up the returns over all of
the trading days in each week, or the model would be estimated using the daily
data. Daily forecasts would then be produced up to 10 days (2 trading weeks)
ahead.
In both cases, the models would be estimated, and forecasts made of the
conditional variance and conditional covariance. If daily data were used to
estimate the model, the forecasts for the conditional covariance forecasts for
the 5 trading days in a week would be added together to form a covariance
forecast for that week, and similarly for the variance. If the returns had been
aggregated to the weekly frequency, the forecasts used would simply be 1-step
ahead.
Finally, the conditional covariance forecast for the week would be divided by
the product of the square roots of the conditional variance forecasts to obtain
a correlation forecast.
(d) The simple historical approach is obviously the simplest to calculate, but
has two main drawbacks. First, it does not weight information: so any
observations within the sample will be given equal weight, while those outside
the sample will automatically be given a weight of zero. Second, any extreme
observations in the sample will have an equal effect until they abruptly drop
out of the measurement period. For example, suppose that one year of daily
data is used to estimate volatility. If the sample is rolled through one day at a
Finally, implied correlations may at first blush appear to be the best method
for calculating correlation forecasts accurately, for they rely on information
obtained from the market itself. After all, who should know better about future
correlations in the markets than the people who work in those markets?
However, market-based measures of volatility and correlation are sometimes
surprisingly inaccurate, and are also sometimes difficult to obtain. Most
fundamentally, correlation forecasts will only be available where there is an
option traded whose payoffs depend on the prices of two underlying assets.
For all other situations, a market-based correlation forecast will simply not be
available.
0.16
Value of Conditional Variance
0.14
0.12
0.1
0.08
0.06
0.04
0.02
0
-1 -0.9 -0.8 -0.7 -0.6 -0.5 -0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Value of Lagged Shock
This graph is a bit of an odd one, in the sense that the conditional variance is
always lower for the EGARCH model. This may suggest estimation error in
one of the models. There is some evidence for asymmetries in the case of the
EGARCH model since the value of the conditional variance is 0.1 for a shock
of 1 and 0.12 for a shock of –1.
(b) This is a tricky one. The leverage effect is used to rationalise a finding of
asymmetries in equity returns, but such an argument cannot be applied to
foreign exchange returns, since the concept of a Debt/Equity ratio has no
meaning in that context.
On the other hand, there is equally no reason to suppose that there are no
asymmetries in the case of fx data. The data used here were daily USD_GBP
returns for 1974-1994. It might be the case that, for example, that news
relating to one country has a differential impact to equally good and bad news
relating to another. To offer one illustration, it might be the case that the bad
news for the currently weak euro has a bigger impact on volatility than news
about the currently strong dollar. This would lead to asymmetries in the news
impact curve. Finally, it is also worth noting that the asymmetry term in the
EGARCH model, 1, is not statistically significant in this case.