Modelling Volatility and Correlation
Modelling Volatility and Correlation
Modelling Volatility and Correlation
• Models with nonlinear g(•) are “non-linear in mean”, while those with
nonlinear 2(•) are “non-linear in variance”.
• Here the dependent variable is the residual series and the independent
variables are the squares, cubes, …, of the fitted values.
• Many other non-linearity tests are available - e.g., the BDS and
bispectrum test
• BDS is a pure hypothesis test. That is, it has as its null hypothesis that
the data are pure noise (completely random)
• It has been argued to have power to detect a variety of departures from
randomness – linear or non-linear stochastic processes, deterministic
chaos, etc)
• The BDS test follows a standard normal distribution under the null
• The test can also be used as a model diagnostic on the residuals to ‘see
what is left’
• If the proposed model is adequate, the standardised residuals should be
white noise.
‘Introductory Econometrics for Finance’ © Chris Brooks 2013 7
Chaos Theory
• Neural networks are not very popular in finance and suffer from
several problems:
– The coefficient estimates from neural networks do not have any real
theoretical interpretation
– Virtually no diagnostic or specification tests are available for estimated
models
– They can provide excellent fits in-sample to a given set of ‘training’ data,
but typically provide poor out-of-sample forecast accuracy
– This usually arises from the tendency of neural networks to fit closely to
sample-specific data features and ‘noise’, and so they cannot ‘generalise’
– The non-linear estimation of neural network models can be cumbersome
and computationally time-intensive, particularly, for example, if the model
must be estimated repeatedly when rolling through a sample.
• Modelling and forecasting stock market volatility has been the subject
of vast empirical and theoretical investigation
• There are a number of motivations for this line of inquiry:
– Volatility is one of the most important concepts in finance
– Volatility, as measured by the standard deviation or variance of returns, is
often used as a crude measure of the total risk of financial assets
– Many value-at-risk models for measuring market risk require the
estimation or forecast of a volatility parameter
– The volatility of stock market prices also enters directly into the Black–
Scholes formula for deriving the prices of traded options
• We will now examine several volatility models.
• Is the variance of the errors likely to be constant over time? Not for
financial data.
‘Introductory Econometrics for Finance’ © Chris Brooks 2013 15
Autoregressive Conditionally Heteroscedastic
(ARCH) Models
• So use a model which does not assume that the variance is constant.
• Recall the definition of the variance of ut:
t2= Var(ut ut-1, ut-2,...) = E[(ut-E(ut)) ut-1, ut-2,...]
2
• Instead of calling the variance t2, in the literature it is usually called ht,
so the model is
yt = 1 + 2x2t + ... + kxkt + ut , ut N(0,ht)
where ht = 0 + 1 ut21 +2 ut2 2 +...+q ut2 q
t 0 1ut21 , vt N(0,1)
• The two are different ways of expressing exactly the same model. The
first form is easier to understand while the second form is required for
simulating from an ARCH model, for example.
1. First, run any postulated linear regression of the form given in the equation
above, e.g. yt = 1 + 2x2t + ... + kxkt + ut
saving the residuals, ût .
2. Then square the residuals, and regress them on q own lags to test for ARCH
of order q, i.e. run the regression
uˆt2 0 1uˆt21 2uˆt2 2 ... quˆt2 q vt
where vt is iid.
Obtain R2 from this regression
If the value of the test statistic is greater than the critical value from the
2 distribution, then reject the null hypothesis.
• Note that the ARCH test is also sometimes applied directly to returns
instead of the residuals from Stage 1 above.
• How do we decide on q?
• The required value of q might be very large
• Non-negativity constraints might be violated.
– When we estimate an ARCH model, we require i >0 i=1,2,...,q
(since variance cannot be negative)
• Since the model is no longer of the usual linear form, we cannot use
OLS.
• The method works by finding the most likely values of the parameters
given the actual data.
1. Specify the appropriate equations for the mean and the variance - e.g. an
AR(1)- GARCH(1,1) model:
yt = + yt-1 + ut , ut N(0,t2)
t2 = 0 + 1 ut21 +t-12
2. Specify the log-likelihood function to maximise:
T 1 T 1 T
L log( 2 ) log( t ) ( yt yt 1 ) 2 / t
2 2
2 2 t 1 2 t 1
3. The computer will maximise the function and give parameter values and
their standard errors
‘Introductory Econometrics for Finance’ © Chris Brooks 2013 27
Parameter Estimation using Maximum Likelihood
• Then the joint pdf for all the y’s can be expressed as a product of the
individual density functions
f ( y1 , y 2 ,..., yT 1 2 X t , 2 ) f ( y1 1 2 X 1 , 2 ) f ( y 2 1 2 X 2 , 2 )...
f ( yT 1 2 X 4 , 2 )
(2)
T
f ( yt 1 2 X t , 2 )
t 1
1 1 T ( y t 1 2 xt ) 2
LF ( 1 , 2 , ) T
2
exp (4)
( 2 ) T
2 t 1 2
• We want to differentiate (4) w.r.t. 1, 2,2, but (4) is a product containing
T terms.
• From (6), ( y ˆ ˆ x ) 0
t 1 2 t
y ˆ ˆ x 0
t 1 2 t
y Tˆ ˆ x 0
t 1 2 t
1 ˆ ˆ 1
T
t 1 2 T xt 0
y
(9)
ˆ1 y ˆ 2 x
‘Introductory Econometrics for Finance’ © Chris Brooks 2013 32
Parameter Estimation using Maximum Likelihood
(cont’d)
• From (7), ( y ˆ ˆ x ) x 0
t 1 2 t t
y x ˆ x ˆ x 0
t t 1 t
2
2 t
y x ˆ x ˆ x 0
t t 1 t 2
2
t
ˆ x y x ( y ˆ x ) x
2
2
t t t 2 t
ˆ x y x Txy ˆ Tx
2
2
t t t 2
2
ˆ 2
y x Txy
t t
(10)
( x Tx ) 2
t
2
T 1
• From (8), ˆ 2
ˆ 4
(y t ˆ1 ˆ 2 xt ) 2
‘Introductory Econometrics for Finance’ © Chris Brooks 2013 33
Parameter Estimation using Maximum Likelihood
(cont’d)
• Rearranging, ˆ 2 1 ( y t ˆ1 ˆ 2 xt ) 2
T
1
2 ut2 (11)
T
2 2 t 1 2 t 1
• Unfortunately, the LLF for a model with time-varying variances cannot be
maximised analytically, except in the simplest of cases. So a numerical
procedure is used to maximise the log-likelihood function. A potential
problem: local optima or multimodalities in the likelihood surface.
• The way we do the optimisation is:
1. Set up LLF.
2. Use regression to get initial guesses for the mean parameters.
3. Choose some initial guesses for the conditional variance parameters.
4. Specify a convergence criterion - either by criterion or by value.
‘Introductory Econometrics for Finance’ © Chris Brooks 2013 35
Non-Normality and Maximum Likelihood
uˆt
• The sample counterpart is vˆt
ˆ t
• Are the v̂t normal? Typically v̂t are still leptokurtic, although less so than
the ût . Is this a problem? Not really, as we can use the ML with a robust
variance/covariance estimator. ML with robust standard errors is called Quasi-
Maximum Likelihood or QML.
‘Introductory Econometrics for Finance’ © Chris Brooks 2013 36
Extensions to the Basic GARCH Model
u t 1 u 2
t 1
log( t ) log( t 1 )
2 2
t 1 2 t 1 2
The news impact curve plots the next period volatility (ht) that would arise from various
positive and negative values of ut-1, given an estimated model.
News Impact Curves for S&P 500 Returns using Coefficients from GARCH and GJR
Model Estimates: 0.14
GARCH
GJR
0.12
Value of Conditional Variance
0.1
0.08
0.06
0.04
0.02
0
-1 -0.9 -0.8 -0.7 -0.6 -0.5 -0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Value of Lagged Shock
‘Introductory Econometrics for Finance’ © Chris Brooks 2013 41
GARCH-in Mean
• GARCH can model the volatility clustering effect since the conditional
variance is autoregressive. Such models can be used to forecast volatility.
• Let 1f,T be the one step ahead forecast for 2 made at time T. This is
2
easy to calculate since, at time T, the values of all the terms on the
RHS are known.
• 1f,T 2 would be obtained by taking the conditional expectation of the
first equation at the bottom of slide 36:
1f,T = 0 + 1 uT2 +T2
2
• Given, 1f,T how is 2f,T , the 2-step ahead forecast for 2 made at time T,
2 2
f 2 = + ( +) f
2
2,T 0 1 1,T
• By similar arguments, the 3-step ahead forecast will be given by
3f,T = ET(0 + 1uT+22 + T+22)
2
= 0 + (1+) 2f,T 2
= 0 + (1+)[ 0 + (1+) 1f,T ]
2
1. Option pricing
2. Conditional betas
im,t
i ,t
m2 ,t
• What if the standard deviations and correlation are changing over time?
s ,t
Use ht p t
F ,t
‘Introductory Econometrics for Finance’ © Chris Brooks 2013 48
Testing Non-linear Restrictions or
Testing Hypotheses about Non-linear Models
• Usual t- and F-tests are still valid in non-linear models, but they are
not flexible enough.
yt = + yt-1 + ut , ut N(0, t )
2
t2 = 0 + 1 ut21 + t 1
2
• We estimate the model imposing the restriction and observe the maximised
LLF falls to 64.54. Can we accept the restriction?
• LR = -2(64.54-66.85) = 4.62.
• The test follows a 2(1) = 3.84 at 5%, so reject the null.
• Denoting the maximised value of the LLF by unconstrained ML as L(ˆ)
~
and the constrained optimum as L( ) . Then we can illustrate the 3 testing
procedures in the following diagram:
‘Introductory Econometrics for Finance’ © Chris Brooks 2013 51
Comparison of Testing Procedures under Maximum
Likelihood: Diagramatic Representation
L
A
L ˆ
B
L
~
~
ˆ
• We know at the unrestricted MLE, L(ˆ), the slope of the curve is zero.
~
• But is it “significantly steep” at L( ) ?
• Purpose
• To consider the out of sample forecasting performance of GARCH and
EGARCH Models for predicting stock index volatility.
• Data
• Weekly closing prices (Wednesday to Wednesday, and Friday to Friday)
for the S&P100 Index option and the underlying 11 March 83 - 31 Dec. 89
ut 1 u 2
1/ 2
or ln( ht ) 0 1 ln( ht 1 ) 1 ( t 1 ) (3)
ht 1 ht 1
where
RMt denotes the return on the market portfolio
RFt denotes the risk-free rate
ht denotes the conditional variance from the GARCH-type models while
t2 denotes the implied variance from option prices.
‘Introductory Econometrics for Finance’ © Chris Brooks 2013 55
The Models (cont’d)
ln( ht ) 0 1 ln( ht 1 ) 1 ( t 1
) ln( t21 ) (5)
ht 1 ht 1
• If this second set of restrictions holds, then (4) & (5) collapse to
ht2 0 t21 (4’)
R Mt R Ft 0 1 ht u t (8.78)
ht 0 1u t21 1 ht 1 (8.79)
ht 0 1u t21 1 ht 1 t21 (8.81)
ht2 0 t21 (8.81)
Equation for 0 1 010-4 1 1 Log-L 2
Variance
specification
(8.79) 0.0072 0.071 5.428 0.093 0.854 - 767.321 17.77
(0.005) (0.01) (1.65) (0.84) (8.17)
(8.81) 0.0015 0.043 2.065 0.266 -0.068 0.318 776.204 -
(0.028) (0.02) (2.98) (1.17) (-0.59) (3.00)
(8.81) 0.0056 -0.184 0.993 - - 0.581 764.394 23.62
(0.001) (-0.001) (1.50) (2.94)
Notes: t-ratios in parentheses, Log-L denotes the maximised value of the log-likelihood function in
each case. 2 denotes the value of the test statistic, which follows a 2(1) in the case of (8.81) restricted
to (8.79), and a 2 (2) in the case of (8.81) restricted to (8.81). Source: Day and Lewis (1992).
Reprinted with the permission of Elsevier Science.
‘Introductory Econometrics for Finance’ © Chris Brooks 2013 58
In-sample Likelihood Ratio Test Results:
EGARCH Versus Implied Volatility
R Mt R Ft 0 1 ht u t (8.78)
u t 1 u t 1 2
1/ 2
ln( ht ) 0 1 ln( ht 1 ) 1 ( ) (8.80)
ht 1 ht 1
u t 1 u t 1 2
1/ 2
• But the models do not represent a true test of the predictive ability of
IV.
• There are 729 data points. They use the first 410 to estimate the
models, and then make a 1-step ahead forecast of the following week’s
volatility.
where t 1 is the “actual” value of volatility, and 2ft is the value forecasted
2
• The relatively small number of such options that exist limits the
circumstances in which implied covariances can be calculated
• EWMA models also cannot allow for the observed mean reversion in
the volatilities or covariances of asset returns that is particularly
prevalent at lower frequencies.
• In the case of the VECH, the conditional variances and covariances would
each depend upon lagged values of all of the variances and covariances
and on lags of the squares of both error terms and their cross products.
• In matrix form, it would be written
VECH H t C A VECH t 1t 1 B VECH H t 1 t t 1 ~ N 0, H t
• Writing out all of the elements gives the 3 equations as
h11t c11 a11u12t a12 u 22t a13 u1t u 2t b11 h11t 1 b12 h22 t 1 b13 h12 t 1
h22 t c21 a 21u12t a 22 u 22t a 23 u1t u 2 t b21 h11t 1 b22 h22 t 1 b23 h12 t 1
h12 t c31 a 31u12t a 32 u 22t a 33 u1t u 2 t b31 h11t 1 b32 h22 t 1 b33 h12 t 1
• Such a model would be hard to estimate. The diagonal VECH is much
simpler and is specified, in the 2 variable case, as follows:
h11t 0 1u12t 1 2 h11t 1
h22t 0 1u 22t 1 2 h22t 1
h12t 0 1u1t 1u 2t 1 2 h12t 1
‘Introductory Econometrics for Finance’ © Chris Brooks 2013 74
BEKK and Model Estimation for M-GARCH
• Neither the VECH nor the diagonal VECH ensure a positive definite
variance-covariance matrix.
• An alternative approach is the BEKK model (Engle & Kroner, 1995).
• The BEKK Model uses a Quadratic form for the parameter matrices to
ensure a positive definite variance / covariance matrix Ht.
• In matrix form, the BEKK model is Ht W W AHt 1 A Bt 1t 1B
• Model estimation for all classes of multivariate GARCH model is
again performed using maximum likelihood with the following LLF:
TN
2
1 T
log 2 log H t t' H t1 t
2 t 1
where N is the number of variables in the system (assumed 2 above),
is a vector containing all of the parameters, and T is the number of obs.
• The off-diagonal elements of Ht, hij,t (i j), are defined indirectly via
the correlations, denoted ρij:
where:
• S is the unconditional correlation matrix of the vector of standardised
residuals (from the first stage estimation), ut = Dt−1ϵt
• ι is a vector of ones
• Qt is an N × N symmetric positive definite variance-covariance matrix
• ◦ denotes the Hadamard or element-by-element matrix multiplication
procedure
• This specification for the intercept term simplifies estimation and
reduces the number of parameters.
‘Introductory Econometrics for Finance’ © Chris Brooks 2013 79
The DCC Model – A Possible Specification
• The model may be estimated in a single stage using ML although this will
be difficult. So Engle advocates a two-stage procedure where each variable
in the system is first modelled separately as a univariate GARCH
• A joint log-likelihood function for this stage could be constructed, which
would simply be the sum (over N) of all of the log-likelihoods for the
individual GARCH models
• In the second stage, the conditional likelihood is maximised with respect
to any unknown parameters in the correlation matrix
• The log-likelihood function for the second stage estimation will be of the
form
• where θ1 and θ2 denote the parameters to be estimated in the 1st and 2nd
stages respectively.
‘Introductory Econometrics for Finance’ © Chris Brooks 2013 81
Asymmetric Multivariate GARCH
In Sample
Unhedged Naïve Hedge Symmetric Time Asymmetric
=0 =1 Varying Time Varying
Hedge Hedge
hFC ,t hFC ,t
t t
h F ,t h F ,t
Return 0.0389 -0.0003 0.0061 0.0060
{2.3713} {-0.0351} {0.9562} {0.9580}
Variance 0.8286 0.1718 0.1240 0.1211
Out of Sample
Unhedged Naïve Hedge Symmetric Time Asymmetric
=0 =1 Varying Time Varying
Hedge Hedge
hFC ,t hFC ,t
t t
h F ,t hF ,t
Return 0.0819 -0.0004 0.0120 0.0140
{1.4958} {0.0216} {0.7761} {0.9083}
Variance 1.4972 0.1696 0.1186 0.1188
0.95
Conclusions
0.90
- OHR is time-varying and less
than 1
0.85 - M-GARCH OHR provides a
better hedge, both in-sample
0.80
and out-of-sample.
0.75
- No role in calculating OHR for
asymmetries
0.70
0.65
500 1000 1500 2000 2500 3000
Symmetric BEKK
Asymmetric BEKK