0% found this document useful (0 votes)
29 views38 pages

T4-VRM-3-Ch3-Volatility-v3 - Study Notes

The document discusses measuring and monitoring volatility, focusing on its relationship with value at risk (VaR) and various approaches to estimate it. Key concepts include the deviation of asset return distributions from normality, the implications of fat tails, and the use of models like GARCH(1,1) for estimating volatility. Additionally, it covers the importance of correlation in portfolio VaR calculations and the limitations of assuming independence in return distributions.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views38 pages

T4-VRM-3-Ch3-Volatility-v3 - Study Notes

The document discusses measuring and monitoring volatility, focusing on its relationship with value at risk (VaR) and various approaches to estimate it. Key concepts include the deviation of asset return distributions from normality, the implications of fat tails, and the use of models like GARCH(1,1) for estimating volatility. Additionally, it covers the importance of correlation in portfolio VaR calculations and the limitations of assuming independence in return distributions.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 38

Guru. Downloaded March 29, 2020.

The information provided in this document is intended solely for you. Please do not freely distribute.

P1.T4. Valuation & Risk Models

Chapter 3. Measuring and Monitoring Volatility

Bionic Turtle FRM Study Notes


Guru. Downloaded March 29, 2020.
The information provided in this document is intended solely for you. Please do not freely distribute.

Chapter 3. Measuring and Monitoring Volatility

[INTRODUCTORY NON-LO] RELATIONSHIP BETWEEN VOLATILITY AND VALUE AT RISK (VAR) ..... 5
EXPLAIN HOW ASSET RETURN DISTRIBUTIONS TEND TO DEVIATE FROM THE NORMAL
DISTRIBUTION. .................................................................................................................... 9
EXPLAIN REASONS FOR FAT TAILS IN A RETURN DISTRIBUTION AND DESCRIBE THEIR
IMPLICATIONS. .................................................................................................................... 9
DISTINGUISH BETWEEN CONDITIONAL AND UNCONDITIONAL DISTRIBUTIONS. ......................... 10
DESCRIBE THE IMPLICATIONS OF REGIME SWITCHING ON QUANTIFYING VOLATILITY................ 11
EXPLAIN THE VARIOUS APPROACHES FOR ESTIMATING VAR. ................................................ 11
COMPARE AND CONTRAST PARAMETRIC AND NON-PARAMETRIC APPROACHES FOR ESTIMATING
CONDITIONAL VOLATILITY. CALCULATE CONDITIONAL VOLATILITY USING PARAMETRIC AND NON-
PARAMETRIC APPROACHES. ............................................................................................... 12
EXPLAIN HOW IMPLIED VOLATILITY CAN BE USED TO PREDICT FUTURE VOLATILITY ................. 23
EVALUATE IMPLIED VOLATILITY AS A PREDICTOR OF FUTURE VOLATILITY AND ITS
SHORTCOMINGS. ............................................................................................................... 23
APPLY THE EXPONENTIALLY WEIGHTED MOVING AVERAGE (EWMA) APPROACH AND THE
GARCH (1,1) MODEL TO ESTIMATE VOLATILITY. ................................................................. 25
EXPLAIN AND APPLY APPROACHES TO ESTIMATE LONG HORIZON VOLATILITY/VAR, AND
DESCRIBE THE PROCESS OF MEAN REVERSION ACCORDING TO A GARCH (1,1) MODEL. ........ 28
CALCULATE CONDITIONAL VOLATILITY WITH AND WITHOUT MEAN REVERSION. ....................... 30
DESCRIBE THE IMPACT OF MEAN REVERSION ON LONG HORIZON CONDITIONAL VOLATILITY
ESTIMATION ...................................................................................................................... 31
DESCRIBE AN EXAMPLE OF UPDATING CORRELATION ESTIMATES. ......................................... 32
CHAPTER SUMMARY ......................................................................................................... 34
QUESTIONS AND ANSWERS ............................................................................................... 35

2
Guru. Downloaded March 29, 2020.
The information provided in this document is intended solely for you. Please do not freely distribute.

Chapter 3. Measuring and Monitoring Volatility


 Selected key concepts (and terminology)
 [introductory non-LO] Relationship between volatility and value at risk (VaR)

 Explain how asset return distributions tend to deviate from the normal
distribution.

 Explain reasons for fat tails in a return distribution and describe their implications.

 Distinguish between conditional and unconditional distributions.

 Describe the implications regime switching has on quantifying volatility.

 Evaluate the various approaches for estimating VaR.

 Compare and contrast different parametric and non-parametric approaches for


estimating conditional volatility.

 Calculate conditional volatility using parametric and non-parametric approaches.

 Evaluate implied volatility as a predictor of future volatility and its shortcomings.

 Explain long horizon volatility/VaR and the process of mean reversion according
to a GARCH(1,1) model.

 Calculate conditional volatility with and without mean reversion.

 Describe the impact of mean reversion on long horizon conditional volatility


estimation

 Describe an example of updating correlation estimates.

3
Guru. Downloaded March 29, 2020.
The information provided in this document is intended solely for you. Please do not freely distribute.

Selected key concepts (and terminology):

 The three value at risk (VaR) approaches are: parametric (aka, analytical), historical
simulation and Monte Carlo simulation. Within parametric approaches, the most
common is the classic linear normal VaR which simply scales (multiplies) volatility.
 Two assumptions especially are among the convenient but egregious (unrealistic)
employed in the linear normal approach: first, our tendency to “impose normality” on
actually non-normal returns; and, second, our belief that returns are independent
over time (the square root rule of scaling VaR/volatility assumes independence).
 On the fallacy of our tendency to “impose normality:” Because we can compute a
standard deviation—or because we only have the first two distributional moments—
does not by itself imply the dataset is normally distributed (of course!)
 Risk varies over time. Models often assume a normal (Gaussian) distribution with
constant volatility from period to period; e.g., Black-Scholes-Merton assumes
constant volatility. But actual returns are non-normal and volatility is typically time-
varying. Therefore, it is hard to use parametric approaches on random returns. It is
hard to find robust distributional assumptions for stochastic asset returns.
 Aside from implied volatility, we are chiefly concerned with the three basic methods
that estimate volatility with historical return data: (simple) historical simulation;
exponentially weighted moving average (EWMA); and GARCH(1,1).
 Historical simulation has many variations (see Dowd) but our initial concern is the
“hybrid approach” which assigns weights according to the EWMA model
 Conditional parameter (e.g., conditional volatility): this is a parameter such as
variance that depends on (is conditional on) circumstances or prior information. A
conditional parameter, by definition, changes over time. This is generally realistic!
 Persistence: In EWMA, persistence is the lambda parameter (λ). In GARCH(1,1), it is
the sum of the alpha (α) and beta (β) parameters. In GARCH(1,1), high persistence
implies slow decay toward the long-run average variance; EWMA does not embed
any long-run average variance.
 EWMA can be viewed as a special case of GARCH(1,1) but where the weight
assigned to the long-run variance (i.e., gamma) is zero. This is not to be confused
with a zero long-run variance; rather, EWMA has no long-run variance. Conversely,
GARCH(1,1) can be viewed as a generalized EWMA.
 In GARCH(1,1), the implied L.R. Variance = ω/(1-[α+β]) = ω/(1-persistence). This is
also called the unconditional variance.
 Two GARCH terms: Autoregressive: Recursive. A parameter (today’s variance) is a
function of itself (yesterday’s variance). Heteroskedastic: Variance changes over time
(homoscedastic = constant variance).
 Leptokurtosis: a heavy-tailed distribution where relatively more observations are
near the middle and in the “fat tails (kurtosis > 3). The GARCH(1,1) model exhibits
heavy-tails in the unconditional distribution of returns, yet the model assumes
conditional returns are normal.

4
Guru. Downloaded March 29, 2020.
The information provided in this document is intended solely for you. Please do not freely distribute.

[Introductory non-LO] Relationship between volatility and value at


risk (VaR)
The chapter is about volatility but is also indirectly about the most common parametric value
at risk (VaR) which is normal linear VaR. The previous chapter (Chapter 2) and this chapter
(Chapter 3) are closely related. Notice this chapter’s learning objectives include mentions of
VaR: “Evaluate the various approaches for estimating VaR,” and “Explain long horizon
volatility/VaR and the process of mean reversion according to a GARCH(1,1) model.”

In a convenient and relatively simple approach, normal linear VaR simply scales volatility.
For example, if µ = zero and σ = 10.0%, then 95.0% normal VaR = 10.0% × 1.645 = 16.45%.
In this way, our most common parametric VaR approach is just a multiple of volatility and
further often inherits the properties of volatility. As reviewed in the previous chapter, you
should be able to calculate the normal linear value at risk (VaR) for both a single asset
and its extension to a two-asset portfolio.

Single-asset normal value at risk (VaR) is volatility scaled by the deviate (e.g., 1.645)

In the following example, we will make the following assumptions: there are 250 trading days
per year. The current asset value equals $200.00. The asset’s return volatility is 10.0% per
annum. Finally, we will assume the expected return is zero; aka, relative VaR.
Further, any value at risk (VaR) estimate requires two user-based “design decisions:”
 What is the confidence level? In this case, the selected confidence is 95.0%
 What is the desired horizon? In this case, the horizon is one day. Therefore, we will
need to scale volatility (and VaR) from one year to one day.

Given these assumptions and our VaR choices (i.e., confidence and horizon), we calculate:
 Per annum VaR = $200.0 * 10.0% * 1.645 = $32.90. But our horizon is one day.
 One-day VaR = $200 * 10.0% * sqrt(1/250) * 1.645 = $2.08; the volatility is scaled
per the so-called “square root rule”
 Relative VaR (versus absolute VaR): we have here assumed the expected return
(aka, drift) for a single day is zero; alternatively we have ignored the expected
return. This is common when estimating the one-day VaR. Technically, if we exclude
the expected return, we are estimating a relative VaR (as opposed to an absolute
VaR). This is called a relative VaR because it is the worst expected loss is relative to
the expected (future) value at the end of the horizon.

5
Guru. Downloaded March 29, 2020.
The information provided in this document is intended solely for you. Please do not freely distribute.

Two-asset value at risk (aka, 2-asset diversified portfolio VaR)

We build on the previous example by considering a portfolio of two assets rather than a
single asset. Notice this requires an additional assumption: what is the correlation between
the two assets? Correlation is almost always denoted by Greek rho, ρ. In this example, we
assume the correlation is zero. The portfolio VaR is an increasing function of correlation; at
perfect correlation, the portfolio VaR will equal the sum of the individual VaRs (which
illustrates diversification: at any correlation less than perfect 100%, the portfolio VaR will be
less than the sum of the individual VaRs). Again, because the horizon is short (i.e., one day)
by convention we can safely assume the expected return is zero; put another way, we are
estimating a relative VaR. Notice that this is essentially an exercise in computing portfolio
volatility: the translation from portfolio volatility to VaR is achieved by the single step of
multiplication. In this instance of 95.0% confidence, we simply multiply by 1.645 in order to
translate a one-day portfolio volatility of $1.41 into a one-day 95.0% VaR of $2.33.

Given these assumptions about the asset and our VaR choices (i.e., confidence and
horizon), we can calculate:
 Per annum portfolio variance = 0.52*0.12 + 0.52*0.22 + 2*0.5*0.5*0.1*0.2*0 = 0.01250
such that per annum portfolio volatility = sqrt(0.01250) = 11.18%
 Per annum (diversified) portfolio VaR = 11.18% * 1.645 = 18.39% such that the
time-scaled one-day portfolio VaR = $200.0 * 18.39% * sqrt(1/250) = $2.33
 There is another approach that utilizes the individual VaRs of $1.04 for Asset A and
$2.08 for Asset B. Where iVaR is Individual VaR, this uses:

portfolio VaR = iVaR + iVaR + 2 ∗ ∗ iVaR ∗ . In this case,


portfolio VaR = 1.04 + 2.08 + 2 ∗ 1.04 ∗ 2.08 ∗ 0 = $2.33

6
Guru. Downloaded March 29, 2020.
The information provided in this document is intended solely for you. Please do not freely distribute.

Portfolio volatility and portfolio VaR are an increasing function with correlation

In this two-asset portfolio, a key relationship is the correlation parameter’s influence on the
diversified portfolio VaR: higher correlation implies higher portfolio VaR. Using the same
individual VaR parameters given previously (i.e., an equally-weighted portfolio with
respective volatilities of 10% and 20%), below we show the portfolio volatility and VaR given
a range of assumptions for the correlation parameter, ρ(A,B), from perfectly negative to
perfectly positive:

The corresponding plot (below) shows the portfolio VaR is increasing and non-linear. Notice
at perfect correlation (ρ = 1.0), the portfolio VaR is equal to the sum of the individual VaRs:
at ρ=1.0, portfolio VaR equals $3.12 which is the sum of $1.04 and $2.08. This is
unsurprising because, under this normal linear VaR estimate, VaR is simply a multiple of
(i.e., multiplier on) the standard deviation.
.

7
Guru. Downloaded March 29, 2020.
The information provided in this document is intended solely for you. Please do not freely distribute.

When we scale volatility or VaR over time, the critical time-scaling assumption is the
assumption of i.i.d. (aka, iid), specifically the independence assumption. Despite our use of
the normal distribution—and contrary to a sometimes popular assumption--normality is not
here the critical assumption! When scaling the VaR from down from one year to one day (or
up from one day to one year) we typically assume independence. Independence is a
common but unrealistic assumption.

If returns are independent across time, then autocorrelation (aka, serial correlation) is zero.
By the contrapositive, the following is also true: if the autocorrelation is non-zero, then the
returns are not independent across time. Therefore, non-zero autocorrelation (aka, non-
zero serial correlation) is by definition a violation of the i.i.d. assumption and renders
inaccurate the square root rule of scaling VaR. For the FRM, you do not need to be
able to quantify the implications of this i.i.d. violation. Nonetheless, it is illustrated
below so that you can see the directional impact. We assume:
 Conveniently, the daily volatility is around 1.0%
 The first column illustrates the typical approach used to scale the daily volatility to a
10-day VaR: 10-day VaR = 1.0% * 2.33 deviate * sqrt(10/1) = 7.36%

At the bottom, compare the 10-day VaR of 7.36% to the second and third columns:
 When autocorrelation is +0.20, the scaled 10-day VaR is 8.82%
 When autocorrelation is -0.20, the scaled 10-day VaR is 6.13%. Negative
autocorrelation reflects mean reversion and we can see that, when there exists mean
reversion, the i.i.d. scaled VaR of 7.36% will overstate the “adjusted” VaR.
 Note how autocorrelation informs the adjusted standard deviation. Again, the
VaR simply multiplies this adjusted standard deviation by the normal deviate of 2.33.

8
Guru. Downloaded March 29, 2020.
The information provided in this document is intended solely for you. Please do not freely distribute.

Explain how asset return distributions tend to deviate from the


normal distribution.
Compared to a normal (bell-shaped) distribution, actual asset returns tend to be:
 Fat-tailed (a.k.a., heavy-tailed): The likelihood of extreme observation (far from the
mean) is higher than predicted by a normal distribution.
 Skewed: Skewness refers to the degree of asymmetry of distribution. The Normal
distribution is a symmetrical distribution, which means that events at the same
distance of the mean are equally likely. In the context of financial returns, the
distribution tends to be left-skewed, leaving more probability to extreme negative
events.
 Time-varying (unstable): The parameters (e.g., mean, volatility) of the distribution
are not stable over time. These changes are related to the variability of the market
conditions.

Normal returns Actual financial returns


Symmetrical distribution Skewed distribution
“Normal” Tails Heavy-tailed (leptokurtosis)
Stable distribution Time-varying parameters

Explain reasons for fat tails in a return distribution and describe


their implications.
A distribution is unconditional if tomorrow’s distribution is the same as today’s distribution,
irrespective of the market and economic conditions. But fat tails could be explained by a
conditional distribution: a distribution that changes over time. Two things can change in a
normal distribution: mean and volatility. Time variations in either mean or volatility could give
rise to fat tails in the unconditional distribution, in spite of the fact that the conditional
distribution is normal (although with different parameters at different points in time).

Therefore, we might plausibly explain fat tails in two ways:


1. Maybe the conditional mean is time-varying; but this is unlikely given the
assumption that markets are efficient.
2. Conditional volatility is time-varying. Most authors assert this to be the more likely
explanation! For instance, significant economic or political events are seen to cause
higher volatility. Time-varying volatility is also observed to be generated by regular,
predictable events.

How can outliers be indications that the volatility varies with time?

We observe that actual financial returns tend to exhibit fat-tails. Here are two possible
explanations:
1. The true distribution is stationary. Therefore, fat-tails reflect the true distribution but
the normal distribution is not appropriate
2. The true distribution changes over time (it is “time-varying”). In this case, outliers can,
in reality, reflect a time-varying volatility. This is considered the more likely
explanation.

9
Guru. Downloaded March 29, 2020.
The information provided in this document is intended solely for you. Please do not freely distribute.

Implications of fat tails on analysis of return distributions

Since the tails of an empirical distribution are fatter than the normal distribution, this causes
problems in risk measurement. Risk focuses on extreme events, attempting to quantify the
probability and size of severe losses. When we impose normal distribution assumptions on
real-world data, it fails exactly where we need it to work the most, especially, in the tails of
the distributions. This could invariably result in underestimation of the VaR loss.

Distinguish between conditional and unconditional distributions.


An unconditional distribution of say, asset returns, remains the same on any given day
regardless of market or economic conditions; for this reason, it is likely to be unrealistic.
Here is another common perspective of an unconditional distribution: it is the distribution that
applies over the long-run. Put another way, the unconditional distribution is the long-run
average distribution.

A conditional distribution is not always the same: it is different, or conditional on, some
economic or market events or other states. It is measured by parameters such as its
conditional mean, conditional standard deviation (conditional volatility), and conditional skew,
and conditional kurtosis.

Heavy tails appear mostly in the unconditional distribution whereas conditional distribution is
usually normal, although with varying means or volatility at different points in time. This is
because conditional fat tails can be normalized by means of standardizing their respective
conditional parameters. Then their distribution can be represented as a standard normal
variable.

Consider X, a conditional random variable, with a mean of and a standard deviation of .


If X follows normal distribution, then ~ ( , 2 ). Else, X can be standardized, so that it is
defined by a normal distribution with a mean of zero and standard deviation of one as: ( −
)/ ~ (0, 1).

So, instead of assuming asset returns are normally distributed, it makes more sense to
assume that asset returns are conditionally normally distributed. However, caution has to be
applied, since in reality, asset returns are generally non-normal, whether unconditional or
conditional; i.e., fat tails are exhibited in asset returns regardless of the method we apply.

10
Guru. Downloaded March 29, 2020.
The information provided in this document is intended solely for you. Please do not freely distribute.

Describe the implications of regime switching on quantifying


volatility.
A typical distribution is a regime-switching volatility model: the regime (state)
switches from low to high volatility, but never (or rarely) in between. A distribution is
“regime-switching” if it changes from high to low volatility.

The problem: a risk manager may assume (and measure) an unconditional volatility but the
distribution is actually regime-switching. In this case, the distribution is conditional (i.e., it
depends on conditions) and might be normal but regime-switching; e.g., volatility is 10%
during a low-volatility regime and 20% during a high-volatility regime but during both
regimes, the distribution may be normal. However, the risk manager may incorrectly assume
a single 15% unconditional volatility. But in this case, the unconditional volatility is likely to
exhibit fat tails because it does not account for the regime-switching.

Explain the various approaches for estimating VaR.


Estimation of VaR involves either an (i) historical based or an (ii) implied volatility approach.

Historical-based approach: Determines the shape of the conditional distribution based on


historical time-series data. There are several categories of historical-based approaches:
 Parametric approach: Assume that asset returns follow a particular theoretical
distribution that can be described with parameters (mean, variance,…) estimated
based on the dataset with statistical methods. Example: conditional (log) normal
distribution with time-varying volatility (historical data informs the volatility)
 Nonparametric approach: There is no assumption concerning the underlying
theoretical distribution. The historical asset returns are deemed to characterize
completely the distribution of return. Example: historical simulation method.
 Hybrid approach: Combines parametric and non-parametric approaches. Example:
Filtered Historical Simulation (FHS)

Implied volatility-based approach: The volatility is implied by the price of derivatives


(options,…) by means of derivative pricing models such as Black-Scholes. The implied
volatility is used as a predictor of future volatility.

11
Guru. Downloaded March 29, 2020.
The information provided in this document is intended solely for you. Please do not freely distribute.

Compare to Jorion’s Value at Risk (VaR) typology

Jorion illustrates another view on the same set of approaches (see below). He firstly
distinguishes between local valuation (which is associated with parametric approaches) and
full valuation (which includes both historical and Monte Carlo simulation). Note also that in
general stress testing requires a full valuation (ie, simulation) approach.

Compare and contrast parametric and non-parametric approaches


for estimating conditional volatility. Calculate conditional volatility
using parametric and non-parametric approaches.
Below is a visual summary of the three parametric volatility approaches (historical standard
deviation, EWMA, and GARCH) that will be reviewed. The idea is to emphasize that the
difference largely concerns the treatment of weights assigned to historical returns. Many
readers will be familiar with historical standard deviation which is the square root of the
historical variance which, in turn, is the average of the historical series of squared returns
(assuming zero mean return). Implicitly, this typical approach assigns equal weights to each
of the returns in the historical window.

12
Guru. Downloaded March 29, 2020.
The information provided in this document is intended solely for you. Please do not freely distribute.

The key difference between parametric and non-parametric approaches concerns the
reliance on a distribution and the role of data:
 A parametric approach uses (historical) data in order to inform distributional
parameters, typically the variance and covariance. These parameters characterize a
distribution; e.g., normal, non-normal, GEV in the tail. Then value at risk (VaR) or
expected shortfall (ES) can be estimated directly from the distribution. In this way, the
parametric approach is betrayed by its reliance on a distributional assumption.
 The classic non-parametric approach is historical simulation. In this approach, VaR
(or ES) is estimated directly from the data without reference to a distribution function.
Simple historical simulation requires zero parameters; it only needs the dataset.

13
Guru. Downloaded March 29, 2020.
The information provided in this document is intended solely for you. Please do not freely distribute.

Parametric approaches

Under the parametric methods, volatility is estimated in different models by means of placing
different weights on historic data. Although there are many variations, we are concerned with
the following three basic parametric approaches to the estimation of volatility:
 Historical standard deviation
 Exponential Smoothing (EWMA or Risk MetricsTM Approach), and
 GARCH(1,1)

Historical standard deviation (STDEV; aka, moving average)

In the historical standard deviation method, given a history of an asset’s continuously


compounded rate of returns we take a specific window of the K most recent returns. To
obtain its standard deviation, we take the square root of the average of the squared
deviations from its mean. Here we assume that the conditional mean is zero. So, volatility is
the square root of the sum of squared returns that are equally weighted.

2 2 2
− , − +1 +⋯+ −2, −1, + −1,
=

This standard deviation is called a moving average (MA) by Jorion. The estimate requires a
window of fixed length; e.g., 30 or 60 trading days. If we observe returns, r(i), over M days,
the volatility estimate is constructed from a moving average (MA). This is an identical
measure:

= (1 )

Each day, the estimate is updated by adding the most recent day and dropping the furthest
day. In a simple moving average, all weights on past returns are equal and set to (1/M).
Note raw returns are used instead of returns around the mean (i.e., the expected mean is
assumed zero). This is common in short time intervals, where it makes little difference on the
volatility estimate.

For example, assume the previous four daily returns for a stock are 6% (n-1), 5% (n-2), 4%
(n-3) and 3% (n-4). What is a current volatility estimate, applying the moving average, given
that our short trailing window is only four days (m= 4)? If we square each return, the series is
0.0036, 0.0025, 0.0016 and 0.0009. If we sum this series of squared returns, we get 0.0086.
Divide by 4 (since m=4) and we get 0.00215. That’s the moving average variance, such that
the moving average volatility is about 4.64%. The above example illustrates a key weakness
of the moving average (MA): since all returns weigh equally, the trend does not matter. In the
example above, notice that volatility is trending down, but MA does not reflect in any way this
trend. We could reverse the order of the historical series and the MA estimation would
produce the same result.

14
Guru. Downloaded March 29, 2020.
The information provided in this document is intended solely for you. Please do not freely distribute.

This standard deviation (STDEV; aka, MA) series is simple but has drawbacks.

 This estimator is particularly sensitive to extreme observations because any extreme


observation becomes larger when squared. So, when returns are non-normal, and, in
particular, fat-tailed, this method may not be suitable.
 Choosing the appropriate window length may result in a trade-off because short
window sizes may be less precise due to estimation error but reflect recent changes
more effectively than the long window sizes.
 The series has a so-called ghosting feature: data points are dropped arbitrarily
due to the length of the window. This results in the periodic appearance of large
decreases in conditional volatility when extreme observations disappear from the
rolling estimation window.
 The series ignores the order of the observations. Older observations may no
longer be relevant, but they receive the same weight. Since the magnitude of recent
changes could be informative in estimating volatility, standard deviation is limited
when compared to an approach where recent data is given more weights.

Exponential Smoothing (EWMA or Risk MetricsTM Approach)

Exponential smoothing is the same method referred to by Hull as exponentially weighted


moving average (EWMA). Exponential smoothing places exponentially declining weights on
historical data, placing more weight on more recent information and less weight on past
information. Two models, namely EWMA and GARCH employ exponential smoothing.

Importantly, EWMA is a special case of GARCH and here is how we get from GARCH (1,1)
to EWMA. We will review this more closely later, but for the moment realize:
 GARCH(1,1) = = + , +

Then we let a = 0 and (b + c) = 1, such that the above equation simplifies to


 GARCH(1,1) = = , + (1 − )

This is now equivalent to the formula for exponentially weighted moving average (EWMA):
 EWMA = = , + (1 − )
= + (1 − ) , because it is typical to use lambda for the so-
called smoothing parameter

15
Guru. Downloaded March 29, 2020.
The information provided in this document is intended solely for you. Please do not freely distribute.

RiskMetrics™ Approach (is just a branded EWMA where λ equals 0.94)

In EWMA, the lambda parameter( ) determines the “decay:” a lambda that is close to one
(high lambda) exhibits slow decay. RiskMetrics™ is a branded form of the exponentially
weighted moving average (EWMA) approach. The optimal (theoretical) lambda varies by
asset class, but the overall optimal parameter used by RiskMetrics™ has been 0.94. In
practice, RiskMetrics™ only uses one decay factor for all series:
 0.94 for daily data
 0.97 for monthly data (month defined as 25 trading days)
Technically, the daily and monthly models are inconsistent. However, they are both easy to
use, they approximate the behavior of actual data quite well, and they are robust to
misspecification.

GARCH (1,1) is the most basic GARCH(p,q) and generalizes EWMA

GARCH (p, q) is a general autoregressive conditional heteroskedastic model:


 Autoregressive (AR): tomorrow’s variance (or volatility) is a regressed function of
today’s variance—it regresses on itself
 Conditional (C): tomorrow’s variance depends—is conditional on—the most recent
variance. An unconditional variance would not depend on today’s variance
 Heteroskedastic (H): variances are not constant they flux over time

GARCH regresses on lagged or historical terms. The lagged terms are either variance or
squared returns. The generic GARCH (p, q) model regresses on (p) squared returns and (q)
variances. Therefore, GARCH (1, 1) “lags” or regresses on last period’s squared return (i.e.,
just 1 return) and last period’s variance (i.e., just 1 variance).

Please note: GARCH (1, 1) is often represented with Greek parameters (per Hull) but can be
expressed with alternative notation:

Some authors denote GARCH as … but Hull’s Greek notation is common

= + + = + +

where
= conditional variance
= previous variance
or = previous squared returns
or = weighted long-run average variance where = × where is
the weight of long-run average variance rate
or = weight of lagged squared returns
or = weight of lagged variance

16
Guru. Downloaded March 29, 2020.
The information provided in this document is intended solely for you. Please do not freely distribute.

Persistence, which is given by (α+β), is a feature embedded in the GARCH model.

In the above formulas, persistence is ( + ) (1 − ) since the weights should add to


one. Persistence refers to how quickly (or slowly) the variance reverts or “decays” toward its
long-run average. High persistence equates to slow decay and slow “regression toward the
mean;” low persistence equates to rapid decay and quick “reversion to the mean.”

A persistence of 1.0 implies no mean reversion. A persistence of less than 1.0 implies
“reversion to the mean,” where a lower persistence implies greater reversion to the mean. If
the weights assigned to the lagged variance and lagged squared returns are greater than
one, the model is non-stationary. That is, if (α+β) > 1 the model is non-stationary and,
according to Hull, unstable, in which case, EWMA is preferred.

The three GARCH(1,1) parameters are weights and therefore must sum to one:

In GARCH (1,1), the weights should all add to unity such that + + = 1. Be careful
about the first term in the GARCH (1, 1) equation: omega (ω) = gamma (γ) * average long-
run variance ( ). If you are asked for the variance, you may need to divide out the weight in
order to compute the average variance.

The average, unconditional variance in the GARCH (1, 1) model is given by:

= =
−( + )

The FRM likes to ask about this unconditional variance in the GARCH(1,1) model. The
question may give you the GARCH (1,1) model parameters and ask you to solve for the
implied long-run (unconditional variance). For example, if you are given the GARCH model,
= + − , + − , then the implied long-run variance is
( )
= .

STDEV vs. GARCH: Advantages and Disadvantages

In practice, variance rates tend to be mean-reverting; therefore, the GARCH (1, 1) model is
theoretically superior (“more appealing than”) to the EWMA model. Remember, that’s the big
difference: GARCH adds the parameter that weighs the long-run average and therefore it
incorporates mean reversion. GARCH (1, 1) is preferred unless the first omega parameter is
negative (which is implied if alpha + beta > 1). In this case, GARCH (1,1) is unstable and
EWMA is preferred.

There are two problems with moving average (MA):

1. Ghosting feature: volatility shocks (sudden increases) are abruptly incorporated into
the MA metric and then, when the trailing window passes, they are abruptly dropped
from the calculation. Due to this, the MA metric will shift in relation to the chosen
window length
2. Trend information is not incorporated

17
Guru. Downloaded March 29, 2020.
The information provided in this document is intended solely for you. Please do not freely distribute.

GARCH estimates improve upon these weaknesses in two ways:


1. More recent observations are assigned greater weights. This overcomes ghosting
because a volatility shock will immediately impact the estimate but its influence will
fade gradually as time passes
2. A term is added to incorporate reversion to the mean

Linda Allen says1 about GARCH(1, 1):

 GARCH is both “compact” (i.e., relatively simple) and remarkably accurate for
predicting volatility for in-sample data. However, as the GARCH parameters change
with new information, forecasts change with it sometimes due to estimation error
which induces a model risk. So, forecasting of out-of-sample data with GARCH
poses a risk. GARCH models predominate in scholarly research. Many variations of
the GARCH model have been attempted, but few have improved on the original.
 The drawback of the GARCH model is its nonlinearity.

STDEV GARCH

 Ghosting feature  More recent data assigned greater weights


 Trend information is not  A term added to incorporate mean reversion
incorporated

EWMA vs GARCH(1,1)

Each of GARCH(1, 1), EWMA and RiskMetrics™ are parametric and recursive.

GARCH(1,1) is generally considered superior (and, after all, is the general case of both
EWMA and MA):
 GARCH estimations can provide estimations that are more accurate than MA
 Except Linda Allen warns2: GARCH (1,1) needs more parameters and may pose
greater MODEL RISK (“chases a moving target”) when forecasting out-of-sample

1
Allen, Linda, Understanding Market, Credit, and Operational Risk (Blackwell Publishing Ltd 2004)
2
Allen, Linda, Understanding Market, Credit, and Operational Risk (Blackwell Publishing Ltd 2004). Chapter 2.

18
Guru. Downloaded March 29, 2020.
The information provided in this document is intended solely for you. Please do not freely distribute.

Nonparametric methods

We have seen the parametric methods (above). On the other hand, nonparametric methods
estimate VaR directly from the data, without making any assumptions about the entire
distribution of returns. Given the problems of fat tails and skewness that arise when normal
distribution is assumed, nonparametric methods potentially avoid such issues. However, it
should be noted that while parametric methods use data more efficiently, the precision of
nonparametric methods is dependent upon large samples; put simply, it is very difficult to
apply nonparametric methods to small samples.

Non-parametric methods of volatility forecasting include historical simulation and multivariate


density estimation(MDE).

Nonparametric methods: Historical simulation

Historical simulation is easy as it does not require estimation of any parameter: we only need
to determine the “lookback window.” The problem is that, for small samples, the extreme
percentiles (e.g., the worst one percent) are less precise. Historical simulation effectively
throws out useful information as it focuses much on the tails.

“The most prominent and easiest to implement methodology within the


class of nonparametric methods is historical simulation (HS). HS uses the
data directly. The only thing we need to determine up front is the lookback
window. Once the window length is determined, we order returns in
descending order, and go directly to the tail of this ordered vector. For an
estimation window of 100 observations, for example, the fifth lowest return
in a rolling window of the most recent 100 returns is the fifth percentile. The
lowest observation is the first percentile. If we wanted, instead, to use a 250
observations window, the fifth percentile would be somewhere between the
12th and the 13th lowest observations (a detailed discussion follows), and
the first percentile would be somewhere between the second and third
lowest returns.” –Linda Allen 3

Nonparametric methods: Multivariate Density Estimation (MDE)

The key feature of multivariate density estimation is that the weights (assigned to historical
square returns) are not a constant function of time. Rather, the current state—as
parameterized by a state vector—is compared to the historical state: the more similar the
states (current versus historical period), the greater the assigned weight. It, therefore,
estimates the joint probability density function of a set of variables (returns and state of the
economy or term structure or inflation level, etc.). The relative weighting is determined by
the kernel function:
K
s  å w (Ct -i )ut2-i
2
t
i 1

Kernel function Vector describing economic state at time t-i

Instead of weighting returns^2 by time,


Weighting by proximity to current state

3
Allen, Linda, Understanding Market, Credit, and Operational Risk (Blackwell Publishing Ltd 2004). Chapter 2.

19
Guru. Downloaded March 29, 2020.
The information provided in this document is intended solely for you. Please do not freely distribute.

One problem with MDE is that it is data intensive as many data points are required to
estimate the appropriate weights that capture the joint density function of the variables. Also,
the quantity of data needed increases with the number of conditioning variables used in
estimation.

Compare EWMA to MDE:


 Both assign weights to historical squared returns (squared returns = variance
approximation);
 Where EWMA assigns the weight as an exponentially declining function of time (i.e.,
the nearer to today, the greater the weight), MDE assigns the weight based on the
nature of the historical period (i.e., the more similar to the historical state, the greater
the weight)

Hybrid Approach

The hybrid approach combines each of the parametric and nonparametric approaches. It
combines historical simulation (by estimating the percentiles of returns directly) and EWMA
(by using exponentially declining weights on past data). Consider the example below (which
is realized in our learning spreadsheet but replicates Linda Allen’s example4 in her Table
2.3). Notice that given 100 observations (K = 100), under simple HS each return is assigned
a weight of 1.0% = 1/100. Compare this to the Hybrid approach where recent returns are
assigned greater weight; e.g., 3 periods ago is weighted 2.21%. At the same time, the more
distant return that occurred 65 days ago is assigned a weight of only 0.63%.

lambda, λ 0.98
K 100

Initial Date
Periods Simple HS Hybrid (EXP)
Return Ago Weight Cumul Weight Cumul
-3.30% 3 1.00% 1.00% 2.21% 2.21%
-2.90% 2 1.00% 2.00% 2.26% 4.47%
-2.70% 65 1.00% 3.00% 0.63% 5.11%
-2.50% 45 1.00% 4.00% 0.95% 6.05%
-2.40% 5 1.00% 5.00% 2.13% 8.18%
-2.30% 30 1.00% 6.00% 1.28% 9.47%

25 Days Later
Periods Simple HS Hybrid (EXP)
Return Ago Weight Cumul Weight Cumul
-3.30% 28 1.00% 1.00% 1.34% 1.34%
-2.90% 27 1.00% 2.00% 1.36% 2.70%
-2.70% 90 1.00% 3.00% 0.38% 3.08%
-2.50% 70 1.00% 4.00% 0.57% 3.65%
-2.40% 30 1.00% 5.00% 1.28% 4.94%
-2.30% 55 1.00% 6.00% 0.77% 5.71%

4
Allen, Linda, Understanding Market, Credit, and Operational Risk (Blackwell Publishing Ltd 2004). Based on
Linda Allen’s Table 2.3 but spreadsheet (including error correction) hand built by David Harper.

20
Guru. Downloaded March 29, 2020.
The information provided in this document is intended solely for you. Please do not freely distribute.

Another (real-life) example

For two companies, GOOG and XOM, we retrieved daily returns over a one-year history (K =
250 days) ending December 14th, 2016. The lambda parameter, λ, is set to 0.980. For
example, the worst return of -5.46% is weighted according to (1-0.98)*0.98^(165-1)/(1-
0.98^25) which is a product of the typical (1-λ)*λ^(n-1) = (1-0.98)*0.98^(165-1) and 1/(1-
0.98^250) which is a slight upward multiplier that "trims" the total weight of 250 (truncated
from an infinite series) days from 99.36% to a full 100.0% based on the 250-day window.
Notice how the largest daily losses for Alphabet (formerly Google) did not occur recently in
the historical window; consequently, the cumulative weighting under the hybrid approach
lags behind the cumulative weight under a simple historical simulation.

Alphabet (Ticker: GOOG) Exxon Mobil Corp (Ticker: XOM)

T 250 T 250
lambda, λ 0.98 lambda, λ 0.98
K 250 K 250

Periods Hybrid (EXP) Periods Hybrid (EXP)


Return Ago Weight Cumul Return Ago Weight Cumul
-5.46% 165 0.07% 0.07% -4.31% 230 0.02% 0.02%
-5.06% 220 0.02% 0.10% -3.54% 96 0.30% 0.32%
-3.87% 121 0.18% 0.28% -3.44% 227 0.02% 0.34%
-3.58% 234 0.02% 0.29% -2.66% 121 0.18% 0.51%
-3.51% 218 0.03% 0.32% -2.51% 68 0.52% 1.03%
-2.94% 24 1.26% 1.58% -2.49% 33 1.05% 2.09%
-2.88% 232 0.02% 1.60% -2.41% 66 0.54% 2.63%
-2.66% 126 0.16% 1.76% -2.25% 221 0.02% 2.65%
-2.64% 219 0.02% 1.79% -2.19% 197 0.04% 2.69%
-2.41% 22 1.32% 3.11% -2.18% 1 2.01% 4.70%

21
Guru. Downloaded March 29, 2020.
The information provided in this document is intended solely for you. Please do not freely distribute.

Technical point on interpolation in the hybrid approach (optional)

Returning to Linda Allen’s example (in Table 2.3; see below), the risk measurement question
is “What is the 95.0% value at risk, VaR, implies by this hybrid simulation?” This requires
solving for the worst 5.0% quantile.

lambda, λ 0.98
K 100

Initial Date
Periods Simple HS Hybrid (EXP)
Return Ago Weight Cumul Weight Cumul
-3.30% 3 1.00% 1.00% 2.21% 2.21%
-2.90% 2 1.00% 2.00% 2.26% 4.47%
-2.70% 65 1.00% 3.00% 0.63% 5.11%
-2.50% 45 1.00% 4.00% 0.95% 6.05%
-2.40% 5 1.00% 5.00% 2.13% 8.18%

The solution is illustrated below (this example can be explored in further detail in the
associated learning spreadsheet).

"Mass centered" In blue are merely explanations of


Raw observations specific mass-centered points
Return Weight Cumul Return Weight Final interpolation @ 5th %ile
Observed -3.30% 2.21% 2.21% -3.30% 1.11%
midpoint -3.10% -3.10% 2.21% <<- cumulative weight 2.21% extends to
Observed -2.90% 2.26% 4.47% -2.90% 3.34% midpoint of returns (-3.30, -2.90%)
midpoint -2.80% -2.80% 4.47%
Observed -2.70% 0.63% 5.11% -2.70% 4.79% 5.00% 66.2% 0.066% -2.63%
midpoint -2.60% -2.60% 5.11%
Observed -2.50% 0.95% 6.05% -2.50% 5.58% <<- -2.50% return is centered on 0.95%,
midpoint -2.45% -2.45% 6.05% its cumulative weight = (0.95%/2 ) +
Observed -2.40% 2.13% 8.18% -2.40% 7.12% 5.11% = 5.58%
midpoint -2.35% -2.35% 8.18%
Observed -2.30% 1.28% 9.47% -2.30% 8.82%

Nonparametric approaches: summary advantages and disadvantages

Advantages Disadvantages
Historical Sim Easiest to implement (simple, convenient) Inefficient (much data is not used)
Multivariate Very flexible: weights are function of state Onerous model: weighting scheme;
density (e.g., economic context such as interest conditioning variables; number of
estimation rates) not constant observations. Data intensive
Hybrid Unlike the HS approach, better incorporates Requires model assumptions; e.g.,
approach more recent information number of observations

22
Guru. Downloaded March 29, 2020.
The information provided in this document is intended solely for you. Please do not freely distribute.

Explain how implied volatility can be used to predict future


volatility
To impute volatility is to derive volatility (to reverse-engineer it, really) from the
observed market price of the asset. A typical example uses the Black-Scholes option
pricing model to compute the implied volatility of a stock option; i.e., option traders will
average at-the-money implied volatility from traded puts and calls.

Advantages of implied vol: Shortcomings of implied volatility include:


 Truly predictive (reflects  Model-dependent
market’s forward-looking  Options on the same underlying asset may trade at different
consensus) implied volatilities; e.g., volatility smile/smirk
 Does not require, nor is  Stochastic volatility; i.e., the model assumes constant
restrained by, historical volatility, but volatility tends to change over time
distribution patterns  Limited because it requires traded (set by market) price

This requires that a market mechanism (e.g., an exchange) can provide a market price for
the option. If a market price can be observed, then instead of solving for the price of an
option, we use an option pricing model (OPM) to reveal the implied (implicit) volatility. We
solve (“goal seek”) for the volatility that matched a model price to the market price:
= ( ). Here the implied standard deviation (ISD) is the volatility input into
an option pricing model (OPM). Similarly, implied correlations can be recovered (reverse-
engineered) from options on multiple assets. According to Jorion, ISD is a superior approach
to volatility estimation. He says, “Whenever possible, VAR should use implied parameters.”

Evaluate implied volatility as a predictor of future volatility and its


shortcomings.
Some describe the application of historical volatility as similar to “driving by looking in the
rear-view mirror.” Another flaw is the implicit assumption that the past is indicative of the
future (“past is prologue”). The advantage of implied volatility is that it is a forward-
looking, predictive measure: Implied volatility can be imputed from derivative prices using
a specific derivative pricing model. The easiest example is the Black–Scholes implied
volatility imputed from equity option prices. In the presence of multiple implied volatilities for
various option maturities and exercise prices, it is common to take the at-the-money (ATM)
implied volatility from puts and calls and extrapolate an average “implied vol”; this “implied
vol” is derived from the most liquid (ATM) options. Linda Allen illustrates:

“A particularly strong example of the advantage obtained by using implied volatility (in
contrast to historical volatility) as a predictor of future volatility is the GBP currency crisis of
1992. During the summer of 1992, the GBP came under pressure as a result of the
expectation that it should be devalued relative to the European Currency Unit (ECU)
components, the deutschmark (DM) in particular (at the time the strongest currency within
the ECU). During the weeks preceding the final drama of the GBP devaluation, many signals
were present in the public domain … the growing pressure on the GBP manifests itself in
options prices and volatilities. [But] historical volatility is trailing, “unaware” of the
pressure. In this case, the situation is particularly problematic since historical volatility
happens to decline as implied volatility rises. The fall in historical volatility is due to the fact
that movements close to the intervention band are bound to be smaller by the fact of the
intervention bands’ existence and the nature of intervention, thereby dampening the
historical measure of volatility just at the time that a more predictive measure shows
increases in volatility.” – Linda Allen 5

5
Allen, Linda, Understanding Market, Credit, and Operational Risk (Blackwell Publishing Ltd 2004). Chapter 2.

23
Guru. Downloaded March 29, 2020.
The information provided in this document is intended solely for you. Please do not freely distribute.

Is implied volatility a superior predictor of future volatility? “It would seem as if the answer
must be affirmative since implied volatility can react immediately to market conditions. As a
predictor of future volatility, this is certainly an important feature.”

Why does implied volatility tend to be greater than historical volatility? According to Linda
Allen6, “empirical results indicate, strongly and consistently, that implied volatility is, on
average, greater than realized volatility.” There are two common explanations.
 Market inefficiency: due to supply and demand forces.
 Rational markets: implied volatility is greater than realized volatility due to stochastic
volatility. “Consider the following facts: (i) volatility is stochastic; (ii) volatility is a
priced source of risk; and (iii) the underlying model (e.g., the Black–Scholes model)
is, hence, mis-specified, assuming constant volatility. The result is that the premium
required by the market for stochastic volatility will manifest itself in the forms we saw
above – implied volatility would be, on average, greater than realized volatility.”

But implied volatility has shortcomings.

 Implied volatility is model-dependent. A mis-specified model can result in an


erroneous forecast.
“Consider the Black–Scholes option-pricing model. This model hinges on a few
assumptions, one of which is that the underlying asset follows a continuous time
lognormal diffusion process. The underlying assumption is that the volatility
parameter is constant from the present time to the maturity of the contract. The
implied volatility is supposedly this parameter. In reality, volatility is not constant over
the life of the options contract. Implied volatility varies through time. Oddly, traders
trade options in “vol” terms, the volatility of the underlying, fully aware that (i) this vol
is implied from a constant volatility model, and (ii) that this very same option will trade
tomorrow at a different vol, which will also be assumed to be constant over the
remaining life of the contract.” –Linda Allen 7
 At any given point in time, options on the same underlying may trade at different vols.
An example is the [volatility] smile effect – deep out of the money (especially) and
deep in the money (to a lesser extent) options trade at a higher volatility than at the
money options.
 The empirically measured volatility is actually stochastic while the forecasts of
implied volatility are a smoother series, which undermines its capacity as a predictor
of future volatility. Nevertheless, implied volatilities on an average are higher than
realized volatility as mentioned earlier.
 Implied volatility measures are available for very few assets or market factors
because of the limited number of options trading on the underlying assets.

6
Allen, Linda, Understanding Market, Credit, and Operational Risk (Blackwell Publishing Ltd 2004). Chapter 2.
7
Allen, Linda, Understanding Market, Credit, and Operational Risk (Blackwell Publishing Ltd 2004). Chapter 2.

24
Guru. Downloaded March 29, 2020.
The information provided in this document is intended solely for you. Please do not freely distribute.

Apply the exponentially weighted moving average (EWMA)


approach and the GARCH (1,1) model to estimate volatility.
Classic volatility implicitly assigns equal weights to all returns

The usual method to compute the volatility is to take the square root of the variance when
the variance is simply the average squared return over a selected historical window8:

1
= → = =

The key shortcoming is that the same weight (weight = 1/n) is assigned to all returns
regardless of whether they occurred recently or in the distant past. Why might this matter?
Because this approach is slow to adjust to updated market conditions. If conditions suddenly
become volatile, we might prefer an approach that assigns more weight to recent returns.

In order to improve the volatility estimate, several methods exist that weight volatility based
on their recency. The Exponentially Weighted Moving Average (EWMA) and the Generalized
Autoregressive Conditional Heteroskedasticity (GARCH) are examples of volatility models
that assign more weight to recent data than the older observations to update faster with new
information.

Exponentially weighted moving average (EWMA)

EWMA consists of applying exponential smoothing to historical data in order to estimate


current volatility. A decay parameter is defined (0 < < 1) to weight the observation such
that = .
 The weight applied to yesterday’s observation is
 The weight applied to the return observed two days ago is
 The weight applied to the data observed n days ago is

The total weight is, therefore:

Note that ∑ is an infinite geometric series that can be simplified as

The total weight becomes:

1
1−

8
This convenient expression assumes the mean return is zero and, strictly speaking, estimates an MLE rather
than unbiased variance.

25
Guru. Downloaded March 29, 2020.
The information provided in this document is intended solely for you. Please do not freely distribute.

In order to have a weighting scheme, the total weights should sum to 1.0. Therefore:

1
=1→ 1− =
1−

 The weight applied to the past day observation (the first observation) is (1 − )
 The weight applied to the data observed two days ago (the second observation) is:
(1 − )
 The weight applied to the data observed n days ago is: (1 − )

The EWMA model to estimate the current volatility based on past period volatility is given as
follows:

=( − ) +( − ) +( − ) +⋯

Similarly, to predict the prior day volatility based on prior observation we have:

= (1 − ) + (1 − ) + (1 − ) +⋯

If we substitute the second equation into the first equation, we obtain a simpler version of the
EWMA volatility estimator, and this is the relevant version for our purposes:

=( − ) +

For example (a common EWMA question): The prior day’s volatility estimate was 3.0%
and the prior day observed return was -2.0%. The decay parameter is 0.940.

First, we compute the prior day’s variance 0.030 = 0.00090, and the prior day’s squared
return (−0.02) = 0.0004.

Then, we compute the current variance estimate as a weighted average:

= (1 − 0.94) ∗ 0.00040 + 0.94 ∗ 0.00090 = 0.000870

Finally, we take the square root of the variance:

= = √ 0.000870 = 2.950%

Note that the squared return is effectively a one-day variance. Viewed as such, EWMA is
really just estimating the weighted average of two variances by giving the most recent one-
day “variance” a small weight, in this case only 6.0%.

26
Guru. Downloaded March 29, 2020.
The information provided in this document is intended solely for you. Please do not freely distribute.

GARCH volatility model

Another well-known volatility estimation model is the Generalized Autoregressive Conditional


Heteroskedasticity (GARCH) model. The equation is defined as follows:
( , ):
= + + ⋯+ + + ⋯+

(1,1) includes one lagged variance and one lagged return:

= + +

Where represents the long-run average variance rate and its weight. The GARCH(1,1)
model must satisfy the following conditions which require that the weights sum to unity:

+ <1
=1− −
≥ 0; ≥ 0

This model is superficially similar to EWMA but relaxes the constraint related to the definition
of the decay factor , providing in this way more degree of freedom to the model and
therefore better in-sample explanatory power. Note that out-of-sample volatility prediction is
not always better and can be worse than in the case of EWMA. In the context of prediction,
out-the-sample performance is the relevant metric.

EWMA can be viewed as a special case of a (1,1) where parameters = 0 and


+ = 1.

GARCH(1,1) long term volatility

The long-term variance rate can be calculated as follows. Usually, the output of the
(1,1) statistical estimation provides = as the intercept of the autoregression.
Such that the model is rewritten:

= + +

We know that = 1 − − which means that the 3 parameters , and must sum to 1.
Therefore, we can calculate the long-run variance rate as:

= =
1− −

For example: Assume that the output of a (1,1) estimation provides = 0.000003,
= 0.10 and = 0.880 and therefore we have = 0.000003 + 0.10 + 0.880 .
The long-run variance here can be computed as follows:

= 1 − 0.10 − 0.88 = 0.020


0.000003
= = = 0.000150
0.020

Hence, the long-run volatility is: = 1.2247%

27
Guru. Downloaded March 29, 2020.
The information provided in this document is intended solely for you. Please do not freely distribute.

Let assume that yesterday daily volatility estimate was 2.0% and the corresponding
observed daily return was -3.0%:

The estimate of today’s variance as per the (1,1) is:

0.000003 + 0.10 × (−0.03) + 0.880 × 0.02 = 0.0004450

The corresponding volatility is √0.0004450 = 2.110% which is higher than the long-run
volatility, as the estimate is driven by the high prior day return and prior day variance
estimate that are both above the long-run volatility. This suggests that the current volatility
has increased.

Explain and apply approaches to estimate long horizon


volatility/VaR, and describe the process of mean reversion
according to a GARCH (1,1) model.
The implications of mean reversion in returns and return volatility

To forecast volatility and VaR forecasts for horizons longer than a day or a week, the key
idea refers to the application of the square root rule (S.R.R). This rule states that variance
scales directly with time such that the volatility scales directly with the square root of time.

Square root rule

The simplest approach to extending the horizon is to use the “square root rule” where we
extend one-period VaR to J-period VAR by multiplying by the square root of J.

, = ( , )
− = × −

For example, if the 1-period VAR is $10, then the 2-period VAR is $14.14 ($10 x square root
of 2) and the 5-period VAR is $22.36 ($10 x square root of 5).

The square root rule (i.e., variance is linear with time) for extending the time horizon only
applies under restrictive i.i.d. assumptions. So, under the two assumptions below, VaR
scales with the square root of time.
1. Random-walk (acceptable)
2. Constant volatility (unlikely)
Thus, the square root rule, while mathematically convenient, doesn’t really work in practice
because it requires that returns are independent and identically distributed (i.i.d.). Here are
two different types of assumption with respect to mean reversion:

If mean reversion… Then square root rule


In returns Overstates long-run volatility

In return volatility If current vol. > long-run volatility, overstates


If current vol. < long-run volatility, understates

28
Guru. Downloaded March 29, 2020.
The information provided in this document is intended solely for you. Please do not freely distribute.

Modeling predictability of variables is often done through time series models accounting for
autoregression, where current estimates are a function of previous values. The
autoregressive process is a stationary process that has a long-run mean, an average level to
which the series tends to revert. This average is often called the “Long Run Mean” (LRM).
For instance, when interest rates are below their LRM they are expected to rise and vice
versa.

Mean reversion has an important effect on long-term volatility.

We assume that volatility is constant under the square root rule (SRR) but volatility is
stochastic, and, in particular, autoregressive. Volatility has a long-run mean – a “steady
state” of uncertainty. When current volatility is above its long-run mean then we can expect a
decline in volatility over the longer horizon. Then estimating the long horizon volatility using
today’s volatility will overstate the true expected long horizon volatility. On the other hand, if
today’s volatility is low, then extrapolating today’s volatility using the square root rule may
understate true long-horizon volatility. It can be shown that under GARCH (1,1), the
expected variance on day t is given by:

= +( + ) ( − )

This formula can be used to calculate the average variance rate over the next T days. The
expected volatility can be estimated as the square root of this average variance rate.

Suppose that our estimate of the current daily variance is 0.00010 (corresponding to a
volatility of 1% per day). Due to mean reversion, however, we expect the average daily
variance rate over the next 25 days to be 0.00018. It makes sense then to base VaR and
expected shortfall estimates for a 25-day time horizon on volatility over the 25 days. This is
equal to 6.7% (= √25 × √0.00018 = 0.067)

For FRM purposes, three definitions of mean reversion are used:


1. Mean reversion in the asset dynamics. The price/return tends towards a long-run
level; e.g., interest rate reverts to 5%, equity log return reverts to +8%
2. Mean reversion in variance. Variance reverts toward a long-run level; e.g., volatility
reverts to a long-run average of 20%. We can also refer to this as negative
autocorrelation, but it's a little trickier. Negative autocorrelation refers to the fact that
a high variance is likely to be followed in time by a low variance. The reason it's tricky
is due to short/long timeframes: the current volatility may be high relative to the long-
run mean, but it may be "sticky" or cluster in the short-term (positive autocorrelation)
yet, in the longer term it may revert to the long-run mean. So, there can be a mix of
(short-term) positive and negative autocorrelation on the way being pulled toward the
long-run mean.
3. Autoregression in the time series. The current estimate (variance) is informed by
(a function of) the previous value; e.g., in GARCH (1,1) and exponentially weighted
moving average (EWMA), the variance is a function of the previous variance.

29
Guru. Downloaded March 29, 2020.
The information provided in this document is intended solely for you. Please do not freely distribute.

Calculate conditional volatility with and without mean reversion.


Mean reversion can be modeled in a stationary time series framework referred to as the
analysis of autoregression (AR). An AR process has a long-run mean, an average level to
which the series tends to revert. Consider an AR (1) model, which can be presented as a
regression of a variable on its own lag. This is the simplest form of mean reversion process
with only one lag. For example, time-series that are mean-reverting such as real exchange
rate, the price/dividend or price/earnings ratio, and the inflation rate can be modeled using its
own lag9.

= + +

The expected value is given as: [ ]= +

This equation can be rewritten as: [ ] = (1 − ) ∗ [ 1− ]+

This means that the next period’s expectations are a weighted sum of today’s value , and
the long-run mean /(1 − ). This time series process has a finite long-run mean, subject
to the constraint that parameter is less than one.

In case if = 1, the long-run mean /(1 − ) is undefined or infinite (as is divided by 0).
So, this process is considered as random walk or nonstationary and the next period’s
expected value is equal to today’s value.

Only if < 1 then the process is mean-reverting. When is above the long-run mean, it is
expected to decline, and if it is below the long-run mean it is expected to increase in value.
Hence is known as the “speed of reversion” parameter.

We first subtract from the autoregression formula to obtain the “return” change in as

− = + + −
= + ( − 1) +

For a two-period return, we require which can be found recursively by substituting the
value of = + + in its equation.

= + +
= + ( + + )+ = + + + +

The two-period return is, therefore:

− = + + + + −
= (1 + ) + ( −1) + +

9
Allen, Linda, Understanding Market, Credit, and Operational Risk (Blackwell Publishing Ltd 2004). Chapter 2.

30
Guru. Downloaded March 29, 2020.
The information provided in this document is intended solely for you. Please do not freely distribute.

The single period conditional variance of the rate of change is calculated as

( − ) = ( + + − )
= ( )
=
The volatility of en+1 is denoted by .

The two-period variance is calculated as:

( − ) = ( (1 + ) + ( −1) + + )
= ( + )
2
= (1 + ) ∗

2
Thus, the single-period variance is and the two-period variance is(1 + )∗ .

This means that if the process was a random walk with no mean reversion (implying b
1), then we can apply the square root rule and the two-period variance is 2 . Then we
would get the standard square root volatility result (√2 ). However, when there is mean
reversion, the square root rule for obtaining volatility fails.

For example, with no mean reversion, the two-period volatility would be 


√2 = 1.41 

With mean reversion, when < 1, say if 0.9, the two-period volatility is, instead,
√(1 + 0.9) = 1.34

Describe the impact of mean reversion on long horizon conditional


volatility estimation
As we have seen above, mean reversion affects conditional volatility and so risk is very
important, particularly in case of arbitrage strategies.

For instance, in a convergence trade where traders assume explicitly that the spread
between two positions, a long and a short, is mean-reverting (i.e. when < 1), then the long
2
horizon risk is smaller than the square root volatility as√(1 + ) ∗ < √2

This may create a sharp difference of opinions on the risk assessment of a trade by risk
managers. So, the managers usually assume a null hypothesis of market efficiency which
means that the spread underlying the convergence trade is a random walk.

31
Guru. Downloaded March 29, 2020.
The information provided in this document is intended solely for you. Please do not freely distribute.

Describe an example of updating correlation estimates.


Financial asset correlations are important variables from a risk management perspective.
We can use the exponentially weighted moving average (EWMA) model to update
correlations, in addition to its common application in updating volatility. The same logic
applies but EWMA will update covariance in addition to volatility in order to inform an
updated correlation. The updated covariance depends on the relationship = .

The covariance estimate under the EWMA is derived from the following relationship:

cov = λcov + (1 − λ)x y

Where is the covariance between asset returns and at for the time period .
Recall how we derive the updated volatility estimate using EWMA:
σ , = λσ , + (1 − λ)r , → vol estimate = σ , = σ,

Note that the decay parameter should be the same for the volatility model and the
covariance model to ensure consistency in the context of the correlation estimation. Once
we have derived the volatility estimates for each asset and the covariance estimate, we
derive the correlation estimate as follows:
cov
ρ =
σ , σ ,

For example (GARP’s example in 3.7): GARP’s example10 assumes two assets with
returns X and Y respectively. The daily return volatility of the previous day has been
estimated for each asset: 1.0% for X and 2.0% for Y. The daily returns observed the
previous day were +2.0% for both assets. Yesterday’s correlation coefficient was 0.20. The
decay factor is assumed to be 0.94. The question is the following: What is the updated
estimate of today’s correlation between X and Y? The first step is to estimate the previous
day estimate of the covariance:
cov =σ , σ , ρ = 0.01 ∗ 0.02 ∗ 0.20 = 0.00004

Then, we forecast the covariance and the volatilities based on the previous day figures.

cov = λcov + (1 − λ)x y = 0.94 ∗ 0.00004 + (1 − 0.94) ∗ 0.2 ∗ 0.2 = .


σ , = λσ , + (1 − λ)x = 0.94 ∗ 0.01 + (1 − 0.94) ∗ 0.02 = .

σ , = λσ , + (1 − λ)y = 0.94 ∗ 0.02 + (1 − 0.94) ∗ 0.02 = .

Finally, we update the correlation estimate:

cov 0.0000616
ρ = = = .
σ , σ , 0.01086 ∗ 0.02

10
2020 FRM Part I: Valuation & Risk Models, 10th Edition. Pearson Learning, 10/2019. Section 3.7 (Correlation)

32
Guru. Downloaded March 29, 2020.
The information provided in this document is intended solely for you. Please do not freely distribute.

For example (Hull’s Example 11.1):11

In John Hull’s example 11.1, he assumes that EWMA’s λ = 0.95 and that the previous day’s
correlation between two variables X and Y was 0.60. The volatility estimates for X and Y on
the previous day (i.e., n – 1) were 1.0% and 2.0%, respectively. Finally, the previous day’s
return (aka, percentage change) in X and Y were, respectively, 0.50% and 2.50%. See
second column below. The first column captures the previous example; the third and fourth
columns capture Hull’s EOC 11.5 and 11.6. The GARCH(1,1) is optional.

= , , = 0.010 ∗ 0.020 ∗ 0.60 = 0.000120

As seen in the second column, EWMA also updates the volatility estimates to σ(X) =
sqrt(0.00009625) = 0.981%, and σ(Y) = sqrt(0.00041125) = 2.028%. Then:

0.00012025
= = = .
, , 0.981% ∗ 2.028%

11
Hull, John C.. Risk Management and Financial Institutions (Wiley Finance). Chapter 11, Example 11.1

33
Guru. Downloaded March 29, 2020.
The information provided in this document is intended solely for you. Please do not freely distribute.

Chapter Summary
Empirically, the return distribution of most assets is not normal. We observe skewness, fat-
tails, and time-varying parameters in the returns distribution. Normality allows for simple and
elegant solutions to risk-measurement but this poses some problems as normality is often
assumed (or when we “impose normality”). When the normality assumption is breached and
assets returns are not normal this can distort our risk measurement. In particular, fat-tails
causes us to underestimate the risk of severe losses if we are working based on an
assumption of normality.

We can explain fat-tails by a distribution that changes over time (conditional distribution).
There are two things that can change in a normal distribution: the mean and volatility.
Therefore, we can explain fat tails in two ways: a time-varying conditional mean; but this is
unlikely given the assumption that markets are efficient. The second explanation is that
conditional volatility is time-varying; according to this reading’s author, the latter is the more
likely explanation.

VaR can be estimated using a variety of approaches, based on historic volatility or implied
volatility. Full-valuation models tend to be non-parametric, whilst local-valuation tends to be
parametric. To be specific, the main historical approaches include: the parametric approach,
where distributional assumptions are imposed, non-parametric estimation, where no
distributional assumptions are imposed but it is implied that history will inform the future; a
hybrid approach, where historical data is combined with e.g. a weighting scheme. In the
implied volatility approach volatility is inferred from pricing models such as e.g., Black-
Scholes.

The main methods for volatility modeling include EWMA, GARCH, MA, and MDE. The
Moving Average is the simplest of these, but suffers from a ghosting feature, whereby one
extreme observation dominates the data until it’s dropped. EWMA is a popular and
widespread method, both because it has the more realistic assumption of assigning greater
weight to recent observations, its ease of implementation, and its relatively non-technical
approach making it easy to explain to management. Multivariate density estimation has the
attractive feature of having its weights informed by the current economic conditions, where
the scenarios with similar economic conditions are given a larger weight. GARCH has grown
in popularity and is generally superior to EWMA. Indeed, EWMA is a special case of
GARCH. GARCH has the advantage of being able to forecast volatility, whereas EWMA
simply gives you the current volatility estimate. GARCH must not be used if the weights
assigned to the lagged variance and lagged squared returns are greater than one as this
implies that the model is non-stationary. This will result in unreliable estimates.

The square-root-rule, states that the process may be scaled, provided it follows a random
walk and the volatility is constant. This can be used to extend a one-period VaR to a J-
period VAR by multiplying by the square root of J. Unfortunately, volatility is generally not
constant so this rule must be applied with caution: in industry, it is generally considered
acceptable to scale up to a week, however, beyond that, the model error makes the estimate
too uncertain.

34
Guru. Downloaded March 29, 2020.
The information provided in this document is intended solely for you. Please do not freely distribute.

Questions and Answers


804.1. Linda Allen introduces the following value at risk (VaR) estimation approaches:
 Historical simulation (HS)
 GARCH (1,1)
 Hybrid
 Multivariate density estimation (MDE)
 Historical standard deviation (STDEV)
 Adaptive volatility (AV)
Consider these six summary descriptions:

I. The simplest parametric approach whose weakness is sensitivity to window length and
extreme observations
II. The most convenient and prominent non-parametric approach whose weakness is
inefficient use of data
III. An interpretation of the exponentially weighted moving average (EWMA) that gives the
risk manger a rule that can used to adapt prior beliefs about volatility in the face of news
IV. A parametric approach that assumes conditional returns are normal but unconditional
tails are heavy; and that returns are not correlated but conditional variance is mean-reverting
V. An approach that weights past squared returns not by time but instead according to the
difference between current and past states of the world
VI. An approach that modifies historical simulation by assigning exponentially declining
weights to past data such that recent (distant) returns are assigned more (less) weight

Which sequence below correctly matches the VaR estimation approach with its summary
description?
a) I = HS, II = AV, III = GARCH, IV = MDE, V = Hybrid, VI = STDEV
b) I = GARCH, II = MDE, III = Hybrid, IV = STDEV, V = HS, VI = AV
c) I = STDEV, II = HS, III = AV, IV = GARCH, V = MDE, VI = Hybrid
d) I = AV, II = Hybrid, III = STDEV, IV = HS, V = GARCH, VI = MDE

804.2. Dennis the Risk Analyst is calculating the 95.0% value at risk (VaR) under the hybrid
approach which is a hybrid between the historical simulation (HS) and exponentially
weighted moving average (EWMA). His historical window is only 90 days and he has set his
smoothing parameter 0.860; that is, λ = 0.860 and K = 90 days. Below are displayed the
(rounded) weights assigned under this approach to the three worst returns in the historical
window (which occurred, respectively, 27, 13 and 15 days ago).

In regard to the weight assigned to the 4th worst return, which


was -6.0%, and the consequent 95.0% VaR, which of the
following statements is TRUE?
a) The hybrid weight assigned to the 4th worst return (-
6.0%) is 1.457% and the 95.0% VaR is (a worst
expected loss of) 6.0%
b) The hybrid weight assigned to the 4th worst return (-
6.0%) is 1.457% and the 95.0% VaR is (a worst
expected loss of) 7.0%
c) The hybrid weight assigned to the 4th worst return (-
6.0%) is 2.664% and the 95.0% VaR is (a worst
expected loss of) 5.0%
d) The hybrid weight assigned to the 4th worst return (-6.0%) is 4.189% and the 95.0%
VaR is (a worst expected loss of) 5.0%

35
Guru. Downloaded March 29, 2020.
The information provided in this document is intended solely for you. Please do not freely distribute.

803.2. The daily standard deviation of a risky asset is 1.40%. If daily returns are
independent, then of course we can use the square root rule (SRR) to scale into a two-day
volatility given by 1.40%*SQRT(2) = 1.980%. However, if the lag-1 autocorrelation of returns
is significantly positive at 0.60, then which is NEAREST to the corresponding scaled two-day
volatility?
a) 1.64%
b) 1.98%
c) 2.33%
d) 2.50%

803.3. Based on daily asset prices over the last month (n = 20 days), Rebecca the Risk
Analyst has calculated the daily volatility as a basic historical standard deviation
(abbreviated STDEV to match Linda Allen's label for the equally weighted average of
squared deviations which "is the simplest and most common way to estimate or predict
future volatility"). As the exhibit below shows, however, she could not quite decide which
mean, µ, to use for the sample standard deviation: the actual sample mean return was -
45.12 basis points; but she has also seen zero assumed per squared returns; and finally, the
expected return is actually +20.0 bps. Each produces a slightly different sample standard
deviation. For example, assuming the actual sample mean produces a daily volatility of
56.04 bps which matches the Excel function STEV.S().

Which of the following statements is TRUE about Rebecca's calculations?


a) An advantage of this short 20-day window under the STDEV approach is that it
minimizes the so-called ghosting effect
b) If the returns are especially heavy-tailed (ie, non-normal), this STDEV is a superior
predictor to absolute standard deviation
c) If the actual 45.12 bps decline was uniquely unpredictable, it is reasonable to use the
currently expected positive µ = +20 bps as the conditional mean parameter
d) While the assumption of zero mean is technically acceptable as an MLE (as opposed
to unbiased) estimator of population standard deviation, it should in practice be
avoided because we seek a conditional mean parameter

36
Guru. Downloaded March 29, 2020.
The information provided in this document is intended solely for you. Please do not freely distribute.

Answers

804.1. C. TRUE: I = STDEV, II = HS, III = AV, IV = GARCH, V = MDE, VI = Hybrid. These
terms are matched with their summary explanation below, but grouped by parametric/non-
parametric and sub-sorted in the order of Linda Allen's presentation.

Parametric approaches (please note that Allen occasionally blurs the distinction between
volatility and VaR, but these parametric VaR approaches generally produce a volatility that
can be scaled by a confidence deviate to retrieve VaR):
 Historical standard deviation (STDEV): The simplest parametric approach whose
weakness is sensitivity to window length and extreme observations.
 Adaptive volatility (AV): An interpretation of the exponentially weighted moving
average (EWMA) that gives the risk manger a rule that can used to adapt prior
beliefs about volatility in the face of news
 GARCH (1,1): A parametric approach that assumes conditional returns are normal
but unconditional tails are heavy; and that returns are not correlated but conditional
variance is mean-reverting
Non-parametric approaches
 Historical simulation (HS): The most convenient and prominent non-parametric
approach whose weakness is inefficient use of data
 Multivariate density estimation (MDE): An approach that weights past squared
returns not by time but instead according to the difference between current and past
states of the world
 Hybrid (aka, age-weighted which Dowd has dubbed "semi-parametric"): An approach
that modifies historical simulation by assigning exponentially declining weights to
past data such that recent (distant) returns are assigned more (less) weight

804.2. A. TRUE: The hybrid weight assigned to the 4th worst return (-6.0%) is 1.457%
and the 95.0% VaR is (a worst expected loss of) 6.0%.

The easiest solution is to realize that the lambda parameter reflects the ratio of weights
between successive days. In this way, the weight assigned to the 16th day is simply 1.695%
* 0.86 = 1.4577%. Adding this to 4.264% generates the corresponding cumulative weight of
5.7217% such that the 5.0% quantile falls at this 4th-worst return of -6.0%. However, Linda
Allen writes that "the observation itself can be thought of as a random event with a
probability mass centered where the observation is actually observed.12" Under this technical
approach (which is akin to an interpolation but with the subtle difference that the weights are
spanning the return observations), the -6.0% return corresponds EXACTLY to 4.264% +
1.4577% = 4.99% which is almost exactly 5.0%. Consequently, the technically exact
approach also returns the loss of 6.0% as the 95.0% confident VaR.

Discuss the two questions above here in the forum:


https://www.bionicturtle.com/forum/threads/p1-t4-804-value-at-risk-var-estimation-
approaches-allen-ch-2.13700/

12 Linda Allen, Understanding Market, Credit and Operational Risk: The Value at Risk Approach

(Oxford: Blackwell Publishing, 2004)

37
Guru. Downloaded March 29, 2020.
The information provided in this document is intended solely for you. Please do not freely distribute.

803.2. D. 2.50%. The 2-period variance is given by σ^2[r(t,t+2)] = σ^2[r(t,t+1)] +


σ^2[r(t+1,t+2) + 2*COV[r(t,t+1), r(t+1,t+2)].

In this case, 2-day volatility = SQRT(0.0140^2 + 0.0140^2 + 2*0.60*0.0140*0.0140) =


SQRT(0.00062720) = 2.5044%

803.3. C. True: If the actual 45.12 bps decline was an uniquely unpredictable, it is
reasonable to use the currently expected positive µ = +20 bps as the conditional mean
parameter

In regard to (A), (B) and (D), each is FALSE:


 In regard to false (A), the STDEV approach with a short window maximizes the
ghosting effect!
 In regard to false (B), writes Linda Allen13, "it is worthwhile mentioning that an
alternative procedure of calculating the volatility involves averaging absolute values
of returns, rather than squared returns. This method is considered more robust when
the distribution is non-normal. In fact it is possible to show that while under the
normality assumption STDEV is optimal, when returns are non-normal, and, in
particular, fat tailed, then the absolute squared deviation method may provide a
superior forecast."
 In regard to false (D), zero mean is actually a very acceptable choice (and
assuming zero mean does not imply an MLE estimator!)
Linda Allen14: "Suppose, for example, that we need to estimate the volatility of the stock
market, and we decide to use a window of the most recent 100 trading days. Suppose
further that over the past 100 days the market has declined by 25 percent. This can be
represented as an average decline of 25bp/day (−2,500bp/100days = −25bp/day). Recall
that the econometrician is trying to estimate the conditional mean and volatility that were
known to market participants during the period. Using −25bp/day as µt, the conditional
mean, and then estimating σ2 t, implicitly assumes that market participants knew of the
decline, and that their conditional distribution was centered around minus 25bp/day.

Since we believe that the decline [in her example, the historical sample mean was entirely
unpredictable], imposing our priors by using µt = 0 is a logical alternative. Another approach
is to use the unconditional mean, or an expected change based on some other theory as the
conditional mean parameter. In the case of equities, for instance, we may want to use the
unconditional average return on equities using a longer period – for example 12 percent per
annum, which is the sum of the average risk free rate (approximately 6 percent) plus the
average equity risk premium (6 percent). This translates into an average daily increase in
equity prices of approximately 4.5bp/day. This is a relatively small number that tends to
make little difference in application, but has a sound economic rationale underlying its use."

Discuss the two questions above here in the forum:


https://www.bionicturtle.com/forum/threads/p1-t4-803-quantifying-volatility-in-value-at-risk-
var-models-allen-ch-2.13519/

13
Linda Allen, Understanding Market, Credit and Operational Risk: The Value at Risk Approach (Oxford:
Blackwell Publishing, 2004)
14
Linda Allen, Understanding Market, Credit and Operational Risk: The Value at Risk Approach (Oxford:
Blackwell Publishing, 2004)

38

You might also like