T4-VRM-3-Ch3-Volatility-v3 - Study Notes
T4-VRM-3-Ch3-Volatility-v3 - Study Notes
The information provided in this document is intended solely for you. Please do not freely distribute.
[INTRODUCTORY NON-LO] RELATIONSHIP BETWEEN VOLATILITY AND VALUE AT RISK (VAR) ..... 5
EXPLAIN HOW ASSET RETURN DISTRIBUTIONS TEND TO DEVIATE FROM THE NORMAL
DISTRIBUTION. .................................................................................................................... 9
EXPLAIN REASONS FOR FAT TAILS IN A RETURN DISTRIBUTION AND DESCRIBE THEIR
IMPLICATIONS. .................................................................................................................... 9
DISTINGUISH BETWEEN CONDITIONAL AND UNCONDITIONAL DISTRIBUTIONS. ......................... 10
DESCRIBE THE IMPLICATIONS OF REGIME SWITCHING ON QUANTIFYING VOLATILITY................ 11
EXPLAIN THE VARIOUS APPROACHES FOR ESTIMATING VAR. ................................................ 11
COMPARE AND CONTRAST PARAMETRIC AND NON-PARAMETRIC APPROACHES FOR ESTIMATING
CONDITIONAL VOLATILITY. CALCULATE CONDITIONAL VOLATILITY USING PARAMETRIC AND NON-
PARAMETRIC APPROACHES. ............................................................................................... 12
EXPLAIN HOW IMPLIED VOLATILITY CAN BE USED TO PREDICT FUTURE VOLATILITY ................. 23
EVALUATE IMPLIED VOLATILITY AS A PREDICTOR OF FUTURE VOLATILITY AND ITS
SHORTCOMINGS. ............................................................................................................... 23
APPLY THE EXPONENTIALLY WEIGHTED MOVING AVERAGE (EWMA) APPROACH AND THE
GARCH (1,1) MODEL TO ESTIMATE VOLATILITY. ................................................................. 25
EXPLAIN AND APPLY APPROACHES TO ESTIMATE LONG HORIZON VOLATILITY/VAR, AND
DESCRIBE THE PROCESS OF MEAN REVERSION ACCORDING TO A GARCH (1,1) MODEL. ........ 28
CALCULATE CONDITIONAL VOLATILITY WITH AND WITHOUT MEAN REVERSION. ....................... 30
DESCRIBE THE IMPACT OF MEAN REVERSION ON LONG HORIZON CONDITIONAL VOLATILITY
ESTIMATION ...................................................................................................................... 31
DESCRIBE AN EXAMPLE OF UPDATING CORRELATION ESTIMATES. ......................................... 32
CHAPTER SUMMARY ......................................................................................................... 34
QUESTIONS AND ANSWERS ............................................................................................... 35
2
Guru. Downloaded March 29, 2020.
The information provided in this document is intended solely for you. Please do not freely distribute.
Explain how asset return distributions tend to deviate from the normal
distribution.
Explain reasons for fat tails in a return distribution and describe their implications.
Explain long horizon volatility/VaR and the process of mean reversion according
to a GARCH(1,1) model.
3
Guru. Downloaded March 29, 2020.
The information provided in this document is intended solely for you. Please do not freely distribute.
The three value at risk (VaR) approaches are: parametric (aka, analytical), historical
simulation and Monte Carlo simulation. Within parametric approaches, the most
common is the classic linear normal VaR which simply scales (multiplies) volatility.
Two assumptions especially are among the convenient but egregious (unrealistic)
employed in the linear normal approach: first, our tendency to “impose normality” on
actually non-normal returns; and, second, our belief that returns are independent
over time (the square root rule of scaling VaR/volatility assumes independence).
On the fallacy of our tendency to “impose normality:” Because we can compute a
standard deviation—or because we only have the first two distributional moments—
does not by itself imply the dataset is normally distributed (of course!)
Risk varies over time. Models often assume a normal (Gaussian) distribution with
constant volatility from period to period; e.g., Black-Scholes-Merton assumes
constant volatility. But actual returns are non-normal and volatility is typically time-
varying. Therefore, it is hard to use parametric approaches on random returns. It is
hard to find robust distributional assumptions for stochastic asset returns.
Aside from implied volatility, we are chiefly concerned with the three basic methods
that estimate volatility with historical return data: (simple) historical simulation;
exponentially weighted moving average (EWMA); and GARCH(1,1).
Historical simulation has many variations (see Dowd) but our initial concern is the
“hybrid approach” which assigns weights according to the EWMA model
Conditional parameter (e.g., conditional volatility): this is a parameter such as
variance that depends on (is conditional on) circumstances or prior information. A
conditional parameter, by definition, changes over time. This is generally realistic!
Persistence: In EWMA, persistence is the lambda parameter (λ). In GARCH(1,1), it is
the sum of the alpha (α) and beta (β) parameters. In GARCH(1,1), high persistence
implies slow decay toward the long-run average variance; EWMA does not embed
any long-run average variance.
EWMA can be viewed as a special case of GARCH(1,1) but where the weight
assigned to the long-run variance (i.e., gamma) is zero. This is not to be confused
with a zero long-run variance; rather, EWMA has no long-run variance. Conversely,
GARCH(1,1) can be viewed as a generalized EWMA.
In GARCH(1,1), the implied L.R. Variance = ω/(1-[α+β]) = ω/(1-persistence). This is
also called the unconditional variance.
Two GARCH terms: Autoregressive: Recursive. A parameter (today’s variance) is a
function of itself (yesterday’s variance). Heteroskedastic: Variance changes over time
(homoscedastic = constant variance).
Leptokurtosis: a heavy-tailed distribution where relatively more observations are
near the middle and in the “fat tails (kurtosis > 3). The GARCH(1,1) model exhibits
heavy-tails in the unconditional distribution of returns, yet the model assumes
conditional returns are normal.
4
Guru. Downloaded March 29, 2020.
The information provided in this document is intended solely for you. Please do not freely distribute.
In a convenient and relatively simple approach, normal linear VaR simply scales volatility.
For example, if µ = zero and σ = 10.0%, then 95.0% normal VaR = 10.0% × 1.645 = 16.45%.
In this way, our most common parametric VaR approach is just a multiple of volatility and
further often inherits the properties of volatility. As reviewed in the previous chapter, you
should be able to calculate the normal linear value at risk (VaR) for both a single asset
and its extension to a two-asset portfolio.
Single-asset normal value at risk (VaR) is volatility scaled by the deviate (e.g., 1.645)
In the following example, we will make the following assumptions: there are 250 trading days
per year. The current asset value equals $200.00. The asset’s return volatility is 10.0% per
annum. Finally, we will assume the expected return is zero; aka, relative VaR.
Further, any value at risk (VaR) estimate requires two user-based “design decisions:”
What is the confidence level? In this case, the selected confidence is 95.0%
What is the desired horizon? In this case, the horizon is one day. Therefore, we will
need to scale volatility (and VaR) from one year to one day.
Given these assumptions and our VaR choices (i.e., confidence and horizon), we calculate:
Per annum VaR = $200.0 * 10.0% * 1.645 = $32.90. But our horizon is one day.
One-day VaR = $200 * 10.0% * sqrt(1/250) * 1.645 = $2.08; the volatility is scaled
per the so-called “square root rule”
Relative VaR (versus absolute VaR): we have here assumed the expected return
(aka, drift) for a single day is zero; alternatively we have ignored the expected
return. This is common when estimating the one-day VaR. Technically, if we exclude
the expected return, we are estimating a relative VaR (as opposed to an absolute
VaR). This is called a relative VaR because it is the worst expected loss is relative to
the expected (future) value at the end of the horizon.
5
Guru. Downloaded March 29, 2020.
The information provided in this document is intended solely for you. Please do not freely distribute.
We build on the previous example by considering a portfolio of two assets rather than a
single asset. Notice this requires an additional assumption: what is the correlation between
the two assets? Correlation is almost always denoted by Greek rho, ρ. In this example, we
assume the correlation is zero. The portfolio VaR is an increasing function of correlation; at
perfect correlation, the portfolio VaR will equal the sum of the individual VaRs (which
illustrates diversification: at any correlation less than perfect 100%, the portfolio VaR will be
less than the sum of the individual VaRs). Again, because the horizon is short (i.e., one day)
by convention we can safely assume the expected return is zero; put another way, we are
estimating a relative VaR. Notice that this is essentially an exercise in computing portfolio
volatility: the translation from portfolio volatility to VaR is achieved by the single step of
multiplication. In this instance of 95.0% confidence, we simply multiply by 1.645 in order to
translate a one-day portfolio volatility of $1.41 into a one-day 95.0% VaR of $2.33.
Given these assumptions about the asset and our VaR choices (i.e., confidence and
horizon), we can calculate:
Per annum portfolio variance = 0.52*0.12 + 0.52*0.22 + 2*0.5*0.5*0.1*0.2*0 = 0.01250
such that per annum portfolio volatility = sqrt(0.01250) = 11.18%
Per annum (diversified) portfolio VaR = 11.18% * 1.645 = 18.39% such that the
time-scaled one-day portfolio VaR = $200.0 * 18.39% * sqrt(1/250) = $2.33
There is another approach that utilizes the individual VaRs of $1.04 for Asset A and
$2.08 for Asset B. Where iVaR is Individual VaR, this uses:
6
Guru. Downloaded March 29, 2020.
The information provided in this document is intended solely for you. Please do not freely distribute.
Portfolio volatility and portfolio VaR are an increasing function with correlation
In this two-asset portfolio, a key relationship is the correlation parameter’s influence on the
diversified portfolio VaR: higher correlation implies higher portfolio VaR. Using the same
individual VaR parameters given previously (i.e., an equally-weighted portfolio with
respective volatilities of 10% and 20%), below we show the portfolio volatility and VaR given
a range of assumptions for the correlation parameter, ρ(A,B), from perfectly negative to
perfectly positive:
The corresponding plot (below) shows the portfolio VaR is increasing and non-linear. Notice
at perfect correlation (ρ = 1.0), the portfolio VaR is equal to the sum of the individual VaRs:
at ρ=1.0, portfolio VaR equals $3.12 which is the sum of $1.04 and $2.08. This is
unsurprising because, under this normal linear VaR estimate, VaR is simply a multiple of
(i.e., multiplier on) the standard deviation.
.
7
Guru. Downloaded March 29, 2020.
The information provided in this document is intended solely for you. Please do not freely distribute.
When we scale volatility or VaR over time, the critical time-scaling assumption is the
assumption of i.i.d. (aka, iid), specifically the independence assumption. Despite our use of
the normal distribution—and contrary to a sometimes popular assumption--normality is not
here the critical assumption! When scaling the VaR from down from one year to one day (or
up from one day to one year) we typically assume independence. Independence is a
common but unrealistic assumption.
If returns are independent across time, then autocorrelation (aka, serial correlation) is zero.
By the contrapositive, the following is also true: if the autocorrelation is non-zero, then the
returns are not independent across time. Therefore, non-zero autocorrelation (aka, non-
zero serial correlation) is by definition a violation of the i.i.d. assumption and renders
inaccurate the square root rule of scaling VaR. For the FRM, you do not need to be
able to quantify the implications of this i.i.d. violation. Nonetheless, it is illustrated
below so that you can see the directional impact. We assume:
Conveniently, the daily volatility is around 1.0%
The first column illustrates the typical approach used to scale the daily volatility to a
10-day VaR: 10-day VaR = 1.0% * 2.33 deviate * sqrt(10/1) = 7.36%
At the bottom, compare the 10-day VaR of 7.36% to the second and third columns:
When autocorrelation is +0.20, the scaled 10-day VaR is 8.82%
When autocorrelation is -0.20, the scaled 10-day VaR is 6.13%. Negative
autocorrelation reflects mean reversion and we can see that, when there exists mean
reversion, the i.i.d. scaled VaR of 7.36% will overstate the “adjusted” VaR.
Note how autocorrelation informs the adjusted standard deviation. Again, the
VaR simply multiplies this adjusted standard deviation by the normal deviate of 2.33.
8
Guru. Downloaded March 29, 2020.
The information provided in this document is intended solely for you. Please do not freely distribute.
How can outliers be indications that the volatility varies with time?
We observe that actual financial returns tend to exhibit fat-tails. Here are two possible
explanations:
1. The true distribution is stationary. Therefore, fat-tails reflect the true distribution but
the normal distribution is not appropriate
2. The true distribution changes over time (it is “time-varying”). In this case, outliers can,
in reality, reflect a time-varying volatility. This is considered the more likely
explanation.
9
Guru. Downloaded March 29, 2020.
The information provided in this document is intended solely for you. Please do not freely distribute.
Since the tails of an empirical distribution are fatter than the normal distribution, this causes
problems in risk measurement. Risk focuses on extreme events, attempting to quantify the
probability and size of severe losses. When we impose normal distribution assumptions on
real-world data, it fails exactly where we need it to work the most, especially, in the tails of
the distributions. This could invariably result in underestimation of the VaR loss.
A conditional distribution is not always the same: it is different, or conditional on, some
economic or market events or other states. It is measured by parameters such as its
conditional mean, conditional standard deviation (conditional volatility), and conditional skew,
and conditional kurtosis.
Heavy tails appear mostly in the unconditional distribution whereas conditional distribution is
usually normal, although with varying means or volatility at different points in time. This is
because conditional fat tails can be normalized by means of standardizing their respective
conditional parameters. Then their distribution can be represented as a standard normal
variable.
So, instead of assuming asset returns are normally distributed, it makes more sense to
assume that asset returns are conditionally normally distributed. However, caution has to be
applied, since in reality, asset returns are generally non-normal, whether unconditional or
conditional; i.e., fat tails are exhibited in asset returns regardless of the method we apply.
10
Guru. Downloaded March 29, 2020.
The information provided in this document is intended solely for you. Please do not freely distribute.
The problem: a risk manager may assume (and measure) an unconditional volatility but the
distribution is actually regime-switching. In this case, the distribution is conditional (i.e., it
depends on conditions) and might be normal but regime-switching; e.g., volatility is 10%
during a low-volatility regime and 20% during a high-volatility regime but during both
regimes, the distribution may be normal. However, the risk manager may incorrectly assume
a single 15% unconditional volatility. But in this case, the unconditional volatility is likely to
exhibit fat tails because it does not account for the regime-switching.
11
Guru. Downloaded March 29, 2020.
The information provided in this document is intended solely for you. Please do not freely distribute.
Jorion illustrates another view on the same set of approaches (see below). He firstly
distinguishes between local valuation (which is associated with parametric approaches) and
full valuation (which includes both historical and Monte Carlo simulation). Note also that in
general stress testing requires a full valuation (ie, simulation) approach.
12
Guru. Downloaded March 29, 2020.
The information provided in this document is intended solely for you. Please do not freely distribute.
The key difference between parametric and non-parametric approaches concerns the
reliance on a distribution and the role of data:
A parametric approach uses (historical) data in order to inform distributional
parameters, typically the variance and covariance. These parameters characterize a
distribution; e.g., normal, non-normal, GEV in the tail. Then value at risk (VaR) or
expected shortfall (ES) can be estimated directly from the distribution. In this way, the
parametric approach is betrayed by its reliance on a distributional assumption.
The classic non-parametric approach is historical simulation. In this approach, VaR
(or ES) is estimated directly from the data without reference to a distribution function.
Simple historical simulation requires zero parameters; it only needs the dataset.
13
Guru. Downloaded March 29, 2020.
The information provided in this document is intended solely for you. Please do not freely distribute.
Parametric approaches
Under the parametric methods, volatility is estimated in different models by means of placing
different weights on historic data. Although there are many variations, we are concerned with
the following three basic parametric approaches to the estimation of volatility:
Historical standard deviation
Exponential Smoothing (EWMA or Risk MetricsTM Approach), and
GARCH(1,1)
2 2 2
− , − +1 +⋯+ −2, −1, + −1,
=
This standard deviation is called a moving average (MA) by Jorion. The estimate requires a
window of fixed length; e.g., 30 or 60 trading days. If we observe returns, r(i), over M days,
the volatility estimate is constructed from a moving average (MA). This is an identical
measure:
= (1 )
Each day, the estimate is updated by adding the most recent day and dropping the furthest
day. In a simple moving average, all weights on past returns are equal and set to (1/M).
Note raw returns are used instead of returns around the mean (i.e., the expected mean is
assumed zero). This is common in short time intervals, where it makes little difference on the
volatility estimate.
For example, assume the previous four daily returns for a stock are 6% (n-1), 5% (n-2), 4%
(n-3) and 3% (n-4). What is a current volatility estimate, applying the moving average, given
that our short trailing window is only four days (m= 4)? If we square each return, the series is
0.0036, 0.0025, 0.0016 and 0.0009. If we sum this series of squared returns, we get 0.0086.
Divide by 4 (since m=4) and we get 0.00215. That’s the moving average variance, such that
the moving average volatility is about 4.64%. The above example illustrates a key weakness
of the moving average (MA): since all returns weigh equally, the trend does not matter. In the
example above, notice that volatility is trending down, but MA does not reflect in any way this
trend. We could reverse the order of the historical series and the MA estimation would
produce the same result.
14
Guru. Downloaded March 29, 2020.
The information provided in this document is intended solely for you. Please do not freely distribute.
This standard deviation (STDEV; aka, MA) series is simple but has drawbacks.
Importantly, EWMA is a special case of GARCH and here is how we get from GARCH (1,1)
to EWMA. We will review this more closely later, but for the moment realize:
GARCH(1,1) = = + , +
This is now equivalent to the formula for exponentially weighted moving average (EWMA):
EWMA = = , + (1 − )
= + (1 − ) , because it is typical to use lambda for the so-
called smoothing parameter
15
Guru. Downloaded March 29, 2020.
The information provided in this document is intended solely for you. Please do not freely distribute.
In EWMA, the lambda parameter( ) determines the “decay:” a lambda that is close to one
(high lambda) exhibits slow decay. RiskMetrics™ is a branded form of the exponentially
weighted moving average (EWMA) approach. The optimal (theoretical) lambda varies by
asset class, but the overall optimal parameter used by RiskMetrics™ has been 0.94. In
practice, RiskMetrics™ only uses one decay factor for all series:
0.94 for daily data
0.97 for monthly data (month defined as 25 trading days)
Technically, the daily and monthly models are inconsistent. However, they are both easy to
use, they approximate the behavior of actual data quite well, and they are robust to
misspecification.
GARCH regresses on lagged or historical terms. The lagged terms are either variance or
squared returns. The generic GARCH (p, q) model regresses on (p) squared returns and (q)
variances. Therefore, GARCH (1, 1) “lags” or regresses on last period’s squared return (i.e.,
just 1 return) and last period’s variance (i.e., just 1 variance).
Please note: GARCH (1, 1) is often represented with Greek parameters (per Hull) but can be
expressed with alternative notation:
= + + = + +
where
= conditional variance
= previous variance
or = previous squared returns
or = weighted long-run average variance where = × where is
the weight of long-run average variance rate
or = weight of lagged squared returns
or = weight of lagged variance
16
Guru. Downloaded March 29, 2020.
The information provided in this document is intended solely for you. Please do not freely distribute.
A persistence of 1.0 implies no mean reversion. A persistence of less than 1.0 implies
“reversion to the mean,” where a lower persistence implies greater reversion to the mean. If
the weights assigned to the lagged variance and lagged squared returns are greater than
one, the model is non-stationary. That is, if (α+β) > 1 the model is non-stationary and,
according to Hull, unstable, in which case, EWMA is preferred.
The three GARCH(1,1) parameters are weights and therefore must sum to one:
In GARCH (1,1), the weights should all add to unity such that + + = 1. Be careful
about the first term in the GARCH (1, 1) equation: omega (ω) = gamma (γ) * average long-
run variance ( ). If you are asked for the variance, you may need to divide out the weight in
order to compute the average variance.
The average, unconditional variance in the GARCH (1, 1) model is given by:
= =
−( + )
The FRM likes to ask about this unconditional variance in the GARCH(1,1) model. The
question may give you the GARCH (1,1) model parameters and ask you to solve for the
implied long-run (unconditional variance). For example, if you are given the GARCH model,
= + − , + − , then the implied long-run variance is
( )
= .
In practice, variance rates tend to be mean-reverting; therefore, the GARCH (1, 1) model is
theoretically superior (“more appealing than”) to the EWMA model. Remember, that’s the big
difference: GARCH adds the parameter that weighs the long-run average and therefore it
incorporates mean reversion. GARCH (1, 1) is preferred unless the first omega parameter is
negative (which is implied if alpha + beta > 1). In this case, GARCH (1,1) is unstable and
EWMA is preferred.
1. Ghosting feature: volatility shocks (sudden increases) are abruptly incorporated into
the MA metric and then, when the trailing window passes, they are abruptly dropped
from the calculation. Due to this, the MA metric will shift in relation to the chosen
window length
2. Trend information is not incorporated
17
Guru. Downloaded March 29, 2020.
The information provided in this document is intended solely for you. Please do not freely distribute.
GARCH is both “compact” (i.e., relatively simple) and remarkably accurate for
predicting volatility for in-sample data. However, as the GARCH parameters change
with new information, forecasts change with it sometimes due to estimation error
which induces a model risk. So, forecasting of out-of-sample data with GARCH
poses a risk. GARCH models predominate in scholarly research. Many variations of
the GARCH model have been attempted, but few have improved on the original.
The drawback of the GARCH model is its nonlinearity.
STDEV GARCH
EWMA vs GARCH(1,1)
Each of GARCH(1, 1), EWMA and RiskMetrics™ are parametric and recursive.
GARCH(1,1) is generally considered superior (and, after all, is the general case of both
EWMA and MA):
GARCH estimations can provide estimations that are more accurate than MA
Except Linda Allen warns2: GARCH (1,1) needs more parameters and may pose
greater MODEL RISK (“chases a moving target”) when forecasting out-of-sample
1
Allen, Linda, Understanding Market, Credit, and Operational Risk (Blackwell Publishing Ltd 2004)
2
Allen, Linda, Understanding Market, Credit, and Operational Risk (Blackwell Publishing Ltd 2004). Chapter 2.
18
Guru. Downloaded March 29, 2020.
The information provided in this document is intended solely for you. Please do not freely distribute.
Nonparametric methods
We have seen the parametric methods (above). On the other hand, nonparametric methods
estimate VaR directly from the data, without making any assumptions about the entire
distribution of returns. Given the problems of fat tails and skewness that arise when normal
distribution is assumed, nonparametric methods potentially avoid such issues. However, it
should be noted that while parametric methods use data more efficiently, the precision of
nonparametric methods is dependent upon large samples; put simply, it is very difficult to
apply nonparametric methods to small samples.
Historical simulation is easy as it does not require estimation of any parameter: we only need
to determine the “lookback window.” The problem is that, for small samples, the extreme
percentiles (e.g., the worst one percent) are less precise. Historical simulation effectively
throws out useful information as it focuses much on the tails.
The key feature of multivariate density estimation is that the weights (assigned to historical
square returns) are not a constant function of time. Rather, the current state—as
parameterized by a state vector—is compared to the historical state: the more similar the
states (current versus historical period), the greater the assigned weight. It, therefore,
estimates the joint probability density function of a set of variables (returns and state of the
economy or term structure or inflation level, etc.). The relative weighting is determined by
the kernel function:
K
s å w (Ct -i )ut2-i
2
t
i 1
3
Allen, Linda, Understanding Market, Credit, and Operational Risk (Blackwell Publishing Ltd 2004). Chapter 2.
19
Guru. Downloaded March 29, 2020.
The information provided in this document is intended solely for you. Please do not freely distribute.
One problem with MDE is that it is data intensive as many data points are required to
estimate the appropriate weights that capture the joint density function of the variables. Also,
the quantity of data needed increases with the number of conditioning variables used in
estimation.
Hybrid Approach
The hybrid approach combines each of the parametric and nonparametric approaches. It
combines historical simulation (by estimating the percentiles of returns directly) and EWMA
(by using exponentially declining weights on past data). Consider the example below (which
is realized in our learning spreadsheet but replicates Linda Allen’s example4 in her Table
2.3). Notice that given 100 observations (K = 100), under simple HS each return is assigned
a weight of 1.0% = 1/100. Compare this to the Hybrid approach where recent returns are
assigned greater weight; e.g., 3 periods ago is weighted 2.21%. At the same time, the more
distant return that occurred 65 days ago is assigned a weight of only 0.63%.
lambda, λ 0.98
K 100
Initial Date
Periods Simple HS Hybrid (EXP)
Return Ago Weight Cumul Weight Cumul
-3.30% 3 1.00% 1.00% 2.21% 2.21%
-2.90% 2 1.00% 2.00% 2.26% 4.47%
-2.70% 65 1.00% 3.00% 0.63% 5.11%
-2.50% 45 1.00% 4.00% 0.95% 6.05%
-2.40% 5 1.00% 5.00% 2.13% 8.18%
-2.30% 30 1.00% 6.00% 1.28% 9.47%
25 Days Later
Periods Simple HS Hybrid (EXP)
Return Ago Weight Cumul Weight Cumul
-3.30% 28 1.00% 1.00% 1.34% 1.34%
-2.90% 27 1.00% 2.00% 1.36% 2.70%
-2.70% 90 1.00% 3.00% 0.38% 3.08%
-2.50% 70 1.00% 4.00% 0.57% 3.65%
-2.40% 30 1.00% 5.00% 1.28% 4.94%
-2.30% 55 1.00% 6.00% 0.77% 5.71%
4
Allen, Linda, Understanding Market, Credit, and Operational Risk (Blackwell Publishing Ltd 2004). Based on
Linda Allen’s Table 2.3 but spreadsheet (including error correction) hand built by David Harper.
20
Guru. Downloaded March 29, 2020.
The information provided in this document is intended solely for you. Please do not freely distribute.
For two companies, GOOG and XOM, we retrieved daily returns over a one-year history (K =
250 days) ending December 14th, 2016. The lambda parameter, λ, is set to 0.980. For
example, the worst return of -5.46% is weighted according to (1-0.98)*0.98^(165-1)/(1-
0.98^25) which is a product of the typical (1-λ)*λ^(n-1) = (1-0.98)*0.98^(165-1) and 1/(1-
0.98^250) which is a slight upward multiplier that "trims" the total weight of 250 (truncated
from an infinite series) days from 99.36% to a full 100.0% based on the 250-day window.
Notice how the largest daily losses for Alphabet (formerly Google) did not occur recently in
the historical window; consequently, the cumulative weighting under the hybrid approach
lags behind the cumulative weight under a simple historical simulation.
T 250 T 250
lambda, λ 0.98 lambda, λ 0.98
K 250 K 250
21
Guru. Downloaded March 29, 2020.
The information provided in this document is intended solely for you. Please do not freely distribute.
Returning to Linda Allen’s example (in Table 2.3; see below), the risk measurement question
is “What is the 95.0% value at risk, VaR, implies by this hybrid simulation?” This requires
solving for the worst 5.0% quantile.
lambda, λ 0.98
K 100
Initial Date
Periods Simple HS Hybrid (EXP)
Return Ago Weight Cumul Weight Cumul
-3.30% 3 1.00% 1.00% 2.21% 2.21%
-2.90% 2 1.00% 2.00% 2.26% 4.47%
-2.70% 65 1.00% 3.00% 0.63% 5.11%
-2.50% 45 1.00% 4.00% 0.95% 6.05%
-2.40% 5 1.00% 5.00% 2.13% 8.18%
The solution is illustrated below (this example can be explored in further detail in the
associated learning spreadsheet).
Advantages Disadvantages
Historical Sim Easiest to implement (simple, convenient) Inefficient (much data is not used)
Multivariate Very flexible: weights are function of state Onerous model: weighting scheme;
density (e.g., economic context such as interest conditioning variables; number of
estimation rates) not constant observations. Data intensive
Hybrid Unlike the HS approach, better incorporates Requires model assumptions; e.g.,
approach more recent information number of observations
22
Guru. Downloaded March 29, 2020.
The information provided in this document is intended solely for you. Please do not freely distribute.
This requires that a market mechanism (e.g., an exchange) can provide a market price for
the option. If a market price can be observed, then instead of solving for the price of an
option, we use an option pricing model (OPM) to reveal the implied (implicit) volatility. We
solve (“goal seek”) for the volatility that matched a model price to the market price:
= ( ). Here the implied standard deviation (ISD) is the volatility input into
an option pricing model (OPM). Similarly, implied correlations can be recovered (reverse-
engineered) from options on multiple assets. According to Jorion, ISD is a superior approach
to volatility estimation. He says, “Whenever possible, VAR should use implied parameters.”
“A particularly strong example of the advantage obtained by using implied volatility (in
contrast to historical volatility) as a predictor of future volatility is the GBP currency crisis of
1992. During the summer of 1992, the GBP came under pressure as a result of the
expectation that it should be devalued relative to the European Currency Unit (ECU)
components, the deutschmark (DM) in particular (at the time the strongest currency within
the ECU). During the weeks preceding the final drama of the GBP devaluation, many signals
were present in the public domain … the growing pressure on the GBP manifests itself in
options prices and volatilities. [But] historical volatility is trailing, “unaware” of the
pressure. In this case, the situation is particularly problematic since historical volatility
happens to decline as implied volatility rises. The fall in historical volatility is due to the fact
that movements close to the intervention band are bound to be smaller by the fact of the
intervention bands’ existence and the nature of intervention, thereby dampening the
historical measure of volatility just at the time that a more predictive measure shows
increases in volatility.” – Linda Allen 5
5
Allen, Linda, Understanding Market, Credit, and Operational Risk (Blackwell Publishing Ltd 2004). Chapter 2.
23
Guru. Downloaded March 29, 2020.
The information provided in this document is intended solely for you. Please do not freely distribute.
Is implied volatility a superior predictor of future volatility? “It would seem as if the answer
must be affirmative since implied volatility can react immediately to market conditions. As a
predictor of future volatility, this is certainly an important feature.”
Why does implied volatility tend to be greater than historical volatility? According to Linda
Allen6, “empirical results indicate, strongly and consistently, that implied volatility is, on
average, greater than realized volatility.” There are two common explanations.
Market inefficiency: due to supply and demand forces.
Rational markets: implied volatility is greater than realized volatility due to stochastic
volatility. “Consider the following facts: (i) volatility is stochastic; (ii) volatility is a
priced source of risk; and (iii) the underlying model (e.g., the Black–Scholes model)
is, hence, mis-specified, assuming constant volatility. The result is that the premium
required by the market for stochastic volatility will manifest itself in the forms we saw
above – implied volatility would be, on average, greater than realized volatility.”
6
Allen, Linda, Understanding Market, Credit, and Operational Risk (Blackwell Publishing Ltd 2004). Chapter 2.
7
Allen, Linda, Understanding Market, Credit, and Operational Risk (Blackwell Publishing Ltd 2004). Chapter 2.
24
Guru. Downloaded March 29, 2020.
The information provided in this document is intended solely for you. Please do not freely distribute.
The usual method to compute the volatility is to take the square root of the variance when
the variance is simply the average squared return over a selected historical window8:
1
= → = =
The key shortcoming is that the same weight (weight = 1/n) is assigned to all returns
regardless of whether they occurred recently or in the distant past. Why might this matter?
Because this approach is slow to adjust to updated market conditions. If conditions suddenly
become volatile, we might prefer an approach that assigns more weight to recent returns.
In order to improve the volatility estimate, several methods exist that weight volatility based
on their recency. The Exponentially Weighted Moving Average (EWMA) and the Generalized
Autoregressive Conditional Heteroskedasticity (GARCH) are examples of volatility models
that assign more weight to recent data than the older observations to update faster with new
information.
1
1−
8
This convenient expression assumes the mean return is zero and, strictly speaking, estimates an MLE rather
than unbiased variance.
25
Guru. Downloaded March 29, 2020.
The information provided in this document is intended solely for you. Please do not freely distribute.
In order to have a weighting scheme, the total weights should sum to 1.0. Therefore:
1
=1→ 1− =
1−
The weight applied to the past day observation (the first observation) is (1 − )
The weight applied to the data observed two days ago (the second observation) is:
(1 − )
The weight applied to the data observed n days ago is: (1 − )
The EWMA model to estimate the current volatility based on past period volatility is given as
follows:
=( − ) +( − ) +( − ) +⋯
Similarly, to predict the prior day volatility based on prior observation we have:
= (1 − ) + (1 − ) + (1 − ) +⋯
If we substitute the second equation into the first equation, we obtain a simpler version of the
EWMA volatility estimator, and this is the relevant version for our purposes:
=( − ) +
For example (a common EWMA question): The prior day’s volatility estimate was 3.0%
and the prior day observed return was -2.0%. The decay parameter is 0.940.
First, we compute the prior day’s variance 0.030 = 0.00090, and the prior day’s squared
return (−0.02) = 0.0004.
= = √ 0.000870 = 2.950%
Note that the squared return is effectively a one-day variance. Viewed as such, EWMA is
really just estimating the weighted average of two variances by giving the most recent one-
day “variance” a small weight, in this case only 6.0%.
26
Guru. Downloaded March 29, 2020.
The information provided in this document is intended solely for you. Please do not freely distribute.
= + +
Where represents the long-run average variance rate and its weight. The GARCH(1,1)
model must satisfy the following conditions which require that the weights sum to unity:
+ <1
=1− −
≥ 0; ≥ 0
This model is superficially similar to EWMA but relaxes the constraint related to the definition
of the decay factor , providing in this way more degree of freedom to the model and
therefore better in-sample explanatory power. Note that out-of-sample volatility prediction is
not always better and can be worse than in the case of EWMA. In the context of prediction,
out-the-sample performance is the relevant metric.
The long-term variance rate can be calculated as follows. Usually, the output of the
(1,1) statistical estimation provides = as the intercept of the autoregression.
Such that the model is rewritten:
= + +
We know that = 1 − − which means that the 3 parameters , and must sum to 1.
Therefore, we can calculate the long-run variance rate as:
= =
1− −
For example: Assume that the output of a (1,1) estimation provides = 0.000003,
= 0.10 and = 0.880 and therefore we have = 0.000003 + 0.10 + 0.880 .
The long-run variance here can be computed as follows:
27
Guru. Downloaded March 29, 2020.
The information provided in this document is intended solely for you. Please do not freely distribute.
Let assume that yesterday daily volatility estimate was 2.0% and the corresponding
observed daily return was -3.0%:
The corresponding volatility is √0.0004450 = 2.110% which is higher than the long-run
volatility, as the estimate is driven by the high prior day return and prior day variance
estimate that are both above the long-run volatility. This suggests that the current volatility
has increased.
To forecast volatility and VaR forecasts for horizons longer than a day or a week, the key
idea refers to the application of the square root rule (S.R.R). This rule states that variance
scales directly with time such that the volatility scales directly with the square root of time.
The simplest approach to extending the horizon is to use the “square root rule” where we
extend one-period VaR to J-period VAR by multiplying by the square root of J.
, = ( , )
− = × −
For example, if the 1-period VAR is $10, then the 2-period VAR is $14.14 ($10 x square root
of 2) and the 5-period VAR is $22.36 ($10 x square root of 5).
The square root rule (i.e., variance is linear with time) for extending the time horizon only
applies under restrictive i.i.d. assumptions. So, under the two assumptions below, VaR
scales with the square root of time.
1. Random-walk (acceptable)
2. Constant volatility (unlikely)
Thus, the square root rule, while mathematically convenient, doesn’t really work in practice
because it requires that returns are independent and identically distributed (i.i.d.). Here are
two different types of assumption with respect to mean reversion:
28
Guru. Downloaded March 29, 2020.
The information provided in this document is intended solely for you. Please do not freely distribute.
Modeling predictability of variables is often done through time series models accounting for
autoregression, where current estimates are a function of previous values. The
autoregressive process is a stationary process that has a long-run mean, an average level to
which the series tends to revert. This average is often called the “Long Run Mean” (LRM).
For instance, when interest rates are below their LRM they are expected to rise and vice
versa.
We assume that volatility is constant under the square root rule (SRR) but volatility is
stochastic, and, in particular, autoregressive. Volatility has a long-run mean – a “steady
state” of uncertainty. When current volatility is above its long-run mean then we can expect a
decline in volatility over the longer horizon. Then estimating the long horizon volatility using
today’s volatility will overstate the true expected long horizon volatility. On the other hand, if
today’s volatility is low, then extrapolating today’s volatility using the square root rule may
understate true long-horizon volatility. It can be shown that under GARCH (1,1), the
expected variance on day t is given by:
= +( + ) ( − )
This formula can be used to calculate the average variance rate over the next T days. The
expected volatility can be estimated as the square root of this average variance rate.
Suppose that our estimate of the current daily variance is 0.00010 (corresponding to a
volatility of 1% per day). Due to mean reversion, however, we expect the average daily
variance rate over the next 25 days to be 0.00018. It makes sense then to base VaR and
expected shortfall estimates for a 25-day time horizon on volatility over the 25 days. This is
equal to 6.7% (= √25 × √0.00018 = 0.067)
29
Guru. Downloaded March 29, 2020.
The information provided in this document is intended solely for you. Please do not freely distribute.
= + +
This means that the next period’s expectations are a weighted sum of today’s value , and
the long-run mean /(1 − ). This time series process has a finite long-run mean, subject
to the constraint that parameter is less than one.
In case if = 1, the long-run mean /(1 − ) is undefined or infinite (as is divided by 0).
So, this process is considered as random walk or nonstationary and the next period’s
expected value is equal to today’s value.
Only if < 1 then the process is mean-reverting. When is above the long-run mean, it is
expected to decline, and if it is below the long-run mean it is expected to increase in value.
Hence is known as the “speed of reversion” parameter.
We first subtract from the autoregression formula to obtain the “return” change in as
− = + + −
= + ( − 1) +
For a two-period return, we require which can be found recursively by substituting the
value of = + + in its equation.
= + +
= + ( + + )+ = + + + +
− = + + + + −
= (1 + ) + ( −1) + +
9
Allen, Linda, Understanding Market, Credit, and Operational Risk (Blackwell Publishing Ltd 2004). Chapter 2.
30
Guru. Downloaded March 29, 2020.
The information provided in this document is intended solely for you. Please do not freely distribute.
( − ) = ( + + − )
= ( )
=
The volatility of en+1 is denoted by .
( − ) = ( (1 + ) + ( −1) + + )
= ( + )
2
= (1 + ) ∗
2
Thus, the single-period variance is and the two-period variance is(1 + )∗ .
This means that if the process was a random walk with no mean reversion (implying b
1), then we can apply the square root rule and the two-period variance is 2 . Then we
would get the standard square root volatility result (√2 ). However, when there is mean
reversion, the square root rule for obtaining volatility fails.
With mean reversion, when < 1, say if 0.9, the two-period volatility is, instead,
√(1 + 0.9) = 1.34
For instance, in a convergence trade where traders assume explicitly that the spread
between two positions, a long and a short, is mean-reverting (i.e. when < 1), then the long
2
horizon risk is smaller than the square root volatility as√(1 + ) ∗ < √2
This may create a sharp difference of opinions on the risk assessment of a trade by risk
managers. So, the managers usually assume a null hypothesis of market efficiency which
means that the spread underlying the convergence trade is a random walk.
31
Guru. Downloaded March 29, 2020.
The information provided in this document is intended solely for you. Please do not freely distribute.
The covariance estimate under the EWMA is derived from the following relationship:
Where is the covariance between asset returns and at for the time period .
Recall how we derive the updated volatility estimate using EWMA:
σ , = λσ , + (1 − λ)r , → vol estimate = σ , = σ,
Note that the decay parameter should be the same for the volatility model and the
covariance model to ensure consistency in the context of the correlation estimation. Once
we have derived the volatility estimates for each asset and the covariance estimate, we
derive the correlation estimate as follows:
cov
ρ =
σ , σ ,
For example (GARP’s example in 3.7): GARP’s example10 assumes two assets with
returns X and Y respectively. The daily return volatility of the previous day has been
estimated for each asset: 1.0% for X and 2.0% for Y. The daily returns observed the
previous day were +2.0% for both assets. Yesterday’s correlation coefficient was 0.20. The
decay factor is assumed to be 0.94. The question is the following: What is the updated
estimate of today’s correlation between X and Y? The first step is to estimate the previous
day estimate of the covariance:
cov =σ , σ , ρ = 0.01 ∗ 0.02 ∗ 0.20 = 0.00004
Then, we forecast the covariance and the volatilities based on the previous day figures.
cov 0.0000616
ρ = = = .
σ , σ , 0.01086 ∗ 0.02
10
2020 FRM Part I: Valuation & Risk Models, 10th Edition. Pearson Learning, 10/2019. Section 3.7 (Correlation)
32
Guru. Downloaded March 29, 2020.
The information provided in this document is intended solely for you. Please do not freely distribute.
In John Hull’s example 11.1, he assumes that EWMA’s λ = 0.95 and that the previous day’s
correlation between two variables X and Y was 0.60. The volatility estimates for X and Y on
the previous day (i.e., n – 1) were 1.0% and 2.0%, respectively. Finally, the previous day’s
return (aka, percentage change) in X and Y were, respectively, 0.50% and 2.50%. See
second column below. The first column captures the previous example; the third and fourth
columns capture Hull’s EOC 11.5 and 11.6. The GARCH(1,1) is optional.
As seen in the second column, EWMA also updates the volatility estimates to σ(X) =
sqrt(0.00009625) = 0.981%, and σ(Y) = sqrt(0.00041125) = 2.028%. Then:
0.00012025
= = = .
, , 0.981% ∗ 2.028%
11
Hull, John C.. Risk Management and Financial Institutions (Wiley Finance). Chapter 11, Example 11.1
33
Guru. Downloaded March 29, 2020.
The information provided in this document is intended solely for you. Please do not freely distribute.
Chapter Summary
Empirically, the return distribution of most assets is not normal. We observe skewness, fat-
tails, and time-varying parameters in the returns distribution. Normality allows for simple and
elegant solutions to risk-measurement but this poses some problems as normality is often
assumed (or when we “impose normality”). When the normality assumption is breached and
assets returns are not normal this can distort our risk measurement. In particular, fat-tails
causes us to underestimate the risk of severe losses if we are working based on an
assumption of normality.
We can explain fat-tails by a distribution that changes over time (conditional distribution).
There are two things that can change in a normal distribution: the mean and volatility.
Therefore, we can explain fat tails in two ways: a time-varying conditional mean; but this is
unlikely given the assumption that markets are efficient. The second explanation is that
conditional volatility is time-varying; according to this reading’s author, the latter is the more
likely explanation.
VaR can be estimated using a variety of approaches, based on historic volatility or implied
volatility. Full-valuation models tend to be non-parametric, whilst local-valuation tends to be
parametric. To be specific, the main historical approaches include: the parametric approach,
where distributional assumptions are imposed, non-parametric estimation, where no
distributional assumptions are imposed but it is implied that history will inform the future; a
hybrid approach, where historical data is combined with e.g. a weighting scheme. In the
implied volatility approach volatility is inferred from pricing models such as e.g., Black-
Scholes.
The main methods for volatility modeling include EWMA, GARCH, MA, and MDE. The
Moving Average is the simplest of these, but suffers from a ghosting feature, whereby one
extreme observation dominates the data until it’s dropped. EWMA is a popular and
widespread method, both because it has the more realistic assumption of assigning greater
weight to recent observations, its ease of implementation, and its relatively non-technical
approach making it easy to explain to management. Multivariate density estimation has the
attractive feature of having its weights informed by the current economic conditions, where
the scenarios with similar economic conditions are given a larger weight. GARCH has grown
in popularity and is generally superior to EWMA. Indeed, EWMA is a special case of
GARCH. GARCH has the advantage of being able to forecast volatility, whereas EWMA
simply gives you the current volatility estimate. GARCH must not be used if the weights
assigned to the lagged variance and lagged squared returns are greater than one as this
implies that the model is non-stationary. This will result in unreliable estimates.
The square-root-rule, states that the process may be scaled, provided it follows a random
walk and the volatility is constant. This can be used to extend a one-period VaR to a J-
period VAR by multiplying by the square root of J. Unfortunately, volatility is generally not
constant so this rule must be applied with caution: in industry, it is generally considered
acceptable to scale up to a week, however, beyond that, the model error makes the estimate
too uncertain.
34
Guru. Downloaded March 29, 2020.
The information provided in this document is intended solely for you. Please do not freely distribute.
I. The simplest parametric approach whose weakness is sensitivity to window length and
extreme observations
II. The most convenient and prominent non-parametric approach whose weakness is
inefficient use of data
III. An interpretation of the exponentially weighted moving average (EWMA) that gives the
risk manger a rule that can used to adapt prior beliefs about volatility in the face of news
IV. A parametric approach that assumes conditional returns are normal but unconditional
tails are heavy; and that returns are not correlated but conditional variance is mean-reverting
V. An approach that weights past squared returns not by time but instead according to the
difference between current and past states of the world
VI. An approach that modifies historical simulation by assigning exponentially declining
weights to past data such that recent (distant) returns are assigned more (less) weight
Which sequence below correctly matches the VaR estimation approach with its summary
description?
a) I = HS, II = AV, III = GARCH, IV = MDE, V = Hybrid, VI = STDEV
b) I = GARCH, II = MDE, III = Hybrid, IV = STDEV, V = HS, VI = AV
c) I = STDEV, II = HS, III = AV, IV = GARCH, V = MDE, VI = Hybrid
d) I = AV, II = Hybrid, III = STDEV, IV = HS, V = GARCH, VI = MDE
804.2. Dennis the Risk Analyst is calculating the 95.0% value at risk (VaR) under the hybrid
approach which is a hybrid between the historical simulation (HS) and exponentially
weighted moving average (EWMA). His historical window is only 90 days and he has set his
smoothing parameter 0.860; that is, λ = 0.860 and K = 90 days. Below are displayed the
(rounded) weights assigned under this approach to the three worst returns in the historical
window (which occurred, respectively, 27, 13 and 15 days ago).
35
Guru. Downloaded March 29, 2020.
The information provided in this document is intended solely for you. Please do not freely distribute.
803.2. The daily standard deviation of a risky asset is 1.40%. If daily returns are
independent, then of course we can use the square root rule (SRR) to scale into a two-day
volatility given by 1.40%*SQRT(2) = 1.980%. However, if the lag-1 autocorrelation of returns
is significantly positive at 0.60, then which is NEAREST to the corresponding scaled two-day
volatility?
a) 1.64%
b) 1.98%
c) 2.33%
d) 2.50%
803.3. Based on daily asset prices over the last month (n = 20 days), Rebecca the Risk
Analyst has calculated the daily volatility as a basic historical standard deviation
(abbreviated STDEV to match Linda Allen's label for the equally weighted average of
squared deviations which "is the simplest and most common way to estimate or predict
future volatility"). As the exhibit below shows, however, she could not quite decide which
mean, µ, to use for the sample standard deviation: the actual sample mean return was -
45.12 basis points; but she has also seen zero assumed per squared returns; and finally, the
expected return is actually +20.0 bps. Each produces a slightly different sample standard
deviation. For example, assuming the actual sample mean produces a daily volatility of
56.04 bps which matches the Excel function STEV.S().
36
Guru. Downloaded March 29, 2020.
The information provided in this document is intended solely for you. Please do not freely distribute.
Answers
804.1. C. TRUE: I = STDEV, II = HS, III = AV, IV = GARCH, V = MDE, VI = Hybrid. These
terms are matched with their summary explanation below, but grouped by parametric/non-
parametric and sub-sorted in the order of Linda Allen's presentation.
Parametric approaches (please note that Allen occasionally blurs the distinction between
volatility and VaR, but these parametric VaR approaches generally produce a volatility that
can be scaled by a confidence deviate to retrieve VaR):
Historical standard deviation (STDEV): The simplest parametric approach whose
weakness is sensitivity to window length and extreme observations.
Adaptive volatility (AV): An interpretation of the exponentially weighted moving
average (EWMA) that gives the risk manger a rule that can used to adapt prior
beliefs about volatility in the face of news
GARCH (1,1): A parametric approach that assumes conditional returns are normal
but unconditional tails are heavy; and that returns are not correlated but conditional
variance is mean-reverting
Non-parametric approaches
Historical simulation (HS): The most convenient and prominent non-parametric
approach whose weakness is inefficient use of data
Multivariate density estimation (MDE): An approach that weights past squared
returns not by time but instead according to the difference between current and past
states of the world
Hybrid (aka, age-weighted which Dowd has dubbed "semi-parametric"): An approach
that modifies historical simulation by assigning exponentially declining weights to
past data such that recent (distant) returns are assigned more (less) weight
804.2. A. TRUE: The hybrid weight assigned to the 4th worst return (-6.0%) is 1.457%
and the 95.0% VaR is (a worst expected loss of) 6.0%.
The easiest solution is to realize that the lambda parameter reflects the ratio of weights
between successive days. In this way, the weight assigned to the 16th day is simply 1.695%
* 0.86 = 1.4577%. Adding this to 4.264% generates the corresponding cumulative weight of
5.7217% such that the 5.0% quantile falls at this 4th-worst return of -6.0%. However, Linda
Allen writes that "the observation itself can be thought of as a random event with a
probability mass centered where the observation is actually observed.12" Under this technical
approach (which is akin to an interpolation but with the subtle difference that the weights are
spanning the return observations), the -6.0% return corresponds EXACTLY to 4.264% +
1.4577% = 4.99% which is almost exactly 5.0%. Consequently, the technically exact
approach also returns the loss of 6.0% as the 95.0% confident VaR.
12 Linda Allen, Understanding Market, Credit and Operational Risk: The Value at Risk Approach
37
Guru. Downloaded March 29, 2020.
The information provided in this document is intended solely for you. Please do not freely distribute.
803.3. C. True: If the actual 45.12 bps decline was an uniquely unpredictable, it is
reasonable to use the currently expected positive µ = +20 bps as the conditional mean
parameter
Since we believe that the decline [in her example, the historical sample mean was entirely
unpredictable], imposing our priors by using µt = 0 is a logical alternative. Another approach
is to use the unconditional mean, or an expected change based on some other theory as the
conditional mean parameter. In the case of equities, for instance, we may want to use the
unconditional average return on equities using a longer period – for example 12 percent per
annum, which is the sum of the average risk free rate (approximately 6 percent) plus the
average equity risk premium (6 percent). This translates into an average daily increase in
equity prices of approximately 4.5bp/day. This is a relatively small number that tends to
make little difference in application, but has a sound economic rationale underlying its use."
13
Linda Allen, Understanding Market, Credit and Operational Risk: The Value at Risk Approach (Oxford:
Blackwell Publishing, 2004)
14
Linda Allen, Understanding Market, Credit and Operational Risk: The Value at Risk Approach (Oxford:
Blackwell Publishing, 2004)
38