Developing A Stress Testing Framework
Developing A Stress Testing Framework
com
Abstract
The Basel 2 Accord requires regulatory capital to cover stress tests, yet no coherent and objective framework for stress testing port-
folios exists. We propose a new methodology for stress testing in the context of market risk models that can incorporate both volatility
clustering and heavy tails. Empirical results compare the performance of eight risk models with four possible conditional and uncondi-
tional return distributions over different rolling estimation periods. When applied to major currency pairs using daily data spanning more
than 20 years we find that stress test results should have little impact on current levels of foreign exchange regulatory capital.
Ó 2008 Elsevier B.V. All rights reserved.
Keywords: Value-at-Risk models; Stress testing; Market risk; Exchange rates; GARCH
1. Introduction ensure that it has sufficient capital to. . . cover the results
of its stress testing’ (Basel Committee on Banking Supervi-
A stress test is a risk management tool used to evaluate sion, 2006, at paragraph 778 (iii), p. 218, emphasis added).
the potential impact on portfolio values of unlikely, As a result leading industry practitioners have called for a
although plausible, events or movements in a set of finan- re-examination of stress testing methodologies, see Rowe
cial variables (Lopez, 2005). They are designed to explore (2005).
the tails of the distribution of losses beyond the threshold A recent survey of stress testing practice (Committee on
(typically 99%) used in Value-at-Risk (VaR) analysis. Since the Global Financial System, 2005) shows that most stress
the end of 1997 financial institutions using internal VaR tests are currently designed around a series of scenarios
models to assess capital adequacy have been required to based either on historical events, hypothetical events, or
implement stress testing (see Basel Committee on Banking some combination of the two. These methods have been
Supervision, 1996). They provide an input to decisions con- criticised by Berkowitz (2000) and Greenspan (2000) for
cerning, amongst other things, hedging, limit setting, port- their lack of rigour. They are typically conducted without
folio allocations and capital adequacy. a risk model so the probability of each scenario is
The Basel 2 Accord recommends a more direct link unknown, making its importance difficult to evaluate.
between stress tests and risk capital, i.e. ‘A bank must There is also a distinct possibility that many extreme yet
plausible scenarios are not even considered.
Stress tests conducted in the context of a risk model can
q
This paper was accepted by Prof. Giorgio Szego while he was the provide a useful alternative or complement to the current
Managing Editor of The Journal of Banking and Finance, and by the past ad hoc methods of stress testing. Several authors have
Editorial Board.
*
Corresponding author. Tel.: +61 2 9850 7755; fax: +61 2 9850 7281.
attempted to build such a bridge between stress tests and
E-mail addresses: [email protected] (C. Alexander), risk models including Kupiec (1998) who examines cross-
[email protected] (E. Sheedy). market effects resulting from a market shock and Aragones
0378-4266/$ - see front matter Ó 2008 Elsevier B.V. All rights reserved.
doi:10.1016/j.jbankfin.2007.12.041
C. Alexander, E. Sheedy / Journal of Banking & Finance 32 (2008) 2220–2236 2221
et al. (2001) who incorporate hypothetical stress events into ology. The methodology includes specification of an initial
an Extreme Value Theory (EVT) framework. shock event and analyses the subsequent market response
Our research begins by asking a more fundamental to that shock using simulation. Alternatively analysts can
question: What is the most suitable risk model in which specify hypothetical shock events and use the risk models
to conduct a stress test? If the model is misspecified, then only to assess its after effects as volatility increases in
our approach is vulnerable to a considerable degree of response to the shock. The methodology can readily be
model risk. Hence a significant part of this paper performs extended to handle multiple assets/risk factors and to
extremely thorough backtests, which are designed to reduce incorporate changing liquidity conditions. Market partici-
the model risk in risk models that are used for stress pants may also use this framework to assess their own
testing. response to a market crisis, i.e. immediate versus gradual
The risk model used for stress testing need not necessar- hedging. The methodology is evaluated by comparing it
ily have the same features as that used for daily VaR mod- to current stress testing techniques and regulatory
els; indeed it could be argued that there are advantages in requirements.
using different models for cross-checking purposes. The The outline of the paper is as follows: Section 2 explains
empirical performance of VaR models used at commercial the risk models that will be examined in this study, and the
banks has been analysed by Berkowitz and O’Brien (2002) reasons for their selection. Section 3 presents an empirical
and Berkowitz et al. (2006). These studies suggest that analysis of these models in currency markets. Section 4
banks’ VaR models may be misspecified since they do takes the best-performing models and shows how they
not fluctuate much while actual volatility of trading reve- may be adapted for stress testing purposes. Section 5 eval-
nues is clearly time-varying. This is almost certainly uates the stress tests and Section 6 summarises and
because many banks use simple unconditional models to concludes.
estimate VaR. The possibility that bank VaR models are
misspecified creates further incentive to ensure that an 2. Risk models
appropriate model is selected for stress testing purposes.
We therefore backtest eight risk models including both What guidance can the market risk capital assessment
conditional and unconditional models and four possible literature provide regarding the selection of appropriate
return distributions. We include in our analysis risk models risk models for stress testing? Previous research, which typ-
that are already popular in the industry and those that are ically explores the 99th percentile of outcomes at 1-day
relatively accessible to financial institutions. Our backtest- horizons, suggests that accurate risk models will capture
ing methodology is designed with stress testing applications two key characteristics: volatility clustering and heavy
in mind, that is, we assess the ability of risk models to fore- tails.2 But stress testing requires exploration of more
cast extreme percentiles (up to 99.9%) of returns distribu- extreme outcomes and, since immediate hedging may not
tions over relatively short multi-day horizons. We also be practical in a market crisis, longer horizons. Danielsson
use assessment criteria based on expected tail loss (ETL) and De Vries (2000) make the point that the most extreme
as well as Value-at-Risk (VaR) to ensure that the model market moves tend to exhibit reduced dependency between
performs well for extreme outcomes beyond VaR. Our successive daily returns. This suggests that unconditional
findings support the use of conditional risk models with risk models may be sufficient for our application, provided
non-normal innovations such as the empirical distribution they have sufficiently heavy tails. We therefore examine a
and Student’s t distribution. range of unconditional risk modelling approaches, each
We choose major exchange rates as the context for anal- with its conditional counterpart, to give a total of eight uni-
ysis due to their importance in the portfolios of financial variate risk models.
institutions in many countries, and because of the availabil- Berkowitz (2000) has argued that the distribution of an
ity of high quality data over a lengthy historical period. We asset or risk factor during periods of market stress is very
investigate daily returns for three of the most traded cur- different from its usual distribution. Therefore, we have
rency pairs1: the USD/JPY, GBP/USD and the AUD/ included a two-component normal mixture distribution as
USD. The importance of these three currencies for the risk a candidate return distribution; here the two component
of financial institutions is highlighted in a recent paper on distributions can be interpreted as corresponding to ‘nor-
carry trade activity (see Galati et al., 2007).
Having identified our preferred risk models, we then 2
The RiskMetrics Group popularised the Exponentially Weighted
develop and illustrate a model-based stress testing method- Moving Average method for estimating volatility. The heavy tails
apparent in financial return distributions led to the popularity of VaR
methods based on historical simulation, which employs an empirical
1
According to Bank for International Settlements Bank for Interna- distribution (see Dowd and Rowe, 2004; Introduction to Value-at-Risk
tional (2005); Triennial Central Bank Survey: Foreign exchange and models. In: Carol Alexander, Elizabeth Sheedy (Eds.), The Professional
derivatives market activity in 2004 (Bank for International Settlements, Risk Manager’s Handbook (PRMIA Publications)). More recently
Basel) these were the three most actively traded currency pairs after the researchers have explored conditional non-normal innovation distribu-
Euro/USD in 2004, and the Euro/USD could not be included in the tions for VaR modelling, finding them superior to their conditional
analysis due to lack of historical data. normal counterparts. See discussion later in Section 2.
2222 C. Alexander, E. Sheedy / Journal of Banking & Finance 32 (2008) 2220–2236
mal’ and ‘stress’ market conditions. We compare the nor- tion has heavy tails then VaR measured at high
mal mixture distribution with two more standard and confidence levels will be greater than under the assumption
accessible approaches designed to capture extreme out- of normality. There is no assumption that the distribution
comes: the Student’s t distribution and a smoothed empir- of returns is symmetric so VaR and ETL will differ accord-
ical distribution. ing to whether the portfolio is long or short the asset (in
An obvious variation on the latter would be an applica- our case the commodity currency expressed in units of
tion of the EVT (Extreme Value Theory – see Longin, the terms currency). For the long asset case, the VaRa esti-
2000; Aragones et al., 2001; Longin, 2005). We choose mate is the absolute value of the lower a percentile of the
not to include EVT in the backtesting experiment that fol- empirical return distribution and ETLa is the absolute
lows since excellent results are achieved using simpler, value of the average of all empirical return outcomes below
more accessible models. Previous research applying EVT VaR. Risk measures for the short asset case can be calcu-
to risk modelling has typically focused on a single period lated using the same method, but first multiplying the
horizon and has assumed independence in returns. Apply- return series by 1.
ing EVT in a conditional, multi-day setting is far more The failings of historical simulation have been well
demanding and likely to be inaccessible to many financial described by Pritsker (2006). Of particular importance to
institutions (although we note that McNeil and Frey this research is the fact that VaR estimates are sample spe-
(2000) have successfully implemented this approach). cific, particularly for high confidence levels. Following
For each model we estimate Value-at-Risk (VaR) and Sheather and Marron (1990), Butler and Schachter (1998)
expected tail loss (ETL), i.e. the expected loss conditional and Chen and Tang (2005) we therefore smooth the empir-
on exceeding VaR. We now describe each of the risk mod- ical density by fitting the Epanechnikov kernel to a sample
els, outlining their merits and explaining the calculation of of daily returns. Kernel estimation is designed to fit a
VaR and ETL for each as a percentage of the current port- smooth curve to a random sample such that it provides
folio value. the best possible representation of the density of a random
variable. This method is explained in Dowd (2002) and
2.1. Unconditional normal involves selecting a kernel (that is, a density function cen-
tred on the data point), and a parameter called the band-
This approach is included for the purpose of bench- width that is analogous to the bin-width in a histogram.
marking the candidate risk models. Here we assume that Silverman (1986) describes a range of possible kernels
et ¼ y t y , where yt, the daily log returns, are independent (e.g. Epanechnikov, Gaussian, triangular) but finds that
and normally distributed: the choice between them is typically not critical. The soft-
ware used for analysis in this study is Matlab, where kernel
et N ð0; r2 Þ: ð1Þ
estimation is a standard function in the Statistics toolbox.
Under this assumption the VaR and ETL for an horizon of The software automatically selects the optimal bandwidth
h days at significance level a are: to minimise the squared difference between the empirical
density and the fitted density.
VaRh;a ¼ U1 ðaÞrh ð2Þ The daily VaR estimate for a long portfolio is the abso-
and3: lute value of the return at the lower a percentile of the ker-
nel density. To calculate ETL we take an average of tail
ETLh;a ¼ a1 uðU1 ðaÞÞrh ; ð3Þ VaRs as recommended by Dowd (2002, p. 45). To calculate
where U denotes the standard normal cumulative distribu- VaR and the ETL over an h-day horizon we multiply the
tion function, updenote the standard normal density func- daily VaR or ETL by the square root of h, consistent with
common industry practice. In addition we have performed
ffiffiffi
tion and rh ¼ r h.
an empirical analysis to justify the choice of square-root
2.2. Unconditional empirical scaling, described in Appendix A.5
This method is chosen by virtue of its popularity in the 2.3. Unconditional Student’s t
industry. Indeed, a recent survey found that of the 64.9% of
firms that disclose their methodology 73% report the use of Huisman et al. (1998) were among the first to specifically
historical simulation (Perignon and Smith, 2006). It makes accommodate heavy tails in VaR estimation with the Stu-
no assumption about the distribution of past returns, other dent’s t distribution. We assume the mean-adjusted returns
than the assumption that returns are independent and iden- et are distributed as follows:
tically distributed (i.i.d).4 If the empirical return distribu-
et rðm=ðm 2ÞÞ1=2 T m ; ð4Þ
3
See McNeil et al., 2005. Quantitative Risk Management (Princeton
5
University Press) at p. 45. Scaling with the square root of time rule is still only an approximation.
4
A referee has pointed out that historical simulation is effective under A better, though very computationally intensive method, is to simulate
mild dependence of returns (such as GARCH). paths of length h and determine h-day VaR directly.
C. Alexander, E. Sheedy / Journal of Banking & Finance 32 (2008) 2220–2236 2223
where Tm is the standardised Student’s t distribution with m Both VaR and ETL are scaled using the square root of time
degrees of freedom, zero mean and unit variance and r is rule for the case of horizons beyond 1 day.
the standard deviation of the mean-adjusted returns. To
estimate the a% h-day VaR we follow the method described 2.5. Conditional normal
by Dowd (2002):
1=2 Here the mean adjusted returns are assumed to be con-
VaRh;a ¼ T 1
m ðaÞððm 2Þ=mÞ rh ; ð5Þ
ditionally normally distributed with conditional variance
where the parameter m is estimated using the method of mo- following the symmetric GARCH(1,1) process of Bollers-
ments. Appendix B derives the ETL: lev (1986)7:
ETL ¼ a1 ðm 1Þ1 ðm 2 þ T 1 2 1
m ðaÞ Þt m ðT m ðaÞÞrh ;
r2t ¼ c1 þ c2 e2t1 þ c3 r2t1 c1 P 0; c2 ; c3
tion is the absolute portfolio return calculated at the lower To account for market liquidity constraints we consider
(upper) a percentile and the corresponding ETL is the a 3-day risk horizon in addition to the more typical 1-
absolute mean of all simulated returns below (above) the day horizon.
VaR. The choice of estimation window is an important source
of model risk.9 The Basel regulations insist on an estima-
2.7. Conditional Student’s t tion window of at least 1 year (equivalent to around 250
trading days). Thus we consider a range of possible esti-
Several heavy-tailed distributions have been combined mation windows (250, 500, 1000 and 2000 days) for esti-
with GARCH models in the VaR estimation literature. mating parameters, volatility and percentiles.
Venter and de Jongh (2002), for example, use the normal
inverse Gaussian distribution. The conditional Student’s t Hence we conduct a rigorous set of backtests that are
model has arguably been investigated by the greatest num- specifically designed to reduce the model risk in the risk
ber of researchers and is a relatively standard option in sta- models that are used for stress testing. This is an interesting
tistical packages. Mittnik and Paolella (2000) estimate VaR area that warrants further empirical research. The present
for East-Asian currencies with an asymmetric generalised t paper presents results for currency markets and forthcom-
distribution in combination with an APARCH model. Giot ing research by the authors examines equity index markets.
and Laurent (2003) apply a similar approach in equity mar- We evaluate the risk models in the context of daily
kets and Angelidis et al. (2004) apply the simpler t- returns on the Australian dollar in terms of US dollars
EGARCH model for estimating VaR in equity markets. (AUD/USD), the Great Britain Pound in terms of US dol-
In an analysis of major currencies, So and Yu (2006) find lars (GBP/USD) and the US dollar in terms of Japanese
that the t-GARCH method performs well in VaR estima- Yen (USD/JPY).10 We have 5691 observations on AUD/
tion for major currency returns. GBP dating from the float of the AUD in December
The method used here is identical to that for Section 2.5, 1983 until June 2006 and 8154 observations from January
but now the innovations are drawn from a Student’s t dis- 1974 until June 2006 for the USD/JPY and GBP/USD.
tribution with m degrees of freedom, in order to capture the For each series we repeatedly estimate both VaR and
conditional excess kurtosis present in empirical data. ETL for each of the eight risk models. The maximum sam-
ple size for estimation of parameters is 2000 days, so the
2.8. Conditional normal mixture first estimation occurs 2000 days (nearly 8 years) into the
data sample. In the case of the USD/JPY and the GBP/
We start by fitting a standard GARCH(1,1) model with USD this allows us to calculate over 6000 estimates of
normal innovations to historical returns, and then stan- VaR and ETL having a 1-day horizon, or around 2000
dardise the historical returns as in Section 2.6. Then using non-overlapping estimates of VaR and ETL having a 3-
the EM algorithm we fit a normal mixture distribution to day horizon. Each time VaR and ETL are estimated they
the standardised returns having two component densities. are based on revised parameter estimates (or percentiles)
To calculate VaR at the h-day horizon we simulate (9) over using the most recent estimation window of 250, 500,
an h-day horizon using innovations drawn from the fitted 1000 or 2000 days. At the end of each risk horizon we cal-
normal mixture distribution. The VaR for a long (or short) culate actual profit and loss for the trading portfolio. An
position is the absolute portfolio return calculated at the a exceedance occurs if the loss is greater than the estimated
(or 1 a) percentile and the corresponding ETL is the abso- VaR for that risk horizon. These exceedances are the inputs
lute mean of all simulated returns below (above) the VaR. to the tests of coverage, conditional coverage and ETL
referred to above. Tables 1–4 present results for each of
3. Model selection the eight risk models using a sample period of 2000 days.
Clearly the assumption of normality of returns cannot
In this section we backtest our risk models using the be justified, whether in the conditional or unconditional
Kupiec (1995) test for coverage, the Christoffersen (1998) context. Column (i.c.) of Table 1 shows that in almost
test for conditional coverage and a method for backtesting every case we can comfortably reject the hypothesis that
ETL due to McNeil and Frey (2000). Test procedures are the actual number of exceptions is equal to the expected
explained in Appendix C. The empirical analysis here dif- number of exceptions. The unconditional normal model
fers from previous empirical research in several respects,
namely:
9
See the discussion of sample period in relation to historical simulation
Most previous research has analysed only VaR at the in Section 9.2 of Hull (2006). Risk Management and Financial Institutions
99.0% confidence level. To better reflect the stress testing (Pearson Prentice Hall). The issue of how many data points are needed to
context of this research we consider confidence levels for estimate the parameters of GARCH reliably is discussed in Straumann
(2004) in Section 5.10.
VaR and ETL of 99.0%, 99.5% and 99.9%. The accuracy 10
Data were obtained from (www.rba.gov.au/Statistics/HistoricalEx-
of our risk measures is assessed using over 15,000 out- changeRates/index.html) and (www.federalreserve.gov/releases/h10/Hist/
of-sample returns. default.htm).
C. Alexander, E. Sheedy / Journal of Banking & Finance 32 (2008) 2220–2236 2225
Table 1
Backtest with normal innovations (Estimation window = 2000 daysA)
Currency pair 1a (i) Unconditional normal (ii) Conditional normal
(a) (b) Unconditional Conditional ETL Unconditional Conditional ETL
coverage VaR coverage p-value coverage VaR coverage VaR p-value
p-value VaR p-value (e) p-value p-value (e)
(c) (d) (c) (d)
1-day horizon
Long AUD/Short USD (%) 99.0 0.00 0.02 0.00 0.00 0.02 0.00
99.5 0.00 0.00 0.00 0.00 0.02 0.00
99.9 0.00 0.02 0.00 0.00 0.02 0.00
Short AUD/Long USD (%) 99.0 1.24 4.24 0.00 2.74 8.30 0.50
99.5 0.00 0.02 0.00 0.42 1.26 5.60
99.9 0.00 0.00 0.00 5.23 14.95 6.00
Long GBP/Short USD (%) 99.0 0.00 0.00 0.00 0.00 0.02 0.00
99.5 0.00 0.00 0.00 0.00 0.02 0.00
99.9 0.00 0.00 0.50 0.00 0.02 2.20
Short GBP/Long USD (%) 99.0 9.52 6.30 0.00 7.37 20.16 0.10
99.5 5.44 1.67 0.00 5.44 11.78 0.00
99.9 0.00 0.00 0.10 0.10 0.43 0.00
Long USD/Short JPY (%) 99.0 0.00 0.00 0.00 0.00 0.00 0.00
99.5 0.00 0.00 0.00 0.00 0.00 0.00
99.9 0.00 0.00 0.00 0.00 0.00 0.00
Short USD/Long JPY (%) 99.0 12.15 17.81 0.00 23.63 48.65 0.00
99.5 0.24 0.18 0.00 0.40 1.10 0.20
99.9 0.00 0.00 0.00 0.00 91.77 0.10
3-day horizon
Long AUD/Short USD (%) 99.0 0.06 0.08 0.30 2.35 5.34 2.00
Short AUD/Long USD (%) 99.0 63.36 75.96 4.50 9.84 24.52 98.10
Long GBP/Short USD (%) 99.0 0.00 0.00 0.00 0.00 0.02 17.90
Short GBP/Long USD (%) 99.0 74.38 46.75 0.60 16.96 27.15 0.90
Long USD/Short JPY (%) 99.0 0.00 0.00 0.00 0.00 0.00 0.00
Short USD/Long JPY (%) 99.0 90.95 81.58 0.70 90.95 81.59 5.60
(a) The trading portfolio used for the analysis. (b) The confidence level at which VaR and ETL are calculated. (c) Test of the null hypothesis that the actual
number of violations is equal to the expected number of violations. (d) Test of the null hypothesis that violations are spread evenly over time (as opposed
to being clustered) and (c) above. (e) Test of the null hypothesis that the ETL does not consistently understate the true potential for losses beyond the VaR.
A
Estimation window refers to the number of observations used to estimate VaR/ETL. The sample is rolled over daily, keeping the sample size constant,
to generate 15 years of out-of-sample VaR forecasts over both 1-day and 3-days horizon. The backtests are based on the ‘violations’, i.e. the returns that
exceeded the prediction of VaR.
performs no better in a test of conditional coverage (col- purposes, especially in the conditional t case. Table 3 pre-
umn (i.d.)), and performance in measuring ETL is poor. sents the unconditional t results in panel (i), with reasonable
Column (i.e.) shows that in every case we can reject the null results for most portfolios, the obvious exceptions being
hypothesis (with 95% confidence) that ETL does not con- long AUD/USD and long USD/JPY. This can be explained
sistently understate the true potential for losses beyond by the negative skewness observed in both currency pairs
the VaR. Turning to the conditional normal risk model which potentially could be addressed with the use of an
presented in panel (ii) of Table 1 we observe that while asymmetric t distribution. Column (i.d.) also provides some
modelling heteroscedasticity improves performance, espe- evidence that this risk model is flawed by clustering of
cially in terms of conditional coverage, the results do not exceedances, even in the conditional case. However, the
generally support the use of this risk model. ETL test results are in fact the best of all the risk models
The risk models based on the empirical distribution are considered in this study. In only 1 out of 24 cases can we
much more suited to stress testing, especially in the condi- reject the ETL measure at the 95% confidence level. Even
tional case. Results for the unconditional empirical model at very low levels of a the measure of ETL under this risk
are presented in panel (i) of Table 2. In the majority of cases measure conservatively estimates the potential for losses
considered we cannot reject the hypotheses tested at 95% beyond the VaR. Indeed, it could be argued that the condi-
confidence. We note, however, that in some cases there is tional t risk model is too conservative in some cases since no
a tendency for exceptions to be clustered, and the hypothe- exceedances are recorded for two of the portfolios.
sis with regard to conditional coverage is rejected (at 95%) In order to conserve space we present abbreviated
in 8 out of 24 cases. The conditional empirical risk model in results for the normal mixture distribution in Table 4. As
panel (ii) of Table 2 eliminates this problem. Use of the Stu- explained earlier, the normal mixture distribution is intui-
dent’s t distribution can also be justified for stress testing tively appealing for modelling extreme risk as it allows
2226 C. Alexander, E. Sheedy / Journal of Banking & Finance 32 (2008) 2220–2236
Table 2
Backtest with empirical innovations (Estimation window = 2000 daysA)
Currency pair 1a (i) Unconditional empirical (ii) Conditional empirical
(a) (b) Unconditional Conditional ETL Unconditional Conditional ETL
coverage coverage p-value coverage coverage p-value
VaR p-value VaR p-value (e) VaR p-value VaR p-value (e)
(c) (d) (c) (d)
1-day horizon
Long AUD/Short USD (%) 99.0 40.90 56.93 14.50 98.68 68.73 93.00
99.5 56.04 74.84 5.80 27.79 52.63 58.70
99.9 27.03 53.94 7.40 9.60 25.02 0.00
Short AUD/Long USD (%) 99.0 50.51 62.43 23.70 51.12 59.83 52.10
99.5 42.14 21.86 38.90 91.60 91.04 74.60
99.9 5.23 0.62 70.30 71.02 93.11 62.20
Long GBP/Short USD (%) 99.0 95.20 8.83 96.00 9.60 16.88 72.20
99.5 74.74 82.76 94.80 9.50 22.93 48.80
99.9 63.06 88.72 12.60 95.06 99.23 35.60
Short GBP/Long USD (%) 99.0 1.80 0.76 73.10 55.67 49.37 86.80
99.5 6.11 3.01 60.40 28.14 50.55 72.40
99.9 15.75 36.77 3.90 63.06 88.72 89.70
Long USD/Short JPY (%) 99.0 57.14 0.26 15.80 84.39 54.32 10.40
99.5 45.39 0.29 22.20 68.97 77.29 13.90
99.9 15.51 1.43 16.70 73.82 93.82 2.10
Short USD/Long JPY (%) 99.0 20.97 9.98 54..00 94.58 89.35 68.00
99.5 61.18 26.08 51.60 48.70 21.83 18.40
99.9 95.06 99.23 28.10 47.66 76.81 21.50
3-day horizon
Long AUD/Short USD (%) 99.0 63.36 75.96 91.20 84.25 85.32 94.50
Short AUD/Long USD (%) 99.0 9.84 24.52 71.50 9.84 24.52 99.00
Long GBP/Short USD (%) 99.0 4.88 3.03 82.20 45.07 42.44 61.20
Short GBP/Long USD (%) 99.0 4.06 11.45 4.60 58.78 66.51 35.60
Long USD/Short JPY (%) 99.0 33.54 0.72 19.40 24.21 36.13 0.20
Short USD/Long JPY (%) 99.0 12.54 28.07 38.30 73.44 79.04 13.50
(a) The trading portfolio used for the analysis. (b) The confidence level at which VaR and ETL are calculated. (c) Test of the null hypothesis that the actual
number of violations is equal to the expected number of violations. (d) Joint test of the null hypothesis that violations are spread evenly over time (as
opposed to being clustered). (e) Test of the null hypothesis that the ETL does not consistently understate the true potential for losses beyond the VaR.
A
Estimation window refers to the number of observations used to estimate VaR/ETL. The sample is rolled over daily, keeping the sample size constant,
to generate 15 years of out-of-sample VaR forecasts over both 1-day and 3-days horizon. The backtests are based on the ‘violations’, i.e. the returns that
exceeded the prediction of VaR.
Table 3
Backtest with Student’s t innovations (Estimation window = 2000 daysA)
Currency pair 1a (i) Unconditional t (ii) Conditional t
(a) (b) Unconditional Conditional ETL Unconditional Conditional ETL
coverage coverage p-value coverage coverage p-value
VaR p-value VaR p-value (e) VaR p-value VaR p-value (e)
(c) (d) (c) (d)
1-day horizon
Long AUD/Short USD (%) 99.0 0.03 0.04 9.70 14.71 11.44 69.30
99.5 1.35 2.41 6.20 30.66 51.33 92.20
99.9 0.06 0.27 87.70 33.46 62.71 73.00
Short AUD/Long USD (%) 99.0 61.28 66.68 10.80 17.45 31.61 99.70
99.5 21.58 16.37 32.00 0.60 2.25 88.40
99.9 27.03 1.20 68.30 NA NA NA
Long GBP/Short USD (%) 99.0 15.34 1.58 95.10 74.41 53.55 96.10
99.5 20.73 35.65 99.70 88.96 85.50 98.70
99.9 35.33 64.83 87.10 15.75 36.77 86.60
Short GBP/Long USD (%) 99.0 5.21 2.38 24.40 7.14 13.50 97.30
99.5 20.34 9.97 24.10 2.21 6.87 84.30
99.9 63.06 88.72 37.30 5.08 14.84 76.20
Long USD/Short JPY (%) 99.0 0.10 0.00 0.10 7.37 4.59 23.30
99.5 0.00 0.00 10.10 5.44 0.00 36.10
99.9 0.10 0.05 56.00 7.87 20.90 54.70
Short USD/Long JPY (%) 99.0 7.14 3.32 56.60 0.51 1.10 100.00
99.5 37.62 17.54 86.80 1.23 4.13 100.00
99.9 35.33 64.83 75.50 NA NA NA
3-day horizon
Long AUD/Short USD (%) 99.0 7.54 11.92 58.10 12.65 23.81 99.40
Short AUD/Long USD (%) 99.0 49.57 73.04 91.80 4.50 13.02 99.80
Long GBP/Short USD (%) 99.0 0.10 0.18 87.30 7.61 15.16 89.30
Short GBP/Long USD (%) 99.0 7.40 18.66 3.80 74.38 74.67 59.50
Long USD/Short JPY (%) 99.0 0.62 0.02 2.40 0.62 0.74 3.90
Short USD/Long JPY (%) 99.0 7.40 18.66 48.80 19.91 39.26 63.00
(a) The trading portfolio used for the analysis. (b) The confidence level at which VaR and ETL are calculated. (c) Test of the null hypothesis that the actual
number of violations is equal to the expected number of violations. (d) Joint test of the null hypothesis that violations are spread evenly over time (as
opposed to being clustered) and (c) above. (e) Test of the null hypothesis that the ETL does not consistently understate the true potential for losses beyond
the VaR.
A
Estimation window refers to the number of observations used to estimate VaR/ETL. The sample is rolled over daily, keeping the sample size constant,
to generate 15 years of out-of-sample VaR forecasts over both 1-day and 3-days horizon. The backtests are based on the ‘violations’, i.e. the returns that
exceeded the prediction of VaR. NA: No violations.
Table 4
Backtest with normal mixture innovationsA (Estimation window = 2000 days)
Currency pair 1a (i) Unconditional normal mixture (ii) Conditional normal mixture
(a) (b) Unconditional Conditional ETL Unconditional Conditional ETL
coverage coverage p-value coverage coverage p-value
VaR p-value VaR p-value (e) VaR p-value VaR p-value (e)
(c) (d) (c) (d)
1-day horizon
Long AUD/Short USD (%) 99.0 2.74 3.85 0.00 0.03 0.10 0.00
99.5 0.12 0.32 0.00 0.00 0.01 0.00
99.9 0.00 0.02 2.10 0.00 0.02 0.00
Long GBP/Short USD (%) 99.0 4.27 3.25 1.40 0.00 0.02 0.00
99.5 1.61 3.97 13.00 0.00 0.01 0.00
99.9 15.50 35.79 3.30 0.00 0.02 0.40
Long USD/Short JPY (%) 99.0 2.37 0.00 0.00 0.00 0.00 0.00
99.5 0.14 0.00 0.00 0.00 0.02 0.00
99.9 0.00 0.00 0.30 0.00 0.02 0.00
3-day horizon
Long AUD/Short USD (%) 99.0 20.29 35.02 5.30 2.35 5.34 5.50
Long GBP/Short USD (%) 99.0 0.05 0.10 3.70 0.01 0.03 12.70
Long USD/Short JPY (%) 99.0 3.04 0.04 0.00 0.05 0.10 0.00
A
For reasons of space results from only three currency pairs are displayed here. All portfolios performed badly in the unconditional coverage and ETL
tests.
2228 C. Alexander, E. Sheedy / Journal of Banking & Finance 32 (2008) 2220–2236
Table 5
Impact of estimation window on backtest of conditional t
Currency pair 1a (i) Estimation window = 250 days (ii) Estimation window = 2000 days
(a) (b) Unconditional Conditional ETL Unconditional Conditional ETL
coverage coverage p-value coverage coverage p-value
VaR p-value VaR p-value (e) VaR p-value VaR p-value (e)
(c) (d) (c) (d)
1-day horizon
Long AUD/Short USD (%) 99.0 0.21 0.88 0.50 14.71 11.44 69.30
99.5 0.01 0.03 13.80 30.66 51.33 92.20
99.9 1.97 6.45 1.60 33.46 62.71 73.00
Short AUD/Long USD (%) 99.0 98.68 68.73 95.70 17.45 31.61 99.70
99.5 55.84 78.59 97.70 0.60 2.25 88.40
99.9 9.60 25.02 0.00 NA NA NA
Long GBP/Short USD (%) 99.0 23.63 21.65 53.30 74.41 53.55 96.10
99.5 45.39 61.83 31.90 88.96 85.50 98.70
99.9 28.30 55.46 27.10 15.75 36.77 86.60
Short GBP/Long USD (%) 99.0 94.58 89.35 30.20 7.14 13.50 97.30
99.5 68.97 77.29 14.80 2.21 6.87 84.30
99.9 15.51 35.81 38.20 5.08 14.84 76.20
Long USD/Short JPY (%) 99.0 0.00 0.00 0.30 7.37 0.00 23.30
99.5 0.00 0.00 1.10 5.44 0.00 36.10
99.9 0.01 0.01 1.20 7.87 20.90 54.70
Short USD/Long JPY (%) 99.0 7.14 13.67 98.30 0.51 1.10 100.00
99.5 6.11 16.11 93.20 1.23 4.13 100.00
99.9 NA NA NA NA NA NA
3-day horizon
Long AUD/Short USD (%) 99.0 2.35 5.34 9.40 12.65 23.81 99.40
Short AUD/Long USD (%) 99.0 4.50 13.02 80.40 4.50 13.02 99.80
Long GBP/Short USD (%) 99.0 3.04 7.59 82.10 7.61 15.16 89.30
Short GBP/Long USD (%) 99.0 74.38 74.67 24.60 74.38 74.67 59.50
Long USD/Short JPY (%) 99.0 0.62 2.05 0.10 0.62 0.74 3.90
Short USD/Long JPY (%) 99.0 42.22 62.86 28.40 19.91 39.26 63.00
(a) The trading portfolio used for the analysis. (b) The confidence level at which VaR and ETL are calculated. (c) Test of the null hypothesis that the actual
number of violations is equal to the expected number of violations. (d) Test of the null hypothesis that violations are spread evenly over time (as opposed
to being clustered). (e) Test of the null hypothesis that the ETL does not consistently understate the true potential for losses beyond the VaR.
1=2
Global Financial System (2005) has been to base this on an eT ¼ t1
m ðaÞððm 2Þ=mÞ T ;
r
historical or hypothetical event specified by analysts, man-
agement or regulators. An alternative or complementary where r T is the equally weighted sample standard deviation
approach is to consider an extreme outcome as defined using all available data up to time T. The initial shock can
under the risk model. also be defined using EVT. Gencay and Selcuk (2004) show
For a long position VaRa is a loss that should be that EVT-based VaR estimates are often more accurate for
exceeded in only a% of cases. Consequently the model- small values of a in emerging markets. Under the EVT the
based stress test equates a to the probability of a market initial shock for a long position is
shock occurring on a given day where the size of the shock
is VaR1,a.11 A typical value for a is 0.0002, corresponding n
eT ¼ u bn1 ½ðnaÞ nnu 1;
to a loss that we are 99.98% confident will not be exceeded
over one day. The value of a should be fixed at board level where u is the threshold, n and b are the tail index and scale
to reflect the risk profile of the organisation and, if rele- parameters of the generalised Pareto distribution, n is the
vant, its target credit rating. sample size and nu is the number of observations exceeding
Under the empirical approach the initial shock e* for the the threshold.
long asset position is simply the a percentile of the empiri- Table 6 illustrates these alternative approaches for defin-
cal distribution using a large sample of data. We recom- ing the initial shock event using the three long portfolios.
mend using 15 years or more of daily returns. But if we We define the initial shock using all data up to June
assume a specific return distribution then the initial shock 2006. We note that the Student’s t distribution always gives
is derived from that distribution. For example under the the smallest shocks in an absolute sense but the other two
Student’s t distribution: models provide similar results. Thus it is difficult to justify
the use of the more complex EVT methodology in this case.
We hypothesise that this may be due to the fact that the
11
For a short position the size of the initial shock to returns is VaR1,(1a). estimated tail index is relatively low for currency data com-
C. Alexander, E. Sheedy / Journal of Banking & Finance 32 (2008) 2220–2236 2229
12 13
In the short asset case the same analysis is performed but path returns The methodology for unconditional stress tests is available from the
are first multiplied by 1. authors on request.
2230 C. Alexander, E. Sheedy / Journal of Banking & Finance 32 (2008) 2220–2236
Long GBP/USD Portfolio Conditional Empirical Model age risk horizon) will provide some relief almost immedi-
18%
ately. However, the stress test horizon should also take
16%
account of reduced market liquidity in a crisis, the size of
14% the position relative to the market and potential delays in
% Portfolio Value
0%
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Table 7 displays the stress test results for three of the six
Risk Horizon in Days portfolios and for risk horizons of 3 and 10 days. Consider,
for example, the 3-day stress test performed on the long
Fig. 1. Comparison of Basel I regulatory capital and stress test losses.
AUD portfolio with a = 0.0002, presented in panel 1 of
The size of the initial shock: We set a = 0.0002 (or Table 7. The assumed initial shock to currency returns var-
0.0005), equivalent to an initial shock that just breaches ies between models from 4.41% (empirical) to 3.90%
the 99.98% (or 99.95%) confidence level. (conditional t), but a much larger discrepancy between the
The percentile for calculating the stress loss: We chose R risk models is apparent when comparing the overall stress
equal to 0.01, equivalent to a one-sided 99% confidence test outcomes over 3-day and 10-day horizons. The condi-
interval for stress test outcomes. tional models incorporate volatility clustering and therefore
The holding period: This is the number of days that the have a wider range of outcomes. The conditional empirical
portfolio is held before it is fully hedged against further model suggests that when a stress event occurs we are 99%
losses. We examined holding periods of 1, 2, . . ., 20 days. confident that losses over 3 days will be no greater than
Note that there is no regulatory prescription regarding 9.88% of portfolio value. The best possible outcome (with
the risk horizon of stress tests. 99% confidence) is a loss of 0.002%, i.e. a very small profit!
We also repeated the analysis with initial shocks calcu-
Fig. 1 presents the conditional empirical stress test lated using EVT.18 We estimated n and b from the same
results for a long GBP/USD currency portfolio. For empirical sample and used these parameter estimates to cal-
comparison the VaR-based regulatory capital require- culate VaR, which in turn was used to determine the initial
ment is depicted by the horizontal line (see explanation shock in the stress test. But calculating initial shocks in this
in Section 5.5 below). The risk horizon for the stress tests way had negligible impact on the final stress test outcomes.
is on the horizontal axis and the 99% stress loss corre- Note that in every case the estimated value of the crucial
sponding to an initial shock at the 99.98% and the tail index parameter was less than 0.1, indicating that the
99.95% percentiles are shown on the chart. The results tails of currency returns distributions are not excessively
indicate that VaR-based regulatory capital requirement heavy.
exceeds our stress loss provided the horizon is not less
than 13 or 19 days (depending on the size of the initial 5.3. Model-based stress tests over time
shock). That is, model-based stress tests would not neces-
sitate any increase in regulatory capital provided the Standard application of the GARCH approach is
horizon for the stress test is less than 13 or 19 days. A known to induce VaR estimates that vary significantly over
similar conclusion is reached for AUD/USD and USD/ time, a characteristic that can present practical difficulties
JPY, although losses from a 10-day stress test would for financial institutions using VaR as an input to the gear-
marginally exceed VaR-based regulatory capital for ing decision.19 GARCH models have been eschewed by
USD/JPY (see Table 7). some institutions since GARCH-based VaR estimates can
What risk horizon should be applied to model-based increase suddenly and dramatically following a market
stress tests? A much shorter risk horizon (say, 3 days) shock and it is typically not possible to raise capital in a
may be sufficient to hedge typical positions in major cur- very short time-frame.
rencies since even the least liquid of the currency pairs we
are examining (the AUD/USD) has average daily turnover
of USD90 billion.16 In addition, the ability of institutions 17
to gradually hedge positions (and thereby reduce the aver- We know this from personal correspondence with the Market Risk
Analytics team at Deutsche Bank, London. Longer horizons are used for
less liquid assets, with a typical stress testing horizon being 2 months.
18
Results available from the authors on request.
16 19
Turnover figures were obtained from BIS (2005). In the USD/JPY A better application of GARCH-based VaR models may be to the
average daily turnover is USD296 billion, while in GBP/USD average assessment of trading limits. In this context a VaR measure that is
daily turnover is USD245 billion. sensitive to current market conditions is potentially attractive.
2232 C. Alexander, E. Sheedy / Journal of Banking & Finance 32 (2008) 2220–2236
Table 7
Stress test results
Unconditional empirical Conditional empirical Conditional t
1. Long AUD/Short USD portfolio: maximum historical loss over 3 days = 7.27%, Over 10 days = 12.61%
Long term volatility estimate NA 10.36% 10.36%
VaR-based regulatory capital 17.84% 16.93% 16.05%
99% Stress loss:
h = 3, a = 0.0002 7.07% 9.88% 8.03%
h = 3, a = 0.0005 6.72% 9.07% 7.04%
h = 10, a = 0.0002 10.05% 15.42% 12.46%
h = 10, a = 0.0005 9.70% 14.54% 11.12%
2. Long GBP/Short USD portfolio: maximum historical loss over 3 days = 7.45%, Over 10 days = 15.89%
Long term volatility estimate NA 9.60% 9.60%
VaR-based regulatory capital 15.94% 14.39% 13.97%
99% Stress loss:
h = 3, a = 0.0002 6.12% 8.20% 7.53%
h = 3, a = 0.0005 5.60% 7.15% 6.37%
h = 10, a = 0.0002 8.79% 12.84% 11.74%
h = 10, a = 0.0005 8.26% 11.28% 10.26%
3. Long USD/Short JPY portfolio: maximum historical loss over 3 days = 7.05%, Over 10 days = 12.77%
Long term volatility estimate NA 10.44% 10.44%
VaR-based regulatory capital 18.16% 16.55% 15.12%
99% Stress loss:
h = 3, a = 0.0002 7.82% 10.93% 8.33%
h = 3, a = 0.0005 6.55% 8.47% 7.04%
h = 10, a = 0.0002 10.86% 17.06% 12.34%
h = 10, a = 0.0005 9.58% 13.19% 10.72%
(a) Long term volatility estimate is sample standard deviation of all daily log returns, expressed in annual terms. (b) Regulatory capital expressed as a
percentage of the portfolio amount, calculated as 3 VaR0.01,10-day using the relevant risk model. VaR calculations use all available historical data for
estimating parameters and percentiles and, where needed, the long-term volatility estimate. (c) Stress Loss refers to 1 times the stress test outcome at the
q percentile over an h-day horizon, expressed as a percentage of initial portfolio value, assuming an initial shock.
We therefore assess the stability of the model-based linked to a decrease in both long term volatility and the
stress tests over time, and hence their suitability for guiding reaction of currencies to market shocks.
decisions regarding capital adequacy. Fig. 2 depicts the
results of model-based stress tests over time in the case of
the Long GBP/Short USD portfolio using the conditional 5.4. Comparing traditional stress tests with model based
t risk model. The stress analysis is repeated every quarter, stress tests
using the available data from the start of the sample to
the estimation date. The minimum sample size is thus 20 According to the Committee on the Global Financial
years. We find that the stress loss estimates are in fact quite System (2005) the most commonly used stress tests are
stable over time. The gradual downtrend in stress loss is based on historical events that would be potentially disas-
trous for the portfolio being analysed. For the portfolios
in this study we have identified the worst recorded currency
Long GBP/USD Portfolio, Conditional-t Model
99% Stress Test Loss with 99.98% Initial Shock, 10-day Horizon
movements from our data over horizons of 3 and 10 days,
Analysis performed using all data from January 1974 until Estimation Date as shown in the headers of Table 7. Taking the Long AUD
15%
portfolio as an example, the worst recorded 3-day loss for
13% this portfolio is 7.27% which is within the range predicted
by both conditional stress tests with 99.98% initial shock.
10%
Note, however, that it is not within the range predicted
8% by the unconditional stress test, providing further evidence
that conditional risk models are preferable for stress testing
5% t-Garch Reaction Coefficient purposes.
3%
Long Term Volatility Are model based stress tests superior to traditional stress
99% Worst Case Stress Loss
tests based on historical extremes? Consider the Long
0% AUD position, where the conditional t risk model assumes
Jan-94
Jan-95
Jan-96
Jan-97
Jan-98
Jan-99
Jan-00
Jan-01
Jan-02
Jan-03
Jan-04
Jan-05
Jan-06
to historical data it relies on a very small sample. In con- shocks should be used as a complement to model-based
trast the model based stress test can simulate any number shocks.
of exchange rate paths which are different from, but consis-
tent with, those experienced historically. A larger sample 5.5. Implications of model based stress tests for risk capital
allows the analyst to form a far more reliable conclusion
concerning the likely impact of the stress event. This section assesses the model based stress tests in the
A further advantage of the model based approach is that light of regulatory capital requirements. According to
it avoids subjective assumptions regarding the implications Basel Committee on Banking Supervision (2006), regulated
of the stress event for the portfolio. Using the traditional financial institutions using internal models to assess capital
approach, analysts may over-estimate, underestimate or are now required to hold sufficient capital to cover stress
even ignore the potential for increased volatility and there- tests. This is in addition to the requirement that they have
fore further large returns subsequent to the initial shock. capital sufficient to meet VaR-based criteria. That is, mar-
The model based approach plots the likely consequences ket risk capital requirements are set at:
of a shock based on established characteristics of markets
Max½Most recent estimate of VaR0:01;10-day ; k Average
that have been verified using historical data over more than
20 years. VaR0:01;10-day for past 60 days;
The weakness of the model based approach is its vulner-
ability to model risk. This includes the risk that an inappro- where k is a multiplier determined by local regulators hav-
priate model is used to describe market dynamics and the ing a minimum value of 3. Note that in the case of interest
uncertainty about the true model. Our backtesting experi- rate related instruments and equity securities the VaR-
ment suggests that unconditional historical simulation, cur- based criteria also requires capital to reflect the specific risk
rently the most popular VaR methodology in the industry of the bank trading portfolio.
according to Perignon and Smith (2006), is likely to be mis- In Table 7 we calculate the VaR-based capital require-
specified and is therefore unsuited for stress testing pur- ment as 3VaR0.01,10-day using all available data for esti-
poses. We have attempted to mitigate model risks with mating risk model parameters/percentiles and, where
an extensive, tailored backtesting procedure (Section 3) necessary, using the long-term standard deviation. This
and by using long data sets for parameter estimation, but measure, while slightly different from the equation above,
there remains the risk that market dynamics may change is designed to represent a ‘typical’ capital requirement
significantly in the near future. The methodology we have assuming relatively benign market conditions. Although
outlined relies on extensive data. While lengthy data sets the VaR-based capital requirement could exceed this value,
are available in many markets they are not available in we observe that our capital requirement estimates are
all. In cases where markets have recently been deregulated, always greater than the losses generated by model based
or where deregulation is considered likely in the near stress tests except for very long stress test horizons. For
future, historical data may be available but not relevant. instance, a position of £1 m for a USD investor has 99%
In these cases any stress test based on historical data stress loss over 3 days (following an initial shock at the
(whether model based or not) will be problematic. 99.98% percentile) of £82,000. But the regulatory capital
An example of this type of model risk can be illustrated estimate is almost double this at £143,900.
with reference to the breakdown of the European Mone-
tary System (EMS) in 1992. In September 1992 the GBP 6. Summary and conclusions
came under speculative attack as market participants antic-
ipated the withdrawal of the GBP from the EMS and its Stress testing is an established component of daily risk
subsequent depreciation. The greatest historical deprecia- management for portfolios exposed to market risk. Indeed
tion in the GBP/USD occurred at that time, with a loss many financial institutions are required by regulators to
of 15.89% over a 10-day period. Note that this loss falls implement stress tests and their importance is to be
outside the bounds predicted by the model-based stress test extended under the new capital accord. However tradi-
in Table 7. The model predicted a 99% stress loss over 10 tional stress tests can be criticised for being conducted out-
days of only 12.84%, based on an initial shock at the side the context of a risk model, hence the probability of an
99.98% percentile. However the sharp depreciation in the extreme outcome is unknown and many extreme yet plau-
GBP had been widely anticipated at the time. In such cases sible possibilities are ignored. Many stress tests also fail to
it is prudent to apply a hypothetical initial shock to the incorporate the characteristics that markets are known to
stress test model. In this case a hypothetical return of exhibit in crisis periods, namely, increased probability of
5.00%, which seems conservative in the circumstances, further large movements, increased co-movement between
would have been sufficient to generate a 99% stress loss markets, greater implied volatility and reduced liquidity.
of 17.40% over 10 days. This example illustrates the need The first objective of this paper was to identify the most
for risk managers to constantly monitor markets for poten- suitable risk models in which to conduct a stress test. We
tial structural breaks that even a sophisticated risk model considered eight relatively well-known and parsimonious
cannot adequately predict. In these cases hypothetical risk models, of which four incorporate volatility clustering.
2234 C. Alexander, E. Sheedy / Journal of Banking & Finance 32 (2008) 2220–2236
log-log plots that are nearly straight lines with slope 1/2 T 1
m ðaÞ tm ðT 1
m ðaÞÞY
ð1þmÞ=2
2Y ð1mÞ=2
Z
and we conclude that the square root of time rule is a suit- xtm ðxÞdx ¼ 1
:
1 2ðm 2Þ 1m
able scaling law for the percentiles of currency returns
1
distributions. ¼ ðm 1Þ ðm 2ÞYtm ðT 1
m ðaÞÞ
LR ¼ n ; 2 ln LR v21 ;
l ¼ ðl1 ; . . . ; ln Þ and r2 ¼ ðr21 ; . . . ; r2n Þ then pnobs
1
ð1 pobs Þ 0
1 1 1
ETLa ðX Þ ¼ a1 ½puðr1
1 G0 ðaÞÞr1 þ ð1 pÞuðr2 G0 ðaÞÞr2 where pexp is the expected proportion and pobs is the ob-
½pl1 þ ð1 pÞl2 ; ð12Þ served proportion of returns that lie in the prescribed inter-
val of the distribution, n1 is the number of returns that lie
where u is the standard normal density function and G0 is inside the interval (i.e., the number of violations) and n0 is
the distribution function for a mixture of two zero-mean the number of returns that lie outside the interval (we can
normal distributions with standard deviations r1 and r2. call these returns the ‘good’ returns).
The test for conditional coverage is
Proof. (i) The result follows on showing that the ETL in a n
standardised Student’s t-distribution with m degrees of pnexp
1 ð1 p
exp Þ
0
LR ¼ ; 2 ln LR v22
freedom, unit standard deviation and zero mean is given by pn0101 ð1 p01 Þn00 pn1111 ð1 p11 Þn10
where n10 is the number of times a violation is followed by a
1 2 good return; n11 is the number of times a violation is fol-
a1 ðm 1Þ ðm 2 þ T 1 1
m ðaÞ Þt m ðT m ðaÞÞ: ð13Þ
lowed by another violation; n01 is the number of times a
2 b
Write tm(x) = A(1 + ax ) where good return is followed by a violation; and n00 is the number
1=2 1 of times a good return is followed by another good return.
A ¼ ððm 2ÞpÞ Cðm=2Þ Cððm þ 1Þ=2Þ; a
Also let
¼ ðm 2Þ1 ; and b ¼ ð1 þ mÞ=2: n01 n11
p01 ¼ and p11 ¼ ;
Then substituting y = 1 + ax2: n00 þ n01 n10 þ n11
Z T 1
m ðaÞ
Z T 1
m ðaÞ
i.e. p01 is the proportion of exceedances, given that the last
xtm ðxÞdx ¼ A xð1 þ ax2 Þb dx return was a ‘good’ return, and p11 is the proportion of
1 1
Z Y exceedances, given that the last return was an exceedance.
A The ETL test first computes the standardised excee-
¼ y b dy; Y
2a 1 dance residuals, r ¼ r ^1
tþh ðetþh ETLh;a Þ if et+h < VaRh,a,
1 2 and r = 0 otherwise. The null hypothesis is that these have
¼ 1 þ ðm 2Þ T 1
m ðaÞ :
RY zero mean, against the alternative that their mean is posi-
bþ1 ð1mÞ=2
But 1
y b dy ¼ Ybþ1 ¼ 2Y 1m and A ¼ tm ðT 1
m ðaÞÞY
ð1þmÞ=2
so: tive. The distribution of the test statistic t ¼ r=est:s:e:r is
2236 C. Alexander, E. Sheedy / Journal of Banking & Finance 32 (2008) 2220–2236
found using the standard bootstrap simulation introduced Galati, G., Heath, A., McGuire, P., 2007. Evidence of carry trade activity.
by Efron and Tibshirani (1993, pp. 224–7). BIS Quarterly Review, 27–41.
Gencay, R., Selcuk, F., 2004. Extreme value theory and VaR: Relative
performance in emerging markets. International Journal of Forecast-
References ing 20, 287–303.
Giot, P., Laurent, S., 2003. Value-at-risk for long and short trading
Angelidis, T., Benos, A., Degiannakis, S., 2004. The use of GARCH positions. Journal of Applied Econometrics 18, 641–664.
models in VaR estimation. Statistical Methodology 1, 105–128. Greenspan, A., 2000. Greenspan’s plea for stress testing. Risk 5.
Angelidis, T., Benos, A., 2006. Liquidity adjusted value-at-risk based on Huisman, R., Koedijk, K., Pownall, R., 1998. VaR-x: Fat tails in financial
the components of the bid–ask spread. Applied Financial Economics risk management. Journal of Risk 1, 47–61.
16 (11), 835–851. Hull, J., 2006. Risk Management and Financial Institutions. Pearson
Aragones, J., Blanco, C., Dowd, K., 2001. Incorporating stress tests into Prentice Hall.
market risk modeling. Derivatives Quarterly. Spring 2001, 44–49. Jarrow, R., Subramanian, A., 1997. Mopping up liquidity. Risk 10, 170–
Bangia, A., Diebold, F., Schuermann, T., Stroughair, J., 2002. Modeling 173.
liquidity risk with implications for traditional market risk measure- Jorion, P., 2007. Value at Risk. McGraw Hill.
ment and management. In: Levich, R., Figlewski, S. (Eds.), Risk Kole, E., Koedijk, K., Verbeek, M., 2007. Selecting copulas for risk
Management: State of the Art. Springer. management. Journal of Banking and Finance 31, 2405–2423.
Barone-Adesi, G., Bourgoin, F., Giannopoulos, K., 1998. Don’t look Kupiec, P., 1995. Techniques for verifying the accuracy of risk measure-
back. Risk 11, 100–103. ment models. The Journal of Derivatives, 73–84.
Basel Committee on Banking Supervision, 1996. Amendment to the Kupiec, P., 1998. Stress testing in a Value-at-Risk framework. Journal of
Capital Accord to Incorporate Market Risks. Derivatives 6, 7–24.
Basel Committee on Banking Supervision, 2006. International Conver- Longin, F., 2000. From Value-at-Risk to stress testing: The extreme value
gence of Capital Measurement and Capital Standards: A Revised approach. Journal of Banking and Finance 24, 1097–1130.
Framework Comprehensive Version (Bank for International Settle- Longin, F., 2005. The choice of the distribution of asset returns: How
ments, Basel). extreme value theory can help? Journal of Banking and Finance 29,
Bauwens, L., Laurent, S., Rombouts, J., 2004. Multivariate GARCH 1017–1035.
models: A Survey, CORE Discussion Paper (Louvain). Lopez, J., 2005. Stress tests: Useful complements to financial risk models.
Berkowitz, J., 2000. A coherent framework for stress-testing. Journal of FRBSF Economic Letter, 119–124.
Risk 2, 1–11. McNeil, A., Frey, R., 2000. Estimation of tail-related risk measures for
Berkowitz, J., Christoffersen, P., Pelletier, D., 2006. Evaluating VaR heteroscedastic financial time series: An extreme value approach.
Models with Desk-level Data. McGill University. Journal of Empirical Finance 7, 271–300.
Berkowitz, J., O’Brien, J., 2002. How accurate are value-at-risk models at McNeil, A., Frey, R., Embrechts, P., 2005. Quantitative Risk Manage-
commercial banks? Journal of Finance, 1093–1111. ment. Princeton University Press.
Bollerslev, T., 1986. Generalised autoregressive conditional heteroskedas- Mittnik, S., Paolella, M., 2000. Conditional density and Value-at-Risk
ticity. Journal of Econometrics 31, 309–328. prediction of asian currency exchange rates. Journal of Forecasting 19,
Borio, C., 2000. Market liquidity and stress: Selected issues and policy 313–333.
implications. BIS Quarterly Review, 38–51. Patton, A., 2006. Copula-Based Models for Financial Time Series.
Butler, J.S., Schachter, B., 1998. Estimating VaR with a precision measure London School of Economics.
by combining kernel estimation with historical simulation. Review of Perignon, C., Smith, D., 2006. The Level and Quality of Value-at-Risk
Derivatives Research 1, 371–390. Disclosure by Commercial Banks. Simon Fraser University,
Chen, S., Tang, C., 2005. Nonparametric inference of value-at-risk for Vancouver.
dependent financial returns. Journal of Financial Econometrics 3, 227– Plerou, V., Gopikrishnan, P., Stanley, H.E., 2005. Quantifying fluctua-
255. tions in market liquidity: Analysis of the bid–ask spread. Physical
Christoffersen, P., 1998. Evaluating interval forecasts. International Review E 71, 046131.
Economic Review 39, 841–862. Pritsker, M., 2006. The hidden dangers of historical simulation. Journal of
Committee on the Global Financial System, 2005. A survey of stress tests Banking and Finance 30, 561–582.
and current practice at major financial institutions. Rowe, D., 2005. The new market risk challenge. Risk 18 (9), 103.
Danielsson, J., De Vries, C., 2000. Value-at-risk and extreme returns. In: Settlements, Bank for International, 2005. Triennial Central Bank Survey:
Embrechts, Paul (Ed.), Extremes and Integrated Risk Management. Foreign Exchange and Derivatives Market Activity in 2004 (Bank for
Risk Books, London. International Settlements, Basel).
Danielsson, J., Zigrand, J.-P., 2004. On time-scaling of risk and the Sheather, S., Marron, J.S., 1990. Kernel quantile estimators. Journal of
square-root-of-time rule, EFA 2004 Maastricht Meetings. the American Statistical Association 85, 410–416.
Deuskar, P., 2006. Extrapolative Expectations: Implications for Volatility Silverman, B.W., 1986. Density Estimation for Statistics and Data
and Liquidity. NYU Stern School of Business, New York. Analysis. Chapman and Hall, London.
Dowd, K., 2002. Measuring Market Risk. Wiley Finance. So, M., Yu, P., 2006. Empirical analysis of GARCH models in value at
Dowd, K., Rowe, D., 2004. Introduction to value-at-risk models. In: risk estimation. International Financial Markets, Institutions and
Alexander, C., Sheedy, E. (Eds.), The Professional Risk Manager’s Money 16, 180–197.
Handbook. PRMIA Publications. Straumann, D., 2004. Estimation in Conditionally Heteroscedastic Time
Efron, B., Tibshirani, R.J., 1993. An Introduction to the Bootstrap. Series Models. Lecture Notes in Statistics. Springer-Verlag, Berlin.
Chapman & Hall, New York. Venter, J., de Jongh, P., 2002. Risk estimation using the normal inverse
Engle, R., 2002. Dynamic conditional correlation: A simple class of Gaussian distribution. Journal of Risk 4, 1–5.
multivariate generalised autoregressive conditional heteroskedasticity Vilasuso, J., 2002. Forecasting exchange rate volatility. Economics Letters
models. Journal of Business & Economic Statistics 20, 339–350. 76, 59–64.