0% found this document useful (0 votes)
18 views59 pages

QRM 06

This document discusses multivariate models, focusing on the basics of multivariate modeling, including joint and marginal distributions, conditional distributions, and independence. It covers the properties of random vectors, moments, characteristic functions, and standard estimators of covariance and correlation in multivariate normal distributions. The section provides mathematical definitions and relationships essential for understanding multivariate statistical analysis.

Uploaded by

qq1812016515
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views59 pages

QRM 06

This document discusses multivariate models, focusing on the basics of multivariate modeling, including joint and marginal distributions, conditional distributions, and independence. It covers the properties of random vectors, moments, characteristic functions, and standard estimators of covariance and correlation in multivariate normal distributions. The section provides mathematical definitions and relationships essential for understanding multivariate statistical analysis.

Uploaded by

qq1812016515
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 59

6 Multivariate models

6.1 Basics of multivariate modelling

6.2 Normal mixture distributions

6.3 Spherical and elliptical distributions

6.4 Dimension reduction techniques

© QRM Tutorial Section 6


6.1 Basics of multivariate modelling
6.1.1 Random vectors and their distributions
Joint and marginal distributions
Let X = (X1 , . . . , Xd ) : Ω → Rd be a d-dimensional random vector
(representing risk-factor changes, risks, etc.).
The (joint) distribution function (df) F of X is
F (x) = FX (x) = P(X ≤ x) = P(X1 ≤ x1 , . . . , Xd ≤ xd ), x ∈ Rd .
The jth margin Fj of F or jth marginal df Fj of X is
Fj (xj ) = P(Xj ≤ xj )
= P(X1 ≤ ∞, . . . , Xj−1 ≤ ∞, Xj ≤ xj , Xj+1 ≤ ∞, . . . , Xd ≤ ∞)
= F (∞, . . . , ∞, xj , ∞, . . . , ∞), xj ∈ R, j ∈ {1, . . . , d}.
(interpreted as a limit).
© QRM Tutorial Section 6.1
Similarly for k-dimensional margins. Suppose we partition X into
(X1′ , X2′ )′ , where X1 = (X1 , . . . , Xk )′ and X2 = (Xk+1 , . . . , Xd )′ ,
then the marginal distribution function of X1 is
FX1 (x1 ) = P(X1 ≤ x1 ) = F (x1 , . . . , xk , ∞, . . . , ∞).
F is absolutely continuous if
Z xd Z x1 Z
F (x) = ··· f (z1 , . . . , zd ) dz1 . . . dzd = f (z) dz
(∗) −∞ −∞ (−∞,x]
for some f ≥ 0 known as the (joint) density of X (or F ). Similarly, the
Rx
jth marginal df Fj is absolutely continuous if Fj (x) = −∞ fj (z) dz for
some fj ≥ 0 known as the density of Xj (or Fj ).
R xj R R xj
In case f exists, Fj (xj ) = −∞ (−∞,∞) f (z) dz−j dzj = −∞ fj (zj ) dzj ,
(∗)
so that Fj is absolutely continuous with density fj (xj ) given by
Z ∞ Z ∞
··· f (z1 , . . . , zj−1 , xj , zj+1 , . . . , zd ) dz1 . . . dzj−1 dzj+1 . . . dzd .
−∞ −∞
| {z }
d − 1-many
© QRM Tutorial Section 6.1.1
Existence of a joint density ⇒ Existence of marginal densities for all
k-dimensional marginals, 1 ≤ k ≤ d − 1. The converse is false in general
(counter-examples can be constructed with copulas; see Chapter 7).
By replacing integrals by sums, one obtains similar formulas for the
discrete case, in which the notion of densities is replaced by probability
mass functions.
We sometimes work with the survival function F̄ of X,
F̄ (x) = F̄X (x) = P(X > x) = P(X1 > x1 , . . . , Xd > xd ), x ∈ Rd ,
with corresponding jth marginal survival function F̄j
F̄j (xj ) = P(Xj > xj )
= F̄ (−∞, . . . , −∞, xj , −∞, . . . , −∞), xj ∈ R, j ∈ {1, . . . , d}.
Note that F̄ (x) 6= 1 − F (x) in general (unless d = 1), since, by the
Law of Total Probability, F̄ (x1 , x2 ) = P(X1 > x1 , X2 > x2 ) = P(X1 >
x1 )−P(X1 > x1 , X2 ≤ x2 ) = 1−P(X1 ≤ x1 )−(P(X2 ≤ x2 )−P(X1 ≤
x1 , X2 ≤ x2 )) = 1 − F1 (x1 ) − F2 (x2 ) + F (x1 , x2 ) 6= 1 − F (x1 , x2 ).
© QRM Tutorial Section 6.1.1
Conditional distributions and independence
A multivariate model for risks X in the form of a joint df, survival
function or density, implicitly describes the dependence of X1 , . . . , Xd .
We can then make statements about conditional probabilities.
As before, consider X = (X1′ , X2′ ) ∼ F . The conditional df of
X2 given X1 = x1 is FX2 |X1 (x2 | x1 ) = P(X2 ≤ x2 | X1 = x1 ) =
E(I{X2 ≤x2 } | X1 = x1 ), where E( · | · ) denotes conditional expectation
(not discussed here).
A useful identity for conditional dfs is
Z
FX1 ,X2 (x1 , x2 ) = FX2 |X1 (x2 | z) dFX1 (z); (16)
(−∞,x1 ]

see the appendix for a proof.


R
◮ If x1 → ∞, then FX2 (x2 ) = Rd FX2 |X1 (x2 | z) dFX1 (z).
R
◮ If F has a density f , then fX2 (x2 ) = Rd fX2 |X1 (x2 | z) dFX1 (z).

© QRM Tutorial Section 6.1.1


If F has density f and fX1 denotes the density of X1 , then
∂2 ∂
f (x1 , x2 ) = F (x1 , x2 ) = FX2 |X1 (x2 | x1 )fX1 (x1 )
∂x2 ∂x1 (16) ∂x2

= fX2 |X1 (x2 | x1 )fX1 (x1 ).


We call
f (x1 , x2 )
fX2 |X1 (x2 | x1 ) =
fX1 (x1 )
the conditional density of X2 given X1 = x1 . In this case, the condi-
tional df FX2 |X1 (x2 | x1 ) is given by
Z xk+1 Z xd
FX2 |X1 (x2 | x1 ) = ··· fX2 |X1 (zk+1 , . . . , zd | x1 ) dzk+1 . . . dzd .
−∞ −∞

X1 , X2 are independent if F (x1 , x2 ) = FX1 (x1 )FX2 (x2 ) for all x1 , x2


(if F has density f , then X1 , X2 are independent if f (x1 , x2 ) =
fX1 (x1 )fX2 (x2 ) for all x1 , x2 ; In this case, fX2 |X1 (x2 | x1 ) = fX2 (x2 ).

© QRM Tutorial Section 6.1.1


The components X1 , . . . , Xd of X are (mutually) independent if F (x)
Q
= dj=1 Fj (xj ) for all x (if F has density f , then X1 , . . . , Xd are
Q
independent if f (x) = dj=1 fj (xj ) for all x).

Moments and characteristic function


If E|Xj | < ∞, j ∈ {1, . . . , d}, the mean vector of X is defined by
EX = (EX1 , . . . , EXd ).
Qd
One can show: X1 , . . . , Xd independent ⇒ E(X1 · · · Xd ) = j=1 E(Xj )
If E(Xj2 ) < ∞ for all j, the covariance matrix of X is defined by
cov(X) = E((X − EX)(X − EX)′ ).
If we write Σ = cov(X), its (i, j)th element is
σij = Σij = cov(Xi , Xj ) = E((Xi − EXi )(Xj − EXj ))
= E(Xi Xj ) − E(Xi )E(Xj );
the diagonal elements are σjj = var(Xj ), j ∈ {1, . . . , d}.
© QRM Tutorial Section 6.1.1
X1 , X2 independent ⇒
: cov(X1 , X2 ) = 0 (counter-example: X1 ∼
U(−1, 1), X2 = X1 ⇒ cov(X1 , X2 ) = E(X13 ) − 0 · E(X12 ) = 0).
2

The cross covariance matrix between two random vectors X, Y is defined


by cov(X, Y ) = E((X − EX)(Y − EY )′ ); note that cov(X, X) =
cov(X).
If E(Xj2 ) < ∞, j ∈ {1, . . . , d}, the correlation matrix of X is defined
by the matrix corr(X) with (i, j)th element
cov(Xi , Xj )
corr(Xi , Xj ) = q , i, j ∈ {1, . . . , d},
var(Xi ) var(Xj )
a.s.
which is in [−1, 1] with corr(Xi , Xj ) = ±1 if and only if Xj = aXi + b
for some a 6= 0 and b ∈ R.
Some properties of E(·) and cov(·, ·):
1) For all A ∈ Rk×d , b ∈ Rk :
◮ E(AX + b) = AEX + b;
© QRM Tutorial Section 6.1.1
◮ cov(AX + b) = A cov(X)A′ = AΣA′ ; if k = 1 (A = a′ ),

a′ Σa = cov(a′ X) = var(a′ X) ≥ 0, a ∈ Rd , (17)


i.e. covariance matrices are positive semidefinite.
◮ cov(X1 + X2 ) = cov(X1 ) + cov(X2 ) + 2 cov(X1 , X2 )

2) If Σ is a positive definite matrix (i.e. a′ Σa > 0 for all a ∈ Rd \{0}),


one can show that Σ is invertible.
3) A symmetric, positive (semi)definite Σ can be written as
Σ = AA′ Cholesky decomposition (18)
for a lower triangular matrix A with Ajj > 0 (Ajj ≥ 0) for all j. A
is known as Cholesky factor (and is also denoted by Σ1/2 ).
Properties of X can often be shown with the characteristic function
(cf)
φX (t) = E(exp(it′ X)), t ∈ Rd .
Qd
X1 , . . . , Xd are independent ⇔ φX (t) = j=1 φXj (tj ) for all t.
© QRM Tutorial Section 6.1.1
Proposition 6.1 (Characterization of covariance matrices)
A symmetric matrix Σ is a covariance matrix if and only if it is positive
semidefinite.

Proof.
“⇒” As we have seen in (17), a covariance matrix Σ is positive semidefinite.
“⇐” Let Σ be positive semidefinite with Cholesky factor A. Let X be
ind.
a random vector with cov X = Id = diag(1, . . . , 1) (e.g. Xj ∼
N(0, 1)). Then cov(AX) = A cov(X)A′ = AA′ = Σ, i.e. Σ is a
covariance matrix (namely that of AX).

6.1.2 Standard estimators of covariance and correlation


Assume X1 , . . . , Xn ∼ F (daily/weekly/monthly/yearly risk-factor
changes) are serially uncorrelated (i.e. multivariate white noise) with
µ := EX1 , Σ := cov X1 and P = corr(X1 ).
© QRM Tutorial Section 6.1.2
Standard estimators of µ, Σ, P are
n
1X
X̄ = Xi (sample mean)
n i=1
n
1X
S= (Xi − X̄)(Xi − X̄)′ (sample covariance matrix )
n i=1
Sij
R = (Rij ) for Rij = p (sample correlation matrix )
Sii Sjj

Under joint normality (F multivariate normal), X̄, S and R are also


MLEs. S is biased, but an unbiased version can be obtained by
n
Sn = S.
n−1
Clearly, X̄ is unbiased.
Unbiasedness of Sn follows from two observations:

© QRM Tutorial Section 6.1.2


◮ Since the Xi ’s are uncorrelated,

cov(X̄) = E((X̄ − µ)(X̄ − µ)′ )


n
 X n
 X ′ 
1
= E (Xj − µ) (Xk − µ)
n2 j=1 k=1
n
1 X Σ
= E((Xj − µ)(Xk − µ)′ ) = .
n2 j,k=1
Xi ’s uncorr. n

◮ Note that
n
′1X
E((Xi − µ)(X̄ − µ) ) = E((Xk − µ)(X̄ − µ)′ )
n k=1
 n 
1X
=E (Xk − µ)(X̄ − µ)′
n k=1
Σ
= E((X̄ − µ)(X̄ − µ)′ ) = cov(X̄) = .
n

© QRM Tutorial Section 6.1.2


This implies that Sn is unbiased since
n
1 X
ESn = E((Xi − X̄)(Xi − X̄)′ )
n − 1 i=1
n
1 X 
= E ((Xi − µ) − (X̄ − µ))((Xi − µ) − (X̄ − µ))′
n − 1 i=1
n   n
 
1 X Σ Σ Σ 1 X Σ
= Σ− − + = Σ− = Σ.
n − 1 i=1 n n n n − 1 i=1 n

© QRM Tutorial Section 6.1.2


6.1.3 The multivariate normal distribution
Definition 6.2 (Multivariate normal distribution)
X = (X1 , . . . , Xd ) has a multivariate normal (or Gaussian) distribution
if d
X = µ + AZ, (19)
ind.
where Z = (Z1 , . . . , Zk ), Zl ∼ N(0, 1), A ∈ Rd×k , µ ∈ Rd .

Typically k = d
EX = µ + AEZ = µ
cov(X) = cov(µ + AZ) = A cov(Z)A′ = AA′ =: Σ

Proposition 6.3 (Cf of the multivariate normal distribution)


Let X be as in (19) and Σ = AA′ . Then the cf of X is
 
1
φX (t) = E(exp(it X)) = exp it µ − t′ Σt ,
′ ′
t ∈ Rd .
2

© QRM Tutorial Section 6.1.3


Idea of proof. Using the fact that φZ (t) = exp(−t2 /2) for Z ∼ N(0, 1)
(see the appendix for a proof), we obtain that

φX (t) = E exp(it′ (µ + AZ)) = exp(it′ µ)E(exp(it̃′ Z))
t̃′ =t′ A
k k 
ind. ′
Y  1X ′
= exp(it µ) E exp(i(t̃j Zj )) = exp it µ − t̃2j
j=1
2 j=1
   
1 1

= exp it µ − t̃′ t̃ = exp it′ µ − t′ AA′ t
2 2
 
1
= exp it′ µ − t′ Σt
2
We see that the multivariate normal distribution is characterized by µ
and Σ, hence the notation X ∼ Nd (µ, Σ).
Nd (µ, Σ) can be characterized by univariate normal distributions.

© QRM Tutorial Section 6.1.3


Proposition 6.4 (Characterization of Nd (µ, Σ))
X ∼ Nd (µ, Σ) ⇐⇒ a′ X ∼ N(a′ µ, a′ Σa) ∀ a ∈ Rd .

Proof. “⇒” via uniqueness of cfs:


φa′ X (t) = E(exp(ita′ X)) = E(exp(i(ta)′ X)) = φX (ta)
   
1 t2

= exp i(ta) µ − (ta)′ Σ(ta) = exp ita′ µ − a′ Σa .
2 2

“⇐” via Corollary A.12

Consequences: a=ej
Margins: X ∼ Nd (µ, Σ) ⇒ Xj ∼ N(µj , Σjj ), j ∈ {1, . . . , d}.
:
a=1 Pd Pd Pd
Sums: X ∼ Nd (µ, Σ) ⇒ j=1 Xj ∼ N( j=1 µj , i,j=1 Σij ).

© QRM Tutorial Section 6.1.3


Proposition 6.5 (Density)
Let X ∼ Nd (µ, Σ) with rank A = k = d (⇒ Σ pos. definite, invertible).
By the density transformation theorem, X can be shown to have density
 
1 1
fX (x) = √ exp − (x − µ)′ Σ−1 (x − µ) , x ∈ Rd .
(2π)d/2 det Σ 2

Consequences:
Sets of the form Sc = {x ∈ Rd : (x − µ)′ Σ−1 (x − µ) = c}, c > 0,
describe points of equal density. Contours of equal density are thus
ellipsoids. Whenever a multivariate density fX (x) depends on x only
through the quadratic form (x − µ)′ Σ−1 (x − µ), it is the density of an
elliptical distribution (see later).
The components of X ∼ Nd (µ, Σ) are mutually independent if and only
if Σ is diagonal, i.e. if and only if the components of X are uncorrelated.

© QRM Tutorial Section 6.1.3


(a) (b)
0.5
0.4 0.5

f (x1, x2)

f (x1, x2)
0.3 0.4
0.2 0.3
0.1 0.2
0 0.1
4 0
2 4 4
0 2 2 4
x2 −2 0 0 2
−2 x1 x2 −2 0
−4 −4 −2 x1
−4 −4

4 4

2 2

x2 0 x2 0

−2 −2

−4 −4
−4 −2 0 2 4 −4 −2 0 2 4
x1 x1
 1 −0.7 ν−2
Left: Nd µ, Σ for µ = ( 00 ), Σ = ( −0.7 1 ); Right: tν (µ, ν Σ), ν = 4,
(same mean and covariance matrix as on the left-hand side)
© QRM Tutorial Section 6.1.3
d
The definition of Nd (µ, Σ) in terms of a stochastic representation (X =
µ + AZ) directly justifies the following sampling algorithm.

Algorithm 6.6 (Sampling Nd (µ, Σ))


Let X ∼ Nd (µ, Σ) with Σ symmetric and positive definite.
1) Compute the Cholesky factor A of Σ; see, e.g. Press et al. (1992).
ind.
2) Generate Zj ∼ N(0, 1), j ∈ {1, . . . , d}.
3) Return X = µ + AZ, where Za = (Z1 , . . . , Zd ).

Further useful properties of multivariate normal distributions


Linear combinations
If X ∼ Nd (µ, Σ) and B ∈ Rk×d , b ∈ Rk , then
BX + b = B(µ + AZ) + b = (Bµ + b) + BAZ
∼ Nk (Bµ + b, BA(BA)′ ) = Nk (Bµ + b, BΣB ′ ).
Special case (see var.-cov. method, Proposition 6.4): b′ X ∼ N(b′ µ, b′ Σb).
© QRM Tutorial Section 6.1.3
Marginal dfs
Let X ∼ Nd (µ, Σ) and write X ′ ′ k
 = (X1, X2 ), where X1 ∈ R , X2 ∈
d−k ′ ′ Σ11 Σ12
R , and µ = (µ1 , µ2 ), Σ = Σ21 Σ22 . Then

X1 ∼ Nk (µ1 , Σ11 ) and X2 ∼ Nd−k (µ2 , Σ22 ).


Proof. Choose B = ( I0k 00 ) and B = ( 00 Id−k
0
), respectively.
Conditional distributions
Let X be as before and Σ be positive definite. One can show that
X2 | X1 = x1 ∼ Nd−k (µ2.1 , Σ22.1 ),
where µ2.1 = µ2 + Σ21 Σ−1 −1
11 (x1 − µ1 ) and Σ22.1 = Σ22 − Σ21 Σ11 Σ12 .
Quadratic forms
Let X ∼ Nd (µ, Σ) and Σ be positive definite with Cholesky factor A.
Furthermore, let Z = A−1 (X − µ). Then Z ∼ Nd (0, Id ). Moreover,
(X − µ)′ Σ−1 (X − µ) = Z ′ Z ∼ χ2d , (20)
© QRM Tutorial Section 6.1.3
which is useful for (goodness-of-fit) testing of Nd (µ, Σ): We can check
whether the squared Mahalanobis distances Di2 = (Xi − X̄)′ S −1 (Xi −
X̄), i ∈ {1, . . . , n}, form a(n approximate) sample from χ2d .
Convolutions
Let X ∼ Nd (µ, Σ) and Y ∼ Nd (µ̃, Σ̃) be independent. Via cfs it is
then an exercise to show that
X + Y ∼ Nd (µ + µ̃, Σ + Σ̃).

6.1.4 Testing multivariate normality


For testing univariate normality, all tests of Section 3.1.2 can be applied.
Now consider multivariate normality. By Proposition 6.4,
X1 , . . . , Xn ∼ Nd (µ, Σ) ⇒ a′ X1 , . . . , a′ Xn ∼ N(a′ µ, a′ Σa).
ind. ind.

This can be tested statistically (for some a) with various goodness-of-fit


tests (e.g. Q-Q plots) used for univariate normality (however, for a = ej ,
© QRM Tutorial Section 6.1.4
j ∈ {1, . . . , d}, we would only test normality of the margins, not joint
normality). Alternatively, (20) can be used to test joint normality (see
Mardia’s test below).
Multivariate Shapiro–Wilk
Mardia’s test
◮ According to (20), if X ∼ Nd (µ, Σ) with Σ positive definite, then
(X − µ)′ Σ−1 (X − µ) ∼ χ2d (can approx. be used in a Q-Q plot).
◮ Let Di2 = (Xi − X̄)′ S −1 (Xi − X̄) denote the squared Mahalanobis
distances and Dij = (Xi − X̄)′ S −1 (Xj − X̄) the Mahalanobis
angles.
P Pn 1 Pn
◮ Let bd = 12 n i=1
3
j=1 Dij and kd = n
4
i=1 Di . Under the null
n
hypothesis one can show that asymptotically for n → ∞,
kd − d(d + 2)
n
6 bd ∼ χ2d(d+1)(d+2)/6 , p ∼ N(0, 1),
8d(d + 2)/n
which can be used for testing; see Joenssen and Vogel (2014).
© QRM Tutorial Section 6.1.4
Example 6.7 (Multivariate (non-)normality of 10 Dow Jones stocks)
We apply Mardia’s test (of multivariate skewness and kurtosis) to
daily/weekly/monthly/quarterly log-returns of 10 (of the 30) Dow Jones
stocks from 1993–2000.
Daily Weekly Monthly Quarterly
n 2020 416 96 32
b10 9.31 9.91 21.10 50.10
p-value 0.00 0.00 0.00 0.02
k10 242.45 177.04 142.65 120.83
p-value 0.00 0.00 0.00 0.44

⇒ Daily/weekly/monthly data shows evidence against joint normality.


For quarterly data, a CLT effect seems to take place (but too little
data to say more) and there is still evidence against joint normality.

We can also compare Di2 data to a χ210 graphically using a Q-Q plot.
© QRM Tutorial Section 6.1.4
Q-Q plot of Di2 data against a χ210 distribution:
(a) daily data; (b) weekly data; (c) monthly data; and (d) quarterly data
250 • •
(a) (b)
200 • 60
Ordered D2 data •
150 •
40 •••••
• ••••••
100 ••• ••••
•••
• ••••••
•• 20 •
••••
50 •••• •••••
••••••••• ••••••
•••
••
••
••
••
••••••••
• ••
••
••
••
••
•••••••••••

•••••
•••••••••••••••••••••••••••••••••••••••• •••••••••
0 •• 0 ••••••••••••
0 5 10 15 20 25 30 5 10 15 20 25 30
40 (c) • •
20 (d) • •

Ordered D2 data


30 ••
• 15
•• • •
• •
20 ••• 10 •••
••••••• ••••
•••• •••
•••••• ••••
10
••••••

•• ••••

••••• 5
•••••••• ••

••••
•••••••• ••
5 10 15 20 25 5 10 15 20
2 2
χ10 quantile χ10 quantile

© QRM Tutorial Section 6.1.4


Example 6.8 (Simulated data vs BMW–Siemens)
Is the BMW–Siemens data (see Section 3.2.2) jointly normal?

Simulated data (fitted multivariate normal) Real risk−factor changes


0.10

0.10


0.05

0.05



● ● ●
● ● ●
● ●

● ● ●
● ● ● ●

● ●● ● ●
● ●● ● ● ● ●
● ● ● ● ● ● ● ●
● ● ●
●● ●● ● ● ● ● ● ●

●● ● ● ●
● ●● ●
● ●● ●
●● ● ● ●●● ● ● ● ●
● ● ●
● ●● ● ● ● ● ● ●● ● ● ●
● ● ●●
●● ●● ● ● ●●●
● ●
● ●● ●
● ● ● ● ● ● ● ●● ● ●
●● ● ●● ● ●●●● ●
●●● ●● ● ●● ●
●● ● ●● ● ● ●● ● ●
● ● ●
● ●● ●● ●● ● ●

SIEMENS
●● ● ●● ●● ●●● ● ● ●
● ● ●●●●
●●●● ●●
● ● ●● ●● ●

●●●●● ●● ● ● ● ● ●
● ●● ● ●
●●●● ●●●
● ●●
●●●● ●●●●● ●
●● ●●
● ●
● ●● ● ● ●● ●● ● ● ● ●
● ●● ●●● ●●●●
●●● ●●●● ● ● ● ●●● ●● ●●●●
●●●●
●●
● ● ● ●● ●● ● ●
● ●● ●● ●● ● ●● ●● ●●● ●● ●
●● ● ● ●●●●●●

●●

●● ●● ● ●
● ● ●● ● ●●●●
● ●●●
● ●● ● ●●●● ●●●● ●


● ● ●●●● ●● ●

●●●●
● ●● ●●●
● ●● ●●●●●● ● ●
●● ● ● ●● ●●●● ● ● ●●● ● ●

●●
●●
●●●
●●
●●●
● ● ●●●●
● ● ●
● ● ● ●
● ● ● ● ●● ●● ● ● ●
● ●● ●●●●●●●● ●●
●●
●●●● ●
●● ● ●● ●● ● ●● ● ● ● ●●●● ●●●● ●●● ●●●●● ●● ● ● ●● ● ●●●● ● ●
●● ●● ● ●● ●●●●● ●
● ●●●●
● ●●
●● ● ●●● ● ● ● ● ●●

● ●
●●●●●●● ●●●●● ●●
● ● ● ● ●●

●●
●●
● ●● ●●●●
● ● ● ●●● ●●●




● ●
● ●●●●
●●●●● ● ●●●●● ● ● ●
● ●●
● ● ● ●
●● ●●
● ●

●●
●● ●
●●●●
●●●●●●

●● ● ●●●●● ●●●●●●●●●●●●● ● ● ● ● ● ●●●●● ●●● ●●
● ●●●●
● ●●● ●● ●●
●●


●●● ●
●●●●● ●● ●●● ●●● ●●●● ●
●●● ●
● ●
●●●
●●
●●●●
●● ●●
● ●●●● ●● ●●
●● ●●● ●● ●● ●●●●●●
●●●● ●●●●● ●●●●●
●●
●●●●●●●
●● ●
●●●●●● ●●

●●●● ●●

●●●●
●●●


●●
●●


●●●
●●
●●
●●●●
●●
●●●


●●●
●●
●●

●●●
●●●
● ●●●●●● ● ●●
●●●
● ●● ● ● ● ● ●●
● ●
●●●●●
●●●●
●●

● ●



●●

●●●●
● ●
●●
●●
● ●●

●●
●●

● ●
●●● ●●●
● ●● ● ●●● ● ●
● ●● ● ●●● ● ●
● ●●●● ● ●● ● ●● ●●

● ●
● ●● ●
●● ●● ● ●
●● ● ●
●●●●
●●●●
● ●
● ●
●●● ● ● ● ●● ● ●● ● ● ●
● ● ●●● ● ●

● ● ●●●
●● ●
●●●
●●●●
●●●●●●● ●● ●● ●●●●●
● ●
● ●● ● ●● ● ● ●●● ●●● ●●
● ●● ●●●●●●
● ●●●● ●●● ●● ●●●● ●● ●
● ●● ●● ● ● ●
●●●● ●
●● ●●
●● ● ●● ●●
● ●● ●●●●●●●● ●●●● ●● ● ● ● ●● ●● ● ●
●●●● ●●
● ●
● ●
● ●
●●●●

●●
●●
●●●


● ●● ●
●● ●●● ●●●● ●
● ● ●●● ●● ●
● ●
● ●●●●●●● ●●●● ●●●●
● ●

●● ●
●●●●●●
●●

●●●●● ● ●●●●
●●● ●●●●

●●●●● ●● ● ● ●●●
●● ● ●● ●● ● ●●●●●●●●



●●

● ●
●●●
●●
●●●
●●●

●●

● ●




●●●

●●
●●●●● ●
●●
●●●
● ●● ●● ● ●●● ●
● ● ●●●
●● ● ●●●●
●●● ●●
●●
●●●●
● ●


●●
●●● ●●●
● ●●● ●●●

●●●● ●● ● ●
●●
● ● ●● ●●
●● ●●

●●
● ●●●


●● ●
●●
●●
●●
●●


●●
●●

●●
●● ●● ●

●●●●● ● ●● ●
● ● ●
● ●●● ●●● ● ●●●●
●●●

●●●● ●● ●

●●

● ●●

●●●●●

●●
● ●●

●●●●● ●●
●●

●●
●●●●●● ●●●●
● ●●●●●
● ● ● ● ●●●● ●●●●●●●
● ●●
● ●●●●

●●

●●
●●●



●●
●●
●●
●●●●


●●





●●●●
●●● ●
●●● ●●●● ●● ●
● ● ●●● ●
● ●●●
●●● ●●
●●
●●● ●●●●●●
● ●●
●●
● ●●●●●●●
● ●●●●
●●●● ●● ●●
● ●
●●● ● ● ● ● ●● ● ● ●● ●● ●● ●
●●●

●●● ●
●●●

●●
●●●
●●●


●●●
●●
●● ●
●●



●●
●●●●

●●
●●
● ●
●●●●●● ●● ● ●●●
●●● ● ● ● ● ●●
● ● ● ● ●●● ● ● ● ●●● ●
● ●●●

●●
●●
●● ●●● ●
● ● ● ●● ●
●●●●● ●●
● ● ● ● ● ●● ●● ● ●● ●●● ●
●●

●●●
●● ●
● ●●●
● ● ●● ●●● ●

●●
●●●● ●
●●●● ●●● ●●●●●●●


●●
●●●●
●●
●●
● ●● ●

●●●
●● ● ●
● ●●● ●
●● ● ●●●●●● ● ●●

● ●● ●
●●● ●● ● ● ● ●● ● ● ● ● ●●

●●●

●●
●●●

●●
●●
● ●



●●
●●
●●
●●

●●




●●





●●●●●●●●●

●●●
●●● ●●● ● ● ●
● ●● ●● ● ●●● ● ● ● ● ●●
X2

● ● ●● ● ● ● ● ●●● ● ● ●● ●●●●●
●●● ●
● ●
●●
●● ● ● ●● ●●● ●
●● ●
●● ● ●● ●●●
● ●●● ●●
●●●● ●

●●● ●● ●
● ●●●● ● ● ● ● ●● ● ●● ● ●●

● ●

● ●
●● ●
●●● ●● ● ●● ● ● ●
●●● ●● ●
● ● ● ●● ●
●●●●
● ● ●●●●

● ●● ●● ●●●●●●●●●
●●
● ●●●●●
●●●●●●●●● ● ● ●● ● ●
●● ●● ●●
●●
●●●●● ●

●●
● ●●

●●

●●●
●●●

●●
●● ●
●●● ●●●●
●● ●
●●●● ●●● ●
● ● ●● ● ●●●● ●● ●● ● ● ●● ●●●●●● ● ●● ●●●
●● ●
●●●●
● ●

● ●●
●● ●● ● ●●●● ●●
●● ● ● ●●● ●●●●● ●
●● ● ●● ●● ●
● ●●● ●
●●●●●
●●● ● ● ● ● ●●
● ●●●●● ●●●●●●●●●

●●●●●● ● ●
●● ● ●
● ● ● ●●● ●● ● ●●● ●
●●
● ●●● ●

●●● ●● ●
●● ● ● ●●● ● ● ● ● ●●● ●● ●
● ● ●●



●●
●●●
●●
●●


●●


●● ●● ●● ●●
● ●●●●●●●●● ●●●
●●●●● ●● ●
●● ●●●●●
●●●
●●
●●


●●


●●

●● ● ●●

●● ●● ●●● ●

●●
● ● ●●
● ● ● ●
● ●● ●●
●● ●●●●
●●●
●●
●●●●

● ●
●●●
●●


●●

●●

●●●●

● ●●







●●
●●



●●●●●●●
●●●●●

●●● ● ●●
● ● ● ●


●●●● ●
●● ● ●● ●
● ●●●● ● ● ● ● ●
● ●
●●
● ● ●● ● ● ● ● ● ● ●● ●
●●

● ●
●●●● ●●●● ● ● ●
● ●●●●●● ● ●● ● ●●● ● ●● ●● ●●●● ●● ● ●● ● ●● ● ● ●● ● ● ● ●●●●●● ●●●● ●●●●●
● ●● ●● ● ●
● ●●● ●
●●●●

● ●●●●●● ● ●●●●●● ●● ● ● ●● ● ● ●●
●●● ●● ●●●● ● ●●● ●● ●● ● ● ●


● ●●● ● ●
●● ●●● ●●● ●● ●●●
●●●●
● ●●●● ●
● ●●●
● ●● ●●● ● ●● ●● ● ● ●● ● ● ● ● ● ●● ● ●● ●

●●●●
●●●● ●●●● ● ● ●
● ● ●● ●● ● ●● ●
●● ● ●● ●●●●●●●●
● ●●●
●●
● ●●●●● ●●
●● ● ●● ●● ● ●●● ●●

● ● ●●
●●●●
● ●●●●●●● ●
● ● ●● ●●●●●●●
● ●●●●● ●●
●● ●●●

● ●●●●●●● ●● ●● ● ● ● ●
● ● ●● ●● ●
●● ●● ● ●●●● ● ● ● ●
● ●●● ● ●●●● ● ● ●●● ● ●●● ●● ● ● ●● ●●
●●●●●●
● ●● ●
● ●●●
●● ●●●●●●
●●●
●●● ●● ●●
● ● ●● ●● ● ● ●
● ●● ●●●
● ● ●●●
● ●


● ●●
● ●
●● ● ● ●●●● ● ● ● ● ●●
● ●● ●
●●
● ● ● ●● ● ● ●

●●● ● ●
● ● ● ● ● ● ● ●
● ● ● ●● ●
●● ●
●●●●●● ●● ●● ● ●● ●● ● ●●● ● ●● ●●
●● ●
●●
●● ●●
● ●●● ●●● ●●●● ●● ●●● ●
● ● ● ● ●● ● ●

● ●
● ● ●● ●●●
● ● ●●
● ●
● ● ● ●● ● ● ●● ●●
●●●●● ●
● ●
● ● ● ● ●● ●●● ●● ●● ● ● ● ●● ● ●● ● ● ●
● ● ●● ●●●● ● ●● ●● ● ● ● ● ● ● ● ●●● ● ● ● ● ●● ● ●
● ● ● ● ● ● ●
−0.05

−0.05
● ● ● ●
●● ● ● ● ●
● ● ●● ●●●● ●●● ● ● ● ● ● ● ● ●●● ●

●● ● ●●● ● ● ● ● ● ●●
● ● ●
● ●● ●
● ● ● ●
● ● ● ● ● ●
● ●
● ●● ●

● ●


● ●








−0.15

−0.15
−0.15 −0.05 0.05 0.10 −0.15 −0.05 0.05 0.10

X1 BMW

© QRM Tutorial Section 6.1.4


Considering the first margin only:

Q−Q plot for margin 1 (simulated data) Q−Q plot for margin 1 (real data)


0.10

0.10
●● ●



●●●
0.05

0.05
● ●

● ●
●●






Sample quantiles

Sample quantiles
● ●
●● ●●


●●●●●●●
● ●








●●

●●
● ●
●●



●●

●●

●●

●● ●●



●●
●●

●●
● ●

●●

●● ●●


●●


●●
● ●


●●


●●

●●


●●
● ●

●●




●●


●●
●● ●●

●●


●●

●●


●●
● ●

●●





●●


●●

● ●●

●●

●●


●●

● ●●





●●


●●
● ●
●●


●●



●●


●●


● ●●


●●

●●



●●


●●


●● ●

●●


●●


●●


●●


●● ●
●●


●●




●●

●●


●●

● ●


●●


●●




●●


●●
● ●●


●●


●●



●●


●●


● ●
●●


●●


●●




●●


●●

● ●●


●●


●●


●●


●●


●●


●● ●

●●


●●


●●




●●


●●
● ●●


●●


●●


●●


●●


●●






●●


●●


●● ●


●●


●●


●●


●●


●●


● ●
●●


●●


●●





●●


●●

● ●
●●


●●


●●

●●


●●

●● ●

●●


●●


●●



●●


●●

● ●
●●


●●


●●

●●


●●


●● ●
●●

●●


●●



●●


●●
● ●

●●


●●




●●

●● ●

●●

●●





●●


●●


● ●
●●


●●

●●


●●


● ●●

●●


●●


●●


●● ●

●●





●●

●●


● ●

●●




●●


●●

● ●





●●

● ●●
−0.05

−0.05
●●

●●


●●

● ●●



●●

●●
● ●
●●

●●


●●

●●

●● ●●

●●
●●

●●

●●
● ●●
●●●
●●
●● ●●


●●

●●


●●●● ●
●●

●●
●●● ●●


● ●●

●●

● ●
●●●


●●



−0.15

−0.15

−3 −2 −1 0 1 2 3 −3 −2 −1 0 1 2 3

N(0,1) quantiles N(0,1) quantiles

© QRM Tutorial Section 6.1.4


Considering the second margin only:

Q−Q plot for margin 2 (simulated data) Q−Q plot for margin 2 (real data)


●●
0.05

0.05

● ●●●
● ● ●●●
●●
●● ●
●●●

● ●
●●


●●●●●●
● ●
●●





●●

●●● ●●


●●

●●

●● ●
●●



●●


●●

●●
● ●



●●
●●

● ●

●●
●●
●●


●●

●●

● ●
●●

Sample quantiles

Sample quantiles

●●

●●
● ●
●●


●●

●●


●●

●● ●
●●



●●

●●


●●
● ●
●●

●●




●●


●● ●●


●●

●●


●●

●●
● ●●


●●




●●


●●
● ●

●●



●●


●●

● ●


●●

●●


●●


●●
● ●


●●






●●


●● ●


●●



0.00

0.00



●●


●●


● ●
●●


●●



●●


●●

● ●


●●

●●

●●


●●

● ●
●●


●●





●●


●●

● ●


●●


●●


●●

●●


●●


●● ●●


●●


●●




●●

●●

● ●●


●●


●●





●●


●●
● ●


●●


●●


●●


●●


●●




●●


●●


● ●

●●


●●






●●


●● ●

●●


●●


●●

●●

●●


● ●

●●


●●


●●

●●


●●
● ●
●●


●●


●●

●●


●●
● ●


●●


●●



●●


●●
● ●


●●


●●





●●


●● ●
●●

●●





●●


●● ●

●●


●●

●●


●●


●● ●


●●





●●


●●

● ●

●●




●●


●●

●● ●●

●●



●●


●● ●


●●




●●

● ●
●●





●●

●●


● ●
●●

●●

●●

● ●
●●


●●

●●
●●

●●

● ●


●●

●●

●●

● ●
●●

●●




●● ●



●●


●●
● ●

●●



●●


●● ●


●●
●●
●●

● ●


−0.05

−0.05

●●
●● ●

●●● ●●

●●●● ●
●●
● ●● ●
● ●●

●●
●●



●●



−0.10

−0.10

−3 −2 −1 0 1 2 3 −3 −2 −1 0 1 2 3

N(0,1) quantiles N(0,1) quantiles

© QRM Tutorial Section 6.1.4


Q-Q plot of the simulated (left) or real (right) Di2 ’s against a χ22 :
2 2
Q−Q plot of Di (simulated data) Q−Q plot of Di (real data)

● ●


60

60
Sample quantiles

Sample quantiles



40

40
●●



●●

●●●●
●●

20

20

●●
●●●●

● ●

●●

● ● ●
●●
●●


●●
● ●●
●●
● ●
●●

● ●●
●● ● ● ●●

●●

● ●

●●


●●●● ●


●●●●●●● ●●


●●
●●
●●
●●
●●
●●●●●● ●
●●

●●
●●



●●

●●
●●

●●
●● ●
●●

●●





●●


●●


●●

●●


●●
●●

● ●
●●

●●
●●

●●

●●
●●

●●
● ●●


●●

●●



●●

●●


●●

●●
● ●

●●


●●





●●

●●

●●


●●


●● ●
●●


●●


●●


●●


●●


●●


●●

● ●●


●●


●●
●●


●●

●●


●●


●●


●● ●


●●

●●


●●




●●


●●


●●


●●


●●

● ●●


●●

●●


●●






●●


●●


●●

●●


●●


● ●
●●


●●

●●


●●



●●


●●


●●


●●


●●


● ●●


●●


●●


●●


●●


●●




●●


●●


●●


●●


●●
● ●


●●


●●


●●


●●


●●




●●


● ●
●●


●●



0

0

●●


●●


●●


●●


●●


● ●
●●


●●


●●


●●


●●


●●


●●


●●


●●




●●
● ●


●●


●●


●●


●●

0 5 10 15 0 5 10 15

χ22 quantiles χ22 quantiles

© QRM Tutorial Section 6.1.4


Advantages of Nd (µ, Σ)
Distribution is determined by µ and Σ.
Inference is thus “easy”.
Linear combinations are normal (⇒ VaRα and ESα calculations for
portfolios are easy).
Marginal distributions are normal.
Conditional distributions are normal.
Quadratic forms are (theoretically) chi-squared.
Convolutions are normal.
Sampling is straightforward.
Independence and uncorrelatedness are equivalent.

© QRM Tutorial Section 6.1.4


Drawbacks of Nd (µ, Σ) for modelling risk-factor changes
1) Tails of univariate (normal) margins are too thin (generate too few
extreme events).
2) Joint tails are too thin (too few joint extreme events). Nd (µ, Σ) cannot
capture the notion of tail dependence (see Chapters 3 and 7).
3) Strong symmetry known as radial symmetry: X is radially symmetric
d d
about µ if X − µ = µ − X. This is true for Nd (µ, Σ) since Z = −Z.
Short outlook:
Normal variance mixtures (or, more generally, elliptical distributions)
can address 1) and 2) while sharing many of the desirable properties of
Nd (µ, Σ).
Normal mean-variance mixtures can also address 3) (but at the expense
of ellipticality and thus tractability in comparison to Nd (µ, Σ)).

© QRM Tutorial Section 6.1.4


6.2 Normal mixture distributions
Idea: Randomize Σ (and possibly µ) with a non-negative rv W .

6.2.1 Normal variance mixtures


Definition 6.9 (Multivariate normal variance mixtures)
The random vector X has a (multivariate) normal variance mixture
distribution if √
d
X = µ + W AZ,

where Z ∼ Nk (0, Ik ), W ≥ 0 is a rv independent of Z, A ∈ Rd×k , and


µ ∈ Rd . µ is called location vector and Σ = AA′ scale (or dispersion)
matrix .
d √
Observe that (X | W = w) = µ + wAZ = Nd (µ, wAA′ ) = Nd (µ, wΣ);
d
or (X | W ) = Nd (µ, W Σ). W can be interpreted as a shock affecting the
variances of all risk factors.
© QRM Tutorial Section 6.2
Properties of multivariate normal variance mixtures

Let X = µ + W AZ and Y = µ + AZ. Assume that rank(A) = d ≤ k
and that Σ is positive definite.
√ ind. √
If E W < ∞, then E(X) = µ + E( W )AE(Z) = µ + 0 = µ = EY
If EW < ∞, then
√ √ √
cov(X) = cov( W AZ) = E(( W AZ)( W AZ)′ )
ind.
= E(W ) · E(AZZ ′ A′ ) = E(W ) · AE(ZZ ′ )A′
= E(W )AIk A′ = E(W )Σ 6= Σ (= cov(Y ))
in general

However, if they exist (i.e. if EW < ∞) corr(X) = corr(Y ) since


cov(Xi , Xj ) E(W )Σij
corr(Xi , Xj ) = q =q
var(Xi ) var(Xj ) E(W )Σii E(W )Σjj
Σij
=p = corr(Yi , Yj ), i, j ∈ {1, . . . , d}.
Σii Σjj
© QRM Tutorial Section 6.2.1
Lemma 6.10 (Independence in normal variance mixtures)

Let X = µ + W Id Z with EW < ∞ (uncorrelated normal variance
mixture). Then
Xi and Xj are independent ⇐⇒ W is a.s. constant (i.e. X is normal).

See the appendix for a proof. Intuitively, W affects all components of X


and thus creates dependence (unless it is constant).
Characteristic function: Recall: If Y ∼ Nd (µ, Σ), then φY (t) =
exp(it′ µ − 21 t′ Σt). The cf of a multivariate normal variance mixtures is
φX (t) = E(exp(it′ X)) = E( E(exp(it′ X) | W ) )
= E(exp(it′ µ − 21 W t′ Σt)) = exp(it′ µ)E(exp(−W 21 t′ Σt)).
This depends on the Laplace-Stieltjes transform F̂W (θ) = E(exp(−θW ))
R
= 0∞ e−θw dFW (w) of FW . We thus introduce the notation X ∼
Md (µ, Σ, F̂W ) for a d-dimensional multivariate normal variance mixture.
© QRM Tutorial Section 6.2.1
Density: If Σ is positive definite, P(W = 0) = 0, the density of X is
Z ∞
fX (x) = fX|W (x | w) dFW (w)
0
 
(x − µ)′ Σ−1 (x − µ)
Z ∞
1
= q exp − dFW (w).
0 (2πw)d det(Σ) 2w

⇒ Only depends on x through (x − µ)′ Σ−1 (x − µ).


⇒ Multivariate normal variance mixtures are elliptical distributions.
If Σ is diagonal and EW < ∞, X is uncorrelated (as cov(X) = E(W )Σ)
but not independent unless W is constant a.s. (see stoch. representation).
Linear combinations: For X ∼ Md (µ, Σ, F̂W ) and Y = BX + b,
where B ∈ Rk×d and b ∈ Rk , we have Y ∼ Mk (Bµ + b, BΣB ′ , F̂W );
this can be shown via cfs. If a ∈ Rd (b = 0, B = a′ ∈ R1×d ),
a′ X ∼ M1 (a′ µ, a′ Σa, F̂W ).

© QRM Tutorial Section 6.2.1


Sampling:

Algorithm 6.11 (Simulation of X = µ + W AZ ∼ Md (µ, Σ, F̂W ))
1) Generate Z ∼ Nd (0, Id ).
2) Generate W ∼ FW (with LS transform F̂W ), independent of Z.
3) Compute the Cholesky factor A (such that AA′ = Σ).

4) Return X = µ + W AZ.

Example 6.12 (td (ν, µ, Σ) distribution)


For Step 2), use W ∼ Ig(ν/2, ν/2) (either via W = ν/V for V ∼ χ2ν or
W = 1/V for V ∼ Γ( ν2 , ν2 ) (Γ(α, β) density: f (x) = β α xα−1 e−βx /Γ(α)).

© QRM Tutorial Section 6.2.1


Examples of multivariate normal variance mixtures
Multivariate normal distribution
W = 1 a.s. (degenerate case)
Two point mixture

w with probability p,
1
W = w1 , w2 > 0, w1 6= w2 .
w with probability 1 − p
2

Can be used to model ordinary and stress regimes; extends to k regimes.


Symmetric generalised hyperbolic distribution
W has a generalised inverse Gaussian distribution (GIG); see MFE (2015,
p. 187).
Multivariate t distribution
W has an inverse gamma distribution W = 1/V for V ∼ Γ(ν/2, ν/2).
◮ E(W ) = ν ⇒ cov (X) = ν Σ. For finite variances/correlations,
ν−2 ν−2
ν > 2 is required. For finite mean, ν > 1 is required.
© QRM Tutorial Section 6.2.1
◮ The density of the multivariate t distribution is given by
 − ν+d
Γ((ν + d)/2) (x − µ)′ Σ−1 (x − µ) 2
fX (x) = 1 + ,
Γ(ν/2)(νπ)d/2 |Σ|1/2 ν
where µ ∈ Rd , Σ ∈ Rd×d is a positive definite matrix, and ν is the
degrees of freedom. Notation: X ∼ td (ν, µ, Σ).
◮ td (ν, µ, Σ) has heavier marginal and joint tails than Nd (µ, Σ).
◮ BMW–Siemens data; simulations from fitted Nd (µ, Σ) and td (3, µ, Σ):
0.15

0.15

0.15
● ●


0.10

0.10

0.10

● ●

● ● ●
0.05

0.05

0.05
● ●
● ●
● ● ● ●

● ● ● ● ●● ● ● ● ●
● ● ● ● ● ● ●●●
● ● ● ● ●
● ● ● ● ●● ●
● ●● ● ● ●
● ● ● ● ●● ● ●● ● ●
● ●● ● ● ● ●● ●● ● ● ● ●
● ● ●
● ●● ● ● ● ● ● ● ●
●●●
●● ● ● ●●● ●● ● ● ●
● ● ● ● ● ●● ●● ●
● ●● ● ●
● ●● ●● ● ●● ●
● ●● ●●● ●●● ●● ● ●●● ●●

●● ●●● ●● ●●● ● ● ●●●●● ● ●
● ●

●● ● ● ●●●● ●● ● ● ● ●● ●● ●
●●●●●● ●

●●●● ●


●●●● ●
●●●● ●
●● ● ●● ● ● ● ● ● ●● ●●●●

● ● ● ●
● ●●●●●● ●● ●●●
● ●● ●● ●● ●● ● ●●●●●● ●●●●●●● ●●

●●●● ● ●●● ●● ●●●
●●●● ● ● ●
●●●●● ●●● ●●●● ● ●
● ●●● ● ● ●● ●● ● ●● ● ● ●●●● ●
● ●
●●●
● ●●
● ●


●●●●●●


●● ●
●● ●●
●●
● ●●● ●●● ● ●
●● ● ● ● ● ● ●●

●●
SIE


SIE

● ●

SIE
● ● ●
●●
● ●
● ●●● ●●● ● ● ●● ●●● ●● ●● ● ● ●●● ● ●●●●● ●●
●● ● ● ●
● ● ●● ● ●
●●●●
● ●




●●●
●●
●●
●●
●●●
● ●● ●●●● ● ●● ● ●
●● ●●
●●
●●●●
● ●●

●●●



●●
●●



●●
●●

●●

●● ●●●●●



●●● ●●
● ●
● ●
● ●●

● ●
●●●●●●

●● ●
● ●
●●●●●
●●●
●● ●

●● ●●●

● ●●
● ●

●●

●●

●●●●
●●●
●● ●●● ●● ●● ●●● ●●
●● ●
● ●●
●●●
●●●●
●●●
●●
●●●●
●●

●●●
●●●●●●

● ●●
●●
● ●● ●●●●●● ●●
●● ● ●●●
●●●

●● ●
●●
● ● ●
●●
●● ●●●● ●●●
● ● ●
● ●● ●
●●

●●















●●
●●

●● ●
●●


●●
●●





●●
●●●

●● ●
●●●● ●●● ●
●●● ●● ●
● ● ●

●●●

●●






●●


●●


●●

●●
●●


●●

●●

●●





●●
●●

●●


●●●●

● ●
● ●●● ● ● ●●●●●

● ●
●●




●●
● ●
●●●



●●


●●● ●●

●●



●●
●●●● ●●●
● ● ●●● ●●●
●●●●

●●●●






●●

●●
●●


●●

●●
●●
●●



●●

●●

●●
●●
●●●●●
●● ●●●●●

● ●
●●●●● ●
●●

●●







●●●


●●


●●

●●

●●●








●●






●●
●●

●●

●●
●●


●●●
●●

●●●●

●●● ● ●●


● ●● ●●●● ●


●●
●●●
● ●
●●
●●
●●


●●





●●

●●









●●
●●

●●

●●●●
●●●● ●
● ●

●●●●
● ●

●●●●
●●
●●● ●●


● ● ● ●● ● ● ●●●●● ● ●
●● ●●
●●
● ●
● ●
● ●


●●●
●●
●●
● ●


●●●●●
●●●

●●
●●
● ● ●● ● ●●●●● ●
●●●
●●

● ●
●●
●●●
●●

● ●●
● ●●

●●
●●
● ●
● ●● ●●●●
● ●● ● ● ● ● ●●● ● ●●● ●● ●
0.00

● ● ● ●
0.00

0.00
●● ●●●●
●●●
●●●

●●●
●●●
●●●

● ●
●● ●●
● ●● ● ● ● ● ●
●●●●●
● ●●
●●
●●●
● ● ●

●●
● ● ●●
●●●●
● ●● ●
●● ●●●●●
● ●
●●
●●●●●●●
●●●
●●●● ●
●●●●● ●

●● ●●

●●






●●



●●
●●●


●●









●●


●●

●●






























●●












●●

●●


●●
●●

●●●● ●●

● ● ●
● ●
●● ● ●●
●●

●●●


●●











●●


●●


















●●



●●














●●









●●









●●






●●




●●


●●
●●


●●
●●●● ● ● ● ● ●
●●● ● ●
●●●

●●


























●●














●●

























●●
●●



















●●




●●
●●


●●●● ●
●● ●

●●●


●●
●●

●●
●●
●●





●●

●●




●●








●●
●●


●●
●●

●●





●●
●●



●●

●●●

● ●●





●●●


● ●●
●● ● ●
● ●●●●● ●●●

●●

●●

● ●●









●●



●●





●●





●●

●●







●●




●●

●●●

●●

●●



●●

●●● ●● ● ●●
●●●●●







●●



●●
●●



●●

●●




















●●
●●




●●
●●

●●


●●




● ●
●●


● ●●
●●
●●●●●● ●
● ●●●●
●●●●

●●●●
●●●

●●●


●●

●●

●●

●●


●●


●●
●●



●●

●●


●●

●●

●●● ●
●●●
●●
● ● ● ●● ●● ● ●●●
●●
●●

●●

●●●
●●
●●


●●
●●●●

●●

●●
●●

●●

●●●●

●●
●●●●

●●
●●

●●




●●●●
● ●●●●●● ● ●● ● ●● ●● ●● ●
●●●●●
●●

●●
●●

●●



●●

●●


●●

●●

●●


●●
●●
●●
●●


●●
●●


●●
●●
●●
●●
●●
●●
●●●
●●
● ●
●● ● ●
● ●

●●●

●●
●●
●●
●●
●●
●●
●●

●●
●●
●●

●●
●●


●●
●●

●●


●●
●●
●●


●●

●●

●●

●●

●●●●
● ●
●●
● ●●
● ●● ● ●● ● ● ● ● ●●●●●●●
●●


● ●
●●

●●

●●

●●●●




●●
●●
●●

●●●

●●●

●●




●●

●●●

●●
●●●●
●●●● ●● ● ●●●
●●
● ●●●
●●●

●●
●●

●●

●●




●●


●●




●●






●●







●●


●●

●●

●●
●●

●●

●●●●
●●●● ●● ● ●● ● ●

●● ●● ●● ●

●●
●●




●●


●●●
●●
●●
●●




●●









●●









●●
●●











●●

●●



●●

●●●


●● ●●
●●●
●●● ● ● ●● ●

●●
●●●


●●


●●


●●
●●

●●


●●

●●





●●

●●●


●●



●●





●●


●●

●●●

●●



●●


●●

●●

●●


●●


●●


●●●●●●● ●
● ●● ● ● ●●
●● ●●
●●


●●

●●



●●



●●


●●



●●
●●



●●

●●



●●

●●


●●








●●
●●

●●
●●

●●

● ●● ●● ● ●
●●●●●●
●● ●
●●
●●●



●●


●●●

●●

●●
●●


●●
●●●
●●
●●●

●●●
●● ●●●● ●● ● ●
●●●●●●● ● ●

●●●



●●●●


●●
●●

●●


●●
●●
●●
●●

●●
●●


●●


● ●
●●

●●

● ●

●●●
● ●
● ●
●●●●● ●● ● ● ● ●● ●●●●● ●●

●●
●●


●●
●●


●●



●●




●●



●●

●●
●●
●●



●●●


●●


● ●

●● ●

● ●●●

●● ● ●● ●●● ●●
●●
●●

●●


●●
●●●

●●

●●
●●



●●

●●



●●



●●

●●
●●

●●
● ●●

●●

●●● ● ● ● ● ● ● ●


●●
●●●

●●
●●●

●●
●●



●●● ●

●●

●●
●●
●●





●●

●●

●●



●●

●●●


●●

●●●
●●●●

●●●

●●●● ● ● ●● ● ●

●●●
●●
●●
●●

●●



●●

●●
●●


●●
●●


●●
●●
●●


●●
●●

●●

●●●
●●● ●
● ●
●●● ● ● ●
● ●●● ●●

●●





●●
●●






●●





















●●













●●

●●



●●●
●●

●●●●
● ●


● ●
●●●● ● ●●
●●
● ● ●●●●●


●●●

●●
●●

●●

●●







●●











●●












●●


●●●





●●

●●●





●●

●●●●




●●●

●●
●●●

●●
●●
● ●
●● ●

●●
● ●





●●


●●

●●

●●





●●●






●●


●●





●●

●●

●●●

●●

●●●
●●
●●●
● ●●● ●●●●●●●
●● ●
●●
●●
●●

●●
●●



●●




●●

●●
●●






●●
●●


●●●
●●
●●●
● ●

● ●● ●●●
●●
●●●●


●●●



●●
●●






●●



●●








●●●●







●●●

●●

●●●


●●
●●
●●● ●●

● ● ● ●● ● ●●


●● ●●●●
●●
●●



●●●

●●

















●●







●●


●●
●●

●●●●● ●●●
●● ● ●


● ●●
●● ●●
●●

●●●●

●●●
●●●
●●
●●●●●● ● ● ● ●● ● ●
●●
● ●

●●●

●●
●●
●●
●●
●●
● ●●
●●
● ●●
●● ●●● ●
● ● ●● ● ●
● ●
●●●●●
●●
● ●

●●●



●●●●
● ●●

●●●●● ●●
● ●● ●● ●
●● ●
●●
● ●● ●
●●
● ●

●●●
●●
● ●●●
●●
● ●● ● ●
● ●●●●●●

● ●
●●●
●●●●

●●

● ●

●●

●●
●●
●●●●●
●●●
●●●● ●
●● ●
● ● ●●
● ●●●● ● ●●
●●
●●●


●●●
●●●●●●
● ●
●●●
●●●●●● ●
● ●●

● ● ●● ●●●●●
●●

●●
●●
●●
● ●●
●●●
● ● ● ● ●●●●●●●●●●








●●
●●

●●
●●●




●●
●●
●●
●●
●●

●●●●



●●●●

●●●●
● ● ●●●
●●●●● ●

● ●●

●●●
●●●●
● ●
●●
● ●●● ● ●
●●
●● ●
● ● ●●● ●

●●

●●●●
●●
●●●●●●

●● ● ● ●●●●●●●
● ●
●●
●●●
●●

●●


● ●




●●●


●●●
●●
●●●●●●●
●●


●●●● ●● ●
● ● ● ●●●●●●●
● ●


●●

● ●
●●●●
●●

● ●●


● ● ● ● ●
● ●● ●●● ●
●● ●●●●●● ●● ● ● ●●
●●

● ●●
●●


●●

●●●●●
● ●
● ●

●●
●●
●●●
● ●● ● ●●● ●●●●●●● ● ●●

●●
●●
● ●
● ● ●
● ●
●● ●●● ● ●● ●●●●●●
●●●
●● ● ●●● ● ● ●● ●●
●● ●
● ●
●● ●●●
● ●●
● ● ● ●●● ●●● ●
●● ●● ●
●● ●
● ●
●●●●● ●●
●●●●●
●● ●●● ● ● ● ●●
● ● ●● ● ●●●●

● ●●
● ●

●●●● ● ●●
● ● ●● ●● ●●●● ●
●●
●●
● ●
● ●
● ●● ●●●● ● ●●●●●● ●●
●●●●●●

● ●●●
● ● ●● ●● ●● ●
●● ● ● ● ●● ● ● ●● ● ● ●● ● ● ●
● ● ●● ● ●● ● ● ● ●● ●
● ●●●
● ● ● ● ● ●● ●

●●
●● ●● ● ●●● ● ●
● ●● ● ● ● ● ● ●
●●
●● ●
−0.05

−0.05

−0.05
● ●
● ● ●●
● ● ● ● ● ● ● ● ●● ● ● ●
● ●
● ● ● ●● ●
● ●●
● ● ● ●● ● ● ●
● ● ●
● ● ● ●
● ●
● ● ● ●
● ● ●

● ● ●
● ●


● ●

● ●
● ●

−0.10

−0.10

−0.10

−0.15 −0.10 −0.05 0.00 0.05 0.10 0.15 0.20 −0.15 −0.10 −0.05 0.00 0.05 0.10 0.15 0.20 −0.15 −0.10 −0.05 0.00 0.05 0.10 0.15 0.20

BMW BMW BMW

© QRM Tutorial Section 6.2.1


6.2.2 Normal mean-variance mixtures
Radial symmetry implies that all one-dimensional margins of normal
variance mixtures are symmetric.
Often visible in data: joint losses have heavier tails than joint gains.
Idea: Introduce asymmetry by mixing normal distributions with different
means and variances.

X has a (multivariate) normal mean-variance mixture distribution if


d √
X = m(W ) + W AZ, (21)
where
Z ∼ Nk (0, Ik );
W ≥ 0 is a scalar random variable which is independent of Z;
A ∈ Rd×k is a matrix of constants;
m : [0, ∞) → Rd is a measurable function.
© QRM Tutorial Section 6.2.2
Normal mean-variance mixtures add skewness: Let Σ = AA′ and
observe that X | W = w ∼ Nd (m(w), wΣ). In general, they are no
longer elliptical (see later).
Example 6.13
Suppose we have m(W ) = µ + W γ. Since
E(X | W ) = µ + W γ,
cov(X | W ) = W Σ
we have
EX = E(E(X | W )) = µ + E(W )γ if EW < ∞,
cov(X) = E(cov(X | W )) + cov(E(X | W ))
= E(W )Σ + var(W )γγ ′ if E(W 2 ) < ∞.
If W has a GIG distribution, then X follows a generalised hyperbolic
distribution. γ = 0 leads to (elliptical) normal variance mixtures; see
MFE (2015, Sections 6.2.3) for details.
© QRM Tutorial Section 6.2.2
6.3 Spherical and elliptical distributions
Empirical examples (see MFE (2015, Sections 6.2.4)) show that
1) Md (µ, Σ, F̂W ) (e.g. multivariate t) provide superior models to Nd (µ, Σ)
for daily/weekly stock-return data;
2) the more general skewed normal mean-variance mixture distributions
offer only a modest improvement.
We study elliptical distributions, a generalization of Md (µ, Σ, F̂W ).

6.3.1 Spherical distributions

Definition 6.14 (Spherical distribution)


A random vector Y = (Y1 , . . . , Yd ) has a spherical distribution if for
every orthogonal U ∈ Rd×d (i.e. U ∈ Rd×d with U U ′ = U ′ U = Id )
d
Y = UY (distributionally invariant under rotations and reflections)

© QRM Tutorial Section 6.3


Theorem 6.15 (Characterization of spherical distributions)
Let ktk = (t21 + · · · + t2d )1/2 , t ∈ Rd . The following are equivalent:
1) Y is spherical (notation: Y ∼ Sd (ψ) for ψ as below).
2) ∃ a characteristic generator ψ : [0, ∞) → R, such that φY (t) =

E(eit Y ) = ψ(ktk2 ), ∀ t ∈ Rd .
d
3) For every a ∈ Rd , a′ Y = kakY1 (lin. comb. are of the same type).
⇒ Subadditivity of VaRα for jointly elliptical losses

Theorem 6.16 (Stochastic representation)


d
Y ∼ Sd (ψ) if and only if Y = RS for an independent radial part R ≥ 0
and S ∼ U({x ∈ Rd : kxk = 1}).

See the appendix for proofs for Theorems 6.15 and 6.16.
If Y has a density fY , it satisfies fY (y) = g(kyk2 ) for a function
g : [0, ∞) → [0, ∞) referred to as density generator (i.e. fY is constant
on spheres); see the appendix for a proof.
© QRM Tutorial Section 6.3.1
Corollary 6.17
d
If Y ∼ Sd (ψ) and P(Y = 0) = 0, then (kY k, kYY k ) = (R, S) since
d RS RS
(kY k, kYY k ) = (kRSk, kRSk ) = (|R|kSk, |R|kSk ) = (R, S).

⇒ kY k and Y /kY k are independent (⇒ goodness-of-fit, sampling).

Example 6.18 (Standardized normal variance mixtures)


d

Y ∼ Md (0, Id , F̂W ) is spherical (recall: Y = 0 + W Id Z) since
√ √
φY (t) = E(exp(it′ W Z)) = E(E(exp(i(t W )′ Z) | W ) )
= E(exp(− 12 W t′ t)) = F̂W ( 12 t′ t) = F̂W ( 12 ktk2 ),
so Y ∼ Sd (ψ) by Theorem 6.15 Part 2). We thus have ψ(t) = F̂W (t/2).
For Y ∼ Nd (0, Id ), ψ(t) = exp(−t/2). By Corollary 6.17, simulating
d
S ∼ U({x ∈ Rd : kxk = 1}) can thus be done via S = Y /kY k. Fang
et al. (1990, pp. 50) show that ψ generates Sd (ψ) for all d ∈ N if and
only if it is the characteristic generator of a normal variance mixture.
© QRM Tutorial Section 6.3.1
Example 6.19 (R, S, cov, corr)
It follows from Y ∼ Nd (0, Id ) and R2 = kY k2 = Y ′ Y ∼ χ2d that
0 = EY = ER ES ⇒ ES = 0,
Th. 6.16
2
Id = cov Y = cov(RS) = E(R ) cov S = d cov S ⇒ cov S = Id /d.
Th. 6.16
(22)

For (spherically distributed) Y ∼ Sd (ψ) with E(R2 ) < ∞, it follows


that
E(R2 )
cov Y = cov(RS) = E(R2 ) cov S = Id
Th. 6.16 d
(E(R2 )/d)Id
and thus corr Y = √ = Id .
(E(R2 )/d)(E(R2 )/d)

For (elliptically distributed; see soon) X = µ+AY with E(R2 ) < ∞ and
2)
Cholesky factor A of a covariance matrix Σ, we have cov X = E(R d Σ
and corr X = P (the correlation matrix corresponding to Σ).
© QRM Tutorial Section 6.3.1
Example 6.20 (t distribution)
For Y ∼ td (ν, 0, Id ), R2 = Y ′ Y √
= W Z ′ Z for Z ∼ Nd (0, Id ). Thus
Cor.6.17 Y = WZ

R2 Z ′ Z/d χ2d /d
= = ∼ F (d, ν)
d (ν/W )/ν χ2ν /ν
ν
and thus E(R2 /d) = ν−2 .
This, together with Example 6.19, implies that X ∼ td (ν, µ, Σ) has
ν
cov X = ν−2 Σ and corr X = P (which we already know from Sec-
tion 6.2.1); note that in the univariate case X ∼ t(ν, µ, σ 2 ) and
ν
var(X) = ν−2 σ2.
We also see that we can use a Q-Q plot of the order statistics of
R2 /d = kY k2 /d versus the theoretical quantiles of a (hypothesized)
F (d, ν) distribution to check the goodness-of-fit of the hypothesized t
distribution (in any dimensions).
See the appendix for the form of the density generator g.
© QRM Tutorial Section 6.3.1
Example 6.21 (Understanding spherical distributions)
p
n = 500 realizations of S (left) and Y = RS (right) for R ∼ dF (d, ν),
d = 2, ν = 4 (as for the multivariate t distribution with ν = 4).

10
1.0

●●
●●●● ● ● ●●

● ●● ●
● ●●● ●
●●
● ●●●● ●●●
●●●
● ●●●●
● ●
●●●
● ●
●●

●●●
●●● ●●
●●

●●
●●
●●●●●
●●
●●
●●
●●

● ●
● ●●
●●●
●● ●

●● ●
●●
●● ●●

●●●


● ●●
●●
●● ●


●●● ●
●●
●●● ●

●● ●
0.5

●●

5

● ●

● ●

●● ●
● ●● ●

● ●● ●

● ● ●
● ●




● ● ●
● ●
● ● ●
● ●

● ● ●
● ● ●

● ● ● ●

● ● ● ●● ●
● ● ● ●
● ● ● ● ● ●● ● ●
● ● ● ● ●
● ● ● ● ● ●


● ●
● ● ● ● ●● ● ● ●
● ●● ● ● ●
● ● ●● ● ● ●
● ● ●
● ● ●● ● ●●●●● ●● ●
●● ● ●●
●●
●●●●● ● ● ● ●
● ● ●● ●● ●●
● ●●●
● ●●● ● ●
● ● ● ● ● ●●●● ● ● ●●●●●●
● ●
● ●● ●● ●●●
●● ● ●●● ●

●●●
●●●●●● ●●
● ●● ● ●●●●● ●●
●●●
0.0

● ● ●●● ●●● ● ●● ●●●●● ●


● ●●●●
S2

Y2
●●● ●●● ● ● ●●●
● ● ●●●●● ●●
●● ● ●●
● ● ●● ●●●
●●● ●

0

● ● ●
● ●●●
● ●●●●●
●● ●●
●●

●●● ● ●● ●●● ● ●
● ●● ● ● ●● ●●
● ●●● ●●● ●
● ● ● ●●●●● ●●●●●●

●●
●●●
●●
●●●


●●

● ● ●●●
●●

●● ●
● ●
● ● ● ● ●●●●● ●● ●●●●●●●●●● ●●● ●
● ● ●●
● ● ● ●

●●●● ●
●● ●● ●● ● ● ●●●●●
● ● ● ● ●● ●
●● ● ●● ●●
●● ● ●
● ●● ● ● ●● ●● ● ●

● ● ●●●

●●●●
●● ●●●●● ●● ●●
●●●●●●● ● ●
● ● ● ●●●●● ●● ● ● ●


● ● ●●●●● ●● ● ● ●● ● ●
● ●
● ● ● ● ●● ● ●
● ● ● ●● ●● ●● ●
● ● ● ● ● ●

● ● ● ● ●●
● ●
● ● ● ● ●

● ● ●
● ●
● ● ●
● ● ●
● ● ● ●

● ●
● ● ●
● ●



● ●
−0.5

● ●
● ●
●●
● ●

−5
● ●

●● ●
●● ●

● ●
●●
●● ●

● ●

●● ●
● ●●
● ●

●●● ●●




● ●●


● ●●
●● ●
●●●




●●● ●●

●● ●

●● ●

●● ●
●●
●● ●●
●●●


●●● ●●●
−1.0

●●
●● ●

● ● ●
●●●●
●●●

●●● ● ●● ●
●●●●● ●● −10
●●● ● ●
●● ● ●●

−1.0 −0.5 0.0 0.5 1.0 −10 −5 0 5 10

S1 Y1

© QRM Tutorial Section 6.3.1


6.3.2 Elliptical distributions

Definition 6.22 (Elliptical distribution)


A random vector X = (X1 , . . . , Xd ) has an elliptical distribution if
d
X = µ + AY , (multivariate affine transformation)

where Y ∼ Sk (ψ), A ∈ Rd×k (scale matrix Σ = AA′ ), and (location


vector ) µ ∈ Rd .

By Theorem 6.16, an elliptical random vector admits the stochastic


d
representation X = µ + RAS, with R and S as before.

The cf of an elliptical random vector X is φX (t) = E(eit X ) =
′ ′ ′ ′ ′
E(eit (µ+AY ) ) = eit µ E(ei(A t) Y ) = eit µ ψ(t′ Σt). Notation: X ∼
Ed (µ, Σ, ψ) (= Ed (µ, cΣ, ψ(·/c)), c > 0).
If Σ is positive definite with Cholesky factor A, then X ∼ Ed (µ, Σ, ψ)
if and only if Y = A−1 (X − µ) ∼ Sd (ψ).
© QRM Tutorial Section 6.3.2
If X ∼ Ed (µ, Σ, ψ) with P(X = µ) = 0, then Y = A−1 (X − µ) ∼
Sd (ψ). Corollary 6.17 implies that
q 
A−1 (X−µ) d
(X − µ)′ Σ−1 (X − µ), √ = (R, S), (23)
(X−µ)′ Σ−1 (X−µ)

which can be used for testing elliptical symmetry.


Normal variance mixture distributions are elliptical (most useful exam-
d
√ √
ples) since X = µ + W AZ = µ + W kZkAZ/kZk = µ + RAS

with R = W kZk and S = Z/kZk . By Corollary 6.17, R and S are
indeed independent.

© QRM Tutorial Section 6.3.2


Example 6.23 (Understanding elliptical distributions)
n = 500 realizations of X = RAS (left) and X = µ + RAS (right) for
p
R ∼ dF (d, ν), d = 2, ν = 4; recycling of samples from Example 6.21.
10

10
●●


● ● ● ●
● ●

● ● ●
● ●● ● ●
● ●
●●
● ● ●●● ●
● ● ● ●● ● ●●
●● ● ● ● ●● ●
● ●●● ● ● ●
● ● ● ● ●●● ● ●
●● ● ● ●
● ● ● ●● ●

● ● ● ● ●

● ● ●●●●
● ●●●●●●●●●● ●●● ●●
● ●●●● ● ●●● ● ●●●● ●●
● ● ●
● ●● ●● ●● ●


● ●
●●●●● ●● ●●● ●● ●
● ● ●●●●●● ●●● ●●
● ●●
● ●●
●●●
●● ●● ●

●●
● ●● ●
●●●
● ●●●
● ●●●
●●●
● ● ● ●● ●●●● ●● ●●●
●● ● ● ●●●● ● ●

● ● ● ● ●●● ●●
●●●


●●
●● ●●●
●● ●●●●
●● ● ●
● ●
●●●●●
●●
●● ● ●
●●
● ●
●●●


●●



●●● ● ●
●●● ● ●
●●●●●
●●
●●

●●● ●
●●
●●●● ●
● ●
5

5
● ● ●●
●●
●●●
● ● ● ● ●● ● ●
● ●● ● ● ● ●
●● ● ● ●
●● ●● ●●●● ●
●●
●● ●●●● ● ●
●● ● ●
● ●● ●● ●● ●
● ●
●●●●
●●

● ● ●●● ●●●●● ● ● ● ●
●● ●● ●● ●● ●●
●● ● ● ● ●●
● ● ● ● ● ●●
●●●● ●
●●● ● ● ● ●
● ● ● ●●
● ● ● ●●●● ●
● ●
●●●
● ● ●
● ● ● ● ●
● ● ●

● ● ● ● ●● ● ●
● ●● ● ●
● ●
●● ●
● ● ●●● ●
● ● ● ●● ● ●●
●● ● ● ● ●● ●
● ●●● ● ● ● ●
● ● ● ● ●●● ● ●
● ●
●● ● ●●●●● ●●● ● ● ●
● ● ● ●
●●●● ●●● ●●
● ●
●●●●
●●●●●●
● ●●●●●●
● ● ●● ●
● ●● ●●●
●● ●● ● ● ●


●●●
●●
●●●● ● ●● ●● ●●




● ●●● ●● ●●● ●●●● ● ●●
●● ●● ●
● ● ● ●●

●●●●●●●●
●●
●● ● ●●
X2

X2
● ●●● ● ●
● ● ● ● ●● ●●●● ●● ●●●
●● ● ● ●●●● ● ●

● ● ● ●●●
0

0
●●●●● ●● ● ●● ● ●
● ● ●
●●●●●
●●
●● ●●
●●
●●●●

●●








●●

●● ●●●●● ●
● ●● ●
●●

●●
●●
● ●
● ● ●● ●

● ●● ●● ●
●●

●●●
●●●

●●●●● ●● ●


●● ● ●● ●● ●


●●●●●



●●●●●●●●●
●●●● ●●
● ●
●●● ● ●●
● ● ●
●●●●● ● ● ●
● ●
●●●● ● ● ● ●
●● ●● ● ● ● ●
●● ●● ●●
● ● ● ● ●●
● ● ● ● ● ●●
●●●● ●
●●● ● ● ● ● ●
● ● ●●

● ● ●●● ●
● ●
●●●
● ● ●
● ●
● ●● ● ●


● ●
● ●

−5

−5

−10


−10

−10 −5 0 5 10 −10 −5 0 5 10

X1 X1

© QRM Tutorial Section 6.3.2


6.3.3 Properties of elliptical distributions
Density: Let Σ be positive definite and Y ∼ Sd (ψ) have density
generator g. The density transformation theorem implies that X =
µ + AY has density
1 
fX (x) = √ g (x − µ)′ Σ−1 (x − µ) ,
det Σ
which depends on x only through (x − µ)′ Σ−1 (x − µ), i.e. is constant
on ellipsoids (hence the name “elliptical”).
Linear combinations: For X ∼ Ed (µ, Σ, ψ), B ∈ Rk×d and b ∈ Rk ,
BX + b ∼ Ek (Bµ + b, BΣB ′ , ψ) (via cfs).
If a ∈ Rd (take b = 0 and B = a′ ∈ R1×d ),
a′ X ∼ E1 (a′ µ, a′ Σa, ψ) (as for N(µ, Σ)). (24)
From a = ej = (0, . . . , 0, 1, 0, . . . , 0) we see that all marginal distribu-
tions are of the same type.
© QRM Tutorial Section 6.3.3
Marginal dfs: As for Nd (µ, Σ), it immediately follows that X =
(X1′ , X2′ )′ ∼ Ed (µ, Σ, ψ) satisfies X1 ∼ Ek (µ1 , Σ11 , ψ) and that X2 ∼
Ed−k (µ2 , Σ22 , ψ); i.e. margins of elliptical distributions are elliptical.
Conditional distributions: One can also show that conditional distri-
butions of elliptical distributions are elliptical; see Embrechts, McNeil,
et al. (2002). For Nd (µ, Σ) the characteristic generator remains the
same.
d
Quadratic forms: (23) implies that (X − µ)′ Σ−1 (X − µ) = R2 . If
X ∼ Nd (µ, Σ), R2 ∼ χ2d ; and if X ∼ td (ν, µ, Σ), R2 /d ∼ F (d, ν).
Convolutions: Let X ∼ Ed (µ, Σ, ψ) and Y ∼ Ed (µ̃, cΣ, ψ̃) be inde-
pendent. Then aX + bY is elliptically distributed for a, b ∈ R, c > 0.
Conditional correlations remain invariant See Proposition A.13.
Many (but not all) nice properties of Nd (µ, Σ) are preserved. The following
result shows why elliptical distributions are the “Garden of Eden” of QRM.

© QRM Tutorial Section 6.3.3


Proposition 6.24 (Subadditivity of VaR in elliptical models)
Let Li = λ′i X, λi ∈ Rd , i ∈ {1, . . . , n}, with X ∼ Ed (µ, Σ, ψ). Then
P P
VaRα ( ni=1 Li ) ≤ ni=1 VaRα (Li ) for all α ∈ [1/2, 1].
d
Proof. Consider a generic L = λ′ X = λ′ µ + λ′ AY for Y ∼ Sk (ψ). By
d d
Theorem 6.15 Part 3), λ′ AY = kλ′ AkY1 , so L = λ′ µ + kλ′ AkY1 (all Li ’s
are of the same type). By translation invariance and positive homogeneity,
VaRα (L) = λ′ µ + kλ′ Ak VaRα (Y1 ). (25)
Pn Pn
Applying (25) once to L = i=1 Li = ( i=1 λi )′ X and to each L =
Li = λ′i X, i ∈ {1, . . . , n}, and using that VaRα (Y1 ) ≥ 0 for α ∈ [1/2, 1],
P P P
we obtain VaRα ( ni=1 Li ) = ni=1 λ′i µ + k ni=1 λ′i Ak VaRα (Y1 )
(25)
Pn ′ Pn ′ Pn ′ ′
≤ i=1 λi µ + ( i=1 kλi Ak) VaR α (Y1 ) = i=1 (λi µ + kλi Ak VaR α (Y1 ))
Pn Pn Pn
= i=1 VaRα (Li ). For λi = ei , VaRα ( i=1 Xi ) ≤ i=1 VaRα (Xi ).
(25)

© QRM Tutorial Section 6.3.3


6.4 Dimension reduction techniques
6.4.1 Factor models
Explain the variability of X in terms of common factors.

Definition 6.25 (p-factor model)


X follows a p-factor model if
X = a + BF + ε, (26)
where
1) B ∈ Rd×p is a matrix of factor loadings and a ∈ Rd ;
2) F = (F1 , . . . , Fp ) is the random vector of (common) factors with
p < d and Ω := cov(F ), (systematic risk);
3) ε = (ε1 , . . . , εd ) is the random vector of idiosyncratic error terms
with E(ε) = 0, Υ := cov(ε) diag., cov(F , ε) = (0) (idiosync. risk).

© QRM Tutorial Section 6.4


Goals: Identify or estimate Ft , t ∈ {1, . . . , n}, then model the dis-
tribution/dynamics of the (lower-dimensional) factors (instead of Xt ,
t ∈ {1, . . . , n}).
Factor models imply that Σ := cov(X) = BΩB ′ + Υ.
With B ∗ = BΩ1/2 and F ∗ = Ω−1/2 (F − E(F )), we have
X = µ + B ∗ F ∗ + ε,
where µ = E(X). We have Σ = B ∗ (B ∗ )′ +Υ. Conversely, if cov(X) =
BB ′ + Υ for some B ∈ Rd×p with rank(B) = p < d and diagonal
matrix Υ, then X has a factor-model representation for a p-dimensional
F and d-dimensional ε.
For a one-factor/equicorrelation example, see the appendix.

© QRM Tutorial Section 6.4.1


6.4.2 Statistical estimation strategies
Consider Xt = a + BFt + εt , t ∈ {1, . . . , n}. Three types of factor model
are commonly used:
1) Macroeconomic factor models: Here we assume that Ft is observable,
t ∈ {1, . . . , n}. Estimation of B, a is accomplished by time series
regression.
2) Fundamental factor models: Here we assume that the matrix of factor
loadings B is known but the factors Ft are unobserved (and have to be
estimated from Xt , t ∈ {1, . . . , n}, using cross-sectional regression at
each t).
3) Fundamental factor models: Here we assume that neither the factors
Ft nor the factor loadings B are observed (both have to be estimated
from Xt , t ∈ {1, . . . , n}). The factors can be found with principal
component analysis.

© QRM Tutorial Section 6.4.2


6.4.3 Estimating macroeconomic factor models
This is achieved by time series regression.

Univariate regression
Consider the (univariate) time series regression model
Xt,j = aj + b′j Ft + εt,j , t ∈ {1, . . . , n}.

To justify the use of the ordinary least-squares (OLS) method to derive


statistical properties of the method it is usually assumed that, conditional
on the factors, the errors ε1,j , . . . , εn,j form a white noise process (i.e.
are identically distributed and serially uncorrelated).
âj estimates aj , b̂j estimates the jth row of B.
Models can also be estimated simultaneously using multivariate regression;
see MFE (2015).

© QRM Tutorial Section 6.4.3


6.4.4 Estimating fundamental factor models
Consider the cross-sectional regression model Xt = BFt + εt (B known;
Ft to be estimated; cov(ε) = Υ); note that a can be absorbed into Ft .
To obtain precision in estimating Ft , we need d ≫ p.

First estimate Ft via OLS by F̂tOLS = (B ′ B)−1 B ′ Xt . This is the best


linear unbiased estimator if the ε is homoskedastic. However, it is
possible to obtain linear unbiased estimates with a smaller covariance
matrix via generalized least squares (GLS).

To this end, estimate Υ by Υ̂ via the diagonal of the sample covariance


matrix of the residuals ε̂t = Xt − B F̂tOLS , t ∈ {1, . . . , n}.

Then estimate Ft via F̂t = (B ′ Υ−1 B)−1 B ′ Υ−1 Xt .

© QRM Tutorial Section 6.4.4


6.4.5 Principal component analysis
Goal: Reduce the dimensionality of highly correlated data by finding
a small number of uncorrelated linear combinations which account for
most of the variance in the data; this can be used for finding factors.
Key: Any symmetric A admits a spectral decomposition
A = ΓΛΓ′ ,
where
1) Λ = diag(λ1 , . . . , λd ) is the diagonal matrix of eigenvalues of A
which, w.l.o.g., are ordered so that λ1 ≥ λ2 ≥ · · · ≥ λd ; and
2) Γ is an orthogonal matrix whose columns are eigenvectors of A
standardized to have length 1.
Let Σ = ΓΛΓ′ with λ1 ≥ λ2 ≥ · · · ≥ λd ≥ 0 (positive semidefiniteness
⇒ all eigenvalues ≥ 0) and Y = Γ′ (X − µ) (the so-called principal
component transform). The jth component Yj = γj′ (X − µ) is the jth
principal component of X (where γj is the jth column of Γ).
© QRM Tutorial Section 6.4.5
We have EY = 0 and cov(Y ) = Γ′ ΣΓ = Γ′ ΓΛΓ′ Γ = Λ, so the principal
components are uncorrelated and var(Yj ) = λj , j ∈ {1, . . . , d}. The
principal components are thus ordered by decreasing variance.
One can show:
◮ The first principal component is that standardized linear combination
of X which has maximal variance among all such combinations, i.e.
var(γ1′ X) = max{var(a′ X) : a′ a = 1}.
◮ For j ∈ {2, . . . , d}, the jth principal component is that standardized
linear combination of X which has maximal variance among all such
linear combinations which are orthogonal to (and hence uncorrelated
with) the first j − 1-many linear combinations.
Pd Pd Pd
j=1 var(Yj ) = j=1 λj = trace(Σ) = j=1 var(Xj ), so we can
Pk P
interpret j=1 λj / dj=1 λj as the fraction of total variance explained
by the first k principal components.

© QRM Tutorial Section 6.4.5


Principal components as factors
Inverting the principal component transform Y = Γ′ (X − µ), we have
X = µ + ΓY = µ + Γ1 Y1 + Γ2 Y2 =: µ + Γ1 Y1 + ε
where Y1 ∈ Rk contains the first k principal components. This is
reminiscent of the basic factor model.
Although ε1 , . . . , εd will tend to have small variances, the assumptions
of the factor model are generally violated (since they need not have
a diagonal covariance matrix and need not be uncorrelated with Y1 ).
Nevertheless, principal components are often interpreted as factors.
In principle, the same can be applied to the sample covariance matrix
to obtain the sample principal components; see the appendix.

© QRM Tutorial Section 6.4.5

You might also like