Multivariate Extreme Value Theory
Multivariate Extreme Value Theory
Michael Falk
Multivariate
Extreme
Value Theory
and D-Norms
Springer Series in Operations Research
and Financial Engineering
Series Editors
Thomas V. Mikosch
Sidney I. Resnick
Stephen M. Robinson
More information about this series at http://www.springer.com/series/3182
Michael Falk
123
Michael Falk
Fakultät für Mathematik und Informatik
Universität Würzburg
Würzburg, Germany
This Springer imprint is published by the registered company Springer Nature Switzerland AG.
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
‘We do not want to calculate,
we want to reveal structures.’
VII
VIII Preface
1 D-Norms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Norms and D-Norms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Examples of D-Norms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3 Takahashi’s Characterizations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.4 Convexity of the Set of D-Norms . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.5 When Is an Arbitrary Norm a D-Norm? . . . . . . . . . . . . . . . . . . . . 17
1.6 The Dual D-Norm Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
1.7 Normed Generators Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
1.8 Metrization of the Space of D-Norms . . . . . . . . . . . . . . . . . . . . . . . 33
1.9 Multiplication of D-Norms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
1.10 The Functional D-Norm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
1.11 D-Norms from a Functional Analysis Perspective . . . . . . . . . . . . 62
1.12 D-Norms from a Stochastic Geometry Perspective . . . . . . . . . . . 78
IX
X Contents
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237
1
D-Norms
f (x) = 0 ⇐⇒ x = 0 ∈ Rd , (1.1)
f (λx) = |λ| f (x), (1.2)
f (x + y) ≤ f (x) + f (y). (1.3)
x = f (x), x ∈ Rd .
d(x, y) = x − y , x, y ∈ Rd .
d
x1 := |xi | , x = (x1 , . . . , xd ) ∈ Rd .
i=1
actually defines a norm for each p ≥ 1. This is the family of logistic norms.
The particular case p = 2 is commonly called the Euclidean norm.
Although condition (1.1) and homogeneity (1.2) are obvious, the proof of
the corresponding Δ-inequality is a little challenging. The inequality
d 1/p 1/p 1/p
d
d
|xi + yi |p ≤ |xi |p + |yi |p
i=1 i=1 i=1
We see in (1.4) that these inequalities are maintained by the set of D-norms.
Lemma 1.1.2 We have, for 1 ≤ p ≤ q ≤ ∞ and x ∈ Rd ,
(i) xp ≥ xq ,
(ii) lim xp = x∞ .
p→∞
1/p
d p
Proof. (i) This inequality is obvious for q = ∞: x∞ ≤ i=1 |xi | .
Consider now 1 ≤ p ≤ q < ∞ and choose x
= 0 ∈ R . Put S := xp . d
Then, we have
1.1 Norms and D-Norms 3
x
= 1,
S p
and we have to establish x
≤ 1.
S q
From
|xi |
∈ [0, 1]
S
and, thus,
q
p
|xi | |xi |
≤ , 1 ≤ i ≤ d,
S S
we obtain
d 1/q 1/q
x
|xi |
q d
|xi |
p
p/q
x
= ≤ = = 1p/q = 1,
S q i=1
S i=1
S S p
which is (i).
(ii) We have, moreover, for x
= 0 ∈ Rd and p ∈ [1, ∞)
d
p 1/p
|xi |
x∞ ≤ xp = x∞ ≤ d1/p x∞ →p→∞ x∞ ,
i=1
x ∞
Then,
1
xA := (x Ax) 2 , x ∈ Rd ,
defines a norm on Rd . With
10
A= ,
01
Definition of D-Norms
The following result introduces D-norms.
Zi ≥ 0, E(Zi ) = 1, 1 ≤ i ≤ d.
Then,
i.e.,
= xD + yD .
Note that there are norms that are not monotone: choose, for example,
1δ
A=
δ1
with δ ∈ (−1, 0). The matrix A is positive definite, but the norm xA =
(x Ax)1/2 = (x21 + 2δx1 x2 + x22 )1/2 is not monotone; put, for example, δ =
−1/2 and set x1 = 1, x2 = 0, y1 = 1, and y2 = 1/2. Then, x ≤ y, but
√
xA = 1 > yA = 3/2.
d
d
= |xj | d P (Zj = d) = |xj | = x1 ,
j=1 j=1
d p 1/p
Proposition 1.2.1 Each logistic norm xp = i=1 |xi | , 1≤
p ≤ ∞, is a D-norm. For 1 < p < ∞ a generator is given by
X
1/p
X
1/p
(p) (p)
Z (p) = Z1 , . . . , Zd := 1
,..., d
,
Γ (1 − p−1 ) Γ (1 − p−1 )
From Lemma 1.1.2, we know that ·p →p→∞ ·∞ pointwise. We have,
moreover, Γ 1 − p−1 →p→∞ Γ (1) = 1 and, consequently, we also have point-
wise convergence almost surely (a.s.):
(p) (p)
Z (p) = Z1 , . . . , Zd
1/p
1/p
X1 Xd
= ,...,
Γ (1 − p−1 ) Γ (1 − p−1 )
→p→∞ (1, . . . , 1) a.s.,
where the constant Z = (1, . . . , 1)
∈ R −1is a generator of the sup-norm ·∞ .
d
by elementary arguments.
= 1 − P max (|xi | Zi ) ≤ t dt
0 1≤i≤d
∞
t
= 1 − P Zi ≤ , 1 ≤ i ≤ d dt
0 |xi |
∞ d
t
= 1− P Zi ≤ dt
0 i=1
|xi |
∞ d
p
|xi |
= 1− exp − dt
0 i=1
tμ
∞
d p
i=1 |xi |
= 1 − exp − dt.
0 (tμ)p
1.2 Examples of D-Norms 9
p1
d p
The substitution t → t i=1 |xi | /μ implies that the integral above equals
p1
d
|xi |
p ∞
i=1 1
1 − exp − p dt
μ 0 t
∞
xp 1/p
= 1/p
P X1 > t dt
E(X1 ) 0
xp
= E X11/p
1/p
E X1
= xp .
Proof. To prove Theorem 1.3.1 we only have to show the implication “⇐”.
Let (Z1 , . . . , Zd ) be a generator of ·D .
(i) Suppose we have yD = y1 for some y > 0 ∈ Rd , i.e.,
d
d
d
E( max (yi Zi )) = yi = yi E(Zi ) = E yi Zi .
1≤i≤d
i=1 i=1 i=1
1.3 Takahashi’s Characterizations 11
This entails
d
d
E yi Zi − E max (yi Zi ) = E yi Zi − max (yi Zi ) = 0
1≤i≤d 1≤i≤d
i=1 i=1
≥0
d
⇒ yi Zi − max (yi Zi ) = 0 a.s.
1≤i≤d
i=1
d
⇒ yi Zi = max (yi Zi ) a.s.
1≤i≤d
i=1
Recall that yi > 0 for all i. Hence, Zi > 0 for some i ∈ {1, . . . , d} implies
Zj = 0 for all j
= i. Thus, we have, for arbitrary x ≥ 0 ∈ Rd ,
d
xi Zi = max (xi Zi ) a.s.
1≤i≤d
i=1
d
⇒ E xi Zi =E max (xi Zi )
1≤i≤d
i=1
⇒ x1 = xD .
(1, . . . , 1)D = 1
⇒ E max Zi = E(Zj ), 1 ≤ j ≤ d,
1≤i≤d
⇒ E max Zi − Zj = 0, 1 ≤ j ≤ d,
1≤i≤d
≥0
⇒ max Zi − Zj = 0 a.s., 1 ≤ j ≤ d,
1≤i≤d
⇒ Z1 = Z2 = . . . = Zd = max Zi a.s.
1≤i≤d
= E(x∞ Z1 )
= x∞ E(Z1 )
= x∞ , x ∈ Rd .
12 1 D-Norms
Sequences of D-Norms
Theorem 1.3.1 can easily be generalized to sequences of D-norms.
d
(n)
= yi E Zi 1Mj →n→∞ 0,
i=1
i=j
for all i
= j. Choose an arbitrary x ∈ Rd . From inequality (1.4) we know that
1.3 Takahashi’s Characterizations 13
0 ≤ x1 − xDn
⎛ ⎞
⎜ ⎟
⎜ d (n) ⎟
⎜
=E⎜ |xi | Zi − max |xi | Zi ⎟
(n)
1≤i≤d ⎟
⎝ i=1 ⎠
≥0
⎛ d ⎞
d (n) (n)
≤E⎝ |xi | Zi − max |xi | Zi 1Mj ⎠
1≤i≤d
j=1 i=1
d
d
(n) (n)
= E |xi | Zi − max |xi | Zi 1Mj
1≤i≤d
j=1 i=1
d
d
(n)
= |xi | E Zi 1Mj →n→∞ 0,
j=1 i=1
i=j by (1.7)
−−n→∞
−−−→0
which implies
xDn →n→∞ x1 , x ∈ Rd .
(ii) We use inequality (1.4) and obtain
0 ≤ xDn − x∞
(n)
= E max |xi | Zi − max |xi |
1≤i≤d 1≤i≤d
(n)
≤ max |xi | E max Zi − max |xi |
1≤i≤d 1≤i≤d 1≤i≤d
(n)
= x∞ E max Zi −1
1≤i≤d
= x∞ 1Dn − 1 →n→∞ 0.
(n)
= E Zi 1Z (n) =max (n)
Zk
≥ 0.
j 1≤k≤d
(n)
Therefore, E Zi 1Z (n) =max (n)
→n→∞ 0, which is (1.7). We can
j 1≤k≤d Zk
repeat the steps of the preceding proof and get the desired assertion.
(ii) For our given value of i, we have
0 ≤ 1Dn − 1
(n) (n)
= E max Zk − Zi
1≤k≤d
d
(n) (n)
≤ E max Z − Zi 1Z (n) =max (n)
1≤k≤d k j 1≤k≤d Zk
j=1
d
(n) (n) (n)
= E max Zi , Zj − Zi 1Z (n) =max (n)
j 1≤k≤d Zk
j=1
d
(n) (n) (n)
≤ E max Zi , Zj − Zi
j=1
= ei + ej Dn − 1 →n→∞ 0,
1≤j≤d, j=i
d
xDλ := λ x∞ + (1 − λ) x1 = λ max |xi | + (1 − λ) |xi | . (1.8)
1≤i≤d
i=1
(ξ)
= E max xi Zi 1{ξ=j}
1≤i≤d
j=1
2
(j)
= E max xi Zi 1{ξ=j}
1≤i≤d
j=1
2
(j)
= E max xi Zi E 1{ξ=j}
1≤i≤d
j=1
(1) (2)
= λE max xi Zi + (1 − λ)E max xi Zi .
1≤i≤d 1≤i≤d
16 1 D-Norms
(ξ)
By putting xi = 1 and xj = 0 for j
= i, we obtain in particular E Zi = 1,
1 ≤ i ≤ d. This completes the proof.
Z := Z (X)
and
(X)
E max |xi | Zi
1≤i≤d
∞
(X)
= E max |xi | Zi
| X = p f (p) dp
1 1≤i≤d
∞
(p)
= E max |xi | Zi f (p) dp
1≤i≤d
1 ∞
= xp f (p) dp.
1
But this is true for every norm on R2 , as we show next. Suppose that
a
= b. Put
a1 b 2 − b 1 b 2 b 1 a2 − b 1 b 2
α := , β :=
a1 a2 − b 1 b 2 a1 a2 − b 1 b 2
a1 a2 − a1 b 2 a1 a2 − b 1 a2
γ := , δ := .
a1 a2 − b 1 b 2 a1 a2 − b 1 b 2
Then, α, β, γ, δ ≥ 0, α + γ = 1 = β + δ,
We thus obtain from Theorem 1.5.1 the following characterization in the bi-
variate case.
The following lemma entails that in the bivariate case, G(x) = exp(− x),
x ≤ 0 ∈ R2 , defines a df with standard negative exponential margins iff the
norm · satisfies x∞ ≤ x ≤ x1 , x ≥ 0.
b 1 − a1 a1
≤ + (b1 , b2 ) = b
b1 b1
and
a = (a1 , a2 )
b 2 − a2 a2
= (a1 , 0) + (a1 , b2 )
b2 b2
b 2 − a2 a2
≤ (a1 , 0) + (a1 , b2 )
b2 b2
=a1 ≤b1 ≤ b ≤ b , see above
b 2 − a2 a2
≤ + b = b .
b2 b2
Therefore, the norm is monotone.
min(max(a1 , . . . , an ), an+1 )
= max(min(a1 , an+1 ), . . . , min(an , an+1 )).
max(a1 , a2 ) = a1 + a2 − min(a1 , a2 ).
and
n
|T |−1
P Ai = (−1) P Ai .
i=1 ∅=T ⊂{1,...,n} i∈T
1.6 The Dual D-Norm Function 21
|T |−1
E max 1Ai = (−1) E min 1Ai ,
1≤i≤n i∈T
∅=T ⊂{1,...,n}
yielding
n
E max 1Ai =P Ai , E min 1Ai = P Ai .
1≤i≤n i∈T
i=1 i∈T
Corollary 1.6.3 If Z (1) , Z (2) generate the same D-norm, then, for
each x ∈ Rd ,
(1) (2)
E min |xi | Zi = E min |xi | Zi .
1≤i≤d 1≤i≤d
(1)
E min |xi | Zi
1≤i≤d
⎛ ⎞
(1)
=E⎝ (−1)|T |−1 max |xi | Zj ⎠
j∈T
∅=T ⊂{1,...,d}
(1)
= (−1)|T |−1 E max |xi | Zj
j∈T
∅=T ⊂{1,...,d}
22 1 D-Norms
|T |−1
= (−1) |xj | ej
∅=T ⊂{1,...,d} j∈T
D
|T |−1 (2)
= (−1) E max |xi | Zj
j∈T
∅=T ⊂{1,...,d}
⎛ ⎞
(2)
=E⎝ (−1)|T |−1 max |xi | Zj ⎠
j∈T
∅=T ⊂{1,...,d}
(2)
=E min |xi | Zi .
1≤i≤d
·D → · D
Clearly,
· D = 0 (1.12)
is the least dual D-norm function, corresponding, for example, to ·D = ·1
if d ≥ 2, whereas
Thus, for an arbitrary dual D-norm function and d ≥ 2, we have the bounds
with
1/p
d
p
xp = |xi | , x = (x1 , . . . , xd ) ∈ Rd .
i=1
24 1 D-Norms
Example 1.7.3 Put Z (1) := (1, . . . , 1) and Z (2) := (X, . . . , X), where
X ≥ 0 is an rv with E(X) = 1. Both generate the D-norm ·∞ , but
only Z (1) satisfies Z (1) 1 = d.
Z := dZ̃ (1.15)
max1≤i≤d (|xi | Vi )
= dE
V1 + · · · + Vd
1 max1≤i≤d (|xi | Vi )
= E(V1 + · · · + Vd )E
α V1 + · · · + Vd
1
= E max (|xi | Vi ) .
α 1≤i≤d
λ2
ai = aj
λ1
and, thus,
λ2 λ2
c = ai = aj = c,
λ1 λ1
i.e., λ1 = λ2 and, hence, ai = aj , i.e., i = j. We therefore obtain
n
c Φ(B∞ ) = lim E 1R+ ·Ai (Z) Z
n∈N
i=1
n
= lim E 1R+ ·Ai (Z) Z
n∈N
i=1
n
= lim c Φ(Ai ) = c Φ(Ai ),
n∈N
i=1 i∈N
m(n)
fn = αi,n 1Ai,n , n ∈ N,
i=1
with Ai,n ∈ BSc and αi,n > 0, 1 ≤ i ≤ m(n), n ∈ N, such that fn (s) ↑n∈N f (s),
s ∈ Sc .
By applying the monotone convergence theorem twice, we obtain
max (xj sj ) Φ(ds) = f dΦ
Sc 1≤j≤d Sc
= lim fn dΦ
n∈N Sc
m(n)
= lim αi,n Φ(Ai,n )
n∈N
i=1
1
m(n)
= lim αi,n E Z 1R+ ·Ai,n (Z)
n∈N c
i=1
⎛ ⎞
1 ⎝
m(n)
= lim E Z αi,n 1R+ ·Ai,n (Z)⎠
n∈N c
i=1
⎛ ⎞
m(n)
1 ⎝ Zc ⎠
= lim E Z αi,n 1R+ ·Ai,n
n∈N c
i=1
Z
1 Zc
= E Z f
c Z
The following consequence of the two preceding auxiliary results is obvi-
ous.
Proof. Use polar coordinates to identify the set E with (0, ∞)·Sc and identify
ν in these coordinates with the product measure μ × Φ on (0, ∞) × Sc , where
the measure μ on (0, ∞) is defined by μ((λ, ∞)) = 1/λ, λ > 0. Precisely,
define the one-to-one function T : (0, ∞) × Sc → E by
T (λ, a) := λa
xi
= μ λ > 0 : λ > min Φ(ds)
S 1≤i≤d si
c
1
= Φ(ds)
Sc min1≤i≤d (xi /si )
si 1
= max Φ(ds) =
x .
Sc1≤i≤d xi D
Corollary
1.7.10 The distribution of a generator Z̃ of ·D with
Z̃ = c is uniquely determined.
Proof. Let Z (1) , Z (2) be two generators of ·D with Z (1) = Z (2) = c.
For A ∈ BSc and i = 1, 2, put
1
Φi (A) := E 1R+ ·A Z (i) Z (i) = E 1R+ ·A Z (i) .
c
Then, we obtain
Φi (A) = E 1A Z (i) = P Z (i) ∈ A ,
E(Z) = const.
1.7 Normed Generators Theorem 31
(1) (2)
Proof. Z , Z be two dgenerators of ·D . For i = 1, 2, put ci :=
Let
(i)
E Z , Si := s ≥ 0 ∈ R : s = ci and Φi as in Lemma 1.7.5. We
have S2 = (c2 /c1 )S1 . Since the measure ν defined in Lemma 1.7.8 depends
only on ·D according to Lemma 1.7.9, we obtain the equations
Φ2 (S2 ) = ν((1, ∞) · S2 )
c2
=ν , ∞ · S1
c1
c1 c1
= ν((1, ∞) · S1 ) = Φ(S1 ).
c2 c2
But 1 = Φ1 (S1 ) = Φ(S2 ) and, thus, c1 /c2 = 1, which completes the proof.
For example, for ·D = ·∞ and an arbitrary norm · on Rd , we obtain
that each generator Z of ·∞ satisfies
d
E(Z) = ei .
i=1
d
1
d
E(Z) = dei = ei .
i=1
d i=1
R(x) := r
32 1 D-Norms
and R(0) = 0. Note that the radial function R is homogeneous of order one,
i.e., R(λx) = λR(x), λ ≥ 0, x ∈ [0, ∞)d . If x = rs with s ∈ S, then λx = λrs
and, thus,
R(λx) = λr = λR(x).
Repeating the arguments in the derivation of Theorem 1.7.1, the conclusion
in this result can be generalized as follows.
Choose v ∈ (0, 1). Then, there exists λ > 1 and u ∈ (0, 1) such that
(v, 1 − v) = λ(u, 1 − u2 ).
As a consequence, we obtain
(v, 1 − v) = λ (u, 1 − u2 ) = λc > c ≥ (v, 1 − v) ,
which is a contradiction.
An example of an unbounded complete angular set in R2 is
Angular sets that are not necessarily complete are introduced in Defini-
tion 1.11.2.
1.8 Metrization of the Space of D-Norms 33
dW (P, Q)
:= inf{E (X − Y 1 ) : X has distribution P, Y has distribution Q}.
As a consequence, we obtain
d ! !
! !
! xi P (dx) − 1!! ≤ dW (P, Pn ) →n→∞ 0
!
i=1 Sd
and, thus, P ∈ PD .
The separability of PD can be seen as follows. Let P be a countable
and dense subset of P. Identify each distribution P in P with an rv Y =
(Y1 , . . . , Yd ) on Sd which follows this distribution P , i.e., each component Yi
is non-negative, and we have Y1 + · · · + Yd = d. Without loss of generality
(wlog), we can assume that E(Yi ) > 0 for each component. This can be seen
as follows. Let T ⊂ {1, . . . , d} be the set of those indices i with E(Yi ) = 0.
Suppose that T
= ∅. As Yi ≥ 0, this implies Yi = 0 for i ∈ T . For n ∈ N, put
(n) 1 − n1 Yi , if i
∈ T,
Yi := d
n|T | , if i ∈ T.
d
(n) (n) (n)
Then, i=1 Yi = d and, with Y (n) := Y1 , . . . , Yd ,
1
d d
E Y − Y (n) = + E(Yi ) = 2 .
1 n |T | n n
i∈T i∈T
see, for example, Villani (2009). But since for each probability measure P ∈
PD we have
x1 P (dx) = d P (dx) = d,
Sd Sd
(1)
xD1 = E max |xi | Zi
1≤i≤d
Proof. From Corollary 1.7.2 we know that, for every D-norm ·Dn , there
exists a generator Z (n) that realizes in Sd := x ≥ 0 ∈ Rd : x1 = d . The
simplex Sd is a compact subset of Rd , and thus, the sequence Z (n) , n ∈ N,
is tight, i.e., for each ε > 0 there exists a compact set K in Rd such that
P Z (n) ∈ K > 1 − ε for n ∈ N; just choose K = Sd . But this implies
that the sequence is relatively compact , i.e., there exists a subsequence Z (m) ,
m = m(n) that converges in distribution to some rv Z = (Z1 , . . . , Zd ); see,
for example, Billingsley (1999, Prokhorov’s theorem).
One readily finds that this limit Z realizes in Sd as well, and that each of
its components has expected value equal to one. The Portmanteau Theorem
implies
lim sup P Z (m) ∈ Sd ≤ P (Z ∈ Sd )
n∈N
1.8 Metrization of the Space of D-Norms 37
as Sd is a closed subset of Rd . But P Z (m) ∈ Sd = 1 for each m, and thus,
(m)
P (Z ∈ Sd ) = 1 as well. The sequence of components Zi , n ∈ N, is uniformly
(m)
integrable for each i ∈ {1, . . . , d} as
Zi realizes
in [0, d], and thus, weak
(m) (m)
convergence Zi →D Zi implies E Zi →n→∞ E(Zi ); see Billingsley
(m)
(1999, Theorem 5.4). But E Zi = 1, and thus, we obtain E(Zi ) = 1 as
well for each i ∈ {1, . . . , d}.
The rv Z is, therefore, the generator of a D-norm ·D . From Proposi-
tion 1.8.3, we obtain that dW ·Dm , ·D →n→∞ 0. Lemma 1.8.4 implies
that xDm → xD , x ∈ Rd , and, thus, f (·) = ·D .
Z := Z (1) Z (2)
Lemma 1.9.1 The D-norm ·D1 D2 does not depend on the particular
choice of generators Z (1) , Z (2) , provided that they are independent.
1.9 Multiplication of D-Norms 39
(1) (2)
xD1 D2 = E max |xi | Zi Zi
1≤i≤d
(1) (2) (1) (1)
= E max |xi | zi Zi P ∗ Z (1) d z1 , . . . , zd
1≤i≤d
= xz (1) P ∗ Z (1) dz (1) = E xZ (1) , (1.20)
D2 D2
i.e., xD1 D2 is independent of the particular choice Z (2) . Repeating the above
arguments and conditioning on Z (2) , we obtain the equation
xD1 D2 = E xZ (2) , (1.21)
D1
i.e., ·D1 D2 = ·D1 . The sup-norm ·∞ is, therefore, the identity element
within the set of D-norms, equipped with the above multiplication. There is
no other D-norm with this property.
Equipped with this commutative multiplication, the set of D-norms on Rd
is, therefore, a semigroup with an identity element.
The rv
X + Y = (X1 + Y1 , . . . , Xd + Yd )
40 1 D-Norms
·(HRΣ/n )n = ·HRΣ , n ∈ N.
i.e., ·D1 D2 = ·1 . Multiplication with the norm ·1 yields ·1 again, and
thus, ·1 is an absorbing element among the set of D-norms. There is clearly
no other D-norm with this property.
Idempotent D-Norms
The maximum-norm ·∞ and the norm ·1 both satisfy
·D2 := ·DD = ·D .
Such a D-norm is called idempotent. Naturally, the question of how to charac-
terize the set of idempotent D-norms arises. This is achieved in what follows.
It turns out that in the bivariate case, ·∞ and ·1 are the only idempotent
D-norms, whereas in higher dimensions, each idempotent D-norm is a certain
combination of ·∞ and ·1 .
1.9 Multiplication of D-Norms 41
E(|X + Y |) = E(|X|),
and
F (0) 1
E(|X|) = − F −1 (u) du + F −1 (u) du.
0 F (0)
or
1
0 = (1 − 2F (0)) F −1 (v) dv
0
F (0) 1 ! −1 !
+2 !F (u) + F −1 (v)! du dv.
0 F (0)
1
The assumption 0 = E(X) = 0
F −1 (v) dv now yields
F (0) 1 ! −1 !
!F (u) + F −1 (v)! du dv = 0
0 F (0)
thus,
F −1 (u) + F −1 (v) = 0 (1.23)
for λ-a.e. (u, v) ∈ [0, F (0)] × [F (0), 1], where λ denotes the Lebesgue measure
on [0, 1].
If F (0) = 0, then P (X > 0) = 1, and thus, E(X) > 0, which would be a
contradiction. If F (0) = 1, then P (X < 0) > 0 unless P (X = 0) = 1, which
we have excluded, and thus, E(X) < 0, which would again be a contradiction.
Consequently, we have established 0 < F (0) < 1.
As the function F −1 (q), q ∈ (0, 1), is in general continuous from the left
(see, e.g., Reiss (1989, Lemma A.1.2)), equation (1.23) implies that F −1 (v) is
a constant function on (0, F (0)] and on (F (0), 1), precisely,
−1 −m, v ∈ (0, F (0)],
F (v) =
m, v ∈ (F (0), 1),
1.9 Multiplication of D-Norms 43
for some m > 0. Note that the representation X = F −1 (U1 ), together with the
assumption that X is not a.s. the constant zero, implies m
= 0. The condition
F (0) 1
0 = E(X) = F −1 (v) dv + F −1 (v) dv = m(1 − 2F (0))
0 F (0)
Idempotent D-Norms on R2
The next result characterizes bivariate idempotent D-norms.
as well as
(1) (1)
E max Z1 , Z2 = 1 + E(|X|).
Idempotent D-Norms on Rd
Next, we extend Proposition 1.9.4 to arbitrary dimensions d ≥ 2. Denote
again by ei the i-th unit vector in Rd , 1 ≤ i ≤ d, and let ·D be an arbitrary
D-norm on Rd . Recall that, for 1 ≤ i < j ≤ d,
(x, y)Di,j = xei + yej D , (x, y) ∈ R2 ,
K
xD = max |xi | + |xi | , x ∈ Rd .
i∈Ak
k=1 i∈{1,...,d}\∪K
k=1 Ak
Proof. The easiest way to establish this result is to use the fact that G(x) :=
exp(− xD ), x ≤ 0 ∈ Rd , defines a df for an arbitrary D-norm ·D on Rd .
This is the content of Theorem 2.3.3 later in this book.
46 1 D-Norms
K
= P ηk∗ ≤ min xi , 1 ≤ k ≤ K; ηj ≤ xj , j ∈ ∪k=1 Ak ,
i∈Ak
∗
where k ∈ Ak is an arbitrary but fixed element of Ak for each k ∈ {1, . . . , K}.
The rv η ∗ with joint components ηk∗ , 1 ≤ k ≤ K, and ηj , j ∈ ∪K k=1 Ak , is
an rv of a dimension less than d, and η ∗ has no pair of completely dependent
components. The rv η ∗ may be viewed as the rv η after having removed the
copies of the completely dependent components. Its corresponding D-norm is,
of course, still idempotent. From Proposition 1.9.5, we obtain its df, i.e.,
⎛ ⎞
K !
!
⎜ ! ! ⎟
G(x) = exp ⎝− ! min xi ! − |xj |⎠
! !
i∈Ak
k=1
j∈(∪K
k=1 Ak )
⎛ ⎞
⎜
K ⎟
= exp ⎝− max |xi | − |xj |⎠ , x ≤ 0 ∈ Rd ,
i∈Ak
k=1
j∈(∪K
k=1 Ak )
E max (|xi | Zi )
1≤i≤d
K
= max |xi | + |xj | , x ∈ Rd .
i∈Ak
k=1
j∈(∪K
k=1 Ak )
1.9 Multiplication of D-Norms 47
It is easy to see that this D-norm is idempotent, and thus, the proof is com-
plete.
The set of all idempotent trivariate D-norms is, for example, given by the
following five cases:
⎧
⎪
⎪max(|x| , |y| , |z|),
⎪
⎪
⎪
⎪
⎨max(|x| , |y|) + |z| ,
(x, y, z)D = max(|x| , |z|) + |y| ,
⎪
⎪
⎪max(|y| , |z|) + |x| ,
⎪
⎪
⎪
⎩|x| + |y| + |z| ,
where the three mixed versions are just permutations of the arguments and
may be viewed as equivalent.
Tracks of D-Norms
The multiplication of D-norms D1 , D2 , . . . on Rd can obviously be iterated:
·n+1 Di := ·Dn+1 n Di , n ∈ N.
i=1 i=1
Σ = λ1 r1 r1 + · · · + λd rd rd ,
We establish several auxiliary results in what follows. The first one shows
that the multiplication of two D-norms is increasing.
Proof. Let Z (1) and Z (2) be independent generators of ·D1 and ·D2 . By
equation (1.21), we have for x ∈ Rd
(2)
xD1 D2 = E xZ . (1.24)
D1
Note that
xD1 = xE Z (2) = E xZ (2) (1.25),
D1 D1
(2)
where
the expectation of an rv is meant componentwise, i.e., E Z
=
(2) (2)
E Z1 , . . . , E Zd , etc. Put
T (x) := xD1 , x ∈ Rd .
Check that T is a convex function by the triangle inequality and the homo-
geneity satisfied by any norm. Jensen’s inequality states that a convex function
T : Rd → R entails T (E(Y1 ), . . . , E(Yd )) ≤ E(T (Y1 , . . . , Yd )) for arbitrary in-
tegrable rv Y1 , . . . , Yd . We thus obtain from Jensen’s inequality together with
equations (1.24) and (1.25)
(2)
xD1 D2 = E xZ
D1
= E T xZ (2)
≥ T E xZ (2)
= E xZ (2)
D1
= xD1 .
Exchanging Z (1) and Z (2) completes the proof.
1.9 Multiplication of D-Norms 49
Proof. From Lemma 1.9.8, we know that, for each x ∈ Rd and each n ∈ N,
xn Di ≤ xn+1 Di .
i=1 i=1
is an idempotent D-norm on Rd .
for each k ∈ N. By letting k tend to infinity and repeating the above argu-
ments, we obtain
k
∗ (i)
xD∗ = E xZ Z ↑k→∞ E (xZ ∗ D∗ ) = xD∗ D∗ ,
i=1 ∞
An Application to Copulas
Let the rv U = (U1 , . . . , Ud ) follow a copula, i.e., each component Ui is uni-
1
formly distributed on (0, 1). As E(Ui ) = 0 u du = 1/2, the rv Z := 2U
generates a D-norm; see also the discussion on page 152. The following result
is an immediate consequence of the previous considerations.
1.9 Multiplication of D-Norms 51
Zt ≥ 0, E(Zt ) = 1, t ∈ [0, 1] ,
and
E sup Zt < ∞.
0≤t≤1
0 = f D
=E sup (|f (t)| Zt )
t∈[0,1]
f D = 0 =⇒ f = 0.
= |λ| f D .
The triangle inequality for ·D follows from the triangle inequality for real
numbers |x + y| ≤ |x| + |y|, x, y ∈ R:
54 1 D-Norms
As max1≤i≤n (|f (ti )| Zti ) is an rv for each n ∈ N, the limit of this sequence, i.e.,
supt∈[0,1] (|f (t)| Zt ), is an rv as well. We can therefore compute its expectation,
which is finite by the bound
sup (|f (t)| Zt ) =: f Z∞ ≤ sup (|f (t)|) sup Zt = f ∞ Z∞
t∈[0,1] t∈[0,1] t∈[0,1]
and taking expectations. Recall that each function f ∈ E[0, 1] is by the def-
inition of E[0, 1] bounded. The process Z = (Zt )t∈[0,1] is again called the
generator of the D-norm ·D .
This example shows that the generator of a D-norm is also not uniquely
determined in the functional setup.
The functional sup-norm ·∞ is again the smallest D-norm
f ∞ ≤ f D , f ∈ E[0, 1],
(see Lemma 1.10.2 below), but unlike the multivariate case, there is no in-
dependence D-norm in the functional setup. Suppose there exists a D-norm
with generator Z = (Zt )t∈[0,1] such that
d
xDt := E max (|xi | Zti ) = |xi |
1 ,...,td 1≤i≤d
i=1
x1 ≤ K x2 , x ∈ Rd .
This is no longer true for arbitrary norms on E[0, 1]. But it turns out that
each functional D-norm is equivalent to the sup-norm f ∞ = supt∈[0,1] |f (t)|
on E[0, 1].
Proof. Let Z = (Zt )t∈[0,1] be a generator of ·D . For each t0 ∈ [0, 1] and
f ∈ E[0, 1], we have
= f D
≤ E (f ∞ Z∞ ) = f ∞ 1D ,
f D ≤ f − gD + gD .
Proof. Choose ε ∈ (0, 1) and put fε (·) := 1[0,ε] (·) ∈ E[0, 1]. Then, fε ∞ =
1 > ε1/p = fε p . The Lp -norm, therefore, does not satisfy the first inequality
in Lemma 1.10.2.
t
Zt := exp Bt − , t ∈ [0, 1], (1.26)
2
1.10 The Functional D-Norm 57
where B := (Bt )t≥0 is a standard Brownian motion on [0, ∞). The corre-
sponding max-stable process is a Brown–Resnick process (Brown and Resnick
(1977)); see Section 4.2.
The characteristic properties of a standard Brownian motion B are that
it realizes in C[0, 1], B0 = 0 and that the increments Bt − Bs are independent
and normal N (0, t − s) distributed rv with mean zero and variance t − s,
formulated a little loosely. As a consequence, each Bt with t > 0 follows the
normal distribution N (0, t) with mean zero and variance t. We have, therefore,
t
E(Zt ) = exp − E(exp(Bt ))
2
t 1/2 Bt
= exp − E exp t
2 t1/2
∞ 2
t 1/2 1 x
= exp − exp(t x) 1/2
exp − dx
2 −∞ (2π) 2
∞
1 (x − t1/2 )2
= 1/2
exp − dx
−∞ (2π) 2
= 1, (1.27)
as exp −(x − t1/2 )2 /2 /(2π)1/2 , x ∈ R, is the density of the normal
N (t1/2 , 1)-distribution.
It is well known that, for x ≥ 0,
P sup Bt > x = 2P (B1 > x),
t∈[0,1]
which is called the reflection principle for the standard Brownian motion; see,
for example, Revuz and Yor (1999, Proposition 3.7). From this equation and
the representation of the expectation of an rv in Lemma 1.2.2, we obtain
E sup Zt ≤E sup (exp (Bt ))
t∈[0,1] t∈[0,1]
=E exp sup Bt
t∈[0,1]
∞
= P exp sup Bt >x dx
0 t∈[0,1]
∞
≤1+ P sup Bt > log(x) dx
1 t∈[0,1]
58 1 D-Norms
∞
=1+2 P (B1 > log(x)) dx
1 ∞
≤1+2 P (exp(B1 ) > x) dx
0
= 1 + 2E(exp(B1 ))
< ∞,
Var(Bs ) Cov(Bs , Bt ) ss
Σ= = .
Cov(Bs , Bt ) Var(Bt ) st
Lemma 1.10.6 can be extended to zero means Gaussian processes with sta-
tionary increments; see Kabluchko et al. (2009, Remark 24). For the trivariate
case, we refer to Huser and Davison (2013).
Proof (of Lemma 1.10.6). We provide quite an elementary proof, which uses
the independence of the increments of a Brownian motion. We have, for 0 ≤
s < t and x, y > 0,
(x, y)Ds,t
= E(max(xZs , yZt ))
1.10 The Functional D-Norm 59
s t
= E max x exp Bs − , y exp Bt −
2 2
s
t s
t t s
+ y exp − E exp(Bt )1 Bt − + log(y) > Bs − + log(x)
2 2 2
s
t−s x
= x exp − E exp(Bs )1 Bt − Bs ≤ + log
2 2 y
t t−s x
+ y exp − E exp(Bt )1 Bt − Bs > + log
2 2 y
s
t
=: x exp − I + y exp − II.
2 2
t−s x
I = E (exp(Bs )) E 1 Bt − Bs ≤ + log
2 y
s t−s
x
= exp P Bt − Bs ≤ + log
2 2 y
s (t−s)/2+log(x/y)
1 u
= exp √ ϕ √ du
2 −∞ t−s t−s
√
s ((t−s)/2+log(x/y))/ t−s
= exp ϕ(u) du
2 −∞
s √t − s log(x/y)
= exp Φ + √ ,
2 2 t−s
where ϕ denotes the standard normal density on the real line. By repeating
the above arguments, we obtain
t−s x
II = E exp(Bs ) exp(Bt − Bs )1 Bt − Bs > + log
2 y
t−s x
= E(exp(Bs ))E exp(Bt − Bs )1 Bt − Bs > + log
2 y
s ∞
1 u
= exp exp(u) √ ϕ √ du.
2 (t−s)/2+log(x/y) t−s t−s
The equation
u 1 u2
exp(u)ϕ √ = √ exp(u) exp −
t−s 2π 2(t − s)
60 1 D-Norms
2
1 (u − (t − s)) t−s
= √ exp − exp
2π 2(t − s) 2
t−s u − (t − s)
= exp ϕ √
2 t−s
implies
∞
t 1 u − (t − s)
II = exp √ ϕ √ du
2 (t−s)/2+log(x/y) t−s t−s
∞
t
= exp √
ϕ(u) du
2 (log(x/y)−(t−s)/2)/ t−s
√
t log(x/y) t−s
= exp 1−Φ √ −
2 t−s 2
√
t t − s log(y/x)
= exp Φ + √
2 2 t−s
by appropriate elementary substitutions and the equation 1 − Φ(u) = Φ(−u),
u ∈ R. The assertion is now a consequence of the equation
s
t
x exp − I + y exp − II
2 2
√
√
t − s log(x/y) t − s log(y/x)
= xΦ + √ + yΦ + √ .
2 t−s 2 t−s
As in the multivariate case in Corollary 1.6.3, the value of f D does not
depend on the particular process Z = (Zt )t∈[0,1] , that generates the functional
D-norm ·D .
Lemma 1.10.7 Let Z = (Zt )t∈[0,1] and Z̃ = (Z̃t )t∈[0,1] be two genera-
tors of the functional D-norm ·D . Then,
and
inf (|f (t)| Z̃t ) = lim min |f (ti )| Z̃ti .
t∈[0,1] n→∞ 1≤i≤n
n
But for each n ∈ N, (Zti )ni=1 and Z̃ti are generators of the same
i=1
D-norm ·Dt on R , as they satisfy for x = (x1 , . . . , xn ) ∈ Rn
n
1 ,...,tn
E max (|xi | Zti ) = E sup (|fx (t)| Zt )
1≤i≤n t∈[0,1]
=E sup |fx (t)| Z̃t
t∈[0,1]
with
xi , if t = ti
n
fx (t) := = xi 1{ti } (t), t ∈ [0, 1], (1.29)
0 elsewhere i=1
62 1 D-Norms
Note that we use a capital letter S in the index for such a seminorm, which
is defined by a generator Z. The above definition is quite close to that of a
D-norm in Lemma 1.1.3; the difference is that xS = 0 does not necessarily
imply x = 0 in (1.30), as we allow E(Zj ) = 0 for some j ∈ {1, . . . , d}, i.e., Zj =
0 a.s. In this case, we obtain for the unit vector ej = (0, . . . , 0, 1, 0, . . . , 0) ∈ Rd
1.11 D-Norms from a Functional Analysis Perspective 63
ej S = E(Zj ) = 0.
The seminorm ·S is, consequently, a norm iff E(Zj ) > 0 for all j = 1, . . . , d,
with the special case of it being a D-norm iff E(Zj ) = 1 for all j.
Ressel (2013, Theorem 1) characterized the set of seminorms as defined
in (1.30) when the generators Z realize in S = x ≥ 0 ∈ Rd : x∞ = 1 .
This characterization is achieved in the setup of functional analysis. The set
of seminorms turns out to be a Bauer simplex, whose extremal elements are
the seminorms with a constant generator. The aim of this section is to extend
this characterization in Theorem 1.11.19 to the case where the generators Z
all realize in an angular set, defined below.
As a consequence of our considerations we show in particular in Propo-
sition 1.11.20 that the set ofD-norms, whose generators follow a discrete
distribution on the set Sd = x ≥ 0 ∈ Rd : x1 = d , is a dense subset of
the set of all D-norms.
Before we can present the results, we have to introduce various definitions
and auxiliary results.
An angular set in R2 is, for example, S1 := (u, 1 − u2 ) : u ∈ [0, 1] ; see
the discussion after Theorem 1.7.13. The set
S2 := {(u, 1/u) : u > 0} ∪ {(0, 1), (1, 0)}
is not an angular set in R2 in the sense of Definition 1.11.2 as it is not compact.
The set
S3 := {(u, 1 − u) : u ∈ [0, 1/2]}
2
is an angular set in R , but not a complete one as in Definition 1.7.12, since
not every (x, y) > 0 ∈ R2 can be represented as (x, y) = λ(u, 1 − u), with
some λ > 0 and some u ∈ [0, 1/2]. The vector (3/4, 1/4), for example, cannot
be represented this way.
Repeating the arguments in the proof of Proposition 1.4.1 yields the con-
vexity of the set KS .
Lemma 1.11.3 The set KS is convex, i.e., if ·S,1 , ·S,2 are semi-
norms in KS , then λ ·S,1 + (1 − λ) ·S,2 ∈ KS for any λ ∈ [0, 1] as
well.
ν(E\([0, ∞) · S)) := 0
1 1
ν [0, x] = E max Zj =
x . (1.33)
1≤j≤d xj S
Proof. Clearly, we only have to prove the implication “⇒.” It can be seen by
induction as follows. Suppose equation (1.34) is true for n ≥ 2. It is true for
n+1
n = 2 by the convexity of K. Choose λ1 , . . . , λn+1 ∈
[0, 1] with i=1 λi = 1
n
and x1 , . . . , xn+1 ∈ K. We can assume wlog that i=1 λi > 0. Then, we
obtain
n+1
n
λi xi = λn+1 xn+1 + λi xi
i=1 i=1
⎛ ⎞
n
n
λ
= λn+1 xn+1 + ⎝ λj ⎠ n i xi
j=1 i=1 j=1 λj
n
λ
= λn+1 xn+1 + (1 − λn+1 ) n i xi ∈ K
i=1 j=1 λj
by induction.
Barycentric Coordinates
Let K ⊂ Rd be a convex and compact set, whose extremal set ex(K) =
{x1 , . . . , xn } is a set of n distinct vectors in Rd . For any x ∈ K there exists
according to Theorem 1.11.6 a vector w = (w1 , . . . , wn ) = w(x) of weights
w1 , . . . , wn ∈ [0, 1] with ni=1 wi = 1, such that
1.11 D-Norms from a Functional Analysis Perspective 67
n
x= wi xi .
i=1
1 1 1 1 1 1
, = (0, 0) + (1, 1) = (0, 1) + (1, 0).
2 2 2 2 2 2
In this example, we have two vectors of generalized barycentric coordinates
of (1/2, 1/2): (1/2, 0, 1/2, 0) and (0, 1/2, 0, 1/2).
Each vector of barycentric coordinates w = w(x) = (w1 , . . . , wn ) for a
fixed x ∈ K with corresponding extremal points x1 , . . . , xn can be interpreted
as a discrete probability measure Qw on the set ex(K) of the extremal points
of K:
n
Qw (B) = wi εxi (B), B ⊂ ex(K),
i=1
where εz (·) is the Dirac measure or point measure with mass one at z, i.e.,
εz (B) = 1 if z ∈ B and zero elsewhere.
As a consequence, we can write for any linear affine functional f : K → R
f (x) = f (x ) Qw (dx );
ex(K)
recall that x is kept fixed. This representation can easily be seen as follows.
Each linear affine functional f : K → R can be written as f (·) = (·) + b,
where is a linear function and b ∈ R is a fixed real number. We therefore
obtain
n
f (x) = wi xi + b
i=1
n
= wi (xi ) + b
i=1
n
= wi ((xi ) + b)
i=1
n
= wi f (xi )
i=1
= f (x ) Qw (dx ). (1.36)
ex(K)
68 1 D-Norms
V := RN = {x = (x1 , x2 , . . . ) : xi ∈ R, i ∈ N}
Example 1.11.10 The space Vd := f : Rd → R of real valued func-
tions on Rd is a vector space, equipped with the usual componentwise
operations. By defining for x ∈ Rd
f s,x := |f (x)| , f ∈ Vd ,
we obtain a family ·s,x : x ∈ Rd of seminorms on Vd , indexed by
I = Rd .
We define convergence of a sequence fk , k ∈ N, to f in V by
K = conv(ex(K)).
Note that, because of the closure in the previous result, it is not guaranteed
that every x ∈ K is the convex combination of extremal elements of K.
70 1 D-Norms
1S = E max Zj ≤ c.
1≤j≤d
72 1 D-Norms
For every x ∈ Rd and for every ε > 0 there exists y ∈ Qd with x − y∞ < ε.
As a consequence we obtain by the triangle inequality for any n ∈ N
as well as
yS,0 − εc ≤ xS,0 ≤ yS,0 + εc.
But this yields ! !
! !
lim sup !xS,n − xS,0 ! ≤ 2εc.
n→∞
This is equivalent to
(ξ)
max (|xj | zj ) = E max |xj | Zj , x ∈ Rd ,
1≤j≤d 1≤j≤d
E max (|xj | Zj )
1≤j≤d
(1) (2)
= λE max |xj | Zj + (1 − λ)E max |xj | Zj ,
1≤j≤d 1≤j≤d
with λ := P (Z ∈ A) ∈ (0, 1). Note that the distributions of Z (1) and Z (2) are
different. The seminorm generated by Z is, therefore, not extremal.
Introducing a Homeomorphism
The functional T : S → ex(KS ), which maps each z ∈ S onto the seminorm
·S,z ∈ KS with constant generator Z = z = (z1 , . . . , zd ) ∈ S, i.e.,
i.e., the functional T , as well as its inverse functional, is continuous. The func-
tional T is, therefore, a homeomorphism. It maps the Euclidean topology on
S one-to-one onto the topology defined on ex(KS ), which is the topology of
pointwise convergence. It can be metrized as in (1.39). We state this relation-
ship explicitly.
The next result was established by Ressel (2013, Theorem 1) for the com-
plete angular set S = x ∈ [0, 1]d : x∞ = 1. Its extension to an arbitrary
angular set was proved by Fuller (2016).
Proof. The set KS is, according to Lemmas 1.11.3 and 1.11.16, a convex and
compact subset of Vd , which is a locally convex vector space, as shown in
Example 1.11.10. According to equation (1.40), the set ex(KS ) is closed. In
order to prove that KS is a Bauer simplex, it remains to show that, for every
element ·S ∈ KS , the probability measure Q · S on ex(KS ), defined in the
Choquet–Bishop–de Leeuw theorem 1.11.12, is uniquely determined.
Choose ·S ∈ KS and let Q · S be a probability measure on ex(KS ) =
ex(KS ) that satisfies
f (·S ) = f ·S,z Q · S d ·S,z
ex(KS )
According to Lemma 1.11.18, we can identify ex(KS ) with S and their topolo-
gies as well. As a consequence, the probability measure Q · S on the Borel
σ-field of ex(KS ) can be identified with a probability measure σ on the Borel
σ-field of S. Equation (1.41), therefore, becomes
xS = xS,z σ(dz)
S
= max (|xj | zj ) σ(dz)
S 1≤j≤d
m(n)
·S,n := wi,n ·S,i,n , n ∈ N,
i=1
m(n)
Pn (·) := wi,n εzi,n (·).
i=1
(n) (n)
Let Z (n) := ∈ S be an rv, which follows this discrete
Z1 , . . . , Zd
probability measure Pn with support z1,n , . . . , zm(n),n , n ∈ N. The rv
Z (n) generates the seminorm ·S,n , since for every x ∈ Rd with zi,n =
(zi,n,1 , . . . , zi,n,d ), we have
m(n)
(n)
E max |xj | Zj = max (|xj | zi,n,j ) P Z (n) = zi,n
1≤j≤d 1≤j≤d
i=1
m(n)
= xS,i,n wi,n
i=1
= xS,n .
and each summand on the left-hand side of the above equation is non-negative.
Let δj be the probability measure on Sd that puts mass one on the vector
dej , 1 ≤ j ≤ d. Then,
(n)
1 d
Bn − β j
Qn := Pn + δj , n ∈ N,
Bn j=1
dBn
for each j ∈ {1, . . . , d}, and therefore, Z̃ (n) generates a D-norm, say, ·D,n .
The convergence in (1.45) together with Lemma 1.2.2 imply for x ∈ Rd
(n)
xD,n = E max |xj | Z̃j
1≤j≤d
∞
(n)
= 1 − P |xj | Z̃j ≤ t, 1 ≤ j ≤ d dt
0
d max1≤j≤d |xj |
(n)
= 1 − P |xj | Z̃j ≤ t, 1 ≤ j ≤ d dt
0
d max1≤j≤d |xj |
(n)
= 1 − P |xj | Zj ≤ t, 1 ≤ j ≤ d dt + o(1)
0
(n)
= E max |xj | Zj + o(1)
1≤j≤d
d
x, y := xi yi ∈ R,
i=1
Note that
1/2
&
d
x, x = x2i = x2
i=1
We see that the inner product x, y of x and y is just the coordinate s0
of the orthogonal projection of y onto the line Lx . If x has arbitrary length
2
x2 > 0, then x, y = s0 x2 .
1.12 D-Norms from a Stochastic Geometry Perspective 79
y − s0 x
Lx
: s0 x = x, yx
x
which defines the support function h(L, ·) of L. The support function is one
of the most central basic concepts in convex geometry.
A convex and compact set L ⊂ Rd is uniquely determined by its support
function h(L, ·). This is a consequence of the next result. Put, for x ∈ Rd ,
HL (x) := y ∈ Rd : y, x ≤ h(L, x) ,
Proof. Each y ∈ L satisfies y, x ≤ h(L, x), and thus, L ⊂ HL (x) for each
x ∈ Rd , i.e.,
L⊂ HL (x).
x∈Rd
Choose z ∈ Rd , z
∈ L. It is well known that z and L can be separated in
the following way: we can find x ∈ Rd , x
= 0, such that, for all y ∈ L,
This is the hyperplane separation theorem; see, for example, Rockafellar (1970,
Corollary 11.4.2.). As g(·) := ·, x is a continuous function on Rd and L ⊂ Rd
80 1 D-Norms
d
0 = xS ≥ yi |xi | ≥ 0,
i=1
thus, x1 = 0, or x = 0 ∈ Rd .
Suppose next that ·S is a norm on Rd . We have to show K ∩(0, ∞)d
= ∅.
As ·S is a norm, we have ej S > 0 for each j = 1, . . . , d, i.e., there exists
yj ∈ K, whose j-th component is strictly positive. Since K is convex, the
d
vector y := j=1 yj /d is in K as well, and it is in (0, ∞)d .
i=1
i.e., the convex and compact set that generates the norm ·p as in
equation (1.46) is Kq = y ≥ 0 ∈ Rd : yq ≤ 1 .
82 1 D-Norms
d
d
xi yi ≤ xi = x1 ,
i=1 i=1
and equality holds for y = (1, . . . , 1) ∈ Rd . This proves (1.47) for the combi-
nation p = 1, q = ∞.
For p = ∞ and q = 1, we obtain for every y ≥ 0 ∈ Rd with y1 ≤ 1
d
d
xi yi ≤ x∞ yi = x∞ y1 ≤ x∞ ,
i=1 i=1
and equality holds for the choice y = ei∗ , where i∗ ∈ {1, . . . , d} is that index
with xi∗ = max(x1 , . . . , xd ) = x∞ . This proves (1.47) for the combination
p = ∞, q = 1.
Finally, we consider p, q ∈ (1, ∞) with p−1 + q −1 = 1. We obtain with
∗
x := (x1 , . . . , xd )/ xp
'
d
sup xi yi : y ≥ 0 ∈ R , yq ≤ 1
d
i=1
'
d
xi
= xp sup yi : y ≥ 0 ∈ Rd , yq ≤ 1
i=1
x p
d '
∗
= xp sup xi yi : y ≥ 0 ∈ R , yq ≤ 1 .
d
i=1
d
x∗i yi ≤ x∗ p yq = yq ≤ 1,
i=1
d
therefore, it is sufficient to find y ∈ Kq such that i=1 x∗i yi = 1.
We have equality in Hölder’s inequality if
x∗i = yiq ,
p
1 ≤ i ≤ d.
Therefore, put
yi := x∗i
p/q
, 1 ≤ i ≤ d.
Then, we obtain
d
x∗i yi = x∗ p yq = yq
i=1
1.12 D-Norms from a Stochastic Geometry Perspective 83
with
1/q
d
x∗p = x∗ p
p/q
yq = i = 1,
i=1
h(L(Kc ), x) = x1 , x ∈ Rd .
84 1 D-Norms
This shows that a convex and compact subset of [0, ∞)d , whose support func-
tion generates
a norm, is not uniquely determined by this norm. Note that
L(Kc ) = y ∈ Rd : |y| = λ(1, . . . , 1), c ≤ λ ≤ 1 is not a convex set if d ≥ 2
for any c ∈ [0, 1).
If we put, however, K := [0, 1]d, then K is convex and compact, K ∩
(0, ∞)d
= ∅, and L(K) = [−1, 1]d is convex as well. The norm ·K that is
generated by K is again ·1 , but now K is uniquely determined: it is the only
convex and compact subset of [0, ∞)d , K ∩ (0, ∞)d
= ∅, generating ·1 such
that L(K) is convex. This is a consequence of our preceding considerations,
summarized in the next result.
Lemma 1.12.6 Let K ⊂ [0, ∞)d be a convex and compact set with
K ∩(0, ∞)d
= ∅. If L(K) is a convex set, then K is uniquely determined
by the generated norm ·K .
h(L(K), x) = xK .
This equation
identifies the set L(K) according to Corollary 1.12.2. But
L(K) = y ∈ Rd : |y| ∈ K identifies the set K.
Cross-Polytopes
For z = (z1 , . . . , zd ) ≥ 0 ∈ Rd , put
Δz := conv ({0, z1 e1 , . . . , zd ed })
d '
d
= λi zi ei : λ1 , . . . , λd ≥ 0, λi ≤ 1 , (1.49)
i=1 i=1
d
d
x= λi zi ei , y= κ i z i ei ,
i=1 i=1
d d
with i=1 |λi | ≤ 1, i=1 |κi | ≤ 1. We obtain for ϑ ∈ [0, 1]
1.12 D-Norms from a Stochastic Geometry Perspective 85
d
ϑx + (1 − ϑ)y = (ϑλi + (1 − ϑ)κi )zi ei
i=1
with
d
d
d
|ϑλi + (1 − ϑ)κi | ≤ ϑ |λi | + (1 − ϑ) |κi | ≤ 1,
i=1 i=1 i=1
Introducing Max-Zonoids
Let Z = (Z1 , . . . , Zd ) ≥ 0 ∈ Rd be an rv with the property E(Zi ) ∈ (0, ∞),
1 ≤ i ≤ d. Then,
Example 1.12.10 Each logistic norm ·p , p ∈ [1, ∞], is, according
to Proposition 1.2.1, a D-norm. Lemma 1.12.5 shows that each ·p is
generated by the D-max-zonoid Kq = y ≥ 0 ∈ Rd : yq ≤ 1 , where
1/p + 1/q = 1.
86 1 D-Norms
A Random Cross-Polytope
The obvious question When is a convex and compact set K a max-zonoid?
was answered by Molchanov (2008). The answer is given within the framework
of stochastic geometry.
Let Z = (Z1 , . . . , Zd ) ≥ 0 ∈ Rd be a rv with E(Zi ) ∈ (0, ∞), 1 ≤ i ≤ d.
Then,
ΔZ = conv ({0, Z1 e1 , . . . , Zd ed })
d '
d
= λi Zi ei : λ1 , . . . , λd ≥ 0, λi ≤ 1 , (1.50)
i=1 i=1
thus,
see equation (1.48). Note that the set L(ΔZ ) is, according to Lemma 1.12.7,
convex as well.
The preceding observation raises the idea that the random cross-polytopes
ΔZ play a major role when answering the question When is a convex and
compact set a max-zonoid? posed earlier in this section. This is actually true;
see Corollary 1.12.17, which characterizes max-zonoids.
1.12 D-Norms from a Stochastic Geometry Perspective 87
has finite expectation, i.e., E (X1 ) < ∞. A random closed set X with this
property is called integrably bounded.
At this point, we ignore the precise definition of a proper σ-field on F
such that X1 is a Borel-measurable rv. Instead, we refer to Molchanov
(2005, Section 1.2.1).
If X is a random closed set that is integrably bounded, then X is bounded
with probability one, and thus, it is compact with probability one. In what
follows, we assume that X is an integrably bounded closed and convex subset
of [0, ∞)d ; thus, it is in particular compact with probability one.
The proper definition of the expectation E(X) of a random set X, given
below, is crucial.
d
ξ1 = ξi ≤ X1
i=1
thus,
d
E (ξ1 ) = E(ξi ) ≤ E (X1 ) < ∞,
i=1
i.e., each component ξi of ξ ∈ S(X) has finite expectation E(ξi ) < ∞. Recall
that ξi ≥ 0. The selection expectation of X is now the set
d
E(X) := {E(ξ) : ξ ∈ S(X)} ⊂ [0, E (X1 )] .
88 1 D-Norms
Proof. Since E(X) is a closed and bounded subset of [0, ∞)d , it is compact.
It remains to show that it is convex as well. For each y (1) , y
(2)
∈
E(X),
(1) (2) (1)
there exist sequences ξn , ξn ∈ S(X), n ∈ N, with limn→∞ E ξn = y (1) ,
(2) (1) (2)
limn→∞ E ξn = y (2) . The convexity of X implies that λξn +(1−λ)ξn ∈
S(X) for each λ ∈ [0, 1] as well; thus,
λy (1) + (1 − λ)y (2) = lim E λξn(1) + (1 − λ)ξn(2) ∈ E(X);
n→∞
L(E(X)) ⊂ E(L(X)).
L(E(X)) = E(L(X)).
L (E (ΔZ )) = E (L (ΔZ )) ;
d
ξ := λi Zi ei ∈ X = ΔZ .
i=1
This implies
where ΔE(Z) = conv({0, E(Z1 )e1 , . . . , E(Zd )ed }) and [0, E(Z)] =
[0, E(Z1 )] × · · · × [0, E(Zd )]. Lemma 1.12.7, together with Lem-
mas 1.12.13 and 1.12.14, implies that the symmetric extension
L(E(ΔZ )) = E(L(ΔZ )) is a convex set.
Next, we establish the reverse inequality. Choose x ∈ Rd and put for ε > 0
Xε := {y ∈ X : y, x ≥ h(X, x) − ε} .
ξε , x ≥ h(X, x) − ε.
thus,
E(h(X, x)) − ε ≤ h(E(X), x).
1.12 D-Norms from a Stochastic Geometry Perspective 91
xK = h(L(K), x)
= xZ
= E(h(L(ΔZ ), x)) by equation (1.51)
= h(E(L(ΔZ )), x) according to Theorem 1.12.16
= h(L(E(ΔZ )), x) according to Lemma 1.12.14.
The set L(E(ΔZ )) is convex, as shown in Example 1.12.15, and the set L(K) is
convex according to the assumption that K is a max-zonoid. Theorem 1.12.16
now implies that these sets coincide, because they provide identical support
functions as shown above. But this yields E(ΔZ ) = K, completing the proof.
norm ·p , with p ∈ [1, ∞], can be identified by Lemma 1.12.5 with Kq =
y ≥ 0 ∈ Rd : yq ≤ 1 , where 1/p + 1/q = 1.
Each max-zonoid K satisfies
ΔE(Z) ⊂ K ⊂ [0, E(Z)]
for some rv Z ≥ 0 ∈ Rd with E(Zi ) ∈ (0, ∞), 1 ≤ i ≤ d; see equation (1.53).
This is a characterization of a max-zonoid in dimension d = 2.
Δz ⊂ K ⊂ [0, z]
Proof. Check that L(K) is a convex set if d = 2. The norm ·K , generated by
K, is monotone; see Lemma 1.12.3. Corollary 1.5.4 implies that ·K = ·Z
for some rvs Z ≥ 0 ∈ R2 with E(Zi ) ∈ (0, ∞), i = 1, 2. In this case z = E(Z).
Set d ≥ 3 and put z := (1, . . . , 1) ∈ Rd , K := conv({0, e1 , . . . , ed , y}) ⊂
Rd , where y has constant entry 3/4. Then, Δz ⊂ K ⊂ [0, 1]d , but L(K) is not
convex:
1 1 3 3 3 3 3
y+ − , ,..., = 0, , . . . ,
∈ K.
2 2 4 4 4 4 4
where ·p , ·q are logistic norms with p, q ∈ [1, ∞] such that 1/p + 1/q = 1.
Both are D-norms according to Proposition 1.2.1. In what follows, we show
that this inequality can be extended to D-norms and their dual norms.
In equations (1.55) and (1.56), we show that a dual norm always exists. Its
uniqueness is shown below. We do not require ·(D) to be a D-norm itself.
This is actually true in dimension d = 2; see Proposition 1.12.26.
Prominent examples are the logistic norms ·p and ·q with 1/p + 1/q =
1, which are dual to one another according to Lemma 1.12.5. This symmetric
duality does not hold in the general case, i.e., if ·(D) is the dual norm of
·D , then ·D is generally not the dual norm of ·(D) .
Lemma 1.12.21 Let ·(1) and ·(2) be two radially symmetric norms
on Rd such that
y ≥ 0 ∈ Rd : y(1) ≤ 1 = y ≥ 0 ∈ Rd : y(2) ≤ 1 .
Proof. Choose y ≥ 0 ∈ Rd , y
= 0, and put y ∗ := y/ y(2) . Then, y ∗ (2) = 1
and, thus, y ∗ (1) ≤ 1. But this is y ∗ (1) ≤ y ∗ (2) , which implies y(1) ≤
y(2) . Interchanging both norms implies equality.
d
|xi yi | ≤ xD y(D) , x, y ∈ Rd .
i=1
Together, we obtain
d
xi yi ≤ xD y(D) ,
i=1
we obtain
1
λt1 x + (1 − λ)t2 y = 1 1 (x + y) ∈ L(K).
t1 + t2
hence,
1
x + y(K) =
max {t > 0 : t(x + y) ∈ L(K)}
−1
1
≤ 1 1
t1 + t2
1 1
= +
t1 t2
= x(K) + y(K) ,
which is one of the two inequalities we want to establish. The fact that K ⊂
[0, 1]d by (1.53) implies
r1 := max {r > 0 : rx ∈ K} ≤ r2 := max r > 0 : rx ∈ [0, 1]d .
d
|xi yi | ≤ xD y(D) , x, y ∈ Rd .
i=1
1
x(D) := .
sup {t > 0 : t |x| ∈ λKq1 + (1 − λ)Kq2 }
1.12 D-Norms from a Stochastic Geometry Perspective 97
The mapping ·D → ·(D) between the set of D-norms and the set of
dual norms on R2 is, consequently, one-to-one.
But this follows from the general inequalities ·∞ ≤ ·D ≤ ·1 in (1.4): we
have, for x ∈ K,
x∞ ≤ xD ≤ 1
thus, x ∈ [0, 1]2 . On the other hand, for arbitrary x = λ1 e1 + λ2 e2 ∈ Δ(1,1) ,
λ1 , λ2 ≥ 0, λ1 + λ2 ≤ 1, we have
λ1 e1 + λ2 e2 D ≤ λ1 e1 + λ2 e2 1 ≤ λ1 + λ2 ≤ 1,
max1≤i≤n Xi − bn
P ≤ x = P (Xi ≤ an x + bn , 1 ≤ i ≤ n)
an
= F n (an x + bn )
→n→∞ G(x) (2.2)
and
G0 (x) := exp(−e−x ), x ∈ R, (2.4)
being the family of reverse Weibull, Fréchet, and Gumbel distributions. Note
that G1 (x) = exp(x), x ≤ 0, is the standard negative exponential df.
The assumption F ∈ D(G) is quite a mild one, practically satisfied by
any textbook df F . We refer, for example, to Galambos (1987, Section 2.3)
or Resnick (1987, Chapter 1), where the condition F ∈ D(G) is characterized
and the choice of the constants an > 0, bn ∈ R, is specified.
2.1 Univariate Extreme Value Theory 101
Note that the signs of β and α in these two representations are flipped about
zero, i.e., Fβ with β > 0 corresponds to Gα with α < 0, etc. With this
particular parametrization, the set of univariate distributions {Fβ : β ∈ R} is
commonly called the family of generalized extreme value distributions.
max1≤i≤n η (i) − bn
P ≤ x = P (η ≤ x), x ∈ R.
an
This is the reason why Gα is called a max-stable df, and the set {Gα : α ∈ R}
collects all univariate max-stable distributions, which are non-degenerate; see,
for example, Galambos (1987, Theorem 2.4.1).
102 2 D-Norms & Multivariate Extremes
X − bn ! n
P ≤ t+s! >t
an an
1 − F (an (t + s) + bn )
=1−
1 − F (an t + bn )
log(Gα (t + s))
→n→∞ 1 −
log(Gα (t))
⎧
⎪
⎪ Hα − 1 + st , if α > 0,
⎪
⎨
= Hα 1 + st , if α < 0, s ≥ 0, (2.6)
⎪
⎪
⎪
⎩ H0 (s), if α = 0,
Zi ≤ c, 1 ≤ i ≤ d, (2.8)
1 1 1
P ≤x =P ≤U =1− ,
U x x
1 Zi
P Zi ≤ x = P ≤U
U x
= E (1 (Zi /x ≤ U ))
= 1 (z/x ≤ u) (P ∗ (U, Zi ))d(u, z)
[0,1]×[0,c]
= 1 (z/x ≤ u) ((P ∗ U ) × (P ∗ Zi ))d(u, z)
[0,1]×[0,c]
c 1
Zi
P (V ≤ x) = P ≤ xi , 1 ≤ i ≤ d
U
Zi
=P ≤ U, 1 ≤ i ≤ d
xi
zi
= P U ≥ , 1 ≤ i ≤ d (P ∗ Z)d(z1 , . . . , zd )
[0,c]d xi
zi
= P U ≥ max (P ∗ Z)d(z1 , . . . , zd )
[0,c] d 1≤i≤d xi
zi
= 1 − max (P ∗ Z)d(z1 , . . . , zd )
[0,c]d 1≤i≤d xi
zi
=1− max (P ∗ Z)d(z1 , . . . , zd )
[0,c]d 1≤i≤d xi
Zi
= 1 − E max
1≤i≤d xi
1
=1− x ,
(2.11)
D
Zi
P (V ≥ x) = P U ≤ ,1 ≤ i ≤ d
xi
zi
= P U ≤ , 1 ≤ i ≤ d (P ∗ Z)d(z1 , . . . , zd )
[0,c]d xi
zi
= P U ≤ min (P ∗ Z)d(z1 , . . . , zd )
[0,c]d 1≤i≤d xi
zi
= min (P ∗ Z)d(z1 , . . . , zd )
[0,c] d 1≤i≤d xi
Zi
= E min = 1/x D , (2.12)
1≤i≤d xi
2.2 Multivariate Generalized Pareto Distributions 105
i.e., the survival function of V is equal to the dual D-norm function 1/x D ,
for x ≥ c.
As an obvious consequence of (2.12), we obtain the equation
Zi
P (V ≥ x) = E min = 1/x D .
1≤I≤d xi
1 1
P (V ≥ x) = min = , x ≥ (1, . . . , 1).
1≤i≤d xi max xi
1≤i≤d
If we choose ·D = ·1 , then we know from (1.12) that · 1 = 0; thus,
P (V ≥ x) = 0, x ≥ (d, . . . , d).
This example shows that assessing the risk of a portfolio is highly sensitive
to the choice of the stochastic model. For x = (d, . . . , d) and ·D = ·∞ ,
the probability of the event that the losses jointly exceed the value d is 1/d,
whereas for ·D = ·1 , it is zero. In dimension d = 2, this means a joint loss
probability of 1/2 versus a joint loss probability of 0.
U/Z; however, in this case, we may divide by zero. To overcome this problem,
choose a number K < 0 and put
W := (W1 , . . . , Wd )
U U
:= max − , K , . . . , max − , K . (2.14)
Z1 Zd
P (W ≤ x) = 1 − xD , x 0 ≤ x ≤ 0 ∈ Rd ,
P (W ≥ x) = x D , x 0 ≤ x ≤ 0 ∈ Rd . (2.16)
−1
−1
U U
= − max − , K , . . . , − max − , K
Z1 Zd
1 1
= ,...,
min (U/Z1 , −K) min (U/Zd , −K)
1
= (Z1 , . . . , Zd ) = V
U
if U/Zi ≤ −K or Zi /U ≥ −1/K for 1 ≤ i ≤ d.
With α1 = · · · = αd = 0 and Zi /U ≥ −1/K for 1 ≤ i ≤ d, we obtain
−1
ψ0 (W1 ), . . . , ψ0−1 (Wd ) = (− log(U/Z1 ), . . . , − log(U/Zd ))
= (log(Z1 ) − log(U ), . . . , log(Zd ) − log(U )) ,
if (ψα1 (x1 ), . . . , ψαd (xd )) ≥ x0 ; for such x, its survival function follows from
equation (2.16):
P (Y ≥ x) = P (W1 , . . . , Wd ) ≥ (ψα1 (x1 ), . . . , ψαd (xd ))
= (ψα1 (x1 ), . . . , ψαd (xd )) D .
max1≤i≤n η (i) − bn
P ≤x = P (η ≤ x) , x ∈ Rd .
an
Note that both the maximum function and division are taken componentwise.
Different from the univariate case in (2.4), the class of multivariate max-
stable distributions or multivariate extreme value distributions, also abbrevi-
ated as EVD, is no longer a parametric one, indexed by some α ∈ R. This is
obviously necessary for the univariate margins of G. Instead, a non-parametric
part occurs, which can be best described in terms of D-norms, as is shown in
what follows.
1
Gi (x) = exp − , x > 0, 1 ≤ i ≤ d.
x
Next, we show that such simple EVDs actually exist. Choose an arbitrary
(1) (1) (2) (2)
D-norm ·D on Rd . Let V (1) = (V1 , . . . , Vd ), V (2) = (V1 , . . . , Vd ), . . .
be independent copies of the rv V = Z/U , where Z = (Z1 , . . . , Zd ) is a
generator of ·D with the additional property that it is bounded by some
c ≥ 1 ∈ Rd , and the rv U is uniformly distributed on (0, 1). The rv Z and U
are assumed to be independent as well; thus, the rv V follows a simple GPD.
For the vector of the componentwise maxima,
we obtain from equation (2.11), for x > 0 and n large enough such that
nx > c,
1
P max V (i) ≤ x = P V (i) ≤ nx, 1 ≤ i ≤ n
n 1≤i≤n
n
= P V (i) ≤ nx
i=1
2.3 Multivariate Max-Stable Distributions 109
n
= P (V ≤ nx)
1 n
= 1−
nx D
n
1 1
= 1−
n x D
1
→n→∞ exp −
x =: G(x), (2.19)
D
Zi0
=P ≤0
U
= P (Zi0 ≤ 0)
= P (Zi0 = 0) < 1
1 (i) n n
P max V ≤ x = P (V ≤ nx) ≤ P (Zi0 = 0) →n→∞ 0.
n 1≤i≤n
Hence, we have
1
P max V (i) ≤ x →n→∞ G(x), x ∈ Rd ,
n 1≤i≤n
where
exp − x1 D , if x > 0,
G(x) = (2.20)
0 elsewhere.
Since P n−1 max1≤i≤n V (i) ≤ · , n ∈ N, is a sequence of df on Rd , one easily
checks that its limit G(·) is a df itself; see, for example, Reiss (1989, (2.2.19)).
It is obvious that the df G satisfies
n
1 1
Gn (nx) = exp −
nx = exp − x
= G(x), x > 0 ∈ Rd ,
D D
and, thus,
Gn (nx) = G(x), x ∈ Rd , n ∈ N,
which is the max-stability of G. Let the rv ξ ∈ Rd have df G. By keeping
xi > 0 fixed and letting xj tend to infinity for j
= i, we obtain the marginal
distribution of G:
110 2 D-Norms & Multivariate Extremes
Gi (xi ) =P (ξi ≤ xi )
= x lim
→∞
P (ξi ≤ xi , ξj ≤ xj , j
= i)
j
j=i
= x lim
→∞
G(x)
j
j=i
1
= x lim exp − x
j →∞
j=i D
1 1 1
= x lim exp − , . . . , , . . . ,
j →∞ x1 xi xd D
j=i
1
= exp − 0, . . . , 0, , 0, . . . , 0
xi
D
1
= exp − ei D
xi
1
= exp −
xi
by the fact that each D-norm is standardized. Each univariate marginal df Gi
of G is, consequently, the unit Fréchet df
1
Gi (x) = exp − , x > 0.
x
This proves that simple EVDs actually exist. We see later on in Theorem 2.3.4
that, actually, each simple EVD can be represented by means of a D-norm as
in (2.20).
Gi (x) = exp(x), x ≤ 0, 1 ≤ i ≤ d.
1 1 1
η=− =− , ...,
ξ ξ1 ξd
2.3 Multivariate Max-Stable Distributions 111
1
P (η ≤ x) = P − ≤ xi , 1 ≤ i ≤ d
ξi
1
= P ξi ≤ − , 1 ≤ i ≤ d
xi
1
=P ξ≤−
x
= exp(− xD ) =: GD (x).
By putting
P (ηi ≤ x) =P (η ≤ xei )
= exp (− xei D )
= exp (− |x| ei D )
= exp(x), x ≤ 0.
Characterization of SMS DF
We are going to show that any SMS df or standard EVD can be represented as
in (2.21), i.e., the theory of D-norms allows a mathematically elegant charac-
terization of the family of SMS dfs, presented in the next result. It comes from
results found in Balkema and Resnick (1977); de Haan and Resnick (1977);
Pickands (1981), and Vatan (1985).
Theorem 2.3.3 A df G on Rd is an SMS df iff there exists a D-norm
·D on Rd such that
(c) 1
P ξi ≤ x = exp − α =: Fα (x), x > 0, 1 ≤ i ≤ d.
x
Its expectation is Γ (1 − c) =: μc , which can be seen by applying el-
ementary rules of integration as follows; note that the density of Fα is
exp(−x−α )x−α−1 α, x > 0. We have
∞
1
E = x exp(−x−α )x−α−1 α dx
|ηi |c 0
∞
=α x−α exp(−x−α )dx
0
∞
1
= x− α exp(−x)dx
0 ∞
1
= x(1− α )−1 exp(−x)dx
0
1
=Γ 1−
α
= Γ (1 − c). (2.23)
2.3 Multivariate Max-Stable Distributions 113
Gc (tx) = Gc (x)1/t ,
α
t > 0, x ∈ Rd .
In the following proof, we use the fact that Gc (x) > 0 for x > 0. Otherwise,
the max-stability of Gc would imply
for some x > 0 ∈ Rd and each t > 0; letting t converge to infinity would
obviously produce a contradiction. Using Lemma 1.2.2, we obtain, for x >
0 ∈ Rd ,
(c)
xDc = E max xi Zi
1≤i≤d
∞
1 (c)
= P max xi ξi > t dt
μc 0 1≤i≤d
∞
1 1
= 1 − Gc t dt
μc 0 x
∞
1/tα
1 1
= 1 − Gc dt
μc 0 x
114 2 D-Norms & Multivariate Extremes
∞
1 1 1
= 1 − exp α log Gc dt
μc 0 t x
1/α ∞
1 1 1
= − log Gc 1 − exp − α dt
x μc 0 t
1/α
1
= − log Gc
x
∞
by the substitution t → (− log (Gc (1/x)))1/α t; note that 0 1−exp (−1/tα ) dt
= μc . Observe that, for x > 0 ∈ Rd ,
1
Gc = G (−xα ) ,
x
thus, we have for x ∈ Rd
α 1/α
xDc = (− log (G (− |x| ))) , (2.24)
where |x| is also meant componentwise. This yields
lim xDc = − log(G(− |x|)).
c→1
Proof. We establish the first part; the second part is obvious. Since Gαi (Xi )
is uniformly distributed on (0, 1), it is clear that ηi = log(Gαi (Xi )) has df
exp(x) = G1 (x), x < 0. It remains to show that the df of the rv η, say H, is
max-stable. But this follows from the fact that G(α1 ,...,αd ) is max-stable with
Gnαi (ψα−1
i
(x/n)) = Gαi (ψα−1
i
(x)): For xi < 0, 1 ≤ i ≤ d, we have
defines a bivariate SMS df. Let the rv η = (η1 , η2 ) follow this df. Then,
by Lemma 2.3.5, the rv
X = (X1 , X2 ) := ψ0−1 (η1 ), ψ0−1 (η2 )
t−s y−x
= exp − exp(−x)Φ +√
2 t−s
√
t−s x−y
− exp(−y)Φ + √
2 t−s
= G(0,0) (x, y), x, y ∈ R,
which
√ is the bivariate Hüsler–Reiss distribution, with parameter λ =
t − s/2; see, for example, Falk et al. (2011, Example 4.1.4).
Min-Stable Distributions
Let X (1) , X (2) , . . . be independent copies of the rv X on Rd . The rv X (or its
distribution) is called min-stable if there are constants an > 0 ∈ Rd , bn ∈ Rd ,
n ∈ N, such that, for each n ∈ N,
min1≤i≤n X (i) + bn
P ≥ x = P (X ≥ x), x ∈ Rd . (2.25)
an
max1≤i≤n −X (i) − bn
P ≤ −x = P (−X ≤ −x), x ∈ Rd , (2.26)
an
2.3 Multivariate Max-Stable Distributions 117
P (X ≥ x) = P (−X ≤ −x)
= exp − (ψα1 (−x1 ), . . . , ψαd (−xd ))D (2.27)
P (X ≥ x) = exp(− xD ), x ≥ 0 ∈ Rd ,
Takahashi Revisited
We can now present the original version of Takahashi’s characterizations
in terms of multivariate max-stable dfs. In what follows, let the rv η =
(η1 , . . . , ηd ) have the SMS df
⇐⇒ ∃ y < 0 ∈ Rd :
d
P (ηi ≤ yi , 1 ≤ i ≤ d) = P (ηi ≤ yi ).
i=1
(ii) η1 = η2 = · · · = ηd a.s.
d
P (η1 ≤ x1 , . . . , ηd ≤ xd ) = P (ηi ≤ xi )
i=1
d
= exp(xi )
i=1
118 2 D-Norms & Multivariate Extremes
d
= exp xi
i=1
= exp (− x1 ) , x ≤ 0 ∈ Rd ,
Part (ii) in the next result is obviously trivial. We list it for the sake of
completeness.
Theorem 2.3.8 With η as above, we have the equivalences
(i) η1 , . . . , ηd are independent iff η1 , . . . , ηd are pairwise independent.
(ii) η1 = η2 = · · · = ηd a.s. iff η1 , . . . , ηd are pairwise completely depen-
dent.
x x
xD = x1
x
=: x1 A ,
1 D x1
where A(·) is a function on the unit sphere S = y ∈ Rd : y1 = 1 . It is
evident that it suffices to define the function A(·) on S+ := u ≥ 0 ∈ Rd−1 :
d−1
i=1 ui ≤ 1 by putting
d−1
A(u) := u1 , . . . , ud−1 , 1 − ui .
i=1 D
yielding
ε = 1D ∈ [1, d], (2.28)
according to the general inequalities ·∞ ≤ ·D ≤ ·1 in (1.4). The ex-
tremal coefficient is, therefore, the D-norm of the vector 1.
1/p
d p
For the family of logistic D-norms xp = i=1 |xi | , p ∈ [1, ∞], we
obtain, for example,
⎧
⎪
⎨d, if p = 1,
1/p
1p = d , if p ∈ (1, ∞),
⎪
⎩
1, if p = ∞.
Denote by
U1:n ≤ U2:n ≤ · · · ≤ Un:n
the ordered values of U1 , . . . , Un , n ∈ N, or order statistics, for short. It is well
known that n
i
n k=1 Ek
(Ui:n )i=1 =D n+1 , (2.29)
k=1 Ek i=1
where E1 , E2 , . . . are iid standard exponential rvs; see, for example, Reiss
(1989, Corollary 1.6.9). In what follows, we suppose that the sequence
E1 , E2 , . . . is independent of the sequence Z (1) , Z (2) , . . . as well.
1
η := −
supi∈N Vi Z (i)
Proof. Adopting the arguments in (2.11) and (2.19), we obtain, for x > 0 ∈
Rd , even without the assumption that Z is bounded,
1 1 (i) 1
P max Z ≤ x →n→∞ exp − x
= P (ξ ≤ x), (2.30)
n 1≤i≤n Ui D
1 1 (i) 1 1
P max Z ≤x =P max Z (i) ≤ x ,
n 1≤i≤n Ui n 1≤i≤n Ui:n
1 1
P max Z (i) ≤x
n 1≤i≤n Ui:n
n+1
Ek 1 (i)
=P k=1
max i Z ≤x .
n 1≤i≤n Ek
k=1
The law of large numbers implies n+1k=1 Ek /n →n→∞ 1 a.s. Moreover,
1 1
(i) (i)
max i Z →n→∞ sup i Z = sup Vi Z (i) =: ξ,
1≤i≤n
k=1 Ek i∈N k=1 Ek i∈N
1
P (ξ ≤ x) = exp − x
, x > 0 ∈ Rd .
D
Putting
1
η := −
ξ
completes the proof.
P (η > sx)
(ii) lim = x D .
s↓0 s
1 1
P (η > x) = P sup Vi Z (i) > = 1 − P sup Vi Z (i)
>
i∈N |x| i∈N |x|
2.4 How to Generate Max-Stable RVS 123
with
1
P sup Vi Z (i)
>
i∈N |x|
⎛ ⎞
d * +
(i) 1 ⎠
=P⎝ Vi Zj ≤
j=1
|xj|
i∈N
n
1
= lim 1 − E min (|xj | Zj ) + o(1)
n→∞ n 1≤j≤d
= exp (− x D ) ,
which is part (i).
Part (ii) follows from the inclusion–exclusion principle (see Corol-
lary 1.6.2), together with (1.10):
d
P (η > sx) = 1 − P {ηi ≤ sxi }
i=1
=1− (−1)|T |−1 P (ηi ≤ sxi , i ∈ T )
∅=T ⊂{1,...,d}
= (−1)|T |−1 (1 − P (ηi ≤ sxi , i ∈ T )).
∅=T ⊂{1,...,d}
But
= sE max(|xi | Zi ) + o(s)
i∈T
= s x D + o(s)
according to Lemma 1.6.1. This completes the proof of Lemma 2.4.2.
2.5 Covariances, Range, etc. of Standard Max-Stable rvs 125
where we have used the equality 1(x,∞) (X1 ) = 1 − 1(∞,x] (X1 ) etc. As a con-
sequence, by using Fubini’s theorem, we obtain
E((X1 − X2 )(Y1 − Y2 ))
∞ ∞
= E 1(∞,x] (X2 ) − 1(∞,x](X1 ) 1(∞,y] (Y2 ) − 1(∞,y] (Y1 ) dx dy
−∞ −∞
∞ ∞
= 2P (X ≤ x, Y ≤ y) − 2P (X ≤ x)P (Y ≤ y) dx dy,
−∞ −∞
where this time we have used the equality E 1(∞,x] (X2 )1(∞,y] (Y2 ) = P (X ≤
x, Y ≤ y) etc. This completes the proof.
Proof (of Lemma 2.5.1). From Lemma 2.5.2 and Lemma 1.2.2, we obtain
∞ ∞
Cov(η1 , η2 ) = P (η1 ≤ x, η2 ≤ y) − P (η1 ≤ x)P (η2 ≤ y) dx dy
−∞ −∞
0 0
= P (η1 ≤ x, η2 ≤ y) − P (η1 ≤ x)P (η2 ≤ y) dx dy
−∞ −∞
0 0
= P (η1 ≤ x, η2 ≤ y) dx dy − E(η1 )E(η2 )
−∞ −∞
0 0
= P (η1 ≤ x, η2 ≤ y) dx dy − 1
−∞ −∞
= E(η1 η2 ) − 1.
But
0 0
P (η1 ≤ x, η2 ≤ y) dx dy
−∞ −∞
0 0
= exp (− (x, y)D ) dy dx
−∞ −∞
0 0 y
= exp x 1, dy dx
−∞ −∞ x D
2.5 Covariances, Range, etc. of Standard Max-Stable rvs 127
0 ∞
=− x exp (x (1, y)D ) dy dx
−∞ 0
∞ 0
x
=− 2 exp(x) dx dy
0 −∞ (1, y)D
∞
1
= 2 dy
0 (1, y)D
1 1 1
= B , − 1,
p p p
where
1 ∞
ty−1
B(x, y) = t x−1
(1 − t) y−1
dt = dt, x, y > 0,
0 0 (1 + t)x+y
1 E(|Z1 − Z2 |)
E(|η1 − η2 |) = 2 1 − = .
(1, 1)D (1, 1)D
As (1, 1)D is greater than one and less than two according to equation
(1.4), the preceding equation implies the bounds
E(|Z1 − Z2 |)
≤ E(|η1 − η2 |) ≤ E(|Z1 − Z2 |).
2
a + b |b − a|
max(a, b) = + , (2.31)
2 2
which holds for arbitrary numbers a, b ∈ R, we obtain
Recall that 1 D can be zero, in which case the preceding upper bound
is infinity and less helpful.
E max ηi = − P max ηi ≤ t dt
1≤i≤d −∞ 1≤i≤d
0
=− P (η ≤ t1) dt
−∞
0
=− exp (− t1D ) dt
−∞
0
=− exp (t 1D ) dt
−∞
0
1 1
=− exp(t) dt = − ,
1D −∞ 1D
130 2 D-Norms & Multivariate Extremes
E min ηi =− P min ηi ≤ t dt
1≤i≤d −∞ 1≤i≤d
0
1 1
E max ηi − E min ηi ≤ − .
1≤i≤d 1≤i≤d 1 D 1D
It is interesting to note that this upper bound converges to 1/λ if the dimen-
sion d tends to infinity.
1 1 c
P c ≤ x = P ≤ |ηi |
|ηi | x
1
=P ≤ −ηi
x1/c
1
= P − 1/c ≥ ηi
x
1
= exp − 1/c , x > 0, 1 ≤ i ≤ d,
x
−c
i.e., |ηi | follows the Fréchet df Fα (x) = exp(−x−α ), x > 0, with parameter
α = 1/c; note that P (ηi = 0) = 0. Its expectation is, by (2.23), μc = Γ (1 − c).
The rv
1
H(x) =P c ≤ x i , 1 ≤ i ≤ d
|ηi |
1
=P ηi ≤ − 1/c , 1 ≤ i ≤ d
x
i
1 1
= exp − 1/c , . . . , 1/c , x > 0 ∈ Rd ,
x xd
1 D
On the other hand, from the fact that each D-norm is larger than the sup-
norm ·∞ and smaller than the norm ·1 (see (1.4)), we obtain
c
c
1/c 1/c 1/c
|x
1 | , . . . , |xd | ≥ max |xi | = max (|xi |) = x∞
D 1≤i≤d 1≤i≤d
and d c
c
1/c 1/c 1/c
|x1 | , . . . , |xd | ≤ |xi | →c→0 x∞
D
i=1
Proof (of Proposition 2.6.1). From Lemma 1.2.2, we obtain that, for x > 0 ∈
Rd ,
(c) 1 xi
E max xi Zi = E max c
1≤i≤d μc 1≤i≤d |ηi |
∞
1 xi
= P max c > t dt
μc 0 1≤i≤d |ηi |
∞
1 xi
= 1 − P max c ≤ t dt
μc 0 1≤i≤d |ηi |
2.6 Max-Stable Random Vectors as Generators of D-Norms 133
∞
1 xi
= 1−P c ≤ t, 1 ≤ i ≤ d dt
μc
0 |ηi |
∞
1 1 t
= 1−P c ≤ , 1 ≤ i ≤ d dt
μc 0 |ηi | xi
∞
1 1 1
= 1 − exp − , . . . , dt
μc 0 (t/x1 )1/c (t/xd )1/c D
∞
1 1 1/c 1/c
= 1 − exp − 1/c (x1 , . . . , xd ) dt
μc 0 t D
c ∞
1
1/c 1/c 1
= (x1 , . . . , xd ) 1 − exp − 1/c dt
μc D 0 t
c
1/c 1/c
by the substitution t → (x1 , . . . , xd ) t.
∞ D
The integral 0 1 − exp −1/t1/c dt equals E(Y ) according to the
Lemma 1.2.2, where Y follows a Fréchet distribution with parameter 1/c.
It was shown in (2.23) that E(Y ) = μc , which completes the proof.
This may raise the conjecture that the sequence of D-norms converges to the
sup-norm ·∞ , if it converges. This is actually true and can easily be seen as
follows.
Recall that ·∞ ≤ ·D ≤ ·1 for an arbitrary D-norm and that c ∈
(0, 1). As a consequence, we obtain
cn
1/cn 1/cn
(x1 , . . . , xd )D(n) = x1 , . . . , xd
D
134 2 D-Norms & Multivariate Extremes
cn
1/cn 1/cn
≤ (x1 , . . . , xd )
1
d cn
1/cn
= xi
i=1
→n→∞ (x1 , . . . xd )∞ , x ≥ 0 ∈ Rd ,
This chapter reveals the crucial role that copulas play in MEVT. The D-norm
approach again proves to be quite a helpful tool. In particular, it turns out
that a multivariate df F is in the domain of attraction of a multivariate EVD
iff this is true for the univariate margins of F together with the condition
that the copula of F in its upper tail is close to that of a generalized Pareto
copula. As a consequence, MEVT actually means extreme value theory for
copulas.
Sklar’s Theorem
A copula is a multivariate df with the particular property that each univariate
margin is the uniform distribution on (0, 1). For an exhaustive account of
copulas we refer to Nelsen (2006). Sklar’s theorem plays a major role in the
characterization of F ∈ D(G) for a general df F on Rd .
where Fi−1 (u) = inf{t ∈ R : Fi (t) ≥ u}, u ∈ (0, 1), is the generalized inverse
of Fi . The copula of an rv Y = (Y1 , . . . , Yd ) is meant to be the copula of its
df.
If Y = (Y1 , . . . , Yd ) is an rv such that, for each i ∈ {1, . . . , d}, the df Fi
of Yi is in its upper tail a continuous function, then the copula C of Y is, for
u = (u1 , . . . , ud ) close to 1 ∈ Rd , uniquely determined and given by
for some u0 < 1, is called a generalized Pareto copula (GPC). These copulas
turn out be a key to MEVT; see, for example, Proposition 3.1.5 and 3.1.10.
Note that any marginal distribution of a GPC C is a lower dimensional
GPC as well: if the rv U = (U1 , . . . , Ud ) follows the GPC C on Rd , then the
3.1 Characterizing Multivariate Domain of Attraction 137
for u close to 0 ∈ Rm .
P (X ≥ tx)
P (X ≥ tx | X ≥ x) =
P (X ≥ x)
P (Ui ≥ F (txi ), 1 ≤ i ≤ d)
=
P (Ui ≥ F (xi ), 1 ≤ i ≤ d)
P (Ui ≥ 1 − (1 − F (txi )), 1 ≤ i ≤ d)
=
P (Ui ≥ 1 − (1 − F (xi )), 1 ≤ i ≤ d)
P Ui ≥ 1 − 1t (1 − F (xi )), 1 ≤ i ≤ d
=
P (Ui ≥ 1 − (1 − F (xi )), 1 ≤ i ≤ d)
1
= , t ≥ 1,
t
by equation (3.3), provided P (U ≥ u) > 0 for all u ∈ [0, 1)d close
to 1 ∈ Rd . The preceding result can easily be extended to arbitrary
univariate generalized Pareto margins as given in (2.7).
The previous result shows that if one wants to model the copula of multi-
variate exceedances above high thresholds, then a GPC is a first option.
The uniformity in the preceding result is meant as follows: For each ε > 0,
there exists δ > 0 such that
|C(u) − (1 − 1 − uD )|
≤ ε, if u ∈ [1 − δ, 1]d , u
= 1.
1 − u
The norm · in the denominator can be arbitrarily chosen, due to the fact
that all norms on Rd are equivalent.
As an example, we show in Corollary 3.1.15 that an Archimedean copula
Cϕ on Rd , whose generator function ϕ satisfies condition (3.11) below, is in
the domain of attraction of an SMS df with corresponding logistic D-norm.
Proof (of Proposition 3.1.5). The implication “⇐” is obvious: we have for
x ≤ 0 ∈ Rd
n
x n 1 1
C 1+ = 1 − xD + o
n n n
→n→∞ exp(− xD ) =: G(x),
=C 1+ −1+O C 1+ −1 .
t t
The lower Fréchet bound for a multivariate df (see, for example, Galambos
(1987, Theorem 5.1.1)) for x = (x1 , . . . , xd ) ≤ 0 provides the inequality
x 1
d
0≥C 1+ −1≥ xi .
t t i=1
Choose u ∈ [0, 1]d with 1 − uD ≤ 1/2. The preceding equation with
t := 1/ 1 − u∞ implies
|C(u) − (1 − 1 − uD )|
≤ r(1 − u∞ ); (3.5)
1 − u∞
note that we can apply (3.4) with these choices of u and t, since
1 1−u
1 − 1 ≤ u ≤ 1 ⇐⇒ 0 ≤ ≤ 1,
t 1 − u∞
|C(u) − (1 − 1 − uD )|
→ 1−u ∞ →0 0, (3.6)
1 − u∞
3.1 Characterizing Multivariate Domain of Attraction 141
as u → 1.
As described above, uniformity in u in the above expansion means that
for all ε > 0, there exists δ > 0 such that the remainder term satisfies
|o(1 − u∞ )| ≤ ε if 1 − u∞ ≤ δ. We prove this by a contradiction. Sup-
pose this uniformity is not valid. Then there exists ε∗ > 0 such that, for all
δ > 0, there exists uδ ∈ [0, 1]d with 1 − uδ ∞ ≤ δ and |o(1 − uδ ∞ )| > ε∗ .
But this clearly contradicts equation (3.6).
Since all norms on Rd are equivalent, the remainder term o(1 − u∞ ) in
expansion (3.7) can be substituted by o(1 − u) with an arbitrary norm on
Rd . This completes the proof of Proposition 3.1.5.
The following consequence of Proposition 3.1.5 provides a handy charac-
terization of the condition C ∈ D(G).
Corollary 3.1.6 A copula C on Rd satisfies C ∈ D(G), with G(x) =
exp(− xD ), x ≤ 0 ∈ Rd , iff for all x ≤ 0 ∈ Rd , the limit
1 − C(1 + tx)
lim =: (x) (3.8)
t↓0 t
The limit (·) is also known as the stable tail dependence function of C
(Huang (1991)). The fact that each stable tail dependence function is actu-
ally a D-norm opens the way to estimating an underlying D-norm by using
estimators of the stable tail dependence function.
Proof. We know from Proposition 3.1.5 that the condition C ∈ D(G) is equiv-
alent to the expansion
d
(1 + txi ) − d + 1 ≤ C(1 + tx) ≤ min (1 + txi )
1≤i≤d
i=1
142 3 Copulas & Multivariate Extremes
thus,
d
t |xi | ≥ 1 − C(1 + tx) ≥ t max |xi | .
1≤i≤d
i=1
This implies in particular that (x) →x→0 0 and (x) → ∞ if one component
of x decreases to −∞.
The Taylor expansion log(1 + ε) = ε + O(ε2 ) for ε → 0 implies for x ≤ 0 ∈
R and n ∈ N large,
d
x x
Cn 1 + = exp n log C 1 +
n / / n 00
2
x x
= exp n C 1 + −1+O C 1+ −1
n n
1−C 1+ x 1
= exp − 1
n
+O
n
n
→n→∞ exp(−(x)).
Note that
x
Cn 1 + = P n max U (i) − 1 ≤ x , n ∈ N,
n 1≤i≤n
(xei ) = |x| , x ≤ 0, 1 ≤ i ≤ d,
and thus, G has standard negative exponential margins. Theorem 2.3.3 implies
G(x) = exp(− xD ), x ≤ 0 ∈ Rd , which completes the proof.
3.1 Characterizing Multivariate Domain of Attraction 143
Remark 3.1.7 Equation (3.9) in the preceding proof reveals why ·∞
and ·1 are the smallest and the largest D-norms: this is actually due
to the Hoeffding–Fréchet bounds for a multivariate df.
1 − Cϑ (1 + tx1 , 1 + tx2 )
t
y−x
GHRλ (x, y) = exp − exp(−x)Φ λ +
2λ
x−y
− exp(−y)Φ λ + , x, y ∈ R,
2λ
144 3 Copulas & Multivariate Extremes
has according to equation (3.10) and Lemma 1.10.6, where the D-norm
is explicitly given, the copula
CHRλ (u, v)
log(log(u)/ log(v))
= exp log(u)Φ λ +
2λ
log(log(v)/ log(u))
+ log(v)Φ λ + , u, v ∈ (0, 1).
2λ
Taking the logarithm on both sides and applying the Taylor expansion log(1+
ε) = ε + O(ε2 ) for ε → 0, one obtains for x ∈ R with Gi (x) > 0
or
F n (an x + bn )
d
n n(Fi (ani xi + bni ) − 1)
= CF 1+
n i=1
n
1 d
1
= 1− (n(Fi (ani xi + bni ) − 1))i=1 D + o
n n
→n→∞ exp (− (ψ1 (x1 ), . . . , ψd (xd ))D )
= G(x1 , . . . , xd )
(i)
H(u) := P max U ≤ u = CFn (u).
1≤i≤n
146 3 Copulas & Multivariate Extremes
(i)
Hj (u) = P max Uj ≤ u = un , u ∈ [0, 1],
1≤i≤n
Proof. We know from Proposition 3.1.5 that the condition C ∈ D(G) is equiv-
alent to the expansion
n
x 1 1
= 1−
n + O n2 + o n
D
n
1 1
= 1− x + O +o 1
n n D n
→n→∞ exp(− xD )
= exp(− log(u)D ),
x 1
Cn 1 + + o →n→∞ exp(− xD ).
n n
But
x 1 x
Cn 1 + + o = Cn 1 + + o(1)
n n n
as n → ∞, which follows from the general bound
d
|F (x) − F (y)| ≤ |Fi (xi ) − Fi (yi )|
i=1
yDT :=
yj eij = E max |yj | Zij , y ∈ Rm ,
j=1 1≤j≤m
D
function is
y DT = E min |yj | Zij , y ∈ Rm .
1≤j≤m
148 3 Copulas & Multivariate Extremes
The proof of the preceding lemma shows that, for every T = {i1 , . . . , im } ⊂
{1, . . . , d},
P Uij ≥ uj , 1 ≤ j ≤ m = 1 − u DT
for u close to 1 ∈ Rm , if C is a GPC.
The uniformity condition on u in the preceding result can be dropped for
the reverse implication “⇐.”
Note that the survival probability P (U ≥ u) of an rv U that follows a
copula C, also known as a survival copula, is not a copula itself.
Proof. We first establish the implication “⇒.” We can assume wlog that T =
{1, . . . , d}. From Proposition 3.1.5, we obtain the expansion
P (U ≥ 1 − v)
d
=1−P {Ui ≤ 1 − vi }
i=1
=1− (−1)|T |−1 P (Ui ≤ 1 − vi , i ∈ T )
∅=T ⊂{1,...,d}
|T |−1
=1− (−1) 1− vi ei + o vi ei
∅=T ⊂{1,...,d} i∈T D i∈T
= (−1)|T |−1 vi ei + o(v)
∅=T ⊂{1,...,d} i∈T D
1 − C(1 − sx)
s
1 − P (Ui ≤ 1 − sxi , 1 ≤ i ≤ d)
=
1 s
d
P i=1 {U i ≥ 1 − sx i }
=
s
P (Ui ≥ 1 − sxi , i ∈ T )
= (−1)|T |−1
s
∅=T ⊂{1,...,d}
P (Ui ≥ 1 − sxi , 1 ≤ i ≤ d)
lim
s↓0 s
⎧
⎪
⎨ x 1 = 0, if p = 1,
= x p , if 1 < p < ∞,
⎪
⎩
x ∞ = min {x1 , . . . , xd } , if p = ∞.
The preceding example gives rise to the conjecture that Cϕ ∈ D(Gp ) under
condition (3.11). This conjecture can easily be established.
Corollary 3.1.15 Let Cϕ be an arbitrary Archimedean copula on Rd
with generator ϕ that satisfies condition (3.11). Then, Cϕ ∈ D(Gp ),
where Gp is the standard max-stable df with D-norm ·p , p ∈ [1, ∞].
as ϕ(1) = 0. Since condition (3.11) does not depend on the dimension d, the
preceding Example 3.1.14 also entails that, for x = (x1 , . . . , xm ) ≥ 0 ∈ Rm ,
P (Ui1 ≥ 1 − sx1 , . . . , Uim ≥ 1 − sxm )
lim
s↓0 s
3.1 Characterizing Multivariate Domain of Attraction 151
⎧
⎪
⎨ x 1 = 0, if p = 1,
= x p , if 1 < p < ∞, (3.12)
⎪
⎩
x ∞ = min {x1 , . . . , xm } , if p = ∞,
where these dual D-norm functions are defined on Rm . Lemma 3.1.13 now
implies the assertion.
sϕ (1 − s)
− = p, s ∈ (0, s0 ], (3.13)
ϕ(1 − s)
or
p
ϕ(1 − s) s
log = log , s ∈ (0, s0 ],
ϕ(1 − s0 ) s0
which yields
ϕ(1 − s0 ) p
ϕ(1 − s) = s , s ∈ [0, s0 ],
sp0
i.e.,
ϕ(s) = c(1 − s)p , s ∈ [1 − s0 , 1],
with c := ϕ(1 − s0 )/sp0 . But this implies
Putting x = y = 1, we obtain
2 = 2E max(U1 , U2 ) ,
or
E 1 − max(U1 , U2 ) = 0
∈[0,1]
and, thus,
P (max(U1 , U2 ) = 1) = 1.
But
This idea is investigated in what follows. It turns out that it is actually possible
to cut off the upper tail of a given copula C and to impute a GPC Q in such
a way that the result is again a copula.
Note that
F (x) − F (x0 )
F [x0 ] (x) = P (X ≤ x | X > x0 ) = , x ≥ x0 ,
1 − F (x0 )
where we require F (x0 ) < 1. The univariate POT is the approximation of the
upper tail of F by that of a GPD H
where α, μ, and σ are shape, location and scale parameters of the GPD
H respectively. Recall that the family of univariate standardized GPDs is
given by
⎧
⎪
⎨1 − (−x) , −1 ≤ x ≤ 0, if α > 0,
α
Hα (x) = 1 − xα , x ≥ 1, if α < 0,
⎪
⎩
1 − exp(−x), x ≥ 0, if α = 0.
Multivariate Piecing-Together
A multivariate extension of the univariate PT approach was developed in
Aulbach et al. (2012a) and, for illustration, applied to operational loss data.
This approach is based on the idea that a multivariate df F can be decomposed
by Sklar’s theorem 3.1.1 into its copula C and its marginal df. The multivariate
PT approach then consists of the two steps:
(i) The upper tail of the given d-dimensional copula C is cut off and sub-
stituted by a GPC in a continuous manner, such that the result is again
a copula, called a PT copula. Figure 3.1 illustrates this approach in the
bivariate case: the copula C is replaced in the upper right rectangle of
the unit square by a GPC Q; the lower part of C is kept in the lower left
rectangle, whereas the other two rectangles are needed for a continuous
transition from C to Q.
(ii) Univariate df F1∗ , . . . , Fd∗ are injected into the resulting copula.
(0, 1) (1, 1)
(0, 0) (1, 0)
1(Uj > uj )
xD = E max |xj | Zj , x ∈ Rd ,
1≤j≤d 1 − uj
P (Yi ≤ x) = x, 0 ≤ x ≤ 1,
P (Y ≤ x)
= P Y ≤ x; Uk ≤ uk , k ∈ K; Uj > uj , j ∈ K
K⊂{1,...,d}
= P Ui 1(Ui ≤ ui ) + (ui + (1 − ui )Vi )1(Ui > ui ) ≤ xi , 1 ≤ i ≤ d;
K⊂{1,...,d}
Uk ≤ uk , k ∈ K; Uj > uj , j ∈ K
= P (Ui ≤ xi , 1 ≤ i ≤ d)
= C(x)
P (Y ≤ x)
= P Y ≤ x; Uk ≤ uk , k ∈ K; Uj > uj , j ∈ K
K⊂{1,...,d}
= P Uk ≤ uk , k ∈ K; uj + (1 − uj )Vj ≤ xj , Uj > uj , j ∈ K
K⊂{1,...,d}
xj − uj
= P Uk ≤ uk , k ∈ K; Uj > uj , j ∈ K P Vj ≤ ,j ∈ K
1 − uj
K⊂{1,...,d}
⎛ ⎞
= E⎝ 1(Uk ≤ uk ) 1(Uj > uj ) ⎠
K⊂{1,...,d} k∈K j∈K
xj − uj
×P Vj ≤ , j ∈ K .
1 − uj
xj − uj ! xj − uj !
P Vj ≤ ,j ∈ K
= 1 − E max ! ! − 1!! Zj
1 − uj j∈K 1 − uj
|xj − 1|
= 1 − E max Zj
j∈K 1 − uj
P (Y ≤ x)
= P (Uk ≤ uk , 1 ≤ k ≤ d)
⎛ ⎞
+ E⎝ 1(Uk ≤ uk ) 1(Uj > uj ) ⎠
K⊂{1,...,d} k∈K j∈K
K =∅
|xj − 1|
× 1 − E max Zj
j∈K 1 − uj
=1− E 1(Uk ≤ uk ) 1(Uj > uj )
K⊂{1,...,d} k∈K j∈K
K =∅
|xj − 1|
× max Zj
j∈K 1 − uj
=1−E 1(Uk ≤ uk ) 1(Uj > uj )
K⊂{1,...,d} k∈K j∈K
K =∅
|xj − 1|
× max Zj
j∈K 1 − uj
3.2 Multivariate Piecing-Together 157
=1−E 1(Uk ≤ uk ) 1(Uj > uj )
K⊂{1,...,d} k∈K j∈K
K =∅
|xj − 1|
× max Zj 1(Uj > uj )
1≤j≤d 1 − uj
1(Uj > uj )
=1−E max |xj − 1| Zj
1≤j≤d 1 − uj
× 1(Uk ≤ uk ) 1(Uj > uj )
K⊂{1,...,d} k∈K j∈K
K =∅
1(Uj > uj )
= 1 − E max |xj − 1| Zj (1 − 1(Uj ≤ uj , 1 ≤ j ≤ d))
1≤j≤d 1 − uj
1(Uj > uj )
= 1 − E max |xj − 1| Zj
1≤j≤d 1 − uj
= 1 − x − 1D ,
The term o(1 − v) can be dropped in the preceding result if C is a GPC
itself, precisely if C(v) = 1 − 1 − vD , v ∈ [u, 1] ⊂ Rd .
158 3 Copulas & Multivariate Extremes
uniformly for v ∈ [0, 1]d . On the other hand, we have for v close enough to 1,
where the final equation follows from (2.16). This completes the proof.
V 1 1
X := − , ∈ (−∞, 0]2 (3.17)
2 S1 S2
|x1 | |x2 |
xD = x1 −
x1
for x = (x1 , x2 )
= 0.
3.3 Copulas Not in the Domain of Attraction 159
1 − Cλ (1 − t, 1 − t)
lim
t↓0 t
3 √ √ 4
does not exist for λ ∈ −1/ 2, 1/ 2 \{0}. Since Cλ coincides with the copula
of 2X, we obtain
1 − Cλ Fλ (s), Fλ (s) 1 − P −V /S1 ≤ s, −V /S2 ≤ s
=
1 − Fλ (s) 1 − P −V /S1 ≤ s
1 − P V ≥ |s| max(U, 1 − U )
=
1 − P V ≥ |s| U )
1
P V ≤ |s| max(u, 1 − u) du
= 0 1
0
P V ≤ |s| u du
1/2 1
0 Hλ |s| (1 − u) du + 1/2 Hλ |s| u du
= 1
0 Hλ |s| u du
1
H |s| u du
1/2 λ
= 2 1 .
0
Hλ |s| u du
and c
1 c2
u2 sin(log(u)) du = 2 sin(log(c)) − cos(log(c)) ,
0 u 5
160 3 Copulas & Multivariate Extremes
3 √ √ 4
whose limit does not exist for s ↑ 0 if λ ∈ 1/ 2, 1/ 2 \ {0}; consider, e.g.,
(1) (2)
the sequences sn = − exp (1 − 2n)π and sn = − exp (1/2 − 2n)π as
n → ∞.
On the other hand, elementary computations for x = (x1 , x2 ) ∈ (−∞, 0]2 \
{0} show
Corollary 3.1.6 now implies that C0 ∈ D(G), with the corresponding D-norm
being the above limit.
4
An Introduction to Functional Extreme Value
Theory
Zt ≤ c, t ∈ [0, 1] , (4.1)
for some constant c ≥ 1. For each functional D-norm, there exists a generator
with this additional property; see Theorem 1.10.8. Let U be an rv that is
uniformly distributed on (0, 1) and that is independent of Z. Put
1 1
V := (Vt )t∈[0,1] := (Zt )t∈[0,1] =: Z. (4.2)
U U
Denote using [0, c][0,1] := {f : [0, 1] → [0, c]} the set of all functions from the
interval [0, 1] to the interval [0, c]. Repeating the arguments in equation (2.11),
we obtain, for g ∈ E[0, 1] with g(t) ≥ c, t ∈ [0, 1],
Zt
P (V ≤ g) = P U ≥ , t ∈ [0, 1]
g(t)
zt
= P U≥ , t ∈ [0, 1] (P ∗ Z) d (zt )t∈[0,1]
[0,c][0,1] g(t)
zt
= P U ≥ sup (P ∗ Z) d (zt )t∈[0,1]
[0,c][0,1]
t∈[0,1] g(t)
zt
= 1 − P U ≤ sup (P ∗ Z) d (zt )t∈[0,1]
[0,c][0,1] t∈[0,1] g(t)
zt
=1− sup (P ∗ Z) d (zt )t∈[0,1]
[0,c][0,1] t∈[0,1] g(t)
Zt
= 1 − E sup
t∈[0,1] g(t)
= 1 − 1/gD , (4.3)
Zt
P (V ≥ g) = P (V > g) = E inf = 1/g D .
t∈[0,1] g(t)
4.1 Generalized Pareto Processes 163
zt
P (V > g) = P U< , t ∈ [0, 1] (P ∗ Z) d (zt )t∈[0,1]
[0,c][0,1] g(t)
zt
= P U ≤ inf (P ∗ Z) d (zt )t∈[0,1]
[0,c][0,1] t∈[0,1] g(t)
zt
= inf (P ∗ Z) d (zt )t∈[0,1]
[0,c][0,1] t∈[0,1] g(t)
Zt
=E inf .
t∈[0,1] g(t)
1
P (V ≥ tg | V ≥ g) = , t ≥ 1.
t
Proof. We have
P (V ≥ tg, V ≥ g)
P (V ≥ tg | V ≥ g) =
P (V ≥ g)
P (V ≥ tg)
=
P (V ≥ g)
1/(tg) D 1
= = .
1/g D t
1
E(ST (s)) = E 1(s,∞) (Vt ) dt = .
0 s
Given that the sojourn time ST (s) is positive, this implies for the conditional
expectation of the sojourn time the equation
E(ST (s))
E(ST (s) | ST (s) > 0) =
1 − P (ST (s) = 0)
1/s
=
1 − P (Vt ≤ s, t ∈ [0, 1])
1
= , (4.4)
1D
1D ≥ 1∞ = 1,
and thus, E(ST (s) | ST (s) > 0) increases with decreasing 1D ; its maximum
value is one in the case 1D = 1, which characterizes the functional sup-norm
·∞ by the functional version of Takahashi’s Theorem in Corollary 1.10.5.
4.2 Max-Stable Processes 165
1
→n→∞ exp −
g ,
D
(n)
where the mathematical operations max1≤i≤n Vi , etc., are taken compo-
nentwise. The above reasoning is strict if inf t∈[0,1] g(t) > 0. Otherwise, check
that the above convergence is still true with the limit exp(− 1/gD ) = 0.
Next, we ask: Is there a stochastic process ξ = (ξt )t∈[0,1] on [0, 1] with
1
P (ξ ≤ g) = exp − g
, g ∈ E[0, 1], g > 0?
D
1 (i) (i)
P max ξ ≤ g = P max ξ ≤ ng
n 1≤i≤n 1≤i≤n
= P ξ (i) ≤ ng, 1 ≤ i ≤ n
n
= P ξ (i) ≤ ng
i=1
n
= P (ξ ≤ ng)
n
1
= exp −
ng
D
1
= exp −
g D
= P (ξ ≤ g).
Such processes ξ actually exist, see equation (4.7).
166 4 An Introduction to Functional Extreme Value Theory
(i)
P n max η ≤ f = P (η ≤ f ).
1≤i≤n
P (X ≤ x) = exp(x/ϑ), x ≤ 0,
= 1 − P (X < 0)
= 1 − P (X ≤ 0) = 0. (4.5)
4.2 Max-Stable Processes 167
1
= lim P η ≤ f −
n∈N n
1
= lim exp − f −
n∈N n D
= exp (− f D ) .
Because
P (η < f ) ≤ P (η ≤ f ) = exp (− f D ) ,
the assertion follows.
1 1
P (ξt ≤ y) = P ηt ≤ − = exp − , y > 0,
y y
and, for g ∈ E[0, 1], g > 0, we have
1 1
P (ξ ≤ g) = P η ≤ − = exp −
g .
g D
where {t1 , t2 , . . . } is a dense subset of [0, 1] that also contains the finitely
many points t ∈ [0, 1] at which the function f is discontinuous. From Propo-
sition 2.4.1, we obtain that
⎛ ⎞d
1
ηt1 ,...,td := (ηt1 , . . . , ηtd ) = − ⎝ ⎠
(i)
supi∈N Vi Ztj
j=1
4.2 Max-Stable Processes 169
xDt =E max |xj | Ztj , x ∈ Rd .
1 ,...,td 1≤j≤d
The dominated convergence theorem, together with the fact that Z has
continuous sample paths, implies
n
P {ηti ≤ f (ti )} = P ηt1 ,...,td ≤ (f (tj ))dj=1
i=1
= exp − (f (t1 ), . . . , f (td ))Dt ,...,t
1
= exp −E max |f (tj )| Ztj
1≤j≤d
→n→∞ exp −E sup (|f (t)| Zt )
t∈[0,1]
= exp (− f D ) .
It remains
to show that η has continuous sample paths. Put ξ(t) :=
(i)
supi∈N Vi Zt , t ∈ [0, 1]. We show
and
lim sup ξ(t) ≤ ξ(t0 )
t→t0
for each t0 ∈ [0, 1] with probability one. This implies pathwise continuity of
the process ξ = (ξ(t))t∈[0,1] .
Recall that we require boundedness of Z, i.e., supt∈[0,1] Zt ≤ c for some
number c ≥ 1. For any M ∈ N, we have
Zt
(i)
(i)
ξ(t) = max max Vi Zt , sup i
1≤i≤M i≥M+1 k=1 Ek
170 4 An Introduction to Functional Extreme Value Theory
⎧
⎨≤ max1≤i≤M Vi Zt(i) + c
M +1 ,
k=1 Ek
⎩≥ max1≤i≤M Vi Z (i) .
t
(i)
The continuity of each Zt implies
(i)
lim inf ξ(t) ≥ max Vi Zt0
t→t0 1≤i≤M
for each t0 ∈ [0, 1], with probability one. This shows that the process ξ has
continuous sample paths and, therefore, the process η = −1/ξ as well. Note
that P (ξ > 0) = 1, which can easily be seen using the fact that ξ is a max-
stable process, as in the proof of equation (9.4.6) in de Haan and Ferreira
(2006).
Proof. Choose f ∈ E − [0, 1], and let {t1 , t2 , . . . } be a dense set in [0, 1], which
also contains the finitely many points t ∈ [0, 1] at which f is discontinuous.
We can assume wlog that supt∈[0,1] f (t) =: K < 0; otherwise, the prob-
ability P (η > f ) would be zero and parts (i) and (ii) of Lemma 4.2.5 are
obviously true as f D = 0. From Lemma 2.4.2 and the continuity of η, we
obtain, for ε ∈ (0, |K|),
4.2 Max-Stable Processes 171
P (η > f ) ≥ P {ηti > f (ti ) + ε}
i∈N
n
= lim P {ηti > f (ti ) + ε}
n∈N
i=1
P (η > sf )
lim sup ≤ E min (|f (ti )| Zti ) , n ∈ N.
s↓0 s 1≤i≤n
=1− (−1)|T |−1 exp −sE max |f (tj )| Ztj
j∈T
∅=T ⊂{1,...,n}
by equation (1.10).
The function H is differentiable; thus,
|T |−1
= (−1) E max |f (tj )| Ztj
j∈T
∅=T ⊂{1,...,n}
172 4 An Introduction to Functional Extreme Value Theory
⎛ ⎞
=E⎝ (−1)|T |−1 max |f (tj )| Ztj ⎠
j∈T
∅=T ⊂{1,...,n}
=E min |f (tj )| Ztj
1≤j≤n
P (η > sf )
lim sup ≤E inf (|f (t)| Zt ) = f D .
s↓0 s t∈[0,1]
which completes the proof of part (ii) and, thus, of Lemma 4.2.5.
It is easy to find an SMS process η and f ∈ E − [0, 1] with a strict inequality
in part (i) of Lemma 4.2.5; see the next example. This construction of an
SMS process is a particular example of a max-linear model discussed and
generalized in Section 4.3.
Example 4.2.6 (Simple Max-Linear Model) Take two indepen-
dent and identically standard negative exponentially distributed rvs η0 ,
η1 , and put, for t ∈ (0, 1),
η0 η1
ηt := max , .
1−t t
Put 3 4
1
− 1−t , t ∈ 0, 12 ,
f (t) := 4
− 1t , t ∈ 12 , 1 .
The function f is negative and continuous, and we obtain
η0 η1
P (η > f ) = P max , > f (t), t ∈ [0, 1]
1−t t
η0 η1
≥P > f (t), t ∈ [0, 1/2] ; > f (t), t ∈ (1/2, 1]
1−t t
= P (η0 > −1, η1 > −1)
= P (η0 > −1)2
= exp(−2) > 0.
Proof (of Example 4.2.6). The process Z is non-negative, pointwise not larger
than 2, and satisfies for each t ∈ [0, 1]
We have, moreover,
= exp (− f D ) ,
which proves equation (4.8).
1 1
≤ − .
E mint∈[a,b] Zt E maxt∈[a,b] Zt
Example 4.2.6 shows that 1 D can be zero, in which case the preceding
upper bound is not helpful.
Clearly, the process η has continuous sample paths in our setup. But it is
worth mentioning, on the other hand, that the upper bound in Lemma 4.2.7
implies continuity in probability of η, i.e., P (|ηt − ηs | ≥ ε) →t→s 0, for each
s ∈ [0, 1]: the pathwise continuity of Z, together with the dominated conver-
gence theorem, yields
if a ≤ s ≤ b.
4.3 Generalized Max-Linear Models 175
d
gi (t) = 1, t ∈ [0, 1],
i=0
d i
gi (t) := t (1 − t)d−i , i = 0, . . . , d, t ∈ [0, 1].
i
satisfies condition (4.9). Particularly helpful functions g0∗ , . . . , gd∗ are defined
in (4.12).
Now, for t ∈ [0, 1], put
Xi
ηt := max . (4.10)
i=0,...,d gi (t)
The model (4.10) is called the generalized max-linear model. It defines an SMS
process, as the next lemma shows.
Lemma 4.3.1 The stochastic process η = (ηt )t∈[0,1] in (4.10) defines
an SMS process with generator process Ẑ = (Ẑt )t∈[0,1] given by
In model (4.10) we have not made any further assumptions on the D-norm
·D0,...,d , that is, on the dependence structure of the rv X0 , . . . , Xd . The
special case ·D0,...,d = ·1 characterizes the independence of X0 , . . . , Xd .
This is the regular max-linear model, Wang and Stoev (2011).
On the contrary, ·D0,...,d = ·∞ provides the case of complete depen-
dence X0 = · · · = Xd a.s., with the constant generator Z0 = · · · = Zd = 1.
Thus, condition (4.9) becomes maxi=0,...,d gi (t) = 1, t ∈ [0, 1]; therefore,
Proof (of Lemma 4.3.1). At first, we verify that the process Ẑ is indeed a
generator process. It is obvious that the sample paths of Ẑ are in C + [0, 1],
owing to the continuity of each gi . Furthermore, for each t ∈ [0, 1], we have
by construction
E Ẑt = (g0 (t), . . . , gd (t))D0,...,d = 1.
= exp −E sup |f (t)| max (gi (t)Zi )
t∈[0,1] i=0,...,d
P n max η (k) ≤ f
1≤k≤n
n
gi (t)f (t)
= P Xi ≤ inf , i = 0, . . . , d
t∈[0,1] n
⎛ ⎞
= exp ⎝− sup (g0 (t) |f (t)|) , . . . , sup (gd (t) |f (t)|) ⎠
t∈[0,1] t∈[0,1]
D0,...,d
= P (η ≤ f ).
(d) (d)
MSE η̂t := E ηt − η̂t vanishes for all t ∈ [0, 1] as d increases.
Moreover, we establish uniform convergence of the “predictive” processes and
the corresponding generator processes to the original ones.
Clearly, g0∗ , . . . , gd∗ ∈ C + [0, 1]: the fact that a D-norm is standardized implies
si − si−1 si+1 − si
lim gi∗ (t) = =1= = lim gi∗ (t).
t↑si (0, si − si−1 )Di−1,i (si+1 − si , 0)Di−1,i t↓si
Hence, the functions g0∗ , . . . , gd∗ are suitable for the generalized max-linear
model (4.10). In addition, they have the following property:
Lemma 4.3.2 The functions g0∗ , . . . , gd∗ defined above satisfy
In view of their properties described above, the functions gi∗ work as ker-
nels in non-parametric kernel density estimation. Each function gi∗ (t) has max-
imum value 1 at t = si , and, with the distance between t and si increasing,
the value gi∗ (t) shrinks to zero.
Proof (of Lemma 4.3.2). From the fact that a D-norm is monotone and stan-
dardized, we obtain, for i = 1, . . . , d − 1 and t ∈ [si−1 , si ),
t − si−1 1 1
gi∗ (t) = =
≤ = 1,
(si − t, t − si−1 )Di−1,i si −t
t−s , 1 (0, 1)Di−1,i
i−1
Di−1,i
Analogously, we have g0∗ ≤ 1 and gd∗ ≤ 1. The assertion now follows since
gi∗ (si ) = 1, i = 0, . . . , d.
ηsi−1 ηsi
η̂t = max ,
∗ (t) g ∗ (t)
gi−1 i
ηsi−1 ηsi
= (si − t, t − si−1 )Di−1,i max , , (4.13)
si − t t − si−1
for t ∈ [si−1 , si ], i = 1, . . . , d. Note that ηsi < 0 a.s., i = 0, . . . , d. This implies
that the maximum, taken over d+1 points in (4.10), goes down to a maximum
taken over only two points in (4.13), since all except two of the gi vanish in
t ∈ [si−1 , si ], i = 1, . . . , d. We have, moreover,
η̂si = ηsi , i = 0, . . . , d,
thus, the above process interpolates the rv (ηs0 , . . . , ηsd ). In summary, we have
established the following result.
180 4 An Introduction to Functional Extreme Value Theory
and
inf η̂t = − (ηsi−1 , ηsi )D .
t∈[si−1 ,si ] si−1 ,si
∗
Proof. We know from Lemma 4.3.2 that gi−1 (t), gi∗ (t) ≤ 1 for an arbitrary
i = 1, . . . , d and t ∈ [si−1 , si ]. Hence,
ηsi−1 ηsi
η̂t = max ∗ (t) , g ∗ (t) ≤ max(ηsi−1 , ηsi )
gi−1 i
∗
for i = 1, . . . , d and t ∈ [si−1 , si ]. The fact that gi−1 (si−1 ) = 1 = gi∗ (si )
yields the first part of the assertion. Recall that ηsi < 0 with probability one,
i = 0 . . . , d. Moreover, for t ∈ (si−1 , si ), we have
ηsi−1 ηsi si − t ηs si−1 ηsi−1 + si ηsi
≤ ⇐⇒ ≤ i−1 ⇐⇒ t ≥ ,
si − t t − si−1 t − si−1 ηsi ηsi−1 + ηsi
where equality in one of these expressions occurs iff it does in the other two.
In this case of equality, we have
4.3 Generalized Max-Linear Models 181
ηsi
η̂t = (si − t, t − si−1 )Di−1,i = − (ηsi−1 , ηsi )Di−1,i .
t − si−1
On the other hand, the monotonicity of a D-norm implies, for every t ∈
(si−1 , si ), with t ≥ (si−1 ηsi−1 + si ηsi )/(ηsi−1 + ηsi ),
ηsi
η̂t ≥ (si − t, t − si−1 )Di−1,i
t − si−1
si − t
= t − si−1 , 1
ηsi
Di−1,i
ηsi−1
≥ ,1 ηsi = − (ηsi−1 , ηsi )Di−1,i .
ηsi
Di−1,i
Moreover, for i = 1, . . . , d,
⎧ −1
⎨
(1/Zsi−1 , 1/Zsi )
Di−1,i
, if Zsi−1 , Zsi > 0,
inf Ẑt =
t∈[si−1 ,si ] ⎩0 else.
The minimum is attained for t = (si−1 Zsi + si Zsi−1 )/ Zsi−1 + Zsi in
the first case.
grids, whose diameter converges to zero. It turns out that such a sequence
converges to the initial SMS process in the function space C[0, 1] equipped
with the sup-norm. Thus, our method is suitable for reconstructing the initial
process.
Let
(d) (d) (d) (d) (d) (d)
Gd := {s0 , s1 , . . . , sd }, 0 =: s0 < s1 < · · · < sd := 1, d ∈ N,
(d)
Let η̂ (d) = (η̂t )t∈[0,1] be the discretized version
of an SMS process η =
(d) (d)
(ηt )t∈[0,1] with grid Gd . Denote using Ẑ = Ẑt and Z = (Zt )t∈[0,1]
t∈[0,1]
the generator processes pertaining to η̂ (d) and η respectively. Uniform con-
vergence of η̂ (d) to η and of Ẑ (d) to Z, as d tends to infinity, is established in
the next result.
Theorem 4.3.6 The processes η̂ (d) and Ẑ (d) , d ∈ N, converge
(d)
uniformly
to η and Z pathwise, i.e., η̂ − η ∞ →d→∞ 0 and
(d)
Ẑ − Z →d→∞ 0 with probability one.
∞
as well as
η̂s(d) = − η[s(d) ]d , ηs(d) d D
(d)
η̂s(d) ≥ min →d→∞ ηs ,
s∈[[s(d) ]d ,s(d) d ] [s(d) ]d , s(d)
d
where ·D denotes the D-norm pertaining to η[s(d) ]d , ηs(d) d .
[s(d) ]d , s(d)
d
Hence, the first part of the assertion is proven.
Now, we show that Ẑ (d) →d→∞ Z in (C[0, 1], ·∞ ). If Zs
= 0, the con-
tinuity of Z implies Z[s(d) ]d
= 0
= Zs(d) d for sufficiently large values of d.
Repeating the above arguments, the assertion now follows from Lemma 4.3.5.
If Zs = 0, the continuity of Z implies
(d)
Ẑs(d) ≤ 2 max Z[s(d) ]d , Zs(d) d →d→∞ 2Zs = 0,
4.3 Generalized Max-Linear Models 183
d , s d
t
Zt := exp Bt − , t ∈ [0, 1],
2
ηsi−1 ηsi
× max , , t ∈ [si−1 , si ], 1 ≤ i ≤ d.
si − t t − si−1
(ηs0 , . . . , ηsd ). More precisely, the only additional thing we need to know to
make these predictions is the set of the adjacent bivariate marginal distribu-
tions of (ηs0 , . . . , ηsd ), that is, the bivariate D-norms ·Di−1,i , i = 1, . . . , d.
However, this may be a restrictive condition in practice and suggests the prob-
lem of how to fit models of bivariate D-norms to data, which is beyond the
scope of the present book. The Brown–Resnick process, including additional
parameters, may serve as a parametric model to start with.
The following results, however, are obvious. Let η̂t be a point of the dis-
cretized version defined in (4.13) and define a defective discretized version
via
ηsi−1 ηsi
η̃t := (si − t, t − si−1 )D̃i max , ,
si − t t − si−1
for t ∈ [si−1 , si ], i = 1, . . . , d, where ·D̃i is an arbitrary D-norm on R2 ,
which we call the defective norm. Then, for every t ∈ [si−1 , si ], i = 1, . . . , d,
! !
! !
|η̂t − η̃t | = !(si − t, t − si−1 )Di−1,i − (si − t, t − si−1 )D̃i !
−ηsi−1 −ηsi
× min , .
si − t t − si−1
In particular, we have η̃si = η̂si = ηsi , i = 0, . . . , d. This means that we obtain
an interpolating process even if we replace the D-norm ·Di−1,i with the de-
fective norm ·D̃i . Furthermore, the defective discretized version still defines
a max-stable process with sample paths in C − [0, 1] = {f ∈ C[0, 1] : f ≤ 0}.
Check that its univariate marginal distributions are given by
(si − t, t − si−1 )Di−1,i
P (η̃t ≤ x) = exp x , x ≤ 0,
(si − t, t − si−1 )D̃i
for t ∈ [si−1 , si ], i = 1, . . . , d. These are still negative exponential distribu-
tions, but not standard ones, as they are with the discretized version given
in (4.13). In addition to this, the assertions in Lemma 4.3.4 also hold for the
defective discretized version, since each defective norm ·D̃i is monotone and
standardized. Repeating the arguments in the proof of Theorem 4.3.6 now
shows that the uniform convergence toward the original process η is retained
if we replace the norms ·Di−1,i with arbitrary monotone and standardized
norms ·D̃i . Note that Lemma 1.5.2 implies that these two properties al-
ready imply that the bivariate norm ·D̃i is a D-norm. In that case, the
only property of the discretized version that we lose is the standardization
of the univariate margins, i.e., the resulting process is no longer a standard
max-stable process.
vector (ηt , η̂t ) was standard max-stable itself. This is verified in the following
result.
Lemma 4.3.7 Let η = (ηt )t∈[0,1] be an SMS process and denote by
η̂ = (η̂t )t∈[0,1] its discretized version with grid {s0 , . . . , sd }. Then, the
bivariate rv (ηt , η̂t ) is an SMS rv for every t ∈ [0, 1] with corresponding
D-norm of the two-dimensional marginal
∗
(x, y)Dt := x, gi−1 (t)y, gi∗ (t)y , t ∈ [si−1 , si ], i = 1, . . . , d,
Dt,i−1,i
P (ηt ≤ x, η̂t ≤ y)
∗
= P ηt ≤ x, ηsi−1 ≤ gi−1 (t)y, ηsi ≤ gi∗ (t)y
∗
= exp − E max |x| Zt , gi−1 (t) |y| Zsi−1 , gi∗ (t) |y| Zsi
∗
= exp − E max |x| Zt , |y| max gi−1 (t)Zsi−1 , gi∗ (t)Zsi .
The vector ∗
Zt , max gi−1 (t)Zsi−1 , gi∗ (t)Zsi
(d) (d)
MSE η̂t := E ηt − η̂t
⎛ ⎞
∞
1
= 2 ⎝2 − 2 du⎠ →d→∞ 0.
0 (1, u)D(d)
t
186 4 An Introduction to Functional Extreme Value Theory
Next, we show ·D(d) →d→∞ ·∞ pointwise for all t ∈ [0, 1]. Denote using
t
Then, E(Z̃) = 1, and thus, (Zt , Z̃), define a generator of a D-norm ·D̃ on
(d)
R2 for all t ∈ [0, 1]. Lemma 4.3.5 implies Ẑt ≤ mZ̃ for all d ∈ N. Therefore,
for arbitrary x, y ∈ R, d ∈ N and t ∈ [0, 1], we have
(d)
max |x| Zt , |y| Ẑt ≤ max |x| Zt , |my| Z̃t ,
where
E max |x| Zt , |my| Z̃t = (x, my)D̃ < ∞.
Hence, we can apply the dominated convergence theorem to the sequence
(d) (d)
max |x| Zt , |y| Ẑt , d ∈ N. Together with the fact that Ẑt →d→∞ Zt for
all t ∈ [0, 1] by Theorem 4.3.6, we obtain for x, y ∈ R
(d)
(x, y)D(d) = E max |x| Zt , |y| Ẑt
t
Then, it turns out that the knowledge of this D-norm fully identifies the
distribution of Z; it is actually enough to know this D-norm when t = 1, as
the following Lemma 5.1.1 shows, and this shall be the basis for our definition
of a max-CF. By =D we mean equality of the distributions.
Lemma 5.1.1 Let X = (X1 , . . . , Xd ) ≥ 0, Y = (Y1 , . . . , Yd ) ≥ 0 be
rvs with E(Xi ), E(Yi ) < ∞, 1 ≤ i ≤ d. If we have, for each x > 0 ∈ Rd ,
E (max(1, x1 X1 , . . . , xd Xd )) = E (max(1, x1 Y1 , . . . , xd Yd )) ,
then X =D Y .
Proof. From Lemma 1.2.2, for arbitrary x > 0 ∈ Rd and c > 0, we obtain the
equation
∞
X1 Xd X1 Xd
E max 1, ,..., = 1 − P max 1, ,..., ≤ t dt
cx1 cxd cx1 cxd
0 ∞
= 1 − P (1 ≤ t, Xi ≤ tcxi , 1 ≤ i ≤ d) dt
0
∞
=1+ 1 − P (Xi ≤ tcxi , 1 ≤ i ≤ d) dt.
1
The substitution t → t/c yields that the right-hand side above equals
1 ∞
1+ 1 − P (Xi ≤ txi , 1 ≤ i ≤ d) dt.
c c
Then, a = 1 and X =D Y .
Now, according to Lemma 1.2.2, we have, for any c > 0 and any x > 0,
∞
1 X1 Xd
ϕc,X = 1 − P max c, ,..., ≤ t dt
x 0 x1 xd
∞
=c+ 1 − P (Xi ≤ txi , 1 ≤ i ≤ d) dt.
c
for each c > 0 and x > 0 ∈ Rd . Taking right derivatives with respect to c
yields
P (Xi ≤ cxi , 1 ≤ i ≤ d) = 0.
Letting c → ∞ clearly produces a contradiction.
Suppose next that a > 0. We have
∞ ∞
c+ 1−P (Xi ≤ txi , 1 ≤ i ≤ d) dt = ca+ 1−P (Yi ≤ txi , 1 ≤ i ≤ d) dt.
c ca
Put
X := (X0 , . . . , Xd ) := (1, Z1 , . . . , Zd ).
A repetition of the arguments in the proof of Lemma 1.1.3 yields that
ϕZ (λx + (1 − λ)y)
= E max 1, (λx1 + (1 − λ)y1 )Z1 , . . . , (λxd + (1 − λ)yd )Zd
= E max λ + (1 − λ), (λx1 + (1 − λ)y1 )Z1 , . . . , (λxd + (1 − λ)yd )Zd
ϕλ := λϕ1 + (1 − λ)ϕ2
The proof of Lemma 5.1.4 repeats the arguments in the proof of Propo-
sition 1.4.1, which states that the set of D-norms is convex. It provides in
particular an rv Zλ , whose max-CF is given by ϕλ .
Proof. Let Z (1) , Z (2) be rvs with corresponding max-CFs ϕ1 , ϕ2 . Take an rv
ξ that attains only the values one and two, with probability P (ξ = 1) = λ =
1 − P (ξ = 2), and suppose that ξ is independent of Z (1) and of Z (2) . Note
that
(ξ) (ξ)
Z (ξ) := Z1 , . . . , Zd
where
SPZ (α) = P (Z > qZ (α))(ESZ (α) − qZ (α))
is the stop-loss premium risk measure of Z; see Embrechts et al. (1997).
The preceding remarks suggest that max-CFs might be closely connected
to well-known elementary objects such as conditional expectations and risk
measures; a particular consequence of it is that computing a max-CF may,
in certain cases, be much easier than computing a standard characteristic
function (CF), i.e., a Fourier transform. The following example illustrates
this idea.
5.1 Max-Characteristic Function 193
Example 5.1.5 Let Z be an rv that has the GPD with location pa-
rameter μ ≥ 0, scale parameter σ > 0 and shape parameter ξ ∈ (0, 1),
whose df is
−1/ξ
z−μ
P (Z ≤ z) = 1 − 1 + ξ , z ≥ μ.
σ
⎪
⎪
σ 1
⎪
⎨ xE(Z) = x μ + , if x > ,
1−ξ μ
ϕZ (x) =
⎪
1−1/ξ
⎪
⎪ σx 1 − μx 1
⎩1 + 1+ξ , if x ≤ .
1−ξ σx μ
xα D
ϕG (x) = 1 + 1 − exp − dy
1 yα
∞
1/α
= 1 + xα D 1 − exp(−y −α ) dy.
1/α
1/ xα D
see, for example, Villani (2009). Let X and Y be integrable rvs in Rd with
distributions P and Q. Using dW (X, Y ) := dW (P, Q) we denote the Wasser-
stein distance between X and Y . The next result precisely says that point-
wise convergence of max-CFs is equivalent to convergence with respect to the
Wasserstein metric.
Theorem 5.1.7 Let Z, Z (n) , n ∈ N, be non-negative and integrable
rvs in Rd with the pertaining max-CFs
ϕZ , ϕZ (n) , n ∈ N. Then
ϕZ (n) →n→∞ ϕZ pointwise iff dW Z (n) , Z →n→∞ 0.
(n) (n)
Proof. Suppose
that , Z) →n→∞ 0. Then, we can find versions Z
dW (Z
(n)
and Z with E Z − Z 1 →n→∞ 0. For x = (x1 , . . . , xd ) ≥ 0, this implies
(n) (n)
ϕZ (n) (x) = E max(1, x1 (Z1 + (Z1 − Z1 )), . . . , xd (Zd + (Zd − Zd )))
≤ E (max(1, x1 Z1 , . . . , xd Zd )) + x∞ E Z (n) − Z 1
(n)
≥ E (max(1, x1 Z1 , . . . , xd Zd )) − x∞ E Z − Z 1
= ϕZ (x) + o(1).
1 (n)
P Z (n) ≤ x = P Zi ≤ 1, 1 ≤ i ≤ d .
xi
If
1 (n) 1
lim sup P Z ≤ 1, 1 ≤ i ≤ d > P Zi ≤ 1, 1 ≤ i ≤ d
n→∞ xi i xi
or
1 (n) 1
lim inf P Zi ≤ 1, 1 ≤ i ≤ d < P Zi ≤ 1, 1 ≤ i ≤ d ,
n→∞ xi xi
then equation (5.4) readily produces a contradiction by putting s = 1 and
t = 1 + ε, or t = 1 and s = 1 − ε with a small ε > 0. Thus, we have
P Z (n) ≤ x →n→∞ P (Z ≤ x) (5.5)
Suppose that
(n) (n)
lim sup P Zi ≤ xi , i
∈ T ; Zj ≤ 0, j ∈ T = c > 0.
n→∞
max1≤i≤n X (i)
dW , ξ →n→∞ 0
a(n)
ϕn →n→∞ ϕξ pointwise,
1 n
P Y (n) ≤ x = 1 −
nxα
D
1
→n→∞ exp −
xα
D
= P (ξ ≤ x), (5.6)
1
ϕY (n) (x) = 1 + 1 − P Y (n) ≤ t dt
1 x
(j) n
E n 1 − max Ui = →n→∞ 1 = E(−ηi ),
1≤j≤n n+1
V (n) →D η
⇐⇒ dW V (n) , η →n→∞ 0
⇐⇒ ϕ−V (n) →n→∞ ϕ−η pointwise.
ϕ−η (x)
1
= 1 + x1 exp(−1/x1 ) + x2 exp(−1/x2 ) − exp(− 1/xD ).
1/xD
(n)
xD(n) = E max xi Zi
1≤i≤d
(n)
= E max xi Zi + xi (Zi − Zi )
1≤i≤d
= E max (xi Zi ) + O E Z (n) − Z
1≤i≤d 1
The following consequence of Corollaries 5.1.11 and 5.1.12 is obvious.
Corollary 5.1.13 Let Z (0) , Z (n) be arbitrary generators of the D-
norms ·D0 , ·Dn on Rd , n ∈ N. If Z (n) →D Z (0) , then ·Dn →n→∞
·D0 pointwise.
The reverse implication in the preceding result is not true; just put Z (0) :=
(1, . . . , 1) ∈ Rd and Z (n) := (X, . . . , X), where X ≥ 0 is an rv with E(X) = 0.
Both generate the sup-norm ·∞ , but, clearly, Z (n)
→n→∞ Z (0) , unless
X = 1 a.s.
exp(Y1 ) exp(Yd )
Z := ,...,
μ1 μd
= (exp(Y1 − log(μ1 )), . . . , exp(Yd − log(μd )))
0
1 (j) Y1
= ⎝exp √ Y − n log E exp √ ,
n j=1 1 n
/ ⎞
n
0
1 (j) Yd ⎠
. . . , exp √ Y − n log E exp √
n j=1 d n
√
generates the product D-norm ·Dn . Note that E(exp(Y / n)) < ∞
n
for n ∈ N if E(exp(Y )) < ∞.
Using the Taylor expansion exp(x) = 1 + x + exp(ϑx)x2 /2, with
some 0 < ϑ < 1, x ∈ R, and log(1 + ε) = ε + O(ε2 ) as ε → 0, it is easy
to see that
Yj E Yj2 σjj
n log E exp √ →n→∞ = , 1 ≤ j ≤ d.
n 2 2
1 ≤ max(1, x1 Z1 , . . . , xd Zd )
and
max(x1 Z1 , . . . , xd Zd ) ≤ max(1, x1 Z1 , . . . , xd Zd )
and taking expectations. The upper bound is a consequence of the inequality
max(a, b) ≤ a + b, valid when a, b ≥ 0. Finally, the uniform convergence result
is obtained by writing
ϕZ (x) 1
1≤ ≤1+
xD,Z xD,Z
for all Z ∈ Z and all x ∈ Rd+ \ {0}. Because · D,Z ≥ · ∞ , this entails
! !
! ϕZ (x) ! 1
sup !! − 1!! ≤ ,
Z∈Z xD,Z x∞
from which the conclusion follows.
It is worth noting that the inequalities of Lemma 5.1.15 are sharp in the
sense that, for Z = (1, . . . , 1), ϕZ (x) = max(1, x∞) = max(1, xD,Z ).
Therefore, the leftmost inequality is in fact an equality in this case, whereas
the rightmost inequality ϕZ (x) ≤ a + bxD,Z can only be true if a, b ≥ 1
because of the leftmost inequality.
Lemma 5.1.15 has the following consequence.
Corollary 5.1.16 No constant function can be the max-CF of a gen-
erator of a D-norm.
Such a result is, of course, not true for standard max-CFs, since, for in-
stance, the max-CF of the constant rv zero is the constant function one. The
next result supplements Lemma 5.1.15.
202 5 Further Applications of D-Norms to Probability & Statistics
ϕZ (x) − xD
= E(max(1, x1 Z1 , . . . , xd Zd )) − E max (xi Zi )
1≤i≤d
∞ ∞
= P max (xi Zi ) ≤ t dt
0 1≤i≤d
1
t
= P Zi ≤ , 1 ≤ i ≤ d dt
0 xi
1
→x→∞ P (Zi = 0, 1 ≤ i ≤ d) dt
0
= P (Z = 0)
in Lemma 5.1.15 that the D-norm ·D,Z can be deduced from ϕZ , since
ϕZ (tx)
lim = xD,Z , x ≥ 0 ∈ Rd .
t→∞ t
5.1 Max-Characteristic Function 203
!
∂+ 1 !
P (Z ≤ x) = tϕZ !
∂t tx t=1
1 1 1
= lim (1 + h)ϕZ − ϕZ ,
h↓0 h (1 + h)x x
!
∂ 1 !
tψ ! = P (Z ≤ x)
∂t tx t=1
and
1
lim t ψ − 1 = 0,
t→∞ tx
1
lim t ϕ − 1 = 0, x = (x1 , . . . , xd ) > 0,
t→∞ tx
1 Z1 Zd
tϕZ = tE max 1, ,...,
tx tx1 txd
Z1 Zd
= E max t, ,..., .
x1 xd
204 5 Further Applications of D-Norms to Probability & Statistics
This gives
∞
1 Z1 Zd
tϕZ = P max t, ,..., > y dy
tx 0 x1 xd
∞
Z1 Zd
=t+ P max ,..., > y dy
x1 xd
t ∞
=t+ 1 − P (Zj ≤ yxj , 1 ≤ j ≤ d) dy.
t
∂+ 1
tϕZ = P (Zj ≤ txj , 1 ≤ j ≤ d).
∂t tx
Setting t = 1 concludes the proof of (i). To prove (ii), note that, for all t > 0,
1 1
d
∂ 1 1 1
tψ =ψ − ∂j ψ , (5.8)
∂t tx tx t i=1 xj tx
where ∂j ψ denotes the partial derivative of ψ with respect to its jth compo-
nent. In particular, because
!
∂ 1 !
P (Zj ≤ xj , 1 ≤ j ≤ d) = tψ !
∂t tx t=1
d
1 1 1
=ψ − ∂j ψ , (5.9)
x x
i=1 j
x
∂ 1
tψ = P (Zj ≤ txj , 1 ≤ j ≤ d).
∂t tx
Now write
∞
1 ∂ 1
tψ = t− y ψ −1 dy
tx ∂y yx
t ∞
= t+ 1 − P (Zj ≤ yxj , 1 ≤ j ≤ d) dy
t
1
= tϕZ
tx
!
∂+ 1 ! 1
tϕG ! = exp −
xα = G(x)
∂t tx t=1 D
for x > 0 ∈ Rd .
d
n n − ki
√ Xn−ki :n,i − →D N (0, Σ) ,
ki n i=1
If, for example, the underlying D-norm ·D is the logistic norm xD =
1/p p p 1/p
xp = ( pi=1 |xi |p ) , p ≥ 1, then σij = kij + kji − kij + kji , i
= j.
Note that σij = 0, i
= j, if ·D = ·1 , which is the case for inde-
5d
pendent margins of G(x) = exp(− xD ) = i=1 exp(xi ), x ≤ 0 ∈ Rd .
Then, the components of X = (X1 , . . . , Xd ) are tail independent. The re-
verse implication is true as well, i.e., the preceding result entails that the
componentwise intermediate os Xn−k1 :n,1 , . . . , Xn−kd :n,d are asymptotically
independent iff they are pairwise asymptotically independent. But this is
one of Takahashi’s characterizations of ·D = ·1 ; see Corollary 1.3.5 and
Theorem 2.3.8.
Note that σij ≥ 0 for each pair i, j, i.e., the componentwise os are asymp-
totically positively correlated. This is an obvious consequence of the fact that
each D-norm ·D is pointwise less than ·1 ; see (1.4).
208 5 Further Applications of D-Norms to Probability & Statistics
where E1 , . . . , En+1 are iid standard exponential rvs; see equation (2.29).
Let ξ1, ξ2 , . . . ,ξ2(n+1) be iid standard normal distributed rvs. From the
fact that ξ12 + ξ22 /2 follows the standard exponential distribution on (0, ∞),
we obtain the representation
2i n
2
n j=1 ξj
(Ui:n )i=1 =D 2(n+1) . (5.11)
j=1 ξj2 i=1
This matrix is positive definite for a = 1/31/2 , but not for a = 1/31/4 . This is
the reason why we require the extra condition in Corollary 5.2.3 that the ma-
trix Λ is positive semidefinite. The matrix Λ is, for example, positive semidef-
inite if the value of ei + ej D does not depend on the pair of indices i
= j,
in which case Λ satisfies the compound symmetry condition.
as well. But this follows from the central limit theorem and elementary argu-
ments, using the fact that Cov(X 2 , Y 2 ) = 2c2 if (X, Y ) is bivariate normal
with Cov(X, Y ) = c.
d
n n − ki
P √ Xn−ki :n,i − ≤x (5.12)
ki n i=1
√
ki n − ki
= P Xn−ki :n,i ≤ xi + , 1≤i≤d
n n
⎛ ⎞
n 6 √ 7
(j) k n − k
=P⎝ ≥ n − ki , 1 ≤ i ≤ d⎠
i i
1 Xi ∈ 0, xi +
j=1
n n
⎛⎛
n √
⎝ ⎝ 1 ki n − ki
=P √ xi +
ki j=1 n n
6 √ 7
d
(j) ki n − ki
−1 Xi ∈ 0, xi + ≤x .
n n i=1
Now, put
6 √ 7
d
(n) (n) (n) ki n − ki
Y := := 1 Xi ∈ 0,
Y1 , . . . , Yd xi +
n n i=1
(n)
with values in {0, 1} . The entries of its covariance matrix Σ (n) = σij
d
for
i
= j are given by
(n) (n) (n) (n) (n)
σij = E Yi Yj − E Yi E Yj
(n) (n) (n) (n)
= P Yi = Yj = 1 − P Yi = 1 P Yj = 1
√ &
ki n − ki kj n − kj
= P Xi ≤ xi + , Xj ≤ xj +
n n n n
5.2 Multivariate Order Statistics: The Intermediate Case 211
√
&
ki n − ki kj n − kj
− P Xi ≤ xi + P Xj ≤ xj +
n n n n
√ &
ki n − ki kj n − kj
= Cij xi + , xj +
n n n n
√
&
ki n − ki kj n − kj
− xi + xj +
n n n n
if n is large, where
ki + kj ki kj
+ +o
n n
&
ki kj
= kij + kji − kij ei + kji ej D + o(1) .
n
For i = j, one deduces
(n) ki
σii = (1 + o(1)).
n
The asymptotic normality N (0, Σ)(−∞, x] of the final term in equation
(5.12) now follows from Lemma 5.2.4.
xfi (x)
lim = αi ; (von Mises (2))
x→∞ 1 − Fi (x)
5.3 Multivariate Records and Champions 213
rv X (n) such that X (n) > max(X (1) , . . . , X (n−1) ). Trying to generalize this
concept for the case of random vectors, or even stochastic processes with
continuous sample paths, gives rise to the question of how to define records
in higher dimensions. We consider two different concepts: a simple record is
meant to be an rv X (n) that is larger than X (1) , . . . , X (n−1) in at least one
component, whereas a complete record has to be larger than its predecessors
in all components. In addition to this sequential approach, we say that a set
of rvs X (1) , . . . , X (n) contains a champion if there is an index i ∈ {1, . . . , n}
with X (i) > X (j) , j
= i. In this case, X (i) is called the champion among
X (1) , . . . , X (n) .
Terminology
Let X, X (1) , X (2) , . . . be iid rv in Rd with continuous df F . We call X (n) a
simple record if X (n)
≤ maxi=1,...,n−1 X (i) , and we call it a complete record
if X (n) > maxi=1,...,n−1 X (i) (Figures 5.1–5.3). We further define
π n (X) := P X (n) is a simple record ,
π̄n (X) := P X (n) is a complete record .
×
×
×
×
×
× × a simple record
×
×
× a complete record
×
× ×
×
×
1
P X (n) > max X (i) = .
1≤i≤n−1 n
where
(i) (i)
U (i) = U1 , . . . , Ud , i = 1, 2, . . .
are iid rvs that follow the copula C.
Recall that since F is continuous, the margins are continuous as well, and
in this case, C is uniquely determined by
C(u) = F F1−1 (u1 ), . . . , Fd−1 (ud ) , u = (u1 , . . . , ud ) ∈ (0, 1)d .
or, equivalently,
In that case, we call X (k) the champion among X (1) , . . . , X (n) . Note that,
different from univariate iid observations X1 , . . . , Xd with a continuous df F ,
there is not necessarily a champion for multivariate observations.
We denote the sample concurrence probability by pn (X) and, due to the
iid property, obtain as before
n * +
(i) (j)
pn (X) = P X > max X
1≤j=i≤n
i=1
n
= nπ n (X). (5.15)
If the limit limn→∞ pn (X) exists in [0, 1], we call it the extremal concurrence
probability.
Different than records, the concept of multivariate and functional champi-
ons is very recent. It has been established in the work of Dombry et al. (2017).
In their paper, they derive the limit sample concurrence probability under iid
rvs X (1) , . . . , X (n) in Rd . There are also many results on statistical inference
in their work. The D-norm approach provides an elegant formulation of their
results; see below.
According to the Lemma 5.3.1, we can assume wlog that the observed iid
rvs follow a copula, say C. To emphasize this assumption, in what follows, we
use the notation U instead of X.
Theorem 5.3.2 Let U (1) , U (2) , . . . be independent copies of the rv U ,
which follows a copula C on Rd satisfying C ∈ D(G), where G is an
SMS df with corresponding D-norm ·D . Then,
Note that (ii) implies the uniform integrability of the sequence (Yn )n∈N .
We first show (i). From Lemma 3.1.13 we obtain
(n)
=E M1
2n
= ≤ 2,
n+1
which completes the proof of Theorem 5.3.2.
E(R(n))
→n→∞ E ( η D ) .
log(n)
Proof. We have
n
1
E ( η D ) = E 1(Z > 0) .
1/ZD
Further, for x = (x1 , . . . , xd ) ≤ 0 ∈ Rd , we have
1
=E 1 − exp 1/ZD max (xi Zi ) 1(Z > 0) .
1/ZD 1≤i≤d
= E ( η D ) .
Lemma 1.2.2, and the fact that η and Z are independent entail
220 5 Further Applications of D-Norms to Probability & Statistics
∞
1
=E 1(Z > 0) ,
1/ZD
which is the first assertion. The second assertion can be shown by repeating
the above arguments.
1
E 1(Z > 0) = 0 ⇐⇒ min Zi = 0 a. s. (5.16)
1/ZD 1≤i≤d
Note that we avoid division by zero in the preceding formula by the as-
sumption P (Z > 0) > 0.
Proof. We have
Πn (x) P n(U − 1) > x, U > maxi=1,...,n−1 U (i)
H̄n (x) = := .
πn P U > maxi=1,...,n−1 U (i)
1
E 1/Z exp (1/ZD max1≤i≤d (xi Zi )) 1(Z > 0)
D
H̄D (x) = 1 − ,
1
E 1/Z 1(Z > 0)
D
(5.17)
where Z is a generator of ·D . This is due to Lemma 5.3.4.
Example 5.3.7 For the Marshall–Olkin D-norm
we obtain
Z := ξ(1, . . . , 1) + (1 − ξ)Z ∗ ,
H̄λ (x)
1
E 1/Z D exp 1/ZDλ maxi=1,...,d (xi Zi ) 1(Z > 0, ξ = 1)
=1− λ
1
E 1/Z 1(Z > 0, ξ = 1)
Dλ
Simple Records
So far, we have investigated the (normalized) probability of a complete record
and in particular, its limit, the extremal concurrence probability. Now, we re-
peat this procedure, this time for the simple record probability. Unlike before,
where we were actually dealing with the probability of having a champion,
normalizing the record probability with the factor n does not yield an inter-
pretation in terms of a probability in the simple record case.
The following result is the equivalent of Theorem 5.3.2 and Proposi-
tion 5.3.6 in the context of multivariate simple records. Let X, X (1) , X (2) , . . .
be iid rvs in Rd with common continuous df F . Recall that X (n) is a simple
record, if
X (n)
≤ max X (i) ,
1≤i≤n−1
and π n (X) denotes the probability of X (n) being a simple record within the
iid sequence X (1) , X (2) , . . .
Theorem 5.3.8 Let U (1) , U (2) , . . . be independent copies of an rv U ∈
Rd following a copula C. Suppose that C ∈ D(G), G(x) = exp(− xD ),
x ≤ 0 ∈ Rd . Let η be an rv with this df G. Then
nπ n (U ) →n→∞ E (ηD ) ,
and
5.3 Multivariate Records and Champions 223
This is in general not a probability df on (−∞, 0]d since, for example, H1 (x)
does not converge to zero if only one component xi converges to −∞.
On the other hand, take ·D = ·∞ , which is the least D-norm. In this
case, the components η1 , . . . , ηd of η are completely dependent, i.e., η1 = η2 =
· · · = ηd a.s.; thus,
d
H∞ (x) = E (min(xi , η1 ))i=1 − x∞
∞
= E (max(x∞ , |η1 |)) − x∞
= exp(− x∞ ), x = (x1 , . . . , xd ) ≤ 0 ∈ Rd ,
(i)
nπ n (U ) = nP U
≤ max U
1≤i≤n−1
⎛ ⎞
d *
+
(i)
= nP ⎝ Uj > max Uj ⎠
1≤i≤n−1
j=1
|T |−1 (i)
= (−1) nP Uj > max Uj , j∈T
1≤i≤n−1
∅=T ⊂{1,...,d}
=E max (|ηj | Zj )
1≤j≤d
= E (ηD ) .
224 5 Further Applications of D-Norms to Probability & Statistics
nP (n(U − 1)
≤ min(x, Mn )) →n→∞ E (min(x, η)D ) ,
x (i)
nP U ≤ 1 + , U
≤ max U
n 1≤i≤n−1
x
= nP (n(U − 1)
≤ min (x, Mn )) − nP U
≤ 1 +
n
→n→∞ E (min(x, η)D ) − xD ,
E(m(n))
→n→∞ E (ηD ) ,
log(n)
n(1 − F (ani x + bni )) →n→∞ − log(Gi (x)) =: −ψi (x), Gi (x) > 0.
X − bn ! (n)
P ≤x!X is a simple record →n→∞ HD (ψ(x)).
an
d
C(u) = ui , u = (u1 , . . . , ud ) ∈ [0, 1]d .
i=1
Then, we obtain
1 1 5d
C(u) ui
C(du) = ... i=1
5d du1 . . . dud < ∞
[0,1]d 1 − C(u) 0 0 1− i=1 ui
using elementary arguments and, thus, E(N (2)) < ∞. This observation gives
rise to the problem of how to characterize those copulas C on [0, 1]d, with
d ≥ 2, such that E(N (2)) is finite. Note that E(N (2)) = ∞ if the components
of C are completely dependent.
1
=E
1 − C(U )
∞
1
= P C(U ) > 1 − dt
1 t
∞
1
≤ P Ui > 1 − , 1 ≤ i ≤ d dt.
1 t
On the other hand, the lower bound in (5.20) yields
∞
1
E(N (2)) − 1 = P C(U ) > 1 − dt
1 t
∞
d
1
≥ P (1 − Ui ) < dt
1 i=1
t
∞
1
≥ P 1 − Ui < , 1 ≤ i ≤ d dt
1 dt
∞
1 1
= P 1 − Ui < , 1 ≤ i ≤ d dt
d d t
∞
1 1
= P Ui > 1 − , 1 ≤ i ≤ d dt.
d d t
As a consequence, we have established the equivalence
∞
1
E(N (2)) < ∞ ⇐⇒ P Ui > 1 − , 1 ≤ i ≤ d dt < ∞.
1 t
228 5 Further Applications of D-Norms to Probability & Statistics
P (Ui ≥ u, 1 ≤ i ≤ d) 1 D
≥
1−u 2
for u ∈ [1 − ε, 1). This implies
1 1
P (Ui ≥ u, 1 ≤ i ≤ d) P (Ui ≥ u, 1 ≤ i ≤ d)
du ≥ du
0 (1 − u)2 1−ε (1 − u)2
1 D 1 1
≥ du = ∞,
2 1−ε 1 − u
2 log(1 − u)
χ̄ := lim −1
u↑1 log(P (U1 > u, U2 > u))
is a popular measure of tail comparison, provided that this limit exists (Coles
et al. (1999); Heffernan (2000)). In this case, we have χ̄ ∈ [−1, 1] (Beirlant
et al. (2004, (9.83))). For a bivariate normal copula with a coefficient of cor-
relation ρ ∈ (−1, 1), it is, for instance, well known that χ̄ = ρ.
Note that the next result does not require C ∈ D(G). It requires only
the existence of the above tail dependence coefficient for at least one pair of
components.
Proposition 5.3.13 Let U = (U1 , . . . , Ud ) follow a copula C. Suppose
that there exist indices k
= j such that
2 log(1 − u)
χ̄k,j = lim − 1 ∈ [−1, 1). (5.21)
u↑1 log(P (Uk > u, Ui > u))
Since
2 log(1 − u)
− 1 →u↑1 χ̄k,j ∈ [−1, 1),
log(P (Uk > u, Ui > u))
there exist ε > 0 and c < 1/2 such that
P (Uk ≥ u, Uj ≥ u)
= exp log du
1−ε (1 − u)2
1
Corollary 5.3.14 We have E(N (2)) < ∞ for multivariate normal rvs,
unless all components are completely dependent, precisely, unless all
bivariate coefficients of correlation are one.
References
1(·) Γ
indicator function, 6 Gamma function, 6
=D |T |
equality of distributions, 121, number of elements in set T , 20
187 Ā
A topological closure of set A, 69
transpose of matrix, 3 ej
C[0, 1] j-th unit vector, 5
set of continuous functions on · D
[0, 1], 52 dual D-norm function, 22
C + [0, 1] ·
set of non-negative continuous norm, 1
functions on [0, 1], 175 ∂+
C − [0, 1] right derivative of a function,
set of continuous and non- 203
positive functions on [0, 1], E
184 matrix with constant entry one,
E[0, 1] 9
subset of functions on [0, 1], 52 →D
E − [0, 1] convergence in distribution, 34
subset of non-positive functions εz
in E[0, 1], 166 Dirac measure, 67
F ∈ D(G) 1A (t)
F is in the domain of attraction indicator function of set A, 8
of G, 100, 135
F −1
Absorbing D-norm, 40
generalized inverse of df, 136,
Angular measure, 26
212
Angular set, 63
P ∗Z
Aumann integral, 88
distribution of rv Z, 39
a.s.
[0, c][0,1]
(almost surely), 7
set of functions from [0, 1] to
[0, c], 162
Δ-inequality Barycentric coordinates, 68, 75
triangle inequality, 1 Bauer simplex, 70
Δ-monotone Beta function, 127
Delta-monotone, 17 Bilinear map, 78