0% found this document useful (0 votes)
70 views13 pages

Chapter 6 - The Multivariate Normal Distribution and Copulas - 2013 - Simulation

The document introduces the multivariate normal distribution and copulas. It describes how to generate random variables with a multivariate normal distribution using a Choleski decomposition of the covariance matrix. An example generates a bivariate normal distribution.

Uploaded by

陳裕庭
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
70 views13 pages

Chapter 6 - The Multivariate Normal Distribution and Copulas - 2013 - Simulation

The document introduces the multivariate normal distribution and copulas. It describes how to generate random variables with a multivariate normal distribution using a Choleski decomposition of the covariance matrix. An example generates a bivariate normal distribution.

Uploaded by

陳裕庭
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

The Multivariate

Normal Distribution
and Copulas
6

Introduction
In this chapter we introduce the multivariate normal distribution and show how to
generate random variables having this joint distribution. We also introduce copulas
which are useful when choosing joint distributions to model random variables
whose marginal distributions are known.

6.1 The Multivariate Normal


Let Z 1 , . . . , Z m be independent and identically distributed normal random
variables, each with mean 0 and variance 1. If for constants ai, j , i = 1, . . . , n,
j = 1, . . . , m, and μi , i = 1, . . . , n,
X 1 = a11 Z 1 + a12 Z 2 + · · · + a1m Z m + μ1
··· = ···
··· = ···
X i = ai1 Z 1 + ai2 Z 2 + · · · + aim Z m + μi
···
···
X n = an1 Z 1 + an2 Z 2 + · · · + anm Z m + μn
then the vector X 1 , . . . , X n is said to have a multivariate normal distribution. That
is, X i , . . . , X n has a multivariate normal distribution if each is a constant plus
a linear combination of the same set of independent standard normal random
variables. Because the sum of independent normal random variables is itself
normal, it follows that each X i is itself a normal random variable.

Simulation. DOI: http://dx.doi.org/10.1016/B978-0-12-415825-2.00006-1


© 2013 Elsevier Inc. All rights reserved. 97
98 6 The Multivariate Normal Distribution and Copulas

The means and covariances of multivariate normal random variables are as


follows:
E[X i ] = μi
and
 m 
 
m
Cov(X i , X j ) = Cov aik Z k , a jr Z r
k=1 r =1

m 
m
= Cov(aik Z k , a jr Z r )
k=1 r =1
m  m
= aik a jr Cov(Z k , Z r )
k=1 r =1
m
= aik a jk (6.1)
k=1

where the preceding used that



1, if r = k
Cov(Z k , Z r ) =
0, if r = k

The preceding can be compactly expressed in matrix notation. Namely, if we let


A be the n × m matrix whose row i column j element is ai j , then the defining
equation of the multivariate normal is

X = AZ + µ (6.2)

where X = (X 1 , . . . , X n ) is the multivariate normal vector, Z = (Z 1 , . . . , Z m ) is


the row vector of independent standard normals, µ = (μ1 , . . . , μn ) is the vector of
means, and where B is the transpose of the matrix B. Because Equation (6.1) states
that Cov(X i , X j ) is the element in row i column j of the matrix AA , it follows
that if C is the matrix whose row i column j element is ci j = Cov(X i , X j ), then
Equation (6.1) can be written as
C = AA (6.3)
An important property of multivariate normal vectors is that the joint distribution
of X = (X i , . . . , X n ) is completely determined by the quantities E[X i ] and
Cov(X i , X j ), i, j = 1, . . . , n. That is, the joint distribution is determined by
knowledge of the mean vector µ = (μ1 , . . . , μn ) and the covariance matrix C.
This result can be proved by calculating
n the joint moment generating function
of X 1 , . . . , X n , namely E[exp i=1 ti X i ], which is known to completely
n
specify the joint distribution. To determine this quantity, note first that i=1 ti X i
is itself a linear combination of the independent normal random variables
6.2 Generating a Multivariate Normal Random Vector 99

Z 1 , . . .
, Z m , and is thus also a normal random variable. Hence, using that
E e W = exp {E[W ] + Var(W )/2} when W is normal, we see that
 n  n  n 
  
E exp ti X i = exp E ti X i + Var ti X i 2
i=1 i=1 i=1

As

n 
n
E ti X i = ti μi
i=1 i=1

and
 n   n 
  
n
Var ti X i = Cov ti X i , tj X j
i=1 i=1 j=1


n 
n
= ti t j Cov(X i , X j )
i=1 j=1

we see that the joint moment generating function, and thus the joint distribution,
of the multivariate normal vector is specified by knowledge of the mean values
and the covariances.

6.2 Generating a Multivariate Normal Random Vector


Suppose now that we want to generate a multivariate normal vector
X = (X 1 , . . . , X n ) having a specified mean vector µ and covariance matrix C.
Using Equations (6.2) and (6.3) along with the fact that the distribution of X is
determined by its mean vector and covariance matrix, one way to accomplish this
would be to first find a matrix A such that
C = AA
then generate independent standard normals Z 1 , . . . , Z n and set
X = AZ + µ
To find such a matrix A we can make use of a result known as the Choleski
decomposition, which states that for any n × n symmetric and positive definite
matrix M, there is an n × n lower triangular matrix A such that M = AA , where
by lower triangular we mean that all elements in the upper triangle of the matrix
are equal to 0. (That is, a matrix is lower triangular if the element in row i column
j is 0 whenever i < j.) Because a covariance matrix C will be symmetric (as
Cov(X i , X j ) = Cov(X j , X i )) and as we will assume that it is positive definite
(which is usually the case) we can use the Choleski decomposition to find such a
matrix A.
100 6 The Multivariate Normal Distribution and Copulas

Example 6a The Bivariate Normal Distribution Suppose we want


to generate the multivariate normal vector X 1 , X 2 , having means μi , variances σi2 ,
i = 1, 2, and covariance c = Cov(X 1 , X 2 ). (When n = 2, the multivariate normal
vector is called a bivariate normal.) If the Choleski decomposition matrix is

a11 0
A= (6.4)
a21 a22

then we need to solve



a11 0 a a σ12 c
∗ 11 21 =
a21 a22 0 a22 c σ22

That is,
2
a11 a11 a21 σ12 c
=
2
a11 a21 a21 + a222
c σ22

This yields that


2
a11 = σ12
a11 a21 = c
a21 + a22
2 2
= σ22

Letting ρ = c
σ1 σ2
be the correlation between X 1 and X 2 , the preceding gives that

a11 = σ1
a21 = c/σ1 = ρσ2
 
a22 = σ22 − ρ 2 σ22 = σ2 1 − ρ 2

Hence, letting

σ1 0
A= (6.5)
ρσ2 σ2 1 − ρ 2

we can generate X 1 , X 2 by generating independent standard normals Z 1 and Z 2


and then setting
X = AZ + µ
That is,

X 1 = σ1 Z 1 + μ1

X 2 = ρσ2 Z 1 + σ2 1 − ρ 2 Z 2 + μ2
6.2 Generating a Multivariate Normal Random Vector 101

The preceding can also be used to derive the joint density of the bivariate normal
vector X 1 , X 2 . Start with the joint density function of Z 1 , Z 2 :
 
1 1 2 
f Z 1 ,Z 2 (z 1 , z 2 ) = exp − z 1 + z 2
2
2π 2
and consider the transformation

x 1 = σ1 z 1 + μ1 (6.6)

x2 = ρσ2 z 1 + σ2 1 − ρ 2 z 2 + μ2 (6.7)

The Jacobian of this transformation is


 
  
 σ1  0 
J=  = σ 1 σ 2 1 − ρ2 (6.8)
 ρσ2 σ2 1 − ρ 2 

Moreover, the transformation yields the solution


x 1 − μ1
z1 =
σ1
x2 − μ2 − ρ σσ2 (x1 − μ1 )
z2 =  1
σ2 1 − ρ 2
giving that
 
(x1 − μ1 )2 ρ2 (x2 − μ2 )2
z 12 + z 22 = 1 + +
σ12 1 − ρ2 σ22 (1 − ρ 2 )

− (x1 − μ1 )(x2 − μ2 )
σ1 σ2 (1 − ρ 2 )
(x1 − μ1 )2 (x2 − μ2 )2 2ρ
= 2 + − (x1 − μ1 )(x2 − μ2 )
σ1 (1 − ρ ) σ2 (1 − ρ ) σ1 σ2 (1 − ρ 2 )
2 2 2

Thus, we obtain that the joint density of X 1 , X 2 is


 σ2 
1 x1 − μ1 x2 − μ2 − ρ σ1 (x1 − μ1 )
f X 1 ,X 2 (x1 , x2 ) = f Z ,Z , 
|J | 1 2 σ1 σ2 1 − ρ 2
    
1 x 1 − μ1 2 x 2 − μ2 2
= C exp − +
2(1 − ρ 2 ) σ1 σ2


− (x1 − μ1 )(x2 − μ2 )
σ1 σ2
1√
where C = . 
2π σ1 σ2 1−ρ 2
102 6 The Multivariate Normal Distribution and Copulas

It is generally easy to solve the equations for the Choleski decomposition of an


n × n covariance matrix C. As we take the successive elements of the matrix AA
equal to the corresponding values of the matrix C, the computations are easiest if
we look at the elements of the matrices by going down successive columns. That
is, we equate the element in row i column j of AA to ci j in the following order
of (i, j):

(1, 1), (2, 1), . . . , (n, 1), (2, 2), (3, 2), . . . , (n, 2),
(3, 3), . . . , (n, 3), . . . , (n − 1, n − 1), (n, n − 1), (n, n)

By symmetry the equations obtained for (i, j) and ( j, i) would be the same and
so only the first to appear is given.
For instance, suppose we want the Choleski decomposition of the matrix
⎡ ⎤
942
⎢ ⎥
C = ⎣4 8 3⎦ (6.9)
237

The matrix equation becomes


⎡ ⎤ ⎡ ⎤ ⎡ ⎤
a11 0 0 a11 a21 a31 942
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
⎣ a21 a22 0 ⎦ ∗ ⎣ 0 a22 a32 ⎦ = ⎣ 4 8 3 ⎦
a31 a32 a33 0 0 a33 237

yielding the solution


2
a11 =9 ⇒ a11 = 3
4
a21 a11 = 4 ⇒ a21 =
3
2
a31 a11 = 2 ⇒ a31 =
3

56
2
a21 + a22
2
=8 ⇒ a22 = ≈ 2.4944
3
3 − 8/9 19
a31 a21 + a32 a22 = 3 ⇒ a32 = √ = √ ≈ 0.8463
56/3 3/ 56
1
2
a31 + a32
2
+ a33
2
=7 ⇒ a33 = 59 − (19)2 /56 ≈ 2.4165 
3

6.3 Copulas
A joint probability distribution function that results in both marginal distributions
being uniformly distributed on (0, 1) is called a copula. That is, the joint
6.3 Copulas 103

distribution function C(x, y) is a copula if C(0, 0) = 0 and for 0 ≤ x, y ≤ 1

C(x, 1) = x, C(1, y) = y

Suppose we are interested in finding an appropriate joint probability distribution


function H (x, y) for random variables X and Y , whose marginal distributions are
known to be the continuous distribution functions F and G, respectively. That is,
knowing that
P(X ≤ x) = F(x)

and
P(Y ≤ y) = G(y)

and having some knowledge about the type of dependency between X and Y , we
want to choose an appropriate joint distribution function H (x, y) = P(X ≤ x,
Y ≤ y). Because X has distribution F and Y has distribution G it follows that
F(X ) and G(Y ) are both uniform on (0, 1). Consequently the joint distribution
function of F(X ), G(Y ) is a copula. Also, because F and G are both increasing
functions, it follows that X ≤ x, Y ≤ y if and only if F(X ) ≤ F(x),
G(Y ) ≤ G(y).Consequently, if we choose the copula C(x, y) as the joint
distribution function of F(X ), G(Y ) then

H (x, y) = P(X ≤ x, Y ≤ y)
= P(F(X ) ≤ F(x), G(Y ) ≤ G(y))
= C(F(x), G(y))

The copula approach to choosing an appropriate joint probability distribution


function for random variables X and Y is to first decide on their marginal
distributions F and G, and then choose an appropriate copula to model the joint
distribution of F(X ), G(Y ). An appropriate copula to use would be one that models
the presumed dependencies between F(X ) and G(Y ). Because F and G are
increasing, the dependencies resulting from the resulting copula chosen should
be similar to the dependency that we think holds between X and Y. For instance, if
we believe that the correlation between X and Y is ρ, then we could try to choose
a copula such that random variables whose distribution is given by that copula
would have correlation equal to ρ. (Because correlation only measures the linear
relationship between random variables, the correlation of X and Y is, however, not
equal to the correlation of F(X ) and G(Y ).)

Example 6b The Gaussian Copula A very popular copula used


in modeling is the Gaussian copula. Let  be the standard normal distribution
function. If X and Y are standard normal random variables whose joint distribution
is a bivariate normal distribution with correlation ρ, then the joint distribution of
104 6 The Multivariate Normal Distribution and Copulas

(X ) and (Y ) is called the Gaussian copula. That is, the Gaussian copula C is
given by
C(x, y) = P((X ) ≤ x, (Y ) ≤ y)
= P(X ≤ −1 (x), Y ≤ −1 (y))
 −1 (x)  −1 (y)
1
= 
−∞ −∞ 2π 1 − ρ 2
 
1
× exp − (x + y − 2ρx y) d y d x
2 2

2(1 − ρ 2 )
Remark The terminology “Gaussian copula” is used because the normal
distribution is often called the Gaussian distribution in honor of the famous
mathematician J.F. Gauss, who made important use of the normal distribution
in his astronomical studies. 

Suppose X, Y has a joint distribution function H (x, y), and let


F(x) = lim H (x, y)
y→∞

and
G(y) = lim H (x, y)
x→∞

be the marginal distributions of X and Y. The joint distribution of F(X ), G(Y ) is


called the copula generated by X, Y, and is denoted as C X,Y . That is,
C X,Y (x, y) = P(F(X ) ≤ x, G(Y ) ≤ y)
= P(X ≤ F −1 (x), Y ≤ G −1 (y))
= H (F −1 (x), G −1 (y))
For instance, the Gaussian copula is the copula generated by random variables that
have a bivariate normal distribution with means 0, variances 1, and correlation ρ.
We now show that if s(x) and t (x) are increasing functions, then the copula
generated by the random vector s(X ), t (Y ) is equal to the copula generated
by X, Y.
Proposition If s and t are increasing functions, then
Cs(X ),t (Y ) (x, y) = C X,Y (x, y)
Proof If F and G are the respective distribution functions of X and Y, then the
distribution function of s(X ), call it Fs , is
Fs (x) = P(s(X ) ≤ x)
= P(X ≤ s −1 (x)) (because s is an increasing function)
= F(s −1 (x))
6.3 Copulas 105

Similarly, the distribution function of t (Y ), call it Ft , is

Ft (y) = G(t −1 (y))

Consequently,
Fs (s(X )) = F(s −1 (s(X ))) = F(X )
and
Ft (t (Y )) = G(Y )
showing that

Cs(X ),t (Y ) (x, y) = P(Fs (s(X )) ≤ x, Ft (t (Y )) ≤ y)


= P(F(X ) ≤ x, G(Y ) ≤ y)
= C X,Y (x, y) 

Suppose again that X, Y has a joint distribution function H (x, y) and that the
continuous marginal distribution functions are F and G. Another way to obtain a
copula aside from using that F(X ) and G(Y ) are both uniform on (0, 1) is to use
that 1 − F(X ) and 1 − G(Y ) are also uniform on (0, 1). Hence,

C(x, y) = P(1 − F(X ) ≤ x, 1 − G(Y ) ≤ y)


= P(F(X ) ≥ 1 − x, G(Y ) ≥ 1 − y)
= P(X ≥ F −1 (1 − x), Y ≥ G −1 (1 − y)) (6.10)

is also a copula. It is sometimes called the copula generated by the tail distributions
of X and Y .

Example 6c The Marshall_Olkin Copula A tail distribution generated


copula that indicates a positive correlation between X and Y and which gives a
positive probability that X = Y is the Marshall–Olkin copula. The model that
generated it originated as follows. Imagine that there are three types of shocks.
Let Ti denote the time until a type i shock occurs, and suppose that T1 , T2 , T3 are
independent exponential random variables with respective means E[Ti ] = 1/λi .
Now suppose that there are two items, and that a type 1 shock causes item 1 to fail,
a type 2 shock causes item 2 to fail, and a type 3 shock causes both items to fail.
Let X be the time at which item 1 fails and let Y be the time at which item 2 fails.
Because item 1 will fail either when a type 1 or a type 3 shock occurs, it follows
from the fact that the minimum of independent exponential random variables is
also exponential, with a rate equal to the sum of the rates, that X is exponential
with rate λ1 + λ3 . Similarly, Y is exponential with rate λ2 + λ3 . That is, X and Y
have respective distribution functions

F(x) = 1 − exp{−(λ1 + λ3 )x}, x ≥ 0 (6.11)


G(y) = 1 − exp{−(λ2 + λ3 )y}, y ≥ 0 (6.12)
106 6 The Multivariate Normal Distribution and Copulas

Now, for x ≥ 0, y ≥ 0

P(X > x, Y > y) = P(T1 > x, T2 > y, T3 > max(x, y))


= P(T1 > x)P(T2 > y)P(T3 > max(x, y))
= exp{−λ1 x − λ2 y − λ3 max(x, y)}
= exp{−λ1 x − λ2 y − λ3 (x + y − min(x, y))}
= exp{−(λ1 + λ3 )x} exp{−(λ2 + λ3 )y} exp{λ3 min(x, y)}
= exp{−(λ1 + λ3 )x} exp{−(λ2 + λ3 )y}
× min(exp{λ3 x}, exp{λ3 y}) (6.13)

Now, if p(x) = 1 − e−ax , then p−1 (x) is such that


−1 (x)
x = p( p−1 (x)) = 1 − e−ap

which yields that


1
p−1 (x) = − ln(1 − x) (6.14)
a
Consequently, setting a = λ1 + λ3 in Equation (6.14) we see from Equation (6.11)
that
1
F −1 (1 − x) = − ln(x), 0 ≤ x ≤ 1
λ1 + λ 3
Similarly, setting a = λ2 + λ3 in Equation (6.14) yields from Equation (6.12) that
1
G −1 (1 − y) = − ln(y), 0 ≤ y ≤ 1
λ 2 + λ3
Consequently,

exp{−(λ1 + λ3 )F −1 (1 − x)} = x
exp{−(λ2 + λ3 )G −1 (1 − y)} = y
λ3
− λ +λ
exp{λ3 F −1 (1 − x)} = x 1 3
λ3
− λ +λ
exp{λ3 G −1 (1 − y)} = y 2 3

Hence, from Equations (6.10) and (6.13) we obtain that the copula generated by
the tail distribution of X and Y, referred to as the Marshall–Olkin copula, is

C(x, y) = P(X ≥ F −1 (1 − x), Y ≥ G −1 (1 − y))


 λ3 λ3

− −
= x y min x λ1 +λ3 , y λ2 +λ3

= min(x α y, x y β )
λ1 λ2
where α = λ1 +λ3
and β = λ2 +λ3
.
6.4 Generating Variables from Copula Models 107

Multidimensional Copulas
We can also use copulas to model n-dimensional probability distributions. The
n-dimensional distribution function C(x1 , . . . , xn ) is said to be a copula if all n
marginal distributions are uniform on (0, 1). We can now choose a joint distribution
of a random vector X 1 , . . . , X n by first choosing the marginal distribution
functions Fi , i = 1, . . . , n, and then choosing a copula for the joint distribution of
F1 (X 1 ), . . . , Fn (X n ). Again a popular choice is the Gaussian copula which takes C
to be the joint distribution function of (W1 ), . . . , (Wn ) when W1 , . . . , Wn has
a multivariate normal distribution with mean vector 0, and a specified covariance
matrix whose diagonal (variance) values are all 1. (The diagonal values of the
covariance matrix are taken equal to 1 so that the distribution of (Wi ) is uniform
on (0, 1).) In addition, so that the relationship between X i and X j is similar to that
between Wi and W j , it is usual to let Cov(Wi , W j ) = Cov(X i , X j ), i = j.

6.4 Generating Variables from Copula Models


Suppose we want to generate a random vector X = (X 1 , . . . , X n ) with marginal
distributions F1 , . . . , Fn and copula C. Provided we can generate a random
vector whose distribution is C, and that we can invert the distribution functions
Fi , i = 1, . . . , n, it is easy to generate X. Because the joint distribution of
F1 (X 1 ), . . . , Fn (X n ) is C, we can generate X 1 , . . . , X n by first generating a
random vector having distribution C and then inverting the generated values to
obtain the desired vector X. That is, if the generated values from the copula
distribution function are y1 , . . . , yn , then the generated value of X 1 , . . . , X n are
F1−1 (y1 ), . . . , Fn−1 (yn ).

Example 6d The following can be used to generate X 1 , . . . , X n having


marginal distributions F1 , . . . , Fn and covariances Cov(X i , X j ), i = j, by using
a Gaussian copula:

1. Use the Choleski decomposition method to generate W1 , . . . , Wn from a


multivariate normal distribution with means all equal to 0, variances all equal
to 1, and with Cov(Wi , W j ) = Cov(X i , X j ), i = j.
2. Compute the values (Wi ), i = 1, . . . , n, and note that the joint distribution
of (W1 ), . . . , (Wn ) is the Gaussian copula.
3. Let Fi (Xi) = (Wi ), i = 1, . . . , n.
4. Invert to obtain X i = Fi−1 ((Wi )), i = 1, . . . , n. 

Example 6e Suppose that we want to generate V, W having marginal


distribution functions H and R using a Marshall–Olkin tail copula. Rather than
generating directly from the copula, it is easier to first generate the Marshall–
Olkin vector X, Y. With F and G denoting the marginal distribution functions of
X and Y , we then take 1 − F(X ) = e−(λ1 +λ3 )X , 1 − G(Y ) = e−(λ2 +λ3 )Y as the
108 6 The Multivariate Normal Distribution and Copulas

generated value of the vector having the distribution of the copula. We then set
these values equal to H (V ) and to R(W ) and solve for V and W. That is, we use
the following approach:
1. Generate T1 , T2 , T3 , independent exponential random variables with rates
λ1 , λ 2 , λ 3 .
2. Let X = min(T1 , T3 ), Y = min(T2 , T3 ).
3. Set H (V ) = e−(λ1 +λ3 )X , R(W ) = e−(λ2 +λ3 )Y .
4. Solve the preceding to obtain V, W . 

Exercises
1. Suppose Y1 , . . . , Ym are independent normal random variables with means
E[Yi ] = μi , and variances Var(Yi ) = σi2 , i = 1, . . . , m. If

X i = ai1 Y1 + ai2 Y2 + · · · + aim Ym , i = 1, . . . , n

argue that X 1 , . . . , X n is a multivariate normal random vector.

2. Suppose that X 1 , . . . , X n has a multivariate normal distribution. Show that


X 1 , . . . , X n are independent if and only if

Cov(X i , X j ) = 0 when i = j

3. If X is a multivariate normal n-vector with mean vector µ and covariance


matrix C, show that AX is multivariate normal with mean vector Aµ and
covariance matrix ACA , when A is an m × n matrix.

4. Find the Choleski decomposition of the matrix


⎡ ⎤
4 2 2 4
⎢ ⎥
⎢ 2 5 7 0 ⎥
⎢ ⎥
⎣2 7 19 11 ⎦
4 0 11 25

5. Let X 1 , X 2 have a bivariate normal distribution, with means E[X i ] = μi ,


variances Var(X i ) = σi2 , i = 1, 2, and correlation ρ. Show that the conditional
distribution of X 2 given that X 1 = x is normal with mean μ2 + ρ σσ2 (x1 − μ1 )
1
and variance σ22 (1 − ρ 2 ).

6. Give an algorithm for generating random variables X 1 , X 2 , X 3 having a


multivariate distribution with means E[X i ] = i, i = 1, 2, 3, and covariance
matrix
Exercises 109
⎡ ⎤
3 −2 1
⎢ ⎥
⎣ −2 5 3⎦
1 3 4
7. Find the copula C X,X .

8. Find the copula C X,−X .

9. Find the copula C X,Y when X and Y are independent.

10. If s is an increasing function, and t is a decreasing function, find Cs(X ),t (Y ) in


terms of C X,Y .

You might also like