0% found this document useful (0 votes)
68 views3 pages

Solution To Exercises On MVN: 1 Question 1 (I)

This document provides solutions to exercises on the multivariate normal distribution. It summarizes that: 1) The characteristic function determines the distribution of a random vector. A random vector X has a multivariate normal distribution if its characteristic function has a particular form. 2) If the covariance matrix Σ of a random vector X is positive semidefinite, X can be written as X=AV+μ, where V is a standard normal vector and A defines the covariance matrix, showing X's support is the range of Σ. 3) The probability density function of a multivariate normal vector X can be written in terms of the mean, covariance matrix, and determinants.

Uploaded by

Ahmed Dihan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
68 views3 pages

Solution To Exercises On MVN: 1 Question 1 (I)

This document provides solutions to exercises on the multivariate normal distribution. It summarizes that: 1) The characteristic function determines the distribution of a random vector. A random vector X has a multivariate normal distribution if its characteristic function has a particular form. 2) If the covariance matrix Σ of a random vector X is positive semidefinite, X can be written as X=AV+μ, where V is a standard normal vector and A defines the covariance matrix, showing X's support is the range of Σ. 3) The probability density function of a multivariate normal vector X can be written in terms of the mean, covariance matrix, and determinants.

Uploaded by

Ahmed Dihan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Solution to exercises on MVN

Richard Gill
March 16, 2009
Abstract
This is the solution to the two exercises on the multivariate normal
distribution in my lecture notes for Statistics II.

Question 1 (i)

Let us compute the characteristic function of X = AV + b where the Vi are


independent standard normally distributed. I will use inner product notation hu, vi = u> v for conformable column vectors u, v. The characteristic
function of a random vector X is then X (t) = E expht, Xi, and it deter1 2
mines the distribution of X. We know that Eeit1 V1 = e 2 t1 . It follows
1
2
>
that Eeiht,V i = e 2 ktk and hence Eeihs,Xi = Eeihs,AV +bi = EeihA s,V i eihs,bi =
1 >
1
>
>
e 2 hs,AA si eihs,bi = e 2 s s+is where = AA> is the covariance matrix of
X and = b is its mean vector. It follows that the distribution of X only
depends on and .
1 >
>
We will call the distribution with characteristic function e 2 s s+is the
multivariate normal distribution with mean and covariance matrix . If
X has this distribution, we write X N (, ). According to this definition,
V N (0, I).

Question 1 (ii)

If is n n, positive semidefinite, we can write = AA> where A is n p


and p is the rank of . The p columns of A are linearly independent n-vectors
and they span the range of ; so R() = col(A), the column space of A.
Suppose now V Np (0, I) and define X = AV + . Then X Nn (, ).
By its construction, the support of the distribution of X is indeed the affine
subspace + R().

Mathematical Institute, Leiden University; http://www.math.leidenuniv.nl/gill

Question 1 (iii)

Suppose now is non-singular. As in part (ii) we can construct the distribution of X as X = AV + where A is n n and of full rank, = AA> , and
V Nn (0, I). By the formula for transformation of multivariate densities
under smooth transformations, X has a probability density on Rn , which is
1
1
2
equal to det(A)1 (2)n/2 e 2 kA (x)k . A simple rewriting
produces the

1
21
> 1
required formula det(2) exp 2 (x ) (x ) .

Question 2: Marginal and conditional

By our construction, it follows that affine transformations of multivariate


normal vectors are multivariate normal, and in order to determine their distribution, it suffices to compute mean vector and covariance vector.
Now suppose E1 N (0, 11 ) and E2 N (0, 22 21
11 12 ), where E1
and E2 are independent. I will verify later that 22 21
11 12 is positive
semi-definite. It follows that E1 and E2 are jointly multivariate normally
distributed, and so also is the vector Y formed by concatenating Y1 = E1
and Y2 = E2 +21
11 E1 . This random vector has mean 0 and, by direct computation, covariance matrix . The marginal distribution of Y1 is N (0, 11 )
while the conditional distribution of Y2 given Y1 = y1 , is N (21
11 y1 , 22
21

).
Finally,
if
X

N
(,
)
then
X

has
the
same
distribution
as
11 12
the just constructed Y . It follows that X1 N (1 , 11 ), whilst conditional

on X1 = x1 , X2 N (2 + 21
11 (x1 1 ), 22 21 11 12 ).
To return to the question of whether or not 22 21
11 12 is a covariance
matrix, suppose Y N (0, ) and consider the problem of determining A21
to minimize EkY2 A21 Y1 k2 . The choice A21 = 21
11 makes Y2 A21 Y1
uncorrelated with Y1 . So replacing A21 with anything else, if anything only
makes this expected sum of squares larger, and the given choice solves the
minimization problem. Anyway, by direct computation again, the variance of
Y2 A21 Y1 with the optimal choice of A21 is nothing else than 22 21
11 12 ,
which consequently is indeed a covariance matrix (i.e., a positive semi-definite
symmetric matrix).

Concluding remarks

For those who are not familiar with characteristic functions, let me remark
that by Fourier theory, very many functions g can be expressed
R as the Fourier
transform of another function gb, by the formula g(x) = t= gb(t)eitx dx.
2

Therefore if we know X (t) R=REei tX for all t, and if g has the expression
just given, then Eg(X) = x t gb(t)eRitx dtPX (dx). If we can exchange the
two integrations, we have Eg(X) = gb(t)X (t)dt. Thus knowledge of the
characteristic function of the distribution of X entails knowledge of the expectation of a very large class of functions of X. If the indicators of intervals
are in this class (or if they can be approximated arbitrarily well by functions
in this class) then the probabilities of X lying in any interval are determined,
thus the distribution of X is determined.
All this can be made precise and the result is indeed that the characteristic
function of the probability distribution of a real random variable does indeed
characterise that distribution. The natural generalization to random vectors
is also true.
Alternatively one can prove all the statements in the exercises without any
use of characteristic functions. Probably one should then start by computing
the probability density of the vector X = AV + b in the case when A is
non-singular. In particular, if A is an orthonormal matrix, A> A = AA> = I,
one obtains that X = AV has the same distribution as V .
In general, if some rows of A are linear combinations of others, one can
delete some elements of X, since these are affine functions of others. This
brings us to the case X = AV + b where A is an n p matrix of rank n. So
n p. If n = p we are done. If on the other hand n < p we should apply
an orthonormal transformation B to V , writing X = AB > BV such that the
last p n columns of AB > are identically equal to zero, so that we can delete
superfluous elements from BV and superfluous columns from AB > . This is
the case exactly when the n rows of A are orthogonal to the last p n rows
of B. Thus the orthonormal matrix B has to be such that its last p n rows
span the null space of A. And that is easy to arrange.

You might also like