Unit 19
Unit 19
Structure Page No
19.1 Introduction 35
Objectives
19.2 Multivariate Normal Distribution and Random Sampling 35
19.3 Maximum Likelihood Estimation 41
19.4 Summary 46
19.5 Solutions/Answers 46
19.6 Practical Assignment 50
19.1 INTRODUCTION
We shall begin this unit by recalling the multivariate normal distribution. Many of our
explanations use the representation of the rows of X as k points in n-dimensions. In
section 19.2, we will introduce the assumptions that a random sample is constituted by
the observations. We shall confine it for random sampling, for which we shall assume
that the traits or the measurements considered for the different trials are independent
and the joint distribution of all the variable remains the same. In section 19.3, we shall
discuss the maximum likelihood estimation.
Objectives
After studying this unit, you should be able to:
a describe the likelihood function and the role of maximum likelihood estimation
in deriving the normally distributed estimators;
, ,
Let X' = (X , X , ..., X, ) represent a p-dimensional random variable. The meen 9nd
variance-covariance matrix are denoted by p and I: ,respectively a d we given by
p' =E(xl)=(E(x,), E(x,), ..., E(x,))=(p,, p,, .-.,CL,) (1)
I:=E[(x-p) (x-p)I (2)
1
var(x,) cov(x,, x,) ... cov(x,, x,)
cov(x,,x,) var(x,) ... c~~(x,,X~)
...
,,
COV(X, x ) COV(X,x ) ,, ...
xu(xP )
The density of a p-variate normal random variable x is ¥ by
N,(p, I:) = (2~)-'I2/ I: I-'',exp - p)' I:-'(x - p)
Distributions Assaciated where the mean of xis p ,and the vafiance-covariance matrix of. x is C .
with MVN
We now obtain the multivariate normal density by transforming the random vector
z = (z,, z,, ..., z,) where each zi has the ~ ( 0 , l )and the zis are independent. Thus
~ ( z =) 0 and cov(z) = I . When random variables are independently distributed, then
their density fbnction is the product of their individual densities. We can write
f(z) =f,(z,) f2(z2)...fp(zp)
--
-
=-
fin
] cz:12
a ] I
since fi(zi) = -e-z;lZ
Here, we wish to obtain the multivariate normal density with arbitrary mean vector p
and covariance matrix C which is positive definite. We define the transformation for
y = a z + p , y = CX z + p .
where CX is the (symmetric) square root matrix. The mean vector and covariance
matrix for the transformed random vector y are :
=xf:.l.xX=Z - (6)
Which is the multivariate normal density b c t i o n with mean vector u atld covariance
t
i
matrix C .
Now, we shall discuss random sampling. For this, consider the data matrix X of order
I
n x k, which can be plotted for the n-dimensional scatterplot by representing the .
columns of X as points. We can write
1 where yi is considered as the elements of the rows of the data matrix Let us consider
- -
xi as the sample mean of the i-th observation which is xi = -
1 "
x, ; i = 1, 2; ...,k
" h=l .
then the deviation vector denoted by ei is computed as
ei = y, -Eil, where 1 =[& 1, ...,11'
- - -
=[xl, -xi, XiZ -xi .'.Xik -xi]r
for two vectors ei and e,, we have e;ej = n sii, where sij is the sample variance- .
covariance matrix. Also, the sample correlation coefficient
, with qj - s i ~ =cos(~,)
-. &&
eg is the angle between ei and ej.
, \ where
The probability density function of a bivariate normal random variable with values in
[?I
i
E~ is 1
1 exp[- J?(
+ - 2p(y)(y)]],
1
1
2no,o,(1-p2)' '(1-p2)
Example 2: Let Z =
t L JJ
Check whether XI and X, are independent or not? Also check the independence of
( x , , ~ ,and
) X3?
Since XI & X, have covariance a,, = I, they are not independent. However,
partitioning X & Z as
We see that XI
(x,,x,)and X,
=[:I] I
I
and X) have @variance matrix 2,, =
And
1
E
We have
.
and it follows from Result (1) that Xl is distributed as N (pl ,a1 More generally,
the marginal distribution of any component Xi of X is N ( ~ , Q ~ ~ ) .
[Result 1: If X is distributed as N, (p, Z) ,then any linear combination of variables
a X = alXl + a,X, +.a. + a, x p is distributed as N(afp,a'Za) .]
[ ] [:I
5+0+4
.
Mean is p =
2+4+6 =
3
Here, =. -pll=[~]-[~]=[+]
Therefore, s, =
EX) Find the sample correlation matrix R for the data matrix X =
E6) Find the mcw and the covariance matrix of the random vector X = (x,,x,)'
with probability density function
In the following section, we shall define and maximum likelihood estimation of
multivariate normals.
The value of a parameter, as a function of the data, that maximizes the likelihood
function is called a maximum likelihood estimator (MLE). In many important cases
MLEs can be found quite readily by differentiating the likelihood function, or its
logarithm, setting the result equal to zero, and solving. In other cases the maximum
occurs at a point where the derivative does not exist, and hence other procedures have
to be followed. Often numerical methods have to be used. Iterative proportional fitting
and the Newton-Raphson method are two such algorithms frequently used to generate
MLEs in this case.
From this result, known as factorization theorem, it is clear that maximum likelihood
estimates are often the functions of sufficientstatistics. For the multivariate normal
family, the factorization theorem shows directly that the mean vector % and sample
variance-covariance matrix S are sufficient statistics for the population parameters p
and C ,respectively.
Diatiibationr AsooeUed Now, we shall discuss the following theorem.
with MVN 4
Theureml: If x, ,x, , ..., xn is a random sample from N, (p, C) ,then the maximum
likelihood estimators of p and Z: are
Proof: Since the xi s are independent (because they arise from a random sample), the
likelihood function (joint density) is the product of the densities of the xi's :
,
We use €he notation L (p,Z:) because we consider x ,x,, ...,xn to be known or
available from a future sample. For the given values ofx,, x,, ..., xn,we seek the
values of p and C that maximize Eqn. (1 3). We first express Eqn. (13) in a form that
wiH facilitate finding the maximum.
The scalar quantity (xi - p)' T-l (xi - p) is qua1 to its trace.
Hence, we have
Now by adding and subtracting E in the sum in the right side of Eqn. (14), we obtain
Since Q - Q, = Q,R ,we obtain gap P (Q,R I -a) = 1-a (26)
Note that the definition of VAR does not require normality. However, calculation of
VAR becomes considerably simpler, if we assume that (x,,...,x,) follows an
n-variate normal distribution. Then, the rate of return R ,in Eqn. (24) is normally
distributed with mean E(R) = x$I
i=l
ciri ,and variance V(R) = xx
n n
i-1 j=l
ciuic .
The value Za satisfjling Eqn. (26) can be obtained using a table of the standard
normal distribution. In many standard textbooks of statistics, a table of the survival
"
probability
" 1 -yy
L(x)=I=e dy, x >O,
X
(
is given. Then using the standardization Y = - and symmetry of the density
1
x s R about 0 , we can obtain the value Za with ease.
Namely, lettingr, = -za /Qo ,it follows from Eqn. (26) that
za = ~ ~ ( x ~ 0 - p )
The value za in Eqn. (27) is Var with confidence level 100a%. The value xa is the
100(1- a) percentile of the standard normal distribution given in Table 1.
For example, if p = 0 , then the 99% VaR is given by 2.326uQ0.
Note: Since the risk horizon is very short, e.g. one day or one week, in market risk
management, the mean rate of return p is often set to be zero. In this case, VAR with
confidence level 100a% is given by xauQo.
( )
p be similarly partitioned as p = p (,) ,p(,) ,and let Z be partitioned as
EIO) Showthat x
II
jPI
(xi-F)(F-p)'
II
and x ( F - p ) (xi-F)' areboth p x p
i.1
matrices of zeros. Here x: = [x j,, x j2, ...,x jp ] ,j = 1, 2, ...,n and
- - - - - -
19.4 SUMMARY
In this unit, we have covered the following points.
And note that with this assignment X, p, and C can respectively be rearranged
and partitioned- as - - - ,
It is clear from this example that the normal distribution for any subset can be
expressed by simply selecting the appropriate means and covariances from the
original p & C .
E3) Since 2 is diagonal matrix of order p, let us prove the same for p = 2.
for p = 2, C = ril o:2]
f(x1,x2)= f(x,).f(xz)
i.e. if the random variables X, and X, are uncorrelated, so that p,, = 0, then
--
the joint density are the product of two univariate normal density imd X,, X,
are independent.
E5) a) ,,
Since x, and x, have covariance u - 2 ,they are not independent.
I, :[
Now here
,
and x, have covariancernatrix E,, =
It follows from the fact that "trace of a sum of matrices is equal to the sum of
the traces of the matrices".
n
(since "(7 - p)' x-'c~- p) = -(y - p)' Z-' and
2 2
= (91) - ~ ( 1 If
) TI' @(I)- ~ ( 11)+ ( ~ ( 2-
) P(2)If xi; 1 -
( ~ ( 2 ) ~ ( 2 )
i.e., x(,, and x(,, are independently normally distributed with mean po), p(')
respectively and covariances matrices El, and Z, respectively.
n
E10) Letustake, x ( x j - Z ) ( Z - p ) '
F,
Distributions Associated
with MVN
= n ~ ~ - n ~ ~ p - n ~ ~ + n ~ ~ = 0
Similarly, other can be solved.
1. Write a programme is 'C' language to find the ML estimation of mean (p) and
variance ( a 2 ) .