Jacobians of Matrix Transformations PDF
Jacobians of Matrix Transformations PDF
Jacobians of Matrix Transformations PDF
[This Chapter is based on the lectures of Professor A.M. Mathai of McGill University,
Canada (Director of the SERC Schools).]
7.0. Introduction
Real scalar functions of matrix argument, when the matrices are real, will be
dealt with. It is difficult to develop a theory of functions of matrix argument for
general matrices. Let X = (xi j ), i = 1 , m and j = 1, , n be an m n matrix
where the xi j s are real elements. It is assumed that the readers have the basic
knowledge of matrices and determinants. The following standard notations will be
used here. A prime denotes the transpose, X 0 = transpose of X, |(.)| denotes the
determinant of the square matrix, m m matrix (). The same notation will be used
for the absolute value also. tr(X) denotes the trace of a square matrix X, tr(X) =
sum of the eigenvalues of X = sum of the leading diagonal elements in X. A real
symmetric positive definite X (definiteness is defined only for symmetric matrices
when real) will be denoted by X = X 0 > 0. Then 0 < X = X 0 < I X = X 0 > 0 and
I X > 0. Further, dX will denote the wedge product or skew symmetric product
of the differentials dxi j s.
That is, when X = (xi j ), an m n matrix
225
226 7. JACOBIANS OF MATRIX TRANSFORMATIONS
Proof 7.1.1.
a11 a12 a1p x1
y1
.. = AX = a21 a22 a2p x2
Y = . .. .. . .
. . .. ..
yp
a p1 a p2 a pp
xp
yi = ai1 x1 + ... + aip x p , i = 1, ..., p.
yi y
i
= ai j = (ai j ) = A J = |A|.
x j x j
Hence,
dY = |A|dX.
That is, Y and X, p 1, A is p p, |A| , 0, A is a constant matrix, then
Solution 7.1.1. From the above equations, by taking the differentials we have
where
1 1 1
A = 0 3 1 .
0 0 5
This verifies Theorem 7.1.1 also. This theorem is the standard result that is seen in
elementary textbooks. Now we will investigate more elaborate linear transforma-
tions.
Proof 7.1.2. Let Y = AX = (AX (1) , AX (2) , ..., AX (n) ) where X (1) , ..., X (n) are the
columns of X. Then the Jacobian matrix for X going to Y is of the form
228 7. JACOBIANS OF MATRIX TRANSFORMATIONS
A O ... O
A O ... O
O A ... O
. . . = |A|n = J
. .. .. ... .. (7.1.3)
.. ..
. . ... ..
O O ... A
O O ... A
where O denotes a null matrix and J is the Jacobian for the transformation of X
going to Y or dY = |A|n dX.
Proof 7.1.3.
(1)
X B
Y = XB = ...
(m)
X B
where X (1) , ..., X (m) are the rows of X. The Jacobian matrix is of the form,
B 0 0
B 0 0
.. .. .. = |B|m dY = |B|m dX.
. . . .. ..
. B
.
0 B
0
Then combining the above two theorems we have the Jacobian for the most general
linear transformation.
Proof 7.1.4. For proving this result first consider the transformation Z = AX and
then the transformation Y = ZB, and make use of Theorems 7.1.2 and 7.1.3.
In Theorems 7.1.2 to 7.1.4 the matrix X was rectangular. Now we will examine
a situation where the matrix X is square and symmetric. If X is p p and symmetric
then there are only 1 + 2 + + p = p(p + 1)/2 functionally independent elements
in X because, here xi j = x ji for all i and j. Let Y = Y 0 = AXA0 , X = X 0 , |A| , 0.
Then we can obtain the following result:
Proof 7.1.5. This result can be proved by using the fact that a nonsingular matrix
such as A can be written as a product of elementary matrices in the form
A = E1 E2 Ek
where E 1 , , E k are elementary matrices. Then
Y = AXA0 E 1 E 2 E k XE k0 E 10 .
where E 0j is the transpose of E j . Let Yk = E k XE k0 , Yk1 = E k1 Yk E k1
0
, and so on, and
0
finally Y = Y1 = E 1 Y2 E 1 . Evaluate the Jacobians in these transformations to obtain
the result, observing the following facts. If, for example, the elementary matrix E k
is formed by multiplying the i-th row of an identity matrix by the nonzero scalar
c then taking the wedgeproduct of differentials we have dYk = c p+1 dX. Similarly,
for example, if the elementary matrix E k1 is formed by adding the i-th row of an
identity matrix to its j-th row then the determinant remains the same as 1 and hence
dYk1 = dYk . Since these are the only two types of basic elementary matrices,
systematic evaluation of successive Jacobians gives the final result as |A| p+1 .
Note 7.1.1. From the above theorems the following properties are evident: If X is
a p q matrix of functionally independent real variables and if c is a scalar quantity
and B is a p q constant matrix then
230 7. JACOBIANS OF MATRIX TRANSFORMATIONS
Y = c X dY = c pq dX (7.1.7)
Y = c X + B dY = c pq dX. (7.1.8)
Note 7.1.3. For any p p lower triangular (or upper triangular) matrix of p(p +
1)/2 functionally independent real variables, Y = X + X 0 is a symmetric matrix,
where X 0 denoting the transpose of X = (xi j ), then observing that the diagonal
elements in Y = (yi j ) are multiplied by 2, that is, yii = 2xii , i = 1, ..., p, we have
Y = X + X 0 dY = 2 p dX. (7.1.10)
Solution 7.1.2. Since A and B are positive definite matrices we can write A =
A1 A01 and B = B1 B01 where A1 and B1 are nonsingular matrices, that is, |A1 | ,
0, |B1 | , 0. Also we know that for any two matrices P and Q, tr(PQ) = tr(QP) as
long as PQ and QP are defined, PQ need not be equal to QP. Then
where
Y = A01 (X M)B1 ,
But from Theorem 7.1.4
7.1. JACOBIANS OF LINEAR MATRIX TRANSFORMATIONS 231
But
Z
2
eu du =
and therefore
pq
1 = c |A|q/2 |B| p/2
c = (|A|q/2 |B| p/2 pq/2 )1 .
symmetric with p(p + 1)/2 variables whereas in X there are p2 variables and hence
this is not a one-to-one transformation.
Exercises 7.1.
7.1.1. If X and A are p p lower triangular matrices where A = (ai j ) is a con-
stant matrix with a j j > 0, j = 1, ..., p, X = (xi j ) and xi j s, i j are functionally
independent real variables then show that
p
Y
p j+1
Y = XA dY = a dX,
jj
j=1
p
Y j
Y = AX dY = a dX,
j j
j=1
and
Y = aX dY = a p(p+1)/2 dX (7.1.11)
7.1. JACOBIANS OF LINEAR MATRIX TRANSFORMATIONS 233
and
and
p
Y j p+1 j
0 0 0
Z = A X B dZ = b j ja j j dX. (7.1.13)
j=1
and
p
Y
j
Y = AX + X 0 A0 dY = 2 p
a dX. (7.1.16)
j j
j=1
and
p
Y
p+1 j
Y = AX 0 + XA0 dY = 2 p
a dX. (7.1.18)
j j
j=1
Solution 7.2.1.
x11 x12 ... x1p
.. .. .
. ... ..
X = (xi j ) = .
x p1 x p2 ... x pp
with xi j = x ji for all i and j, X = X 0 > 0. When X is positive definite, that is, X > 0
then x j j > 0, j = 1, ..., p also.
and so on. Taking the xi j s in the order x11 , x12 , , x1p , x22 , , x2p , , x pp and
the ti j s in the order t11 , t21 , t22 , , t pp we have the Jacobian matrix a triangular
matrix with the diagonal elements as follows: t11 is repeated p times, t22 is repeated
p 1 times and so on, and finally t pp appearing once. The number 2 is appearing
a total of p times. Hence the determinant is the product of the diagonal elements,
giving,
p p1
2 p t11 t22 t pp .
Example 7.2.2. If X is p p real symmetric positive definite then evaluate the fol-
lowing integral, we will call it matrix-variate real gamma , denoted by p ():
Z
p+1
p () = |X| 2 etr(X) dX (7.2.2)
X
p1
for <() > 2 .
and
2 2 2
tr(X) = t11 + (t21 + t22 ) + ... + (t2p1 + ... + t2pp ).
Then substituting these, the integral over X reduces to the following:
Z p Z
Z Z
p+1
Y j
t 2 Y 2
2 tr(X)
eti j dti j .
|X| e dX = 2
2t j j e dt j j
j j
X T
j=1 0 i> j
Observe that
!
j1 j1
Z
j 2
2 t j j 2 et j j dt j j
= , <() > ,
0 2 2
Z
2
eti j dti j =
! !
p(p1) 1 p1
p () = 2 ()
2 2
j1 p1
and the condition <( 2
), j = 1, , p <() > 2
. This establishes the
result.
Notation 7.2.1.
p () : Real matrix-variate gamma
Proof 7.2.1. This can be proved by observing the following: When X is nonsin-
1
gular, XX = I where I denotes the identity matrix. Taking differentials on both
sides we have
(dX)X 1 + X(dX 1 ) = O
(dX 1 ) = X 1 (dX)X 1 (7.2.5)
where (dX) means the matrix of differentials. Now we can apply Theorem 7.1.4
treating X 1 as a constant matrix because it is free of the differentials since we are
taking only the wedge product of differentials on the left side.
Note 7.2.1. If the square matrix X is nonsingular and skew symmetric then
proceeding as above it follows that
Y = X 1 , |X| , 0, X 0 = X dY = |X|(p1) dX. (7.2.6)
i
X
0
X = T T , with ti2j = 1, i = 1, ..., p
j=1
p
Y
p j
dX = t dT, (7.2.8)
jj
j=2
and
p
X
0
X = T T, with ti2j = 1, j = 1, ..., p
i= j
p1
Y j1
dX = t dT. (7.2.9)
jj
j=1
Example 7.2.3. Let R = (ri j ) be a p p real symmetric positive definite matrix such
that r j j = 1, j = 1, ..., p, 1 < ri j = r ji < 1, i , j. (This is known as the correlation matrix
in statistical theory). Then show that
[()] p p+1
f (R) = |R| 2
p ()
p1
is a density function for <() > 2 .
Solution 7.2.3. Since R is positive definite f (R) 0 for all R. Let us check the
total integral. Let T be a lower triangular matrix as defined in the above theorem
and let R = T T 0 . Then
Z p
Z
p+1
Y
j+1
|R| 2 dR = 2 2
(t ) dT.
jj
R T
j=2
240 7. JACOBIANS OF MATRIX TRANSFORMATIONS
Observe that
t2j j = 1 t2j1 ... t2j j1
where 1 < ti j < 1, i > j. Then let
Z Y
P+1
B= |R| 2 dR = j
R i> j
where
Z
j+1
j = (1 t2j1 ... t2j j1 ) 2 dt j1 dt j j1
wj
P j1 2
where w j = (t jk ), 1 < t jk < 1, k = 1, ..., j 1, k=1 t jk < 1.
Evaluating the integral with the help of Dirichlet integral of Chapter 1 and then
taking the product we have the final result showing that f (R) is a density.
Exercises 7.2.
p
Y j
0
p
X = T T dX = 2 t dT. (7.2.10)
j j
j=1
7.2.4. Let X = |TT | where X and T are p p lower triangular or upper triangular
matrices of functionally independent real variables with positive diagonal elements.
Then show that
7.2.5. For real symmetric positive definite matrices X and Y show that
XY t XY t
tr(XY)
lim I +
=e = lim I
. (7.2.14)
t t t t
7.2.6. Let X = (xi j ), W = (wi j ) be lower triangular p p matrices of distinct real
Pj
variables with x j j > 0, w j j > 0, j = 1, , p, k=1 w2jk = 1, j = 1, , p. Let
D = diag(1 , , p ), j > 0, j = 1, , p, real and distinct where diag(1 , , p )
denotes a diagonal matrix with diagonal elements 1 , , p . Show that
p
Y j1 1
X = DW dX = j wjj dD dW. (7.2.15)
j=1
and
Y = (A X)(A + X)1
p
p(p+1)
Y
(p+1)
j
dY = 2 2 |A + X|+ |a | dX (7.2.22)
j j
j=1
Proof 7.3.1. For proving the main part of the theorem take the differentials
on both sides of X = T U10 and then take the wedge product of the differentials
systematically. Since it involves many steps the proof of the main part is not given
here. The second part can be proved without much difficulty. Consider the X the
p n, n p real matrix. Observe that
X
tr(XX 0 ) = x2i j ,
ij
that is, the sum of squares of all elements in X = (xi j ) and there are np terms in
P 2
i j xi j . Now consider the integral
Z YZ 2
e tr(X)
dX = exi j dxi j
X ij
= np/2
since each integral over xi j gives . Now let us evaluate the same integral by using
Theorem 7.3.1. Consider the same transformation as in Theorem 7.3.1, X = T U 10 .
Then
Z
np/2
= etr(X) dX
X
Z p
Y
n j ( i j ti2j )
P
= |t | e dT
jj
T j=1
Z
U10 (dU1 ).
244 7. JACOBIANS OF MATRIX TRANSFORMATIONS
But for 0 < t j j < , < ti j < , i > j and U1 unrestricted semiorthonormal, we
have
Z p
n
Y n j (Pi j ti2j ) p
|t j j | e dT = 2 p (7.3.3)
2
T
j=1
!
n j1
Z
n j t2j j 1
|t j j | e dt j j = 2 , n > j 1, (7.3.4)
0 2 2
and each of the p(p 1)/2 integrals
Z
2
eti j dti j = , i > j. (7.3.5)
Now, substituting these the result in (7.3.2) is established.
Remark 7.3.1. For the transformation X = T U 10 to be unique, either one can take
T with the diagonal elements t j j > 0, j = 1, ..., p and U 1 unrestricted semiorthonor-
mal matrix or < t j j < and U1 a unique semiorthonormal matrix.
From the outline of the proof of Theorem 7.3.1 we have the follwing result:
2 p pn/2
Z
U10 (dU1 ) = (7.3.6)
V p,n p ( 2n )
where V p,n is the Stiefel manifold, or the set of semiorthonormal matrices of the type
U1 , n p, such that U10 U1 = I p where I p is an identity matrix of order p. For n = p
the Stiefel manifold becomes the full orthogonal group, denoted by O p . Then we
have for, n = p,
2
2pp
Z
U10 (dU1 ) = . (7.3.7)
Op p ( n2 )
Following through the same steps as in Theorem 7.3.1 we can have the following
theorem involving an upper triangular matrix T 1 .
triangular matrix with distinct nonzero diagonal elements and U 1 is a unique real
n p semiorthonormal matrix, that is, U 10 U1 = I p . Then, ignoring the sign,
p
Y n j
dT U10 (dU1 )
X1 = U1 T 1 dX1 = |t j j | (7.3.8)
1
j=1
Corollary 7.3.1. Let X1 , T, U1 be as defined in Theorem 7.3.1 with the diagonal el-
ements of T positive, that is, t j j > 0, j = 1, ..., p and U 1 an arbitrary semiorthonor-
mal matrix, and let A = XX 0 , which implies, A = T T 0 also. Then
A = XX 0 (7.3.9)
p
Y p+1 j
p
dA = 2 t dT (7.3.10)
jj
j=1
p
Y
p1 j
dT = 2p
t dA. (7.3.11)
jj
j=1
and, finally,
n p+1
dX = |S | 2 2 dS . (7.3.17)
Integrating out U1 (dU1 ) we have the marginal density of T , denoted by g(T ). That
is,
1 1 0
p
2 p e 2 tr(V T T )
Y n j
g(T )dT = tjj dT.
n
np/2
2 |V| 2
j=1
7.3. TRANSFORMATIONS INVOLVING ORTHONORMAL MATRICES 247
Now substituting from Corollary 7.3.2, S and dS in terms of T and dT we have the
density of S , denoted by, h(S ), given by
n p+1 1 1
h(S ) = C1 |S | 2 2 e 2 V S , S = S 0 > 0.
R
Since the total integral, S h(S )dS = 1, we have
n
C1 = [2 np/2
p |V|n/2 ]1 .
2
Exercises 7.3.
7.3.1. Let X = (xi j ), W = (wi j ) be p p lower triangular matrices of distinct
Pj
real variables such that x j j > 0, w j j > 0 and k=1 w2jk = 1, j = 1, ..., p. Let
1
D = diag(1 , ..., p ), j > 0, j = 1, ..., p be real positive and distinct. Let D 2 =
1 1
diag(12 , ..., p2 ). Then show that
p
Y j1 1
X = DW dX = w dD dW. (7.3.18)
j j j
j=1
and
p
Y
X = W 0 DW dX =
j1
( w ) dD dW. (7.3.21)
j jj
j=1
248 7. JACOBIANS OF MATRIX TRANSFORMATIONS
and
p
1 1
Y p1
j1
X = D W 0 WD dX =
2
w dD dW. (7.3.25)
2 2
j jj
j=1
and p
Y
X = T 0 U dX =
j1
|t | dT dU. (7.3.27)
j j
j=1
7.3.7. For a 3 3 matrix X such that X = X 0 > 0 and I X > 0 show that
2
Z
dX = .
X 90
where (dG) = U 0 (dU) and (dH) = (dV 0 )V, and the j s are known as the singular
values of X.
7.3.10. Let 1 > ... > p > 0 be real variables and D = diag(1 , ..., p ). Show that
[ p ( 2p )]2
Z
2
Y
tr(D )
2 2
e |i j | dD = .
p2
D 2p 2
i< j
p+1
0 p1
11 p ()2 etr(X) , X = X > 0, <() >
f (X) = 2 (7.4.3)
0, elsewhere .
If another parameter matrix is to be introduced then we obtain a gamma density
0
with parameters (, B), B = B > 0, as follows:
|B| p+1 p1
p () |X| 2 etr(BX) , X = X 0 > 0, B = B0 > 0, <() >
2
f1 (X) = (7.4.4)
0, elsewhere.
As in the scalar case, two matrix random variables X and Y are said to be inde-
pendently distributed if the joint density of X and Y is the product of their marginal
densities. We will examine the densities of some functions of independently dis-
tributed matrix random variables. To this end we will introduce a few more func-
tions.
|I AB| = |I BA|
0 0
and if A = A > 0 and B = B > 0 then
1 1 1 1
|I AB| = |I A 2 BA 2 | = |I B 2 AB 2 |.
Now,
Z Z
p+1 p+1 1 1 p+1
p () p () = |U| 2 |X| 2 |I U 2 XU 2 | 2 etr(U) dU dX.
U X
p+1
12 12
Let Z = U XU for fixed U. Then dX = |U| 2 dZ by using Theorem 7.1.5.
Now,
Z Z
p+1 p+1 p+1
p () p () = |Z| 2 |I Z| 2 dZ |U|+ 2 etr(U) dU.
0
Z U=U >0
Evaluation of the U-integral by using (7.4.2) yields p ( + ). Then we have
p () p ()
Z
p+1 p+1
B p (, ) = = |Z| 2 |I Z| 2 dZ.
p ( + ) Z
0
Since the integral has to remain non-negative we have Z = Z > 0, I Z > 0.
Therefore, one representation of a real matrix-variate beta function is the following,
which is also called the type-1 beta integral.
p1 p1
Z
p+1 p+1
B p (, ) = |Z| 2 |I Z| , <() >
2 dZ, <() >
. (7.4.6)
0
0<Z=Z <I 2 2
By making the transformation V = I Z note that and can be interchanged in
the integral which also shows that B p (, ) = B p (, ) in the integral representation
also.
1 1 1
W = (I Z) 2 Z(I Z) 2 W = (Z 1 I) W 1 = Z 1 I
|W|(p+1) dW = |Z|(p+1) dZ dZ = |I + W|(p+1) dW.
Under this transformation the integral in (7.4.6) becomes the following: Observe
that
Z
p+1
B p (, ) = |W| 2 |I + W|(+) dW, (7.4.7)
W=W 0 >0
p1 p1
for <() > 2
, <() > 2
.
The representation in (7.4.7) is known as the type-2 integral for a real matrix-variate
beta function. With the transformation V = W 1 the parameters and in (7.4.7)
will be interchanged. With the help of the type-1 and type-2 integral representations
one can define the type-1 and type-2 beta densities in the real matrix-variate case.
Definition 7.4.2. Real matrix-variate type-1 beta density for the p p real
symmetric positive definite matrix X such that X = X 0 > 0, I X > 0.
p+1 p+1 0 p1 p1
1
B p (,) |X| 2 |I X| 2 = 0 < X = X < I, <() >
2
, <() > 2
,
f2 (X) =
0, elsewhere.
(7.4.8)
Definition 7.4.3. Real matrix-variate type-2 beta density for the p p real
symmetric positive definite matrix X.
p (+) p+1 0
p () p ()
|X| 2 |I + X|(+) , X = X > 0,
f3 (X) = <() > p1 , <() > p1 (7.4.9)
2 2
0, elsewhere.
p+1 p+1
|X1 |1 |X2 |2 2 etr(X1 +X2 )
2
f (X1 , X2 ) = , X1 = X10 > 0, X2 = X20 > 0,
p (1 ) p (2 )
p1 p1
<(1 ) > , <(2 ) > . (7.4.10)
2 2
1 1
U = X1 + X2 |X2 | = |U X1 | = |U||I U 2 X1 U 2 |.
Then the joint density of (U, U 1 ) = (X1 + X2 , X1 ), the Jacobian is unity, is available
as
1 p (1 + 2 )
c1 = , c2 = , <(1 ) > 0, <(2 ) > 0.
p (1 + 2 ) p (1 ) p (2 )
= p (1 + 2 )|I + W|(1 +2 ) .
Hence,
p (1 +2 ) p+1 0
p (1 ) p (2 )
|W|1 2 |I + W|(1 +2 ) , W = W > 0,
gw (W) = <(1 ) > p1 <(2 ) > p1
2
, 2
0, elsewhere,
which is a type-2 beta density with the parameters 1 and 2 . Thus, W is real
matrix-variate type-2 beta distributed.
256 7. JACOBIANS OF MATRIX TRANSFORMATIONS
Exercises 7.4.
7.4.1. For a real p p matrix X such that X = X 0 > 0, 0 < X < I show that
[ p ( p+1 )]2
Z
2
dX = .
X p (p + 1)
7.4.4. For a 4 4 real positive definite matrix X such that 0 < X < I, show that
24
Z
dX = .
X 7!5
should have the linear function t11 x11 + (t21 x21 + t22 x22 ) + + (t p1 x p1 + + t pp x pp ).
0
Even if we take a symmetric matrix T = (ti j ) = T then the trace of T X,
p
X
tr(T X) = t11 x11 + + t pp x pp + 2 ti j x i j .
i< j=1
Hence the Laplace transform in the matrix case, for real symmetric positive definite
matrix X, is defined with the parameter matrix T .
Example 7.5.1. Evaluate the Laplace transform for the two-parameter gamma density
in (7.4.4).
Thus for the integral to converge the exponent has to remain positive definite. Then
1
the condition B + T > 0 is sufficient. Let (B + T ) 2 be the symmetric positive
definite square root of B + T . Then
1 1
tr[(B + T )X] = tr[(B + T ) 2 X(B + T ) 2 ],
1 1 p+1
(B + T ) 2 X(B + T ) 2 = Y dX = |B + T | 2 dY
and
p+1 p+1
|X| 2 dX = |B + T | |Y| 2 dY.
Hence,
|B|
Z
p+1
L f (T ) = |B + T | |Y| 2 etr(Y) dY
p () Y=Y 0 >0
= |B| |B + T | = |I + B1 T | . (7.5.4)
Thus for known B and arbitrary T , (7.5.4) will uniquely determine (7.5.3) through
the uniqueness of the inverse Laplace transform. The conditions for the uniqueness
will not be discussed here. For some results in this direction see Mathai (1993,
1997) and the references therein.
Note that {S < X, X > 0} is also equivalent to {X > S , S > 0}. Hence we may
interchange the integrals. Then
Z "Z #
tr(T X)
L f3 (T ) = f2 (S ) e f1 (X S )dX dS .
S >0 X>S
Z "Z #
tr(T S ) tr(T Y)
L f3 (T ) = e f2 (S ) e f1 (Y)dY dS
S >0 Y>0
= g2 (T )g1 (T ).
Example 7.5.2. Using the convolution property for the Laplace transform and an
integral representation for the real matrix-variate beta function show that
B p (, ) = p () p ()/ p ( + ).
Z
p+1 p+1
B p (, ) = |X| 2 |I X| 2 dX,
0<X<I
p1 p1
<() > , <() > .
2 2
Consider the integral
Z Z
p+1 p+1 p+1 p+1
|U| 2 |X U| 2 dU = |X| 2 |U| 2
0<U<X 0<U<X
12 12 p+1
|I X UX | dU 2
Z
p+1 p+1 p+1 1 1
= |X|+ 2 |Y| 2 |I Y| 2 dY, Y = X 2 UX 2 .
0<Y<I
Then
Z
+ p+1 p+1 p+1
B p (, )|X| 2 = |U| 2 |X U| 2 dU. (7.5.6)
0<U<X
260 7. JACOBIANS OF MATRIX TRANSFORMATIONS
Example 7.5.3. Let h(T ) be the Laplace transform of f (X), that is, h(T
Z ) = L f (T ).
p+1
Then show that the Laplace transform of |X| 2 p ( p+1
2 ) f (X) is equivalent to h(U)dU.
U>T
Solution: From (7.5.3) observe that for symmetric positive definite constant matrix
B the following is an identity.
p1
Z
1 p+1
|B| = |X| 2 etr(BX)dX, <() > . (7.5.7)
p () X>0 2
p+1
Then we can replace |X| 2 p ( p+1
2
) by an equivalent integral.
! Z Z
p+1 p+1 p+1 p+1
2 tr(XY)
|X| 2 p |Y| 2 e dY = etr(XY) dY.
2 Y>0 Y>0
p+1 p+1
Then the Laplace transform of |X| 2 p 2 f (X) is given by,
Z Z
tr(T X)
e f (X)[ etr(Y X) dY] dX
X>0 Y>0
Z Z
= etr[(T +Y)X] f (X)dY dX. (Put T + Y = U U > T )
ZX>0 Y>0 Z
= h(T + Y)dY = h(U)dU.
Y>0 U>T
Example 7.5.4. For X > B, B = B0 > 0 and > 1 show that the Laplace transform of
p+1
|X B| is |T |(+ 2 ) etr(T B) p ( + p+1
2 ).
7.5. THE LAPLACE TRANSFORM IN THE MATRIX CASE 261
Exercises 7.5.
7.5.1. By using the process in Example 7.5.3, or otherwise, show that the Laplace
p+1
transform of [ p ( p+1
2
)|X| 2 ]n f (X) can be written as
Z Z Z
... h(Wn )dW1 dWn
W1 >T W2 >W1 Wn >Wn1
where h(T ) is the Laplace transform of f (X).
p+1 p+1
7.5.2. Show that the Laplace transform of |X|n is |T |n 2 p (n + 2
) for n > 1.
7.5.3. If the p p real matrix random variable X has a type-1 beta density with
parameters (1 , 2 ) then show that
1 1
(i) U = (I X) 2 X(I X) 2 type-2 beta (1 , 2 )
(ii) V = X 1 I type-2 beta (2 , 1 )
where indicates distributed as, and the parameters are given in the brackets.
7.5.4. If the p p real symmetric positive definite matrix random variable X has
a type-2 beta density with parameters 1 and 2 then show that
n n
p
g(T ) = LT (|X| f (X)), = (1)
T
where | T | means that first the partial derivatives with respect to ti j s for all i and j
are taken, then written in the matrix form and then the determinant is taken, where
T = (tij ).
0 0 0
f (X) = f (X ) = f (QQ X) = f (Q XQ) = f (D), D = diag(1 , ..., p ).
and
p+1
r F s+1 (a1 , ..., ar ; b1 , ..., br , c; )| |c 2
p (c)
Z
= etr(Z) r F s (a1 , ..., ar ; b1 , ..., b s ; Z 1 )|Z|c dZ (7.6.2)
(2i) p(p+1)/2 <(Z)=X>X0
0
where Z = X + iY, i = 1, X = X > 0, and X and Y belong to the class of
symmetric matrices with the non-diagonal elements weighted by 12 . The function
r F s satisfying (7.6.1) and (7.6.2) can be shown to be unique under certain conditions
and that function is defined as the hypergeometric function of matrix argument ,
according to this definition.
Then by taking 0 F 0 (; ; ) = etr() and by using the convolution property of the
Laplace transform and equations (7.6.1) and (7.6.2) one can systematically build
up. The Bessel function 0 F 1 for matrix argument is defined by Herz (1955). Thus
we can go from 0 F 0 to 1 F 0 to 0 F 1 to 1 F 1 to 2 F 1 and so on to a general p F q .
Example 7.6.1. Obtain an explicit form for 1 F0 from the above definition by using 0 F0
(; ; U) = etr(U) .
0 F 0 (; ; U) = etr(U) .
264 7. JACOBIANS OF MATRIX TRANSFORMATIONS
But
|I + |c = | |c |I + 1 |c .
Then from (7.6.1)
1
1 F 0 (c; ; ) = |I + 1 |c
which is an explicit representation.
Z
p+1
|X| 2 etr(XZ)C K (XT )dX = |Z| C K (T Z 1 ) p (, K) (7.6.7)
X=X 0 >0
where
p " #
p(p1)/4
Y j1
p (, K) = + kj = p ()()K . (7.6.8)
j=1
2
Z
1
etr(S Z) |Z| C K (Z)dZ
(2i) p(p+1)/2 <(Z)=X>X0
1 p+1
= |S | 2 C K (S ), i = 1 (7.6.9)
p (, K)
for Z = X + iY, X = X 0 > 0, X and Y are symmetric and the nondiagonal elements
are weighted by 12 . If the non-diagonal elements are not weighted then the left side
in (7.6.9) is to be multiplied by 2 p(p1)/2 . Further,
p (, K) p ()
Z
p+1 p+1
|X| 2 |I X| 2 C K (T X)dX = C K (T )
0<X<I p ( + , K)
p1 p1
<() > , <() > . (7.6.10)
2 2
Example 7.6.2. By using zonal polynomials establish the following results:
p (c)
2 F 1 (a, b; c; X) =
p (a) p (c a)
Z
p+1 p+1
| |a 2 |I |ca 2 |I X|b d (7.6.11)
0<<I
p1 p1
for <(a) > 2 , <(c a) > 2 .
X
X C K (X)
|I X|b = (b)K for 0 < X < I
k=0 K
k!
266 7. JACOBIANS OF MATRIX TRANSFORMATIONS
and
p (a, K) p (c a)
Z
p+1 p+1
| |a 2 |I |ca 2 C K (X)d = C K (X)
0<<I p (c, K)
by using (7.6.10). But
p (a, K) p (c a) p (a) p (c a) (a)K
= .
p (c, K) p (c) (c)K
Substituting these back, the right side becomes
X
X (a)K (b)K C K (X)
= 2 F 1 (a, b; c; X).
k=0 K
(c)K k!
This establishes the result.
p (c) p (c a b)
2 F 1 (a, b; c; I) = (7.6.12)
p (c a) p (c b)
p1 p1 p1
for <(c a b) > 2 , <(c a) > 2 , <(c b) > 2 .
Solution 7.6.3. In (7.6.11) put X = I, combine the last factor on the right with
the previous factor and integrate out with the help of a matrix-variate type-1 beta
integral.
symmetric function in the sense f (AB) = f (BA) for all A and B when AB and BA
are defined. Then the M-transform of f (S ), denoted by M ( f ), is defined as
Z
p+1
M ( f ) = |U| 2 f (U)dU. (7.6.13)
0
U=U >0
Solution 7.6.4. We will show that the M-transforms on both sides of (7.6.15)
are one and the same. Taking the M-transform of the left-side, with respect to the
parameter , we have,
268 7. JACOBIANS OF MATRIX TRANSFORMATIONS
Z Z "Z #
p+1 p+1 tr(T X)
|T | 2 {LT (|X B| )}dT = |T | 2 |X B| e dX dT
T >0 T >0 X>B
Z "Z #
p+1 tr(T B) tr(T Y)
= |T | 2 e |Y| e dY dT.
T >0 Y>0
p+1 p+1 p+1 p+1
Noting that = + 2
2
the Y-integral gives |T | 2 p ( + 2
). Then the
T -integral gives
! !
p+1 p+1 p+1
M (left-side) = p + p |B|++ 2 .
2 2
M-transform of the right side gives,
Z ( ! )
p+1 p + 1 (+ p+1
) tr(T B)
M (right-side) = |T | 2 p + |T | 2 e dT
T >0 2
! !
p+1 p+1 p+1
= p + p |B|++ 2 .
2 2
The two sides have the same M-transform.
Starting with 0 F 0 (; ; X) = etr(X) , we can build up a general p F q by using the
M-transform and the convolution form for M-transforms, which will be stated next.
p (c)
Z
p+1 p+1
1 F 1 (a; c; ) = |U|a 2 |I U|ca 2 etr(U) dU. (7.6.18)
p (a) p (c a) 0<U<I
Solution 7.6.5. We will establish this by showing that both sides have the same
M-transforms. From the definition in (7.6.14) the M-transform of the left side with
respect to the parameter is given by the following:
Z
p+1
M (left-side) = | | 2
1 F 1 (a; c; )d
0
= >0
p (a )
" #
p (c)
= p () .
p (c ) p (a)
p (c)
Z
p+1
M (right-side) = | | 2
p (a) p (c a)
Z >0
a p+1 ca p+1 tr(U)
|U| 2 |I U| 2 e dU d .
0<U<I
Take,
p+1 p+1
f1 (U) = etr(U) and f2 (U) = |U|a 2 |I U|ca 2 .
Then
p1
Z
p+1
M ( f1 ) = g1 () = |U| 2 etr(U) dU = p (), <() > .
2
ZU>0
p+1 p+1 p+1
M ( f2 ) = g2 () = |U| 2 |U|a 2 |I U|ca 2 dU
U>0
p+1
p (a + 2
) p (c a) p1
= , <(c a) > ,
p (c + p+12
) 2
<(a + ) > p, <(c + ) > p.
Taking f3 in (7.6.16) as the second integral on the right above we have
p (a )
( )
p (c)
M (right-side) = p () = M (left-side).
p (a) p (c )
Hence the result.
270 7. JACOBIANS OF MATRIX TRANSFORMATIONS
Almost all properties, analogous to the ones in the scalar case for hypergeomet-
ric functions, can be established by using the M-transform technique very easily.
These can then be shown to be unique , if necessary, through the uniqueness of
Laplace and inverse Laplace transform pair. Theories for functions of several ma-
trix arguments, Dirichlet integrals, Dirichlet densities, their extensions, Appells
functions, Lauricella functions, and the like, are available. Then all these real cases
are also extended to complex cases as well. For details see Mathai (1997). Prob-
lems involving scalar functions of matrix argument, real and complex cases, are still
being worked out and applied in many areas such as statistical distribution theory,
econometrics, quantum mechanics and engineering areas. Since the aim in this brief
note is only to introduce the subject matter, more details will not be given here.
Exercises 7.6.
0
7.6.1. Show that for = > 0 and p p,
1 F 1 (a; c; ) = etr() 1 F 1 (c a; c; ).
7.6.2. For p p real symmetric positive definite matrices and show that
p (c)
Z
p+1
1 F 1 (a; c; ) = | |(c 2 ) etr()
p (a) p (c a) 0<<
a p+1 ca p+1
| | 2 || 2 dV.
p1 p1 p1
7.6.7. For <(s) > 2
, <(b s) > 2
, <(c a s) > 2
, show that
Z
p+1 p+1
|X| s 2 |I X|bs 2
2 F 1 (a, b; c; X)dX
0<X<I
p (c) p (s) p (b s) p (c a s)
= .
p (b) p (c a) p (c s)
1 p+1
Ar (S ) = p+1 0
F 1 (; r + ; S ), (7.6.19)
p (r + ) 2
2
show that
!
p ()
Z
p+1 tr(S ) p+1 1
|S | 2 Ar (S )e dS = | | 1 F 1 ; r + ; .
S >0 p (r + p+1
2
) 2
7.6.9. If
Z
p+1 p+1
M(, ; A) = |X| 2 |I + X| 2 etr(AX) dX,
0
X=X >0
p1 0
<() > ,A = A > 0
2
then show that
Z !
tr(T X) + p+1 p+1 p+1 1 1
|X + A| e dX = |A| 2 M ,+ ; A2 T A2 .
X>0 2 2
A and B are free of the elements in X, a, , q are scalar constants with a > 0, > 0,
1 1
and c is the normalizing constant. A 2 and B 2 denote the real positive definite square
roots of A and B respectively.
For evaluating the normalizing constant c one can go through the following
procedure: Let
1 1 r p
Y = A 2 XB 2 dY = A 2 |B| 2 dX
by using Theorem 7.1.5. Let
rp
0 2 r p+1
U = YY dY = |U| 2 2 dU
r
p 2
7.7. A PATHWAY MODEL 273
Note 7.7.1. Note that from (7.7.2) and (7.7.3) we can also infer the densities of
Y and U respectively.
Note 7.7.2. In statistical problems usually the parameters are real and hence we
will assume the parameters to be real here as well as in the discussions to follow. If
is in the complex domain then the condition will reduce to <() + 2r > p1 2
.
and then making the substitution V = a(q 1)U and then evaluating the integral by
using a type-2 beta integral we have
rp Z
2 r p+1
1
c = r p r
|V|+ 2 2 |I + V| q1 dV.
p 2r |A| 2 |B| 2 [a(q 1)] p(+ 2 ) V
Evalute the integral by using a real matrix-variate type-2 beta integral. We have the
following:
rp
2 p + 2r p q1 2r
c1 = r p r
(7.7.7)
p 2r |A| 2 |B| 2 [a(q 1)] p(+ 2 ) p q1
r p1 r p1
for + 2
> 2
, q1 2
> 2
.
Case (iii): q = 1.
When q approaches 1 from the left or from the right it can be shown that the
determinant containing q in (7.7.3) and (7.7.6) approaches an exponential form,
which will be stated as a lemma:
Lemma 7.7.1.
lim |I a(1 q)U| 1q = ea tr(U) . (7.7.8)
q1
This lemma can be proved easily by observing that for any real symmetric ma-
trix U there exists an orthonormal matrix Q such that QQ0 = I = Q0 Q, Q0 UQ =
diag(1 , ..., p ) where j s are the eigenvalues of U. Then
|I a(1 q)U| = |I a(1 q)QQ0 UQQ0 |
= |I a(1 q)Q0 UQ|
= |I a(1 q)diag(1 , ..., p )|
Y p
= (1 a(1 q) j ).
j=1
But
lim(1 a(1 q) j ) 1q = ea j .
q1
Then
lim |I a(1 q)U| 1q = eatr(U) .
q1
7.7. A PATHWAY MODEL 275
by evaluating the integral with the help of a real matrix-variate gamma integral.
Note 7.7.3. Observe that f (X) maintains a generalized real matrix-variate type-1
beta form for < q < 1, f (X) maintains a generalized real matrix-variate type-2
beta form for 1 < q < and f (X) keeps a generalized real matrix-variate gamma
form when q 1.
Remark 7.7.1. The parameter q in the density f (X) can be taken as a pathway
parameter. It defines a pathway from a generalized type-1 beta form to a type-2
beta form to a gamma form. Thus a wide variety of probability models are available
from f (X). If the experimenter needs a model with a thicker tail or thinner tail or
the right and left tails cut off, all such models are available from f (X) for various
values of q. For = 0 one has the matrix-variate Gaussian form coming from f (X).
276 7. JACOBIANS OF MATRIX TRANSFORMATIONS
where
r
p+1
[a(1 q)] p(+ 2 ) p + 2r + 1q + 2
c = (7.7.18)
p + 2r p 1q + p+1 2
r p1
for q < 1, + 2
> 2
,
r
[a(q 1)] p(+ 2 ) p q1
= , (7.7.19)
p + 2r p q1 2r
r p1 r p1
for q > 1, + 2
> 2
, q1 2
> 2
,
r
(a) p(+ 2 )
= , (7.7.20)
p + 2r
r p1
for q = 1, + 2
> 2
.
Exercises 7.7.
7.7.1. By using Stirlings approximation for gamma functions, namely,
1
(z + a) 2 zz+a 2 ez (7.7.21)
for |z| and a a bounded quantity, show that the moment expressions in (7.7.13)
and (7.7.14) reduce to the moment expression in (7.7.15).
7.7.2. By opening up p () in terms of gamma functions and by examining the
structure of the gamma products in (7.7.13) show that for q < 1 we can write
Y p
1
0 12 h
E[|a(1 q)A XBX A | ] =
2 E(xhj ) (7.7.22)
j=1
!
r j1 p+1
+ , + , j = 1, ..., p.
2 2 1q 2
7.7.3. By going through the procedure in Exercise 7.7.2 show that, for q > 1,
Yp
1
0 21 h
E[|a(q 1)A XBX A | ] =
2 E(yhj ) (7.7.23)
j=1
Mathai, A.M., Provost, S.B. and Hayakawa, T. (1995). Bilinear Forms and Zonal
Polynomials, Springer-Verlag Lecture Notes in Statistics, 102, New York.
Mathai, A.M. (2005). A pathway to matrix-variate gamma and normal densities,
Linear Algebra and Its Applications, 396, 317-328.
Mathai, A.M. (1978). Some results on functions of matrix argument, Math. Nachr.
84 , 171-177.
Mathai, A.M. (2004). Modules 1,2,3, Centre for Mathematical Sciences, India.