Chapter 4 - Function of Random Variables: EE385 Class Notes 7/6/2015 John Stensby
Chapter 4 - Function of Random Variables: EE385 Class Notes 7/6/2015 John Stensby
Chapter 4 - Function of Random Variables: EE385 Class Notes 7/6/2015 John Stensby
g(x) denote a real-valued function of the real variable x. Consider the transformation
Y = g(X). (4-1)
This is a transformation of the random variable X into the random variable Y. Random variable
X() is a mapping from the sample space into the real line. But so is g(X()). We are interested
in methods for finding the density fY(y) and the distribution FY(y).
When dealing with Y = g(X()), there are a few technicalities that should be considered.
2. For every y, the set {Y = g(X) y} must be an event. That is, the set { S : Y() = g(X())
In practice, these technicalities are assumed to hold, and they do not cause any problems.
Iy = { x : g(x) y }, (4-2)
y = g(x)=x2
y
x-axis
y y
so that
yb yb
Fy (y) P[X I y ] P[X ] Fx ( ).
a a
Example 4-2: Given random variable X and function y = g(x) = x2 as shown by Figure 4-1.
Define Y = g(X) = X2, and find FY(y). If y < 0, then there are no values of x such that x2 < y.
Hence
FY y 0, y 0 .
FY (y) P y X y FX y FX (
y ) , y 0.
Special Considerations
Special consideration is due for functions g(x) that have “flat spots” and/or jump
discontinuities. These cases are considered next.
Watch for places where g(x) is constant (“flat spots”). Suppose g(x) is constant on the
interval (x0, x1]. That is, g(x) = y1, x0 < x x1, where y1 is a constant, and g(x) y1 off x0 < x
x1. Hence, all of the probability that X has in the interval x0 < x x1 is assigned to the single
value Y = y1 so that
P Y y1 P[ x 0 X x1 ] FX x1 FX x 0 . (4-4)
That is, FY(y) has a jump discontinuity at y = y1. The amount of jump is FX(x1) - FX(x0). As an
example, consider the case of a saturating amplifier/limiter transformation.
Example 4-3 (Saturating Amplifier/Limiter): In terms of FX(x), find the distribution FY(y) for
Y = g(X) where
b, x>b
g(x) = x, -b<x b,
b, x -b.
Case: -b y < b
For -b y < b, we have g(x) y for x y. Hence, FY(y) = P(Y = g(X) y) = FX(y), -b y < b
Case: y < -b
For y < -b, we have g(x) < y for NO x. Hence, FY(y) = 0, y < -b
The result of these cases is shown by Figure 4-3.
FX(x)
y=g(x) 1
b
-b b x-axis
-b b
-b
Figure 4-2: Transformation y = g(x) and distribution Fx(x) used in Ex. 4-3.
x c, x0
g(x) y = g(x)
y-axis
x-c, x<0
x c, x0
g(x)
x-c, x 0
y axis
y = g(x)
FY(y) = FX(y-c) for y c.
-c
Case y -c: If y -c then g(x) y for x y+c. Hence,
Note that x1 through xn are functions of y. The range of each xi(y) covers part of the domain of
g(x). The union of the ranges of xi(y), 1 i n, covers all, or part of, the domain of g(x). The
desired fY(y) is
f (x ) f (x ) f (x )
f Y (y) X 1 X 2 + + X n , (4-6)
g(x1 ) g(x 2 ) g(x n )
y = g(x) = x 2
y
y
x1 x2
x1 x2 x-axis
y y
P(y Y y y) f Y ()d f Y (y)y
y
for small y (increments x1, x2 and y are defined to be positive). Similarly,
f (x ) f (x )
f Y (y) X 1 X 2
y y
x x
1 2
Now, let the increments approach zero. The positive quantities y/x1 and y/x2 approach
x1 0 x1 0
x 2 0 x 2 0
y dg(x1 )
y 0 y dg(x 2 )
y 0
.
and
.. (4-9)
x1 dx x 2 dx
f X (x1 ) f X (x 2 )
f Y (y) . (4-10)
dg(x1 ) dg(x 2 )
dx dx
Example 4-6: Consider Y = aX2 where a > 0. If y < 0, then y = ax2 has no real solutions and
If y > 0, then y = ax2 has solutions x1 y / a and x 2 y / a . Also, note that g(x) = 2ax.
Hence,
f X (x1 ) f X (x 2 )
f Y (y)
dg(x1 ) dg(x 2 )
dx dx
f ( y / a) f ( y / a )
X X , y >0 . (4-12)
2a y / a 2a y / a
=0 y<0
To see a specific example, assume that X is Rayleigh distributed with parameter . The density
for X is given by (2-24); substitute this density into (4-12) to obtain
1 y 1 ya
a exp 2 2
2
f Y (y) U(y)
2a y a (4-13)
1 y
exp U(y)
2
2 a 2 2 a
which is the density for an exponential random variable with parameter = 1/(22a), as can be
seen from inspection of (2-27). Hence the square of a Rayleigh random variable produces an
exponential random variable.
Expected Value of Transformed Random Variable
Given random variable X, with density fX(x), and a function g(x), we form the random
variable Y = g(X). We know that
Y E[Y] y f Y (y) dy (4-14)
This requires knowledge of fY(y). We can express Y directly in terms of g(x) and fX(x).
Theorem 4-1: Let X be a random variable and y = g(x) a function. The expected value of Y =
g(X) can be expressed as
Y E[Y] E[g(X)] g(x) f X (x) dx (4-15)
y = g(x) = x 2
y
y
x1 x2
x1 x2 x-axis
To see this, consider the following example that is illustrated by Figure 4-7. Recall that
f Y (y)y f X (x1 )x1 f X (x 2 )x 2 . Multiply this expression by y = g(x1) = g(x2) to obtain
Now, partition the y-axis as 0 = y0 < y1 < y2 < ..... , where y = yk+1 - yk, k = 0, 1, 2, ... . By the
mappings x1 y and x 2 y , this leads to a partition x1k, k = 0, 1, 2, ... ,of the negative
axis and a partition x2k, k = 0, 1, 2, ... ,of the positive x-axes. Sum both sides over their
partitions and obtain
y k f Y (y k )y g(x1k )fX (x1k )x1k g(x 2k )f X (x 2k )x 2k . (4-17)
k 0 k 0 k 0
0
0 y f Y (y) dy
g(x)f x (x)dx g(x)f x (x)dx
0
, (4-18)
g(x)f x (x)dx
the desired result. Observe that this argument can be applied to practically any function y = g(x).
Example 4-7: Let X be N(0,) and let Y = Xn. Find E[Y]. For n even (i.e., n = 2k) we
know, from Example 2-10, that E[Xn] = E[Xn] = 1·3·5 (n - 1)n. For odd n (i.e., n = 2k +
1) write
2 2k 1
E[ X 2k 1 ] x
2k 1 f (x)dx
x exp[ x 2 / 22 ] dx . (4-19)
2 0
2k
1 2 k 1 x x dx
E[ X 2k 1 ] (2 ) exp[ x 2 / 22 ]
2 0 (22 )k 2
(4-20)
2 (22 )k 1 k y
2 0 y e dy .
(k 1) y k e y dy k! . (4-21)
0
n 1 n 2 2
E[ X ]
2
-
x exp[ x / 2 ]dx
1 3 5 (n 1) n , n 2k (n even) (4-22)
n21 ! n
n 1
2
2 2
, n 2k 1 (n odd)
E[ g(X) ] g(x) f X (x) dx . (4-23)
To approximate this, expand g(x) in a Taylor's series around the mean to obtain
(x )n
g(x) g() g()(x ) +g (n) () . (4-24)
n!
E[ g(X) ] g(x) f X (x) dx
(x ) n
g() g()(x ) +g (n) () f X (x) dx (4-25)
n!
g() g() 2 g (3) () 3 + g (n) () n + .
2! 3! n!
An approximation to E[g(X)] can be based on this formula; just compute a finite number of
terms in the expansion.
Characteristic Functions
The characteristic function of a random variable is
() f X (x)e jx dx E[e jX ] . (4-26)
jx
() f X (x)e dx f (x)dx 1 .
X
(4-27)
and
1
f X (x) ()e jx dx . (4-29)
2
Definition (4-26) takes the form of a sum when X is a discrete random variable. Suppose
that X takes on the values xi with probabilities pi = P[X = xi] for index i in some index set I (i
I). Then the characteristic function of X is
pi exp[ jxi ] .
() f X (x)e jx dx (4-30)
iI
Due to the delta functions in density fX(x), the integral in (4-30) becomes a sum.
Example 4-8: Consider the Gaussian density function
1 2 2
f X (x) e x /2 . (4-31)
2
2 2
() F [f (x)] e / 2 . (4-32)
X
-
1 2 2
If f X (x) e (x ) / 2 , then
2
Example 4-9: Let random variable N be Poisson with parameter . That is,
n
P N = n e , n=0, 1, 2, ... . (4-34)
n!
n
n
() exp[ ]
n!
exp[ j n] exp[ ]
n!
exp[ jn]
n 0 n 0
j n
e
exp[] exp[]exp[e j ] (4-35)
n 0
n!
exp[ e j 1]
XY (1, 2 ) E exp{j(1X 2 Y)} e j(1x 2 y) f XY (x, y) dxdy (4-37)
for the continuous case. Equation (4-37) is recognized as the two dimensional Fourier transform
(with the sign of j reversed) of fXY(x,y). Generalizing these definitions, we can define the joint
characteristic function of n random variables X1, X2, ... , Xn as
Equation (4-38) can be simplified using vector notation. Define the two vectors
1 X1
X2
2
, X . (4-39)
n X n
Then, we can write the n-dimensional characteristic function in the compact form
T
X () E e j X . (4-40)
Equations (4-38) and (4-40) convey the same information; however, (4-40) is much easier to
write and work with.
Characteristic Function for Multi-dimensional Gaussian Case
Let X = [X1 X2 ... Xn]T be a Gaussian random vector with mean = E[X ]. Let = [
2 ... n]T be a vector of n algebraic variables. Note that
1
2 n
T [1 2 n ] k k (4-41)
k 1
n
is a scalar. The characteristic function of X is given as
T exp[ j
() E exp[ j
T 1 T
] . (4-42)
X] 2
Y () E[e jY ] E[e jg(X) ] e jg(x) f X (x) dx . (4-43)
If a change of variable y = g(x) can be made (usually, this requires g to have an inverse), this last
integral will have the form
Y () e jy h(y) dy . (4-44)
The desired result fY(y) = h(y) follows (by uniqueness of the Fourier transform).
Example 4-10: Suppose X is N(0;) and Y = aX2. Then
2 2 2 jax 2 x 2 / 22
Y () E[e jY ] E[e jaX ] e jax f X (x) dx e e dx .
2 0
For 0 x < , note that the transformation y = ax2 is one-to-one. Hence, make the change of
variable y = ax2, dy = (2ax)dx = 2 ay dx to obtain
2 jy y/ 2a2 dy
jy e y/ 2a
2
dy .
2 0 2 ay 0
Y () e e e
2ay
Hence, we have
2
e y/ 2a
f Y (y) U(y) .
2ay
Other Applications
Sometimes, a characteristic function is used to obtain qualitative results about a random
phenomenon of interest. That is, it is a tool that may be used to obtain qualitative results about a
random quantity of interest. For example, suppose we want to show that some random
phenomenon is Gaussian distributed. We may be able to do this by deriving the characteristic
function that describes the random phenomenon (and showing that the characteristic function has
the form given by (4-33)). In Chapter 9 of these notes, we do this for shot noise. We use
characteristic function theory to show that classical shot noise becomes Gaussian distributed as
its intensity parameter becomes large.
Moment Generating Function
The moment generating function is
sx
(s) f (x)e dx E[esX ] . (4-45)
X
dn
(s) x n f X (x)esx dx E[X n esX ] , (4-46)
n
ds
so that
dn
(s) E[X n ] m n . (4-47)
n s 0
ds
Example 4-11: Suppose X has an exponential density f X (x) e x U(x) . Then the moment
generating function is
(s) ex esx dx .
0 s
d
(s) E[X] 1
ds s 0
d2
(s) E[X 2 ] 2 2 .
2
ds s 0
2 2
2 E[X 2 ] E[X] 2 2 1 1 2.
Theorem 4-2
Let X and Y be independent random variables. Let g(x) and h(y) be arbitrary functions. Define
the transformed random variables
Z g(X)
(4-48)
W h(Y).
A z [x : g(x) z]
Bw [y : h(y) w].
Fz (z)FW (w) ,
Z = g(X,Y). (4-49)
We want to find the density and distribution of Z in terms of like quantities for X and Y. For
real z, denote Dz as
so that y-axis
In this integral, the region of integration is depicted by the shaded area shown on Figure 4-8.
Now, we can write
z y
FZ (z) f XY (x, y) dxdy .
By using Leibnitz’s rule (see below) for differentiating an integral, we get the density
d d zy
f Z (z) FZ (z) f XY (x, y) dxdy
dz dz
f XY (z y, y) dy .
b(t)
F(t) (x, t)dx .
a(t)
Note that the t variable appears in the integrand and limits. Leibnitz’s rule states that
f Z (z) f X (z y)f Y (y) dy , (4-53)
y=z
1 e(z 1/ 2) , 1/ 2 z 1/ 2
Figure 4-11: Case II: -½ < z < ½.
Y-Axis
0.6
f z (z) 0, z 1/ 2
0.5
0.3
[e1/ 2 e1/ 2 ]e z , 1/ 2 z
0.2
0.1
X-Axis
Figure 4-13: Final result for Example 4-13.
ZX/Y.
Dz = { (x,y) : x/y z },
the shaded region on the plot depicted by Figure 4-14. Now, compute the distribution
yz 0
Fz (z) f XY (x, y)dxdy + yz f XY (x, y)dxdy .
0
y
y-axis
Dz is the shaded region
yz
x
x=
Drawn for case z > 0
x-axis
x
y
Figure 4-14: Integrate over shaded region to obtain FZ for Example 4-14.
d 0
f z (z) Fz (z) y f xy (yz, y)dy y f xy (yz, y)dy y f xy (yz, y)dy
dz 0
Example 4-15: Consider the transformation Z X 2 Y 2 . For this transformation, the region
DZ is given by
D z (x, y) : x 2 y 2 z (x, y) : x 2 y 2 z 2 (4-54)
Now, suppose X and Y are independent, jointly Gaussian random variables with
1 (x 2 y2 )
f XY (x, y) exp (4-56)
22 22
1 (x 2 y 2 )
exp 22 dxdy
r
Fz (z)
22 D
z x-axis
x r cos
To integrated this, use Figure 4-15, and
y r sin
transform from rectangular to polar
coordinates Rectangular-to-Polar Transformation
r x 2 y2 , r0
tan 1 (y / x), d
dA
dA r dr d .
rd
dr
r
The change to polar coordinates yields
1 r2
2 z
Fz (z) 22 r dr d
22 0 0
exp
Cut-away view detailing differential area
dA = rdrd.
The integrand does not depend on so the Figure 4-15: Figures that supports Example 4-15.
integral over is elementary. For the
integral over r, let u = r2/22 and du = (r/2)dr to obtain
z r 2 dr z 2 / 2 2 u
Fz (z) exp r 0 e du
0 2 2 2
2 2
1 e z / 2 , z 0 ,
so that
d z z 2 / 2 2
f z (z) Fz (z) e , z0
dz 2
a Rayleigh density with parameter Hence, if X and Y are identically distributed, independent
Gaussian random variables then Z X 2 Y 2 is Rayleigh distributed.
Two Functions of Two Random Variables
Given random variables X and Y and functions z = g(x,y), w = h(x,y), we form the new random
variables
Z = g(X,Y) (4-57)
W = h(X,Y).
Express the joint statistics of Z, W in terms of functions g, h and fXY. To accomplish this, define
Example 4-16: Consider independent Gaussian X and Y with the joint density function
1 (x 2 y2 )
f XY (x, y) exp .
22 2 2
W Y/X.
Drawn for case
w>0
Dzw is the shaded region on Figure 4-16. The figure is drawn for the case w > 0 (the case w < 0
gives results that are identical to those given below). Now, integrate over Dzw to obtain
Tan 1 (w)
z 1 r 2 / 2 2
2
0 2
e r dr ,
which leads to
tan 1 (w)
2 2
FZW (z, w) 2 1 e z / 2 , z 0, w
(4-60)
0, z 0, w
2 2
FZ (z) {1 e z / 2 }U(z)
(4-61)
FW (w) 1 1 Tan 1(w) , w
2
Note that Z and W are independent, Z is Rayleigh distributed and W is Cauchy distributed.
Joint Density Transformations: Determine fZW Directly in Terms of fXY.
Let X and Y be random variables with joint density fXY(x,y). Let
z = g(x,y) (4-62)
w = h(x,y)
be (generally nonlinear) functions that relate algebraic variables x, y to the algebraic variables z,
w. Also, we assume that g and h have continuous first-partial derivatives at the point (x,y) used
below. Now, define the new random variables
Z = g(X,Y) (4-63)
W = h(X,Y).
In this section, we provide a method for determining the joint density fZW(z,w) directly in terms
of the known joint density fXY(x,y).
First, consider the relatively simple case where (4-62) can be inverted. That is, it is
possible to solve (4-62) for unique functions
x (z, w)
(4-64)
y (z, w)
P1 = (z, w)
P2 = (z, w+dw)
(4-65)
P3 = (z+dz, w+dw)
P4 = (z+dz, w).
z, w plane x, y plane
dz
w-axis y-axis P 2
P2 P3
R1 dw R2 P 3
P 1
P1 P4
g,h P 4
z-axis x-axis
The z-w plane infinitesimal rectangle R1 gets mapped into the x-y plane, where it shows up as
parallelogram R2. As shown on the x-y plane of Figure 4-17, to first-order in dw and dz,
P1 = (x, y) P3 = (x+ dz+ dw, y+ dz+ dw)
z w z w
(4-66)
P2 = (x+ dw, y+ dw) P4 = (x+ dz, y+ dz).
w w z z
The requirement that (4-64) have continuous first-partial derivatives was used to write (4-66).
Note that P1 maps to P1, P2 maps to P2, etc (it is easy to show that P2 - P1 = P3 - P4 and
P4 - P1 = P3 - P2 so that we have a parallelogram in the x-y plane). Denote the area of
x-y plane parallelogram R2 as AREA(R2)
If random variables Z, W fall in the z-w plane infinitesimal square R1, then the random
variables X, Y must in the x-y plane parallelogram R2, and vice-versa. In fact, we can claim
where the approximation becomes exact as dz and dw approach zero. Since AREA(R1) = dzdw,
Equation (4-67) yields the desired fXY once an expression for AREA(R2) is obtained.
Figure 4-18 depicts the x-y plane parallelogram R2 for which area AREA(R2) must be
obtained. This parallelogram has sides P1 P2 and P1 P4 (shown as vectors with arrow heads on
Fig 4-18) that can be represented as
P2
R2 P3
P1
P4
P1 P4 dz iˆ dz ˆj
z z
, (4-68)
P1 P2 dw iˆ dw ˆj
w w
where iˆ and ĵ are unit vectors in the x and y directions, respectively. Now, the vector cross
product of sides P1 P4 and P1 P2 is denoted as P1 P4 P1 P2 . And, the area of parallelogram R2
is the magnitude P1 P4 P1 P2 sin() = P1 P4 P1 P2 , where is the positive angle between
the vectors. Since iˆ ˆj kˆ , ˆj iˆ kˆ , ˆj ˆj = iˆ iˆ = kˆ kˆ = 0, we write
iˆ ˆj kˆ
z
w
AREA(R2 ) P1 P4 P1 P2 det dz dz 0 det dzdw. (4-69)
z z
z w
dw dw 0
w w
In the literature, the last determinant on the right-hand-side of (4-69) is called the Jacobian of
the transformation (4-64); symbolically, it is denoted as J(xy); instead, the notation
(x,y)/(z,w) may be used. We write
x x
(x, y z w z w
J (x, y) det det . (4-70)
(z w y y
z w z w
Finally, substitute (4-69) into (4-67), cancel out the dzdw term that is common to both sides, and
obtain the desired result
(x, y
f ZW (z, w) f XY (x, y) , (4-71)
(z w
x (z,w)
y (z,w)
a formula for the density fZW in terms of the density fXY. It is possible to obtain (4-71) directly
from the change of variable formula in multi-dimensional integrals; this fact is discussed briefly
in Appendix 4A.
It is useful to think of (4-69) as
(x, y
AREA(R2 ) AREA(R1 ) , (4-72)
(z w
a relationship between AREA(R2) and AREA(R1). So, the Jacobian can be thought of as the
“area gain” imposed by the transformation (the Jacobian shows how area is scaled by the
transformation).
By considering the mapping of a rectangle on the x, y plane to a parallelogram on the z,
w plane (i.e., in the argument just given, switch planes so that the rectangle is in the x-y plane
and the parallelogram is in the z-w plane) , it is not difficult to show
(z, w
f XY (x, y) f ZW (z, w) , (4-73)
(x y
where (x,y) and (z,w) are related by (4-62) and (4-64). Now, substitute (4-73) into (4-71) to
obtain
(z, w (x, y
f ZW (z, w) f ZW (z, w) , (4-74)
(x y (z w
1
(x, y
(z, w , (4-75)
(z w
(x y
f XY (x, y)
f ZW (z, w) , (4-76)
(z w x (z,w)
(x, y y (z,w)
for each root, 1 k n. For this case, a simple extension of (4-71) leads to
n
(x, y
f ZW (z, w) f XY (x, y)
(z w
, (4-78)
k 1
(x, y) (x k , y k )
f XY (x, y)
n
f ZW (z, w) (z w . (4-79)
k 1 (x, y
(x, y) (x , y )
k k
That is, to obtain fZW(z,w), we should evaluate the right-hand-side of (4-71) (or (4-76)) at each of
the n roots xk(z,w), yk(z,w), 1 k n, and sum up the results.
Example 4-17: Consider the linear transformation
z = ax + by z a b x
,
w = cx + dy w c d y
1
x a b z x = Az+ Bw
y c d w y = Cz+Dw,
where A, B, C and D are appropriate constants (can you find A, B, C and D??). Now, compute
a b
(z w ad bc
det
(x, y
c d
If X and Y are random variables described by fXY(x,y), the density function for random variables
Z = aX + bY, W = cX + dY is
Example 4-18: Consider X, an n1, zero-mean Gaussian random vector with positive definite
covariance matrix x. Define Y = AX, where A is an nn nonsingular matrix. Note that
y1 a11 a1n x1
n
yi a ik x k , 1 i n
k 1
y n a n1 a nn x n
As discussed previously, the density for X is
1
f x (X) exp 1 X T x1 X
(2) n / 2 x
1/ 2 2
Since A is invertable, we can write fY(Y) as
f X (X)
f Y (Y)
(Y)
,
(X) X=A-1 Y
where
det[A]
T
Y E Y Y T E AX AX A E X X T A T A X A T
T
X A 1 Y A 1
1
f Y (Y) exp 1 (A 1Y)T x1 A 1Y
(2)n / 2 x
1/ 2
det A 2
T 1 T 1 1 ,
1
exp Y (A ) x A Y
1
(2) n/2
x
1/ 2
det A 2
Y1
1/2
Y
a result rewritten as
1 1 ,
f Y (Y) exp 1 Y T Y Y
1/ 2 2
(2)n / 2 Y
where Y = AxAT is the covariance of Gaussian random vector Y. This example leads to the
transformations of Gaussian random variables
r
produces Gaussian random variables (remember
this!!). x-axis
Example 4-19 (Polar Coordinates): Consider the
transformation
Tan 1 (y / x),
that is illustrated by Figure 4-19. With the limitation of to the ( ] range, the
transformation has the inverse
x = r cos()
y = r sin()
cos r sin
(x, y)
det r
(r, )
sin() r cos
so that
(x, y)
f r (r, ) f XY (x, y)
(r, ) x r cos
y r sin
r f XY (r cos , r sin )
for r > 0 and - < . Suppose that X and Y are independent, jointly Gaussian, zero mean
with a common variance 2. For this case, the above result yields
1 r r 2
exp .
2 2 2
2
f ( )
f r (r)
Note that r and are independent, r is Rayleigh and is uniform over (-].
Example 4-20: Consider the random variables Z = g(X,Y) and W = h(X,Y) where
z = g(x, y) x 2 y 2
. (4-80)
w = h(x, y) y / x
Transformation (4-80) has roots (x1, y1) and (x2, y2) given by
for - < w < and z > 0; the transformation has no real roots for z < 0. A direct evaluation of
the Jacobian leads to
z z
x y x(x 2 y 2 )½ y(x 2 y 2 ) ½
(z w det ,
det
(x, y w w 2
y / x 1/ x
x y
(z w
(x 2 y 2 )½ 1 y 2 / x 2 . (4-82)
(x, y
When evaluate at both (x1, y1) and (x2, y2), the Jacobian yields
(z w (z w 1 w2
. (4-83)
(x, y x , y ) (x, y x , y ) z
1 1 2 2
z
f ZW (z, w) f XY (x1, y1) f XY (x 2 , y2 ) , z 0, w , (4-84)
1 w2
where (x1,y1) and (x2,y2) are given by (4-81). If, for example, X and Y are independent, zero-
mean Gaussian random variables with the joint density
1
f XY (x, y) = exp (x 2 + y 2 ) / 22 , (4-85)
2
2
z 1/
f ZW (z, w) exp z 2 / 22 U(z) f Z (z)f W (w) (4-86)
2 1 w2
where
z
f Z (z) exp z 2 / 22 U(z)
2
(4-87)
1/
f W (w)
1 w2
1
f (Y) exp 1 Y T Y . (4-88)
(2)n / 2 2
Now, let A be an n n nonsingular, real-valued matrix, and consider the linear transformation
X AY . (4-89)
The transformation is one-to-one. For every Y there is but one X , and for every X there is but
one Y = A-1X . We can express the density of X in terms of the density of Y as
f y (Y)
f x (X) (4-90)
abs[J]
Y A 1X
where
Hence, we have
1
f x (X) f Y (A 1X)
A
1
exp 1 (A 1X)T A 1X (4-92)
(2) n / 2 A 2
1
exp 1 X T (A 1 )T A 1X ,
(2) n / 2 A 2
1
f x (X) exp 1 X T x1X , (4-93)
(2) n / 2 x
1/ 2 2
x = AAT. (4-94)
The solution to this problem comes from linear algebra. Given any positive definite
symmetric matrix x, there exists a nonsingular matrix P such that
PTxP = I, (4-96)
which means that x = (PT)-1P-1 = (P-1)TP-1 (we say that x is congruent to I). Compare this to
the result given above to see that matrix A can be found by using
1 2
x .
2 5
1 2 1 0
x
2 5 0 1
1 2 1 0
0 1 2 1
1 0 1 0
= [ I PT]
0 1 2 1
1 0
3) P T
2 1
1 0
4) A = (PT )-1
2 1
2 0 3
x 0 1 0
3 0 10
[ x
Add to 3rd row 3/2 times 1st row. Add to 3rd column 3/2 times 1st column
[ IP T ]
2 2
11 11
Finally, compute
2 0 0
A (PT )1 0 1 0
3 2 0 11
2 2