ST 610 Lect 4
ST 610 Lect 4
The joint pmf can be used to compute the probability of any event
defined in terms of (X, Y ). Let A be any subset of R2. Then
X
P ((X, Y ) ∈ A) = f (x, y).
(x,y)∈A
Proof: For any x ∈ R, let Ax = {(x, y) : −∞ < y < ∞}. That is,
Ax is the line in the plane with first coordinate equal to x. Then, for
any x ∈ R,
fX (x) = P (X = x)
Example 4.1.3 (Marginal pmf for dice) Using the table given in
Example 4.1.2, compute the marginal pmf of Y . Using Theorem
4.1.1, we have
1
fY (0) = fX,Y (2, 0) + · · · + fX,Y (12, 0) = .
6
Similarly, we obtain
5 2 1 1 1
fY (1) = ,
fY (2) = , fY (3) = , fY (4) = , fY (5) = .
18 9 6 9 18
P
Notice that 5i=0 fY (i) = 1.
A = {(x, y) : x+y ≥ 1, 0 < x < 1, 0 < y < 1} = {(x, y) : 1−y ≤ x < 1, 0 <
Thus, we have
Z Z Z 1Z 1
9
P (X + Y ≥ 1) = f (x, y)dxdy = 6xy 2dxdy = .
A 0 1−y 10
The joint cdf is the function F (x, y) defined by
Z x Z y
F (x, y) = P (X ≤ x, Y ≤ y) = f (s, t)dtds.
−∞ −∞
4.2. CONDITIONAL DISTRIBUTIONS AND INDEPENDENCE 133
The marginal of X is
Z ∞ Z ∞
fX (x) = f (x, y)dy = e−y dy = e6−x.
−∞ x
=1
136 CHAPTER 4. MULTIPLE RANDOM VARIABLES
f (x, y) = g(x)h(y).
Proof: The “only if” part is proved by defining g(x) = fX (x) and
h(y) = fY (y). To proved the “if” part for continuous random vari-
ables, suppose that f (x, y) = g(x)h(y). Define
Z ∞ Z ∞
g(x)dx = c and h(y)dy = d,
−∞ −∞
= g(x)h(y)dxdy
Z−∞ −∞
∞ Z ∞
= f (x, y)dxdy = 1
−∞ −∞
and
Z ∞
fY (y) = g(x)h(y)dx = h(y)c.
−∞
138 CHAPTER 4. MULTIPLE RANDOM VARIABLES
Thus, we have
and
y 4e−y /384 y>0
h(y) =
0 y≤0
then f (x, y) = g(x)h(y) for all x ∈ R and all y ∈ R. By Lemma
4.2.1, we conclude that X and Y are independent random vari-
ables.
4.2. CONDITIONAL DISTRIBUTIONS AND INDEPENDENCE 139
= (Eg(X))(Eh(Y )).
The result for discrete random variables is proved bt replacing integrals
by sums.
Part (a) can be proved similarly. Let g(x) be the indicator function
of the set A. let h(y) be the indicator function of the set B. Note
that g(x)h(y) is the indicator function of the set C ∈ R2 defined by
140 CHAPTER 4. MULTIPLE RANDOM VARIABLES
P (X ∈ A, Y ∈ B) = P ((X, Y ) ∈ C) = E(g(X)h(Y ))
Proof:
¤
142 CHAPTER 4. MULTIPLE RANDOM VARIABLES
Hence, Z ∼ N (µ + γ, σ 2 + τ 2). ¤
If (X, Y ) is a continuous random vector with joint pdf fX,Y (x, y),
then the joint pdf of (U, V ) can be expressed in terms of FX,Y (x, y) in
a similar way. As before, let A = {(x, y) : fX,Y (x, y) > 0} and B =
{(u, v) : u = g1(x, y) and v = g2(x, y) for some (x, y) ∈ A}. For the
simplest version of this result, we assume the transformation u =
g1(x, y) and v = g2(x, y) defines a one-to-one transformation of A
to B. For such a one-to-one, onto transformation, we can solve the
equations u = g1(x, y) and v = g2(x, y) for x and y in terms of u and
v. We will denote this inverse transformation by x = h1(u, v) and
y = h2(u, v). The role played by a derivative in the univariate case is
now played by a quantity called the Jacobian of the transformation.
It is defined by ¯ ¯
¯ ∂x ∂x ¯
¯ ∂u ¯
J = ¯¯ ∂v ¯
,
∂y
¯ ∂u ∂y ¯¯
∂v
∂x ∂h1 (u,v) ∂x ∂h1 (u,v) ∂y ∂h2 (u,v) ∂y ∂h2 (u,v)
where ∂u = ∂u , ∂v = ∂v , ∂u = ∂u , and ∂v = ∂v .
fX,Y (x, y) = (2π)−1 exp(−x2/2) exp(−y 2/2), −∞ < x < ∞, −∞ < y <
u+v u−v
x = h1(x, y) = , and y = h2(x, y) = .
2 2
1 −((u+v)/2)2/2 −((u−v)/2)2/2 1
fU,V (u, v) = fX,Y (h1(u, v), h2(u, v))|J| = e e
2π 2
146 CHAPTER 4. MULTIPLE RANDOM VARIABLES
FU,V (u, v) = P (U ≤ u, V ≤ v)
= P (X ∈ Au, Y ∈ Bv )
P (X ∈ Au)P (Y ∈ Bv ).
and
1 −x2/2 −y2/2
fX,Y (x, y) = e e ,
2π
we have
1 −(uv)2/2 −v2/2 1 2 2
fU,V (u, v) = e e |v| + e−(−uv) /2e−(−v) /2|v|
2π 2π
v 2 2
= e−(u +1)v /2, −∞ < u < ∞, 0 < v < ∞.
π
150 CHAPTER 4. MULTIPLE RANDOM VARIABLES
a hierarchical model.
EX = λp
EX = E(E(X|Y )),
EX = E(E(X|Y )) = E(pY ) = pλ
Since
and
E([E(X|Y ) − EX]2) = Var(E(X|Y )),
X|P ∼ binomial(P ), i = 1, . . . , n,
P ∼ beta(α, β).
The mean of X is then
nα
EX = E[E(X|p)] = E[nP ] = .
α+β
Since P ∼ beta(α, β),
αβ
Var(E(X|P )) = Var(np) = n2 .
(α + β)2(α + β + 1)
Also, since X|P is binomial(n, P ), Var(X|P ) = nP (1 − P ). We
then have
Z
Γ(α + β) 1
E[Var(X|P )] = nE[P (1 − P )] = n p(1 − p)pα−1(1 − p)β−1dp
Γ(α)Γ(β) 0
Γ(α + β) Γ(α + 1)Γ(β + 1) nαβ
=n = .
Γ(α)Γ(β) Γ(α + β + 2) (α + β)(α + β + 1)
Adding together the two pieces, we get
nαβ(α + β + n)
VarX = .
(α + β)2(α + β + 1)
4.5. COVARIANCE AND CORRELATION 157
Cov(X, Y ) = EXY − µX µY .
158 CHAPTER 4. MULTIPLE RANDOM VARIABLES
a. −1 ≤ ρXY ≤ 1.
= t2σX
2
+ 2tCov(X, Y ) + σY2 .
This is equivalent to
−σX σY ≤ Cov(X, Y ) ≤ σX σY .
That is,
−1 ≤ ρXY ≤ 1.
P ((X − µX )t + (Y − µY ) = 0) = 1.
Note f (x, y) can be obtained from the relationship f (x, y) = f (y|x)f (x).
Then
= EX 3 + EXZ − 0E(X 2 + Z)
=0
Thus, ρXY = Cov(X, Y )/(σX σY ) = 0.
Assuming (a) and (b) are true, we will prove (c). Let
x − µX y − µY x − µX
s=( )( ) and t = ( ).
σX σY σX
Then x = σX t + µX , y = (σY s/t) + µY , and the Jacobian of the
transformation is J = σX σY /t. With this change of variables, we
obtain
Z ∞ Z ∞
σY s σX σY
ρXY = + µY )|
sf (σX t + µX , |dsdt
t t
Z−∞ −∞
∞ Z ∞ p ¡ 1 s 2¢
= s(2πσX σY 1 − ρ2)−1 exp − 2
(t 2
− 2ρs + ( ))
2(1 − ρ) t
Z−∞
∞
−∞
Z ∞
1 t2 s ¡ (s − ρt2)2 ¢
= √ exp(− )dt √ p exp − ds
−∞ 2π 2 −∞ 2π (1 − ρ2)t2 2(1 − ρ2)t2
The inner integral is ES, where S is a normal random variable with
ES = ρt2 and VarS = (1 − ρ2)t2. Thus,
Z ∞
ρt2
ρXY = √ exp{−t2/2}dt = ρ.
−∞ 2π
4.6. MULTIVARIATE DISTRIBUTIONS 163
and
X
Eg(X) = g(x)f (x)
x∈Rn
in the continuous and discrete cases, respectively.
The marginal distribution of (X1, . . . , Xn) , the first k coordinates
164 CHAPTER 4. MULTIPLE RANDOM VARIABLES
or
X
f (x1, . . . , xk ) = f (x1, . . . , xn)
(xk+1 ,...,xn )∈Rn−k
If the Xi’s are all one dimensional, then X1, . . . , Xn are called
mutually independent random variables.
Let (X1, . . . , Xn) be a random vector with pdf fX (x1, . . . , xn). Let
A = {x : fX (x) > 0}. Consider a new random vector (U1, . . . , Un),
defined by U1 = g1(X1, . . . , Xn), . . ., Un = gn(X1, . . . , Xn). Suppose
that A0, A1, . . . , Ak form a partition of A with these properties. The
set A0, which may be empty, satisfies P ((X1, . . . , Xn) ∈ A0) = 0.
The transformation (U1, . . . , Un) = (g1(X), . . . , gn(X)) is a one-to-
one transformation from Ai onto B for each i = 1, 2, . . . , k. Then for
each i, the inverse functions from B to Ai can be found. Denote the
ith inverse by x1 = h1i(u − 1, . . . , un), . . . , xn = hni(u1, . . . , un). Let
Ji denote the Jacobian computed from the ith inverse. That is,
¯ ¯
¯ ∂h1i(u) ∂h1i(u) ∂h1i (u) ¯¯
¯ . . . ∂u1 ¯
¯ ∂u1 ∂u2
¯ ∂h (u) ∂h (u) ∂h2i (u) ¯
¯
¯ 2i 2i
. . . ∂u1 ¯
¯ ∂u1 ∂u2
Ji = ¯ ¯
¯ ... ... ... ... ¯
¯ ¯
¯ ¯
¯ ∂hni(u) ∂hni(u) ∂hni (u) ¯
¯ ∂u1 ∂u2 . . . ∂u1 ¯
the determinant of an n × n matrix. Assuming that these Jacobians
do not vanish identically on B, we have the following representation
4.6. MULTIVARIATE DISTRIBUTIONS 171