Joint Probability Functions
Joint Probability Functions
Similarly,
FY (b) = lim F (t, b) F (, b)
t
p(x, y)
y:p(x,y)>0
and
pY (y) =
p(x, y)
x:p(x,y)>0
EXAMPLE: We flip a fair coin twice. Let X be 1 if head on first flip, 0 if tail on first. Let Y
be number of heads. Find p(x, y) and pX , pY .
EXAMPLE: We flip a fair coin twice. Let X be 1 if head on first flip, 0 if tail on first. Let Y
be number of heads. Find p(x, y) and pX , pY .
Solution: The ranges for X and Y are {0, 1}, {0, 1, 2}, respectively. We have
p(0, 0) = P (X = 0, Y = 0) = P (X = 0)P (Y = 0 | X = 0) = (1/2)(1/2) = 1/4
p(0, 1) = P (X = 0, Y = 1) = P (X = 0)P (Y = 1 | X = 0) = (1/2)(1/2) = 1/4
p(0, 2) = P (X = 0, Y = 2) = 0
p(1, 0) = P (X = 1, Y = 0) = 0
p(1, 1) = P (X = 1, Y = 1) = P (X = 1)P (Y = 1 | X = 1) = (1/2)(1/2) = 1/4
p(1, 2) = P (X = 1, Y = 2) = P (X = 1)P (Y = 2 | X = 1) = (1/2)(1/2) = 1/4
.
x .. y
0
1
2 Row Sum = P {X = x}
0
1/4 1/4 0
1/2
1
0 1/4 1/4
1/2
Column Sum = P {Y = y} 1/4 1/2 1/4
Further,
pX (0) =
2
X
y=0
pX (1) =
2
X
y=0
pY (0) =
2
X
x=0
pY (1) =
2
X
x=0
pY (2) =
2
X
x=0
DEFINITION: We say that X and Y are jointly continuous if there exists a function f (x, y)
defined for all x, y such that for any C R2 we have
ZZ
P {(X, Y ) C} =
f (x, y)dxdy
(x,y)C
Z Z
B A
f (x, y)dxdy
Also,
2
2
F (a, b) =
ab
ab
Zb Za
Similar to before, we view the joint density as a measure of the likelihood that the random
vector (X, Y ) will be in the vicinity of (a, b). As before,
P {a < X < a + da, b < Y < b + db} =
b+db
Z a+da
Z
Z Z
A
where
fX (x) =
f (x, y)dydx =
fX (x)dx
f (x, y)dy
f (x, y)dx
f (x, y)dy =
2ex e2y dy = ex
Next,
P {X > 1, Y < 1} =
Z1 Z
0
x 2y
2e e
dxdy =
Z1
2e2y e1 dy = e1 (1 e2 )
Finally,
P {X < Y } =
ZZ
f (x, y)dxdy =
(x,y):x<y
Z
0
2e2y dy
Z Zy
2e3y dy = 1
2e2y (1 ey )dy
2
1
=
3
3
Everything extends to more than two random variables in the obvious way. For example, for n
random variables we have
F (a1 , a2 , . . . , an ) = P {X1 a1 , X2 a2 , . . . , Xn an }
Also, joint probability mass functions and joint densities are defined in the obvious manner.
(1)
Its easy to see that for discrete random variables the condition (1) is satisfied if and only if
p(x, y) = pX (x)pY (y)
(2)
for all x, y where p(x, y) = P {X = x, Y = y} is the joint probability density function. This is
actually easy to show.
Proof: Note that (2) follows immediately from condition (1). However, for the other direction,
for any A and B we have
XX
XX
P {X A, Y B} =
p(x, y) =
pX (x)pY (y)
yB xA
X
yB
pY (y)
yB xA
pX (x) = P {Y B}P {X A}
xA
n
Y
P {Xi Ai }
i=1
The obvious conditions for the joint probability mass functions (in the discrete case) and joint
densities (in the continuous case) hold. That is, in both cases independence is equivalent the
join probability mass function or density being equal to the respective product of the marginals.
EXAMPLES:
1. Let the joint density of X and Y be given by
f (x, y) = 6e2x e3y
for x, y > 0
for x, y > 0
For Y we have
fY (y) =
Therefore, f (x, y) = fX (x)fY (y) for all x, y R (note that the relation also holds if one, or
both, of x and y are negative). Also, X is Exp(2), and Y is Exp(3).
2. Let the joint density of X and Y be given by
f (x, y) = 24xy
Z1x
f (x, y)dy =
24xydy = 12x(1 x)2
Z1y
24xydx = 12y(1 y)2
f (x, y)dx =
Similarly,
fY (y) =
Therefore, we do not have f (x, y) = fX (x)fY (y) that for all x, y and hence X and Y are not
independent.
3. For i = 1, . . . , n, let Xi Exp(i ) be independent exponential random variables with
parameters i > 0. What is the distribution of the minimum of the Xi ?
Solution: Let Z = min{Xi }. Clearly, the range of Z is t > 0. Therefore
P {Z > t} = P {min{Xi } > t} = P {X1 > t, X2 > t, . . . , Xn > t}
= P {X1 > t}P {X2 > t} . . . P {Xn > t} (by independence)
(
n
X
= e1 t e2 t . . . en t = exp
i=1
! )
t
n
X
i=1
! )
i t
therefore Z Exp(1 + . . . + n ).
One natural way to interpret this result is the following: if n alarm clocks are set to go off after
an exponentially distributed amount of time (each with a potentially different rate i ), then
the time that the first alarm rings is also exponentially distributed
with the parameter which
X
equals to the sum of the parameters of all the clocks, that is
i .
i