0% found this document useful (0 votes)
22 views76 pages

ST2334 Chapter 3 Slides

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views76 pages

ST2334 Chapter 3 Slides

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 76

Chapter 3: Joint

Distributions
1 J OINT D ISTRIBUTIONS FOR M ULTIPLE R ANDOM VARIABLES

• Very often, we are interested in more than one random variables


simultaneously.

• For example, an investigator might be interested in both the height


(H) and the weight (W ) of an individual from a certain popula-
tion.

• Another investigator could be interested in both the hardness (H)


and the tensile strength (T ) of a piece of cold-drawn copper.
D EFINITION 1

• Let E be an experiment and S be a corresponding sample space.

• Let X and Y be two functions each assigning a real number to each


s ∈ S.

• We call (X,Y ) a two-dimensional random vector, or a two-


dimensional random variable.
Similarly to one-dimensional situation, we can denote the range space
of (X,Y ) by
n o
RX,Y = (x, y) x = X(s), y = Y (s), s ∈ S .
The definition above can be extended to more than two random vari-
ables.

D EFINITION 2
Let X1, X2, . . . , Xn be n functions each assigning a real number to every out-
come s ∈ S. We call (X1, X2, . . . , Xn) an n-dimensional random variable
(or an n-dimensional random vector).
We define the discrete and continuous two-dimensional RVs as fol-
lows.

D EFINITION 3
1 (X,Y ) is a discrete two-dimensional RV if the number of possible values
of (X(s),Y (s)) are finite or countable.
That is the possible values of (X(s),Y (s)) may be represented by
(xi, y j ), i = 1, 2, 3, . . . ; j = 1, 2, 3, . . .

2 (X,Y ) is a continuous two-dimensional RV if the possible values of


(X(s),Y (s)) can assume any value in some region of the Euclidean space
R2 .
R EMARK
we can view X and Y separately to judge whether (X,Y ) is discrete or
continuous.

• If both X and Y are discrete RVs, then (X,Y ) is a discrete RV.

• Likewise, if both X and Y are continuous random variables, then


(X,Y ) is a continuous RV.

• Clearly, there are other cases. For example, X is discrete, but Y is


continuous. These are not our focus in this module.
Example 3.1 (Discrete Random Vector)
• Consider a TV set to be serviced.
• Let
X = {age to the nearest year of the set};
Y = {# of defective components in the set}.

• (X,Y ) is a discrete 2-dimensional RV.


• RX,Y = {(x, y)|x = 0, 1, 2, . . . ; y = 0, 1, 2, . . . , n}, where n is the total
number of components in the TV.
• (X,Y ) = (5, 3) means that the TV is 5 years old and has 3 defective
components.
L–example 3.1

• A fast food restaurant operates a drive-up facility and a walk-up


window.

• On a day, Let

X = the proportion of time that the drive-up facility is in use;


Y = the proportion of time that the walk-up window is in use.

• Then RX,Y = {(x, y)|0 ≤ x, 0 ≤ y ≤ 1}.

• (X,Y ) is a continuous 2-dimensional RV.


Joint Probability Function

• We introduce the probability function for the discrete and contin-


uous RVs separately.

• For discrete random vector, similar to the one-dimensional case,


we define its probability function by associate a number with
each possible value of the RV.
D EFINITION 4 (J OINT PROBABILITY FUNCTION FOR DISCRETE RV)
Let (X,Y ) be a 2-dimensional discrete RV, the joint probability (mass)
function is defined by

fX,Y (x, y) = P(X = x,Y = y),

for x, y being possible values of X and Y , or in the other words (x, y) ∈ RX,Y .
The joint probability mass function has the following properties:

(1) fX,Y (x, y) ≥ 0 for any (x, y) ∈ RX,Y .


(2) fX,Y (x, y) = 0 for any (x, y) ∈
/ RX,Y .
∞ ∞ ∞ ∞
(3) ∑ ∑ fX,Y (xi, y j) = ∑ ∑ P(X = xi,Y = y j) = 1;
i=1 j=1 i=1 j=1

or equivalently ∑∑(x,y)∈R f (x, y) = 1.


X,Y

(4) Let A be any subset of RX,Y , then

P((X,Y ) ∈ A) = ∑∑(x,y)∈A fX,Y (x, y).


Example 3.2 Find the value of k such that f (x, y) = kxy for x = 1, 2, 3
and y = 1, 2, 3 can serve as a joint probability function.
Solution: RX,Y = {(x, y)|x = 1, 2, 3; y = 1, 2, 3}.
f (1, 1) = k, f (1, 2) = 2k, f (1, 3) = 3k,
f (2, 1) = 2k, f (2, 2) = 4k, f (2, 3) = 6k,
f (3, 1) = 3k, f (3, 2) = 6k, f (3, 3) = 9k.
Based on property (3), we have
1 = ∑∑(x,y)∈RX,Y f (x, y)
= 1k + 2k + 3k + 2k + 4k + 6k + 3k + 6k + 9k,
which results in k = 1/36.
L–example 3.2

• A company has 2 production lines, A and B, which produce at


most 5 and 3 machines respectively.
• Let
X = number of machines produced by line A
Y = number of machines produced by line B.

• The joint probability function f (x, y) for (X,Y ) is given in the ta-
ble, where each entry represents f (xi, y j ) = P(X = xi,Y = y j ).
• What is the probability that in a day line A produces more ma-
chines than line B?
Table for the joint probability function f (x, y)
x Row
y
0 1 2 3 4 5 Total
0 0 0.01 0.02 0.05 0.06 0.08 0.22
1 0.01 0.03 0.04 0.05 0.05 0.07 0.25
2 0.02 0.03 0.05 0.06 0.06 0.07 0.29
3 0.02 0.04 0.03 0.04 0.06 0.05 0.24
Column Total 0.05 0.11 0.14 0.20 0.23 0.27 1
Consider the event

A = {line A produces more machines than line B} = {X > Y }.

Then we have

P(A) = P(X
 > Y)
= P (X,Y ) = (1, 0) or (X,Y ) = (2, 0) or

(X,Y ) = (2, 1) or . . . or (X,Y ) = (5, 3)
   
= P (X,Y ) = (1, 0) + . . . + P (X,Y ) = (5, 3)
= f (1, 0) + f (2, 0) + . . . + f (5, 3) = 0.73.
L–example 3.3

• A company has 9 executives; 4 are married, 3 have never mar-


ried, and 2 are divorced.

• Three executives are to be randomly selected for promotion.

• Among the selective executives, let

X = {number of married executives}


Y = {number of never married executives}.

• Find the joint probability function of X and Y .


Solution: Note that the executives are selected randomly; so every pos-
sible selection of the executives are equally likely.
 
9
• The total number of ways to select 3 executives out of 9 is .
3
• The possible values of x and y are constrained by x, y = 0, 1, 2, 3
and 1 ≤ x + y ≤ 3. The number
 of 
ways to select
 x married and y
4 3 2
never married is given by .
x y 3−x−y
• Therefore, the joint probability function of (X,Y ) is given by

fX,Y (x, y) = P(X = x,Y = y)


4 3
  2 
x y 3−x−y
= 9
 ,
3

for x, y = 0, 1, 2, 3 such that 1 ≤ x + y ≤ 3 and fX,Y (x, y) = 0 other-


wise.

• This joint p.f. can be summarized as a table.


y Row
x
0 1 2 3 Total
0 0 3/84 6/84 1/84 10/84

1 4/84 24/84 12/84 0 40/84

2 12/84 18/84 0 0 30/84

3 4/84 0 0 0 4/84

Column Total 20/84 45/84 18/84 1/84 1


D EFINITION 5 (J OINT PROBABILITY FUNCTION FOR CONTINUOUS RV)
Let (X,Y ) be a 2-dimensional continuous RV; its joint probability (den-
sity) function is a function fX,Y (x, y) such that
Z Z
P((X,Y ) ∈ D) = fX,Y (x, y)dydx,
(x,y)∈D

for any D ⊂ R2. More specifically,


Z bZ d
P(a ≤ X ≤ b, c ≤ Y ≤ d) = fX,Y (x, y)dydx.
a c
The joint probability density function has the following properties:

(1) fX,Y (x, y) ≥ 0, for any (x, y) ∈ RX,Y .

(2) fX,Y (x, y) = 0, for any (x, y) ∈


/ RX,Y .
Z ∞ Z ∞
(3) fX,Y (x, y)dxdy = 1;
−∞ −∞
Z Z
or equivalently fX,Y (x, y)dxdy = 1.
(x,y)∈RX,Y
Example 3.3 Find the value c such that f (x, y) below can serve as a
joint p.d.f. for a RV (X,Y ):

cx(x + y), 0 ≤ x ≤ 1; 1 ≤ y ≤ 2
f (x, y) =
0, elsewhere

Solution: In order for f (x, y) to be a p.d.f., we need


Z 1Z 2 1
 
∞ ∞ 1 2 2
Z Z Z
1 = f (x, y)dydx = cx(x + y)dydx = c x x+ y dx
−∞ −∞
0 1 0 2 1
Z 1 
1 3 1 2 1 13
= c x(x + 1.5)dx = c x + 1.5 · x = c· ,
0 3 2 0 12
which implies c = 12/13.
L–example 3.4
Reuse the p.d.f. of Example 3.3:

 12
x(x + y), 0 ≤ x ≤ 1; 1 ≤ y ≤ 2
f (x, y) = 13 .
 0, elsewhere

Assume that it is the joint p.d.f. of (X,Y ). Let A = {(x, y)|0 < x < 1/2; 1 <
y < 2}. Compute P((X,Y ) ∈ A).
• Set A corresponds to the shaped area in
the figure on the right.
y
• We have 2
A
P((X,Y ) ∈ A) = P(0 < X < 1/2; 1 < Y < 2)
Z 1/2 Z 2
12
= x(x + y)dydx 1
0 1 13
Z 1/2
12
= x(x + 1.5)dx
13 0  1
12 1 3 1 2 /2
= x + 1.5 · x 0 1 x
13 3 2 0
= 11/52.
2 M ARGINAL AND C ONDITIONAL D ISTRIBUTIONS

D EFINITION 6 (M ARGINAL P ROBABILITY D ISTRIBUTION )


Let (X,Y ) be a two-dimensional RV with joint p.f. fX,Y (x, y). We define the
marginal distribution for X as follows.
• If Y is a discrete RV, then for any x,
fX (x) = ∑ fX,Y (x, y).
y

• If Y is a continuous RV, then for any x,


Z ∞
fX (x) = fX,Y (x, y)dy.
−∞
R EMARK
• fY (y) for Y is defined in the same way as that of X.
• We can view the marginal distribution as the “projection" of the
2D function fX,Y (x, y) to the 1D function.
• More intuitively, it is the distribution of X by ignoring the pres-
ence of Y .
For example, consider a person of a certain community,
– suppose X = body weight, Y = height. (X,Y ) has a joint dis-
tribution fX,Y (x, y).
– the marginal distribution fX (x) of X is the distribution of
body weights for all people in the community.
• fX (x) should not involve the variable y; this can be viewed from
its definition: y is either summed out or integrated over.

• fX (x) is a probability function so it satisfies all the properties of


the probability function.
Example 3.4
1
• Revisit Example 3.2. The joint p.f. is given by f (x, y) = xy for
36
x = 1, 2, 3 and y = 1, 2, 3.

• Note that X has three possible values: 1, 2, and 3. The marginal


distribution for X is given by

– for x = 1, fX (1) = f (1, 1) + f (1, 2) + f (1, 3) = 6/36 = 1/6.


– for x = 2, fX (2) = f (2, 1) + f (2, 2) + f (2, 3) = 12/36 = 1/3.
– for x = 3, fX (3) = f (3, 1) + f (3, 2) + f (3, 3) = 18/36 = 1/2.
– for other values of x, fX (x) = 0.
• Alternatively, for each x ∈ {1, 2, 3},
3
1
fX (x) = ∑ f (x, y) = ∑ 36 xy
y y=1

1 3 1
= x ∑ y = x.
36 y=1 6
L–example 3.5
We reuse the joint p.f. of (X,Y ) derived in L–Example 1:

y Row
x
0 1 2 3 Total
0 0 3/84 6/84 1/84 10/84

1 4/84 24/84 12/84 0 40/84

2 12/84 18/84 0 0 30/84

3 4/84 0 0 0 4/84

Column Total 20/84 45/84 18/84 1/84 1

Can we read out the marginal p.f. of X and Y from the table directly?
L–example 3.6
Reuse the p.d.f. of Example 3.3:

 12
x(x + y), 0 ≤ x ≤ 1; 1 ≤ y ≤ 2
f (x, y) = 13 .
 0, elsewhere

Assume that it is the joint p.d.f. of (X,Y ). Find the marginal distribu-
tion of X.
Solution: (X,Y ) is a continuous RV. For each x ∈ [0, 1], we have
Z ∞ Z 2 12
fX (x) = f (x, y)dy = x(x + y)dy
−∞ 
Z 2 1 13
12
= x x+ ydy
13 1
12
= x (x + 1.5) ;
13
and for x ∈
/ [0, 1], fX (x) = 0.
D EFINITION 7 (C ONDITIONAL D ISTRIBUTION )
Let (X,Y ) be a RV with joint p.f. fX,Y (x, y). Let fX (x) be the marginal p.f.
for X. Then for any x such that fX (x) > 0, the conditional probability
function of Y given X = x is defined to be

fX,Y (x, y)
fY |X (y|x) = .
fX (x)
R EMARK
• For any y such that fY (y) > 0, we can similarly define the condi-
tional distribution of X given Y = y:

fX,Y (x, y)
fX|Y (x|y) = .
fY (y)

• fY |X (y|x) is defined only for x such that fX (x) > 0; likewise fX|Y (x|y)
is defined only for y such that fY (y) > 0.

• The practical meaning of fY |X (y|x): the distribution of Y given that


the random variable X is observed to take the value x.
• Considering y as the variable (x as a fixed value), fY |X (y|x) is a p.f.,
so it must satisfy all the properties of p.f..

• But fY |X (y|x)
Z is not a p.f. for x; this means that there is NO re-

quirement fY |X (y|x)dx = 1 for X continuous or ∑ fY |X (y|x) = 1
−∞ x
for X discrete.

• With the definition, we immediately have

– If fX (x) > 0, fX,Y (x, y) = fX (x) fY |X (y|x).


– If fY (y) > 0, fX,Y (x, y) = fY (y) fX|Y (x|y).
• One immediate application of the conditional distribution is to
compute, for continuous RV,
Z y
P(Y ≤ y|X = x) = fY |X (y|x)dy;
Z−∞

E(Y |X = x) = y fY |X (y|x)dy.
−∞

Their practical meanings are clear: the former is the probability


that Y ≤ y, given X = x; the latter is the average value of Y given
X = x.
For discrete case, the computation is similarly established based
on fY |X (y|x); please fill in the details on your own.
Example 3.5 Revisit Examples 3.2 and 3.4.
1
• The joint p.f. for (X,Y ) is given by f (x, y) = xy for x = 1, 2, 3 and
36
y = 1, 2, 3.
1
• The marginal p.f. for X is fX (x) = x for x = 1, 2, 3.
6
• Therefore, fY |X (y|x) is defined for any x = 1, 2, or 3:

f (x, y) (1/36)xy 1
fY |X (y|x) = = 1 = y,
fX (x) ( /6)x 6
for y = 1, 2, 3.
• We can compute
1
P(Y = 2|X = 1) = fY |X (2|1) = · 2 = 1/3;
6

P(Y ≤ 2|X = 1) = P(Y = 1|X = 1) + P(Y = 2|X = 1)


= fY |X (1|1) + fY |X (2|1) = 1/6 + 1/3 = 1/2;

E(Y |X = 2) = 1 · fY |X (1|2) + 2 · fY |X (2|2) + 3 · fY |X (3|2)


= 1 · (1/6) + 2 · (2/6) + 3 · (3/6) = 7/3.
L–example 3.7
We reuse the joint p.f. of (X,Y ) derived in L–Example 1:

y Row
x
0 1 2 3 Total
0 0 3/84 6/84 1/84 10/84

1 4/84 24/84 12/84 0 40/84

2 12/84 18/84 0 0 30/84

3 4/84 0 0 0 4/84

Column Total 20/84 45/84 18/84 1/84 1

Can we read out the conditional p.f. fX|Y (x|y) and fY |X (y|x) from the
table directly? How to compute E(Y |X = x)?
L–example 3.8 Reuse Examples 3.3 and L–Example 2.

• The joint p.f. for (X,Y ) is given by



 12
x(x + y), 0 ≤ x ≤ 1; 1 ≤ y ≤ 2
f (x, y) = 13 .
 0, elsewhere

• The marginal p.f. for X is given by


12
fX (x) = x (x + 1.5) ,
13
for x ∈ [0, 1].
• For each x ∈ [0, 1], the conditional p.f. fY |X (y|x),

f (x, y) (12/13)x(x + y)
fY |X (y|x) = = 12
fX (x) ( /13)x (x + 1.5)
x+y
= ,
x + 1.5

for y ∈ [1, 2].

• We can compute
Z 1.5 0.5 + y
P(Y ≤ 1.5|X = 0.5) = dy = 0.5625.
1 0.5 + 1.5
• Furthermore
Z 2 0.5 + y
E(Y |X = 0.5) = y dy
1 0.5 + 1.5
1 2
Z
= (0.5y + y2)dy
2 1 
1 3 7
= + = 37/24.
2 4 3
3 I NDEPENDENT R ANDOM VARIABLES

D EFINITION 8 (I NDEPENDENT R ANDOM VARIABLES )


• Random variables X and Y are independent if and only if for any x
and y,
fX,Y (x, y) = fX (x) fY (y).

• Random variables X1, X2, . . . , Xn are independent if and only if for


any x1, x2, . . . , xn,
fX1,X2,...,Xn (x1, x2, . . . xn) = fX1 (x1) fX2 (x2) · · · fXn (xn).
R EMARK
• The above definition is applicable no matter whether (X,Y ) is
continuous or discrete.
• The “product feature" in the definition implies one necessary con-
dition for independence: RX,Y needs to be a product space. In the
sense that if X and Y are independent, for any x ∈ RX and any
y ∈ RY , we have
fX,Y (x, y) = fX (x) fY (y) > 0,
implying RX,Y = {(x, y)|x ∈ RX ; y ∈ Ry} = RX × RY .
Conclusion: if RX,Y is not a product space, then X and Y are not
independent!
Properties of Independent Random Variables
Suppose X,Y are independent RVs.

(1) If A and B are arbitrary subsets of R, the events X ∈ A and Y ∈ B are


independent events in S. Thus

P(X ∈ A;Y ∈ B) = P(X ∈ A)P(Y ∈ B).

In particular, for any real numbers x, y,

P(X ≤ x;Y ≤ y) = P(X ≤ x)P(Y ≤ y).


(2) For arbitrary functions g1(·) and g2(·), g1(X) and g2(Y ) are indepen-
dent. For example,

• X 2 and Y are independent.


• sin(X) and cos(Y ) are independent.
• eX and log(Y ) are independent.

(3) Independence is connected with conditional distribution.

• If fX (x) > 0, then fY |X (y|x) = fY (y).


• Likewise, if fY (y) > 0, then fX|Y (x|y) = fX (x).
Example 3.6 The joint p.f. of (X,Y ) is given below.

y
x fX (x)
1 3 5
2 0.1 0.2 0.1 0.4
4 0.15 0.3 0.15 0.6
fY (y) 0.25 0.5 0.25 1

Are X and Y independent?


Solution:
• We need to check that for every x and y combination, whether we
have
fX,Y (x, y) = fX (x) fY (y).
For example, from the table, we have fX,Y (2, 1) = 0.1; fX (2) = 0.4,
fY (1) = 0.25. Therefore

fX,Y (2, 1) = 0.1 = 0.4 × 0.25 = fX (2) fY (1).

• In fact, we can check for each x ∈ {2, 4} and y ∈ {1, 3, 5} combina-


tion, the equality holds.
• We conclude that X and Y are independent.
L–example 3.9 Given that

2(x + y), for 0 ≤ x ≤ 1, 0 < y < x
fX,Y (x, y) =
0 elsewhere

Are X and Y independent?


Solution:

• The direct way of checking the indepen-


dence is to check whether y
1
fX,Y (x, y) = fX (x) fY (y)

holds for every (x, y) combination. The de-


tail of this method is left as an exercise.
0 1 x
• For this question, we can immediately con-
clude that X and Y are not independent by
checking that RX,Y is not a product space.
L–example 3.10 Suppose that (X,Y ) is a discrete RV. The joint p.f. is
given by

y
x fX (x)
0 1 2 3
0 1/8 1/4 1/8 0 1/2
1 0 1/8 1/4 1/8 1/2
fY (y) 1/8 3/8 3/8 1/8 1

Are X and Y independent?


Solution:
The zero entries in the table indicate that RX,Y is not a product space.
Therefore, X and Y are not independent.
L–example 3.11 We have a handy way to check independence when
fX,Y (x, y) has an explicit formula in RX,Y .

X and Y are independent if and only if both of the following hold:

• RX,Y , the range that the p.f. is positive, is a product space.

• For any (x, y) ∈ RX,Y , we have fX,Y (x, y) = C · g1(x)g2(y); that is, it
can be “factorized" as the product of two functions g1 and g2,
where the former depends on x only, the latter depends on y
only, and C is a constant not depending on both x and y.

Note: g1(x) and g2(y) on their own are NOT necessarily p.f.s.
1
• We use the joint p.d. in Example 3.2 to illustrate: f (x, y) = xy
36
for x = 1, 2, 3 and y = 1, 2, 3.

• A1 = {1, 2, 3} and A2 = {1, 2, 3}, so the RX,Y is a product space.


1
• fX,Y (x, y) = · (x) · (y): C = 1/36, g1(x) = x, g2(y) = y.
36
• We conclude that X and Y are independent.

• The advantage of this method is that we don’t need to find the


marginal distributions fX (x) and fY (y) and check fX,Y (x, y) = fX (x) fY (y).
Following this strategy, we can get fX (x) and fY (y) by standardizing
g1(x) and g2(y). Consider fX (x) for illustration; fY (y) is obtained simi-
larly.
• If X is a discrete RV, its p.m.f. is given by

g1(x)
fX (x) = .
g
∑t∈RX 1 (t)

• If X is a continuous RV, its p.d.f. is given by

g1(x)
fX (x) = R dt.
g
t∈RX 1 (t)
• We continue to use the example above to illustrate. Here X is a
discrete RV, RX = A1 = {1, 2, 3}. We obtain its p.m.f.:

g1(x) x
fX (x) = = 3 = x/6.
∑x∈RX g1(x) ∑x=1 x

• Similarly, we get fY (y) = y/6.


L–example 3.12 Given that

1
x(1 + y), for 0 < x < 2, 0 < y < 1
fX,Y (x, y) = 3
 0, elsewhere

Are X and Y independent?


Solution:
• Set A1 = (0, 2) and A2 = (0, 1), then RX,Y = A1 × A2 is a product
space.
• fX,Y (x, y) in RX,Y can be factorized by C = 1/3, g1(x) = x, g2(y) =
1 + y. Therefore, we conclude that X and Y are independent.
• Furthermore,
g1(x) x
fX (x) = R = R2 = x/2;
g
x∈A1 1 (x)dx 0 xdx

g2(y) 1+y 2
fY (y) = R = R1 = (1 + y).
g
y∈A2 2 (y)dy 0 (1 + y)dy
3
4 E XPECTATION AND C OVARIANCE

D EFINITION 9 (E XPECTATION )
For any two variable function g(x, y),
• if (X,Y ) is a discrete RV,
E(g(X,Y )) = ∑ ∑ g(x, y) fX,Y (x, y);
x y

• if (X,Y ) is a continuous RV,


Z ∞ Z ∞
E(g(X,Y )) = g(x, y) fX,Y (x, y)dydx.
−∞ −∞
If we let

g(X,Y ) = (X − E(X))(Y − E(Y )) = (X − µX )(Y − µy),

the expectation E[g(X,Y )] leads to the covariance of X and Y .

D EFINITION 10 (C OVARIANCE )
The covariance of X and Y is defined to be

cov(X,Y ) = E[(X − E(X))(Y − E(Y ))]


• If X and Y are discrete RVs,

cov(X,Y ) = ∑ ∑(x − µX )(y − µY ) fX,Y (x, y).


x y

• If X and Y are continuous RVs,


Z ∞ Z ∞
cov(X,Y ) = (x − µX )(y − µY ) fX,Y (x, y)dxdy.
−∞ −∞
The covariance has the following properties.

(1) cov(X,Y ) = E(XY ) − E(X)E(Y ).

(2) If X and Y are independent, then cov(X,Y ) = 0. However, cov(X,Y ) =


0 does not imply that X and Y are independent.
(3) cov(aX + b, cY + d) = ac · cov(X,Y ).

(4) V (aX + bY ) = a2V (X) + b2V (Y ) + 2ab · cov(X,Y ).


Example 3.7 Given the joint distribution for (X,Y ):

y
x fX (x)
0 1 2 3
0 1/8 1/4 1/8 0 1/2
1 0 1/8 1/4 1/8 1/2
fY (y) 1/8 3/8 3/8 1/8 1

(a) Find E(Y − X).

(b) Find cov(X,Y ).


Solution:
(a) Method 1:
E(Y − X) = (0 − 0)(1/8) + (1 − 0)(1/4) + (2 − 0)(1/8)
+ . . . + (3 − 1)(1/8) = 1.
Method 2:
E(Y − X) = E(Y ) − E(X) = 1.5 − 0.5 = 1,
where
E(Y ) = 0 · (1/8) + 1 · (3/8) + 2 · (3/8) + 3 · (1/8) = 1.5
E(X) = 0 · (1/2) + 1 · (1/2) = 0.5.
(b) We use cov(X,Y ) = E(XY ) − E(X)E(Y ) to compute. Note that we
have computed E(X) and E(Y ) in Part (a).

E(XY ) = (0)(0)(1/8) + (0)(1)(1/4) + (0)(2)(1/8)


+ . . . + (1)(3)(1/8) = 1.

Therefore

cov(X,Y ) = E(XY ) − E(X)E(Y ) = 1 − (0.5)(1.5) = 0.25.


L–example 3.13 Suppose that (X,Y ) has the p.f.

x2 + xy , for 0 ≤ x ≤ 1, 0 ≤ y ≤ 2

fX,Y (x, y) = 3 .
0, otherwise

(a) Find fX (x), fY (y) and fY |X (y|x).

(b) Find cov(X,Y ).


Solution:

(a) We first find the marginal density of X.

For 0 ≤ x ≤ 1,
Z 2
∞ xy
Z 
fX (x) = fX,Y (x, y) dy = x2 + dy
−∞ 0 3
 2
 2
2 xy 2 2x
= x y+ = 2x + .
6 y=0 3
It is clear that fX (x) = 0 for x < 0 or x > 1. Thus

2x2 + 2x , for 0 ≤ x ≤ 1
fX (x) = 3 .
0, otherwise
Similarly, the marginal density of Y is given as

 1 + y , for 0 ≤ y ≤ 2
fY (y) = 3 6 .
0, otherwise
The conditional probability density function of Y given X = x when
0 ≤ x ≤ 1 is then given as

 x2 + xy/3
fX,Y (x, y)  , for 0 ≤ y ≤ 2
fY |X (y|x) = = 2x2 + 2x/3
fX (x) 0, otherwise

 3x + y , for 0 ≤ y ≤ 2

= 2(3x + 1) .
0, otherwise

(b) We shall use the expression cov(X,Y ) = E(XY ) − E(X)E(Y ).
Now
Z 2Z 1 
2 xy 
E(XY ) = xy x + dx dy
0 0 3
Z 2Z 1 2 2

yx
= yx3 + dx dy
0 0 3
Z 2 4 2 3
1
x yx
= y + dy
0 4 9 x=0
y y2
Z 2  
= + dy
0 4 9
43
= .
54
We have computed the marginal distributions for X and Y in Part
(a). Thus
Z 1    4 3
1
2x 2x 2x 13
E(X) = x 2x2 + dx = + = ,
0 3 4 9 x=0 18
and
2
   2 3
 2
1 y y y 10
Z
E(Y ) = y + dy = + = .
0 3 6 6 18 y=0 9

This gives
43 13 10 1
cov(X,Y ) = E(XY ) − E(X)E(Y ) = − × = − .
54 18 9 162
L–example 3.14

• Start from V (X +Y ) = V (X)+V (Y )+2 cov(X,Y ), we can have some


interesting results.

• By induction, we have for any random variables X1, X2, . . . , Xn,

V (X1 + X2 + · · · + Xn) = V (X1) +V (X2) + . . . +V (Xn) + 2 ∑ cov(Xi, X j ).


j>i
• If X and Y are independent, we have

V (X ±Y ) = V (X) +V (Y ).

• By induction, we have if X1, X2, . . . , Xn are independent,

V (X1 ± X2 ± . . . ± Xn) = V (X1) +V (X2) + . . . +V (Xn).

You might also like