0% found this document useful (0 votes)
41 views

Cov Corr Notes

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
41 views

Cov Corr Notes

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

COVARIANCE AND CORRELATION

LECTURE NOTES

PROFESSOR HOHN

Contents

1. Covariance 1
2. Correlation 4
3. Solutions to Exercises 7

1. Covariance

Definition 1.1. Let X and Y be jointly distributed random variable. The covariance of X
and Y is defined by
Cov(X, Y ) = E[XY ] − E[X]E[Y ]

This is equivalent to
Cov(X, Y ) = E[(X − µX )(Y − µY )]

where µX = E[X] and µY = E[Y ].

Properties 1.2. Let X, Y , and Z be jointly distributed random variables. From the defini-
tion of covariance, we derive the following:

(1) The covariance generalizes variance: Cov(X, X) = Var(X).


(2) The covariance is symmetric: Cov(X, Y ) = Cov(Y, X).
(3) For any fixed scalars a, b ∈ R, Cov(aX + b, Y ) = a Cov(X, Y ).
(4) The covariance is bilinear: Cov(X + aY, Z) = Cov(X, Z) + a Cov(Y, Z) for any fixed
a ∈ R.
(5) If X and Y are independent, then Cov(X, Y ) = 0.

Exercise 1.3. Show that each of the listed properties of the covariance is true.

Solution. Click here.


1
2 PROFESSOR HOHN

Remark 1.4. It is true that if X and Y are independent, then Cov(X, Y ) = 0. However, if
Cov(X, Y ) = 0 then we can not immediately conclude that X and Y are independent. Put
graphically:

X&Y independent =⇒ Cov(X, Y ) = 0

6
Cov(X, Y ) = 0 =⇒ X&Y independent.

d
Example 1.5. Let Z = N (0, 1) and define X = Z 2 . It is clear that X and Z are not
independent (clearly X depends on Z). Show that Cov(X, Z) = 0 even though X and Z are
dependent.

Proof. With X = Z 2 we have

Cov(X, Z) = E[XZ] − E[X]E[Z] = E[Z 3 ] − E[Z 2 ]E[Z].

d
Now, we know that E[Z] = 0 (since Z = N (0, 1), so the mean is 0). Also,
Z ∞
1 2
3
E[Z ] = √ x3 e−x /2 dx.
2π −∞
2 /2
We can perform an integration by parts with u = x2 and dv = xe−x . Then du = 2xdx and
v=e −x2 /2 . So,
Z ∞ Z ∞
3 −x2 /2 2 ∞ 2 /2
x e dx = x2 e−x /2 −∞ −2 xe−x dx
−∞ −∞
Z ∞
2 /2
=0−2 xe−x dx
−∞
Z ∞
2 /2
= −2 xe−x dx
−∞

Recognize that this last integral is (up to a scalar constant) the same integral we calculate
to find E[Z], which is 0. Therefore E[Z 3 ] = 0. Hence Cov(X, Z) = 0 − 0 = 0.
2 /2
Another way to come upon E[Z 3 ] = 0 is by noticing that x3 is an odd function, and e−x
2 /2
is an even function. Hence, x3 e−x is an odd function. We know that the integral of an
odd function from −∞ to ∞ is zero. Thus,
Z ∞
1 2
3
E[Z ] = √ x3 e−x /2 dx = 0
2π −∞
COVARIANCE AND CORRELATION LECTURE NOTES 3

Exercise 1.6. Let X be a random selection of number {2, 3, 4, 5}, equally likely to be any of
the found numbers. Once X is drawn, Y is randomly drawn from the numbers {1, 2, ..., X}
with equal probability of getting each number. For example, given that X = 3, then Y is
drawn from the numbers {1, 2, 3} with P (Y = 1 | X = 3) = 1/3, P (Y = 2 | X = 3) = 1/3,
and P (Y = 3 | X = 3) = 1/3. Find the joint probability mass function of X and Y and find
Cov(X, Y ).

Solution. Click here

Exercise 1.7. Let X and Y be jointly continuous random variables with joint density

c (s2 e−2t + e−t ) 0 < s < 1, 0 < t < ∞

fX,Y (s, t) =
0

otherwise

Find Cov(X, Y ).

Solution. Let’s first find c. We have


Z ∞Z 1 Z ∞  
2 −2t −t 1 −2t  1 7c
c (s e + e ) dsdt = c e + e−t dt = c +1 = .
0 0 0 3 6 6

Therefore, c = 76 . For Cov(X, Y ) = E[XY ] − E[X]E[Y ], we will need the following:


Z ∞Z 1
6
E[XY ] = st(s2 e−2t + e−t ) dsdt
7 0 0
Z ∞Z 1
6
= (s3 te−2t + ste−t ) dsdt
7 0 0
6 ∞  t −2t t −t 
Z
= e + e dt
7 0 4 2
 
6 1 1 1
= · + ·1
7 4 4 2
6 9
= · .
7 16
4 PROFESSOR HOHN
Z ∞Z 1
6
E[X] = s(s2 e−2t + e−t ) dsdt
7 0 0
Z ∞Z 1
6
= (s3 e−2t + se−t ) dsdt
7 0 0
6 ∞  1 −2t 1 −t 
Z
= e + e dt
7 0 4 2
 
6 1 1 1
= · + ·1
7 4 2 2
6 5
= ·
7 8

and
Z ∞Z 1
6
E[Y ] = t(s2 e−2t + e−t ) dsdt
7 0 0
Z ∞Z 1
6
= (s2 te−2t + te−t ) dsdt
7 0 0
6 ∞  t −2t
Z 
= e + te−t dt
7 0 3
 
6 1 1
= · +1
7 3 4
6 13
= ·
7 12

Therefore,

6 9 5 13 6  3
Cov(X, Y ) = E[XY ] − E[X]E[Y ] = − · · =− .
7 16 8 12 7 196

2. Correlation

Definition 2.1. Suppose that X and Y are jointly distributed random variables with non-
zero variances. Then the correlation of X and Y is

Cov(X, Y )
Corr(X, Y ) = p .
Var(X) Var(Y )

Intuition 2.2. Remember back in multivariable calculus that if you take vectors v, w ∈ Rn
we could define the dot-product v · w of the two vectors. Also remember that if you wanted

to find the length of a vector v, you could do this by kvk = v · v. Now, speaking abstractly,
the covariance of two random variables acts like a generalized “dot-product” between the two
COVARIANCE AND CORRELATION LECTURE NOTES 5

variables. That is, we can think of Cov(X, Y ) roughly as a dot product of X and Y . In this
p p
analogy then, the “length” of a random variable is kXk = Cov(X, X) = Var(X). Let’s
also remember that with two vectors v and w, we had an interpretation of the dot-product as
v · w = kvk kwk cos(θ) where θ is the angle between the vectors. Solving for cos(θ) we have
v·w
cos(θ) = kvk kwk .
Hence with respect to the interpretation of Cov(X, Y ) as the dot product
p p
of X and Y , kXk = Var(X), and kY k = Var(Y ), the formula for correlation gives
Cov(X,Y )
Corr(X, Y ) = kXkkY k . With this interpretation then, you can invision Corr(X, Y ) = cos(θ)
where θ is a (very roughly) the “angle between” the random variables X and Y .

Proposition 2.3. For any random variables X and Y , it holds that −1 ≤ Corr(X, Y ) ≤ 1.

Intuitive Proof. With the interpretation that Corr(X, Y ) = cos(θ) where θ is some
generalized notion of the angle between X and Y , since cosine is always bounded between
−1 and 1, so must be the correlation.

Definition 2.4. If Corr(X, Y ) = 0, we say that X and Y are uncorrelated.

Remark 2.5. Notice that Corr(X, Y ) = 0 if and only if Cov(X, Y ) = 0. So, all the previous
properties discussing when the covariance is zero still hold for the correlation.

Example 2.6. Show that Corr(aX + b, Y ) = Corr(X, Y ) for any fixed scalars a > 0 and
b ∈ R.

Solution. For random variables X and Y , and scalars a, b ∈ R, we have shown that
Cov(aX + b, Y ) = a Cov(X, Y ), and we have seen that Var(aX + b) = a2 Var(X). Also, note

that since a > 0, a2 = |a| = a. Therefore,

Cov(aX + b, Y ) a Cov(X, Y )
Corr(aX + b, Y ) = p p =p p
Var(aX + b) Var(Y ) a2 Var(X) Var(Y )
Cov(X, Y )
=p p = Corr(X, Y ).
Var(X) Var(Y )

Let’s make a note that if a < 0, then a2 = |a| = −a, so our previous calculation would have
left us with Corr(aX + b, Y ) = − Corr(X, Y ), but this makes sense with our “dot product”
intuition, since if a < 0, then aX “switches the direction” of X.

Example 2.7. Let X be a random selection of number {2, 3, 4, 5}, equally likely to be any of
the found numbers. Once X is drawn, Y is randomly drawn from the numbers {1, 2, ..., X}
6 PROFESSOR HOHN

with equal probability of getting each number. For example, given that X = 3, then Y is
drawn from the numbers {1, 2, 3} with P (Y = 1 | X = 3) = 1/3, P (Y = 2 | X = 3) = 1/3,
and P (Y = 3 | X = 3) = 1/3. Find Corr(X, Y ).

Solution. Much of the work we’ve already done in Exercise 1.6. We found Cov(X, Y ) =
5 9 7
, E[X] = and E[Y ] = . We have left to find E[X 2 ] and E[Y 2 ]. To this end,
8 4 2
5 X
s 5 s
XX X 1 XX s
E[X 2 ] = s2 pX,Y (s, t) = s2 =
s t
4s 4
s=2 t=1 s=2 t=1
5
X s2 4 9 16 25 27
= = + + + =
4 4 4 4 4 2
s=2

and
5 X
s 5 s
XX X 1 X 1 X
E[Y 2 ] = t2 pX,Y (s, t) = t2 = t2
s t
4s 4s
s=2 t=1 s=2 t=1
1 1 1 1
= (1 + 4) + (1 + 4 + 9) + (1 + 4 + 9 + 16) + (1 + 4 + 9 + 16 + 25)
4·2 4·3 4·4 4·5
77
=
12

From before, we now have


27  7 2 5
Var(X) = − =
2 2 4
and  2
77 9 65
Var(Y ) = − =
12 4 48
Therefore,
5
Cov(X, Y ) 8 p
Corr(X, Y ) = p p =p ≈ .4804
Var(X) Var(Y ) (65/48) (5/4)

Exercise 2.8. Let X and Y be jointly continuous random variables with joint density

c(s2 e−2t + e−t ) 0 < s < 1, 0 < t < ∞

fX,Y (s, t) =
0

otherwise

Find Corr(X, Y ).

Solution. Click here.


COVARIANCE AND CORRELATION LECTURE NOTES 7

3. Solutions to Exercises

Solution to 1.3. Let X, Y , and Z be random variables and a, b ∈ R be scalars. Then,

(1) For a random variable X,


2
Cov(X, X) = E[X · X] − E[X]E[X] = E[X 2 ] − E[X] = Var(X).

(2) For random variables X and Y ,

Cov(X, Y ) = E[XY ] − E[X]E[Y ] = E[Y X] − E[Y ]E[X] = Cov(Y, X).

(3) For random variables X and Y , and scalars a, b ∈ R,

Cov(aX + b, Y ) = E[(aX + b)Y ] − E[aX + b]E[Y ]



= aE[XY ] + bE[Y ] − aE[X]E[Y ] + bE[Y ]
 
= a E[XY ] − E[X]E[Y ] + b E[Y ] − E[Y ]

= a Cov(X, Y ).

(4) For random variables X, Y , and Z, and scalar a ∈ R,

Cov(X + aY, Z) = E[(X + aY )Z] − E[X + aY ]E[Z]



= E[XZ] + aE[Y Z] − E[X]E[Z] + aE[Y ]E[Z]

= E[XZ] − E[X]E[Z]) + a E[Y Z] − E[Y ]E[Z]

= Cov(X, Z) + a Cov(Y, Z).

(5) If X and Y are independent random variables, then E[XY ] = E[X]E[Y ], so

Cov(X, Y ) = E[XY ] − E[X]E[Y ] = E[X]E[Y ] − E[X]E[Y ] = 0.

Solution to 1.6. For the joint probability mass function, we have

pX,Y (s, t) = P (X = s, Y = t) = P (Y = t | X = s) P (X = s).


8 PROFESSOR HOHN

1
Further, by the assumptions of the problem, P (X = s) = 4 for any choice of s ∈ {2, 3, 4, 5},
and 
1

t≤s
s
P (Y = t | X = s) =
0

otherwise
So, we have

1

t≤s
4s
pX,Y (s, t) = P (Y = t | X = s) P (X = s) = .
0

otherwise

As a table, this is
Y
1 2 3 4 5 X=k
X
1 1 1
2 8 8 0 0 0 4
1 1 1 1
3 12 12 12 0 0 4
1 1 1 1 1
4 16 16 16 16 0 4
1 1 1 1 1 1
5 20 20 20 20 20 4
77 77 47 9 1
Y =k 240 240 240 80 20

To find Cov(X, Y ) = E[XY ] − E[X]E[Y ] we have


XX
E[XY ] = st pX,Y (s, t)
s t
5 X
s
X 1
= st
4s
s=2 t=1
5 X s
X t
=
4
s=2 t=1
1 2 1 2 3 1 2 3 4 1 2 3 4 5
= + + + + + + + + + + + + +
4 4 4 4 4 4 4 4 4 4 4 4 4 4
17
=
2

       
1 1 1 1 7
E[X] = 2 +3 +4 +5 =
4 4 4 4 2

and
COVARIANCE AND CORRELATION LECTURE NOTES 9

         
77 77 47 9 1 9
E[Y ] = 1 +2 +3 +4 +5 =
240 240 240 80 20 4
17 7 9 5
Therefore Cov(X, Y ) = − · = .
2 2 4 8

Solution to 2.8. Much of the work has already been done in Example 1.7. We found
3 6 5 15 6 13 13
Cov(X, Y ) = − 196 , E[X] = 7 · 8 = 28 , and E[Y ] = 7 · 12 = 14 . We have left to find E[X 2 ]
and E[Y 2 ]. To this end,

6 ∞ 1 2 2 −2t 6 ∞ 1 4 −2t
Z Z Z Z
−t
2
E[X ] = s (s e + e ) dsdt = (s e + s2 e−t ) dsdt
7 0 0 7 0 0
6 ∞  1 −2t 1 −t 
Z
6 1 1  13
= e + e dt = + = .
7 0 5 3 7 10 3 35

and
∞Z 1
6 ∞ 1 2 2 −2t
Z Z Z
6 2 −2t −t
2 2
+ t2 e−t dsdt

E[Y ] = t (s e + e ) dsdt = s t e
7 0 0 7 0 0
Z ∞
6 1 2 −2t  6 1  25
= t e + t2 e−t dt = +2 = .
7 0 3 7 12 14

We then find
13  15 2 331
Var(X) = − =
35 28 3920
and
25  13 2 181
Var(Y ) = − =
14 14 196
Therefore,

Cov(X, Y ) −3/196
Corr(X, Y ) = p p =p p ≈ −0.0548
Var(X) Var(Y ) (331/3920) (181/196)

You might also like