0% found this document useful (0 votes)
57 views9 pages

Cov Corr Notes

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
57 views9 pages

Cov Corr Notes

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

COVARIANCE AND CORRELATION

LECTURE NOTES

PROFESSOR HOHN

Contents

1. Covariance 1
2. Correlation 4
3. Solutions to Exercises 7

1. Covariance

Definition 1.1. Let X and Y be jointly distributed random variable. The covariance of X
and Y is defined by
Cov(X, Y ) = E[XY ] − E[X]E[Y ]

This is equivalent to
Cov(X, Y ) = E[(X − µX )(Y − µY )]

where µX = E[X] and µY = E[Y ].

Properties 1.2. Let X, Y , and Z be jointly distributed random variables. From the defini-
tion of covariance, we derive the following:

(1) The covariance generalizes variance: Cov(X, X) = Var(X).


(2) The covariance is symmetric: Cov(X, Y ) = Cov(Y, X).
(3) For any fixed scalars a, b ∈ R, Cov(aX + b, Y ) = a Cov(X, Y ).
(4) The covariance is bilinear: Cov(X + aY, Z) = Cov(X, Z) + a Cov(Y, Z) for any fixed
a ∈ R.
(5) If X and Y are independent, then Cov(X, Y ) = 0.

Exercise 1.3. Show that each of the listed properties of the covariance is true.

Solution. Click here.


1
2 PROFESSOR HOHN

Remark 1.4. It is true that if X and Y are independent, then Cov(X, Y ) = 0. However, if
Cov(X, Y ) = 0 then we can not immediately conclude that X and Y are independent. Put
graphically:

X&Y independent =⇒ Cov(X, Y ) = 0

6
Cov(X, Y ) = 0 =⇒ X&Y independent.

d
Example 1.5. Let Z = N (0, 1) and define X = Z 2 . It is clear that X and Z are not
independent (clearly X depends on Z). Show that Cov(X, Z) = 0 even though X and Z are
dependent.

Proof. With X = Z 2 we have

Cov(X, Z) = E[XZ] − E[X]E[Z] = E[Z 3 ] − E[Z 2 ]E[Z].

d
Now, we know that E[Z] = 0 (since Z = N (0, 1), so the mean is 0). Also,
Z ∞
1 2
3
E[Z ] = √ x3 e−x /2 dx.
2π −∞
2 /2
We can perform an integration by parts with u = x2 and dv = xe−x . Then du = 2xdx and
v=e −x2 /2 . So,
Z ∞ Z ∞
3 −x2 /2 2 ∞ 2 /2
x e dx = x2 e−x /2 −∞ −2 xe−x dx
−∞ −∞
Z ∞
2 /2
=0−2 xe−x dx
−∞
Z ∞
2 /2
= −2 xe−x dx
−∞

Recognize that this last integral is (up to a scalar constant) the same integral we calculate
to find E[Z], which is 0. Therefore E[Z 3 ] = 0. Hence Cov(X, Z) = 0 − 0 = 0.
2 /2
Another way to come upon E[Z 3 ] = 0 is by noticing that x3 is an odd function, and e−x
2 /2
is an even function. Hence, x3 e−x is an odd function. We know that the integral of an
odd function from −∞ to ∞ is zero. Thus,
Z ∞
1 2
3
E[Z ] = √ x3 e−x /2 dx = 0
2π −∞
COVARIANCE AND CORRELATION LECTURE NOTES 3

Exercise 1.6. Let X be a random selection of number {2, 3, 4, 5}, equally likely to be any of
the found numbers. Once X is drawn, Y is randomly drawn from the numbers {1, 2, ..., X}
with equal probability of getting each number. For example, given that X = 3, then Y is
drawn from the numbers {1, 2, 3} with P (Y = 1 | X = 3) = 1/3, P (Y = 2 | X = 3) = 1/3,
and P (Y = 3 | X = 3) = 1/3. Find the joint probability mass function of X and Y and find
Cov(X, Y ).

Solution. Click here

Exercise 1.7. Let X and Y be jointly continuous random variables with joint density

c (s2 e−2t + e−t ) 0 < s < 1, 0 < t < ∞

fX,Y (s, t) =
0

otherwise

Find Cov(X, Y ).

Solution. Let’s first find c. We have


Z ∞Z 1 Z ∞  
2 −2t −t 1 −2t  1 7c
c (s e + e ) dsdt = c e + e−t dt = c +1 = .
0 0 0 3 6 6

Therefore, c = 76 . For Cov(X, Y ) = E[XY ] − E[X]E[Y ], we will need the following:


Z ∞Z 1
6
E[XY ] = st(s2 e−2t + e−t ) dsdt
7 0 0
Z ∞Z 1
6
= (s3 te−2t + ste−t ) dsdt
7 0 0
6 ∞  t −2t t −t 
Z
= e + e dt
7 0 4 2
 
6 1 1 1
= · + ·1
7 4 4 2
6 9
= · .
7 16
4 PROFESSOR HOHN
Z ∞Z 1
6
E[X] = s(s2 e−2t + e−t ) dsdt
7 0 0
Z ∞Z 1
6
= (s3 e−2t + se−t ) dsdt
7 0 0
6 ∞  1 −2t 1 −t 
Z
= e + e dt
7 0 4 2
 
6 1 1 1
= · + ·1
7 4 2 2
6 5
= ·
7 8

and
Z ∞Z 1
6
E[Y ] = t(s2 e−2t + e−t ) dsdt
7 0 0
Z ∞Z 1
6
= (s2 te−2t + te−t ) dsdt
7 0 0
6 ∞  t −2t
Z 
= e + te−t dt
7 0 3
 
6 1 1
= · +1
7 3 4
6 13
= ·
7 12

Therefore,

6 9 5 13 6  3
Cov(X, Y ) = E[XY ] − E[X]E[Y ] = − · · =− .
7 16 8 12 7 196

2. Correlation

Definition 2.1. Suppose that X and Y are jointly distributed random variables with non-
zero variances. Then the correlation of X and Y is

Cov(X, Y )
Corr(X, Y ) = p .
Var(X) Var(Y )

Intuition 2.2. Remember back in multivariable calculus that if you take vectors v, w ∈ Rn
we could define the dot-product v · w of the two vectors. Also remember that if you wanted

to find the length of a vector v, you could do this by kvk = v · v. Now, speaking abstractly,
the covariance of two random variables acts like a generalized “dot-product” between the two
COVARIANCE AND CORRELATION LECTURE NOTES 5

variables. That is, we can think of Cov(X, Y ) roughly as a dot product of X and Y . In this
p p
analogy then, the “length” of a random variable is kXk = Cov(X, X) = Var(X). Let’s
also remember that with two vectors v and w, we had an interpretation of the dot-product as
v · w = kvk kwk cos(θ) where θ is the angle between the vectors. Solving for cos(θ) we have
v·w
cos(θ) = kvk kwk .
Hence with respect to the interpretation of Cov(X, Y ) as the dot product
p p
of X and Y , kXk = Var(X), and kY k = Var(Y ), the formula for correlation gives
Cov(X,Y )
Corr(X, Y ) = kXkkY k . With this interpretation then, you can invision Corr(X, Y ) = cos(θ)
where θ is a (very roughly) the “angle between” the random variables X and Y .

Proposition 2.3. For any random variables X and Y , it holds that −1 ≤ Corr(X, Y ) ≤ 1.

Intuitive Proof. With the interpretation that Corr(X, Y ) = cos(θ) where θ is some
generalized notion of the angle between X and Y , since cosine is always bounded between
−1 and 1, so must be the correlation.

Definition 2.4. If Corr(X, Y ) = 0, we say that X and Y are uncorrelated.

Remark 2.5. Notice that Corr(X, Y ) = 0 if and only if Cov(X, Y ) = 0. So, all the previous
properties discussing when the covariance is zero still hold for the correlation.

Example 2.6. Show that Corr(aX + b, Y ) = Corr(X, Y ) for any fixed scalars a > 0 and
b ∈ R.

Solution. For random variables X and Y , and scalars a, b ∈ R, we have shown that
Cov(aX + b, Y ) = a Cov(X, Y ), and we have seen that Var(aX + b) = a2 Var(X). Also, note

that since a > 0, a2 = |a| = a. Therefore,

Cov(aX + b, Y ) a Cov(X, Y )
Corr(aX + b, Y ) = p p =p p
Var(aX + b) Var(Y ) a2 Var(X) Var(Y )
Cov(X, Y )
=p p = Corr(X, Y ).
Var(X) Var(Y )

Let’s make a note that if a < 0, then a2 = |a| = −a, so our previous calculation would have
left us with Corr(aX + b, Y ) = − Corr(X, Y ), but this makes sense with our “dot product”
intuition, since if a < 0, then aX “switches the direction” of X.

Example 2.7. Let X be a random selection of number {2, 3, 4, 5}, equally likely to be any of
the found numbers. Once X is drawn, Y is randomly drawn from the numbers {1, 2, ..., X}
6 PROFESSOR HOHN

with equal probability of getting each number. For example, given that X = 3, then Y is
drawn from the numbers {1, 2, 3} with P (Y = 1 | X = 3) = 1/3, P (Y = 2 | X = 3) = 1/3,
and P (Y = 3 | X = 3) = 1/3. Find Corr(X, Y ).

Solution. Much of the work we’ve already done in Exercise 1.6. We found Cov(X, Y ) =
5 9 7
, E[X] = and E[Y ] = . We have left to find E[X 2 ] and E[Y 2 ]. To this end,
8 4 2
5 X
s 5 s
XX X 1 XX s
E[X 2 ] = s2 pX,Y (s, t) = s2 =
s t
4s 4
s=2 t=1 s=2 t=1
5
X s2 4 9 16 25 27
= = + + + =
4 4 4 4 4 2
s=2

and
5 X
s 5 s
XX X 1 X 1 X
E[Y 2 ] = t2 pX,Y (s, t) = t2 = t2
s t
4s 4s
s=2 t=1 s=2 t=1
1 1 1 1
= (1 + 4) + (1 + 4 + 9) + (1 + 4 + 9 + 16) + (1 + 4 + 9 + 16 + 25)
4·2 4·3 4·4 4·5
77
=
12

From before, we now have


27  7 2 5
Var(X) = − =
2 2 4
and  2
77 9 65
Var(Y ) = − =
12 4 48
Therefore,
5
Cov(X, Y ) 8 p
Corr(X, Y ) = p p =p ≈ .4804
Var(X) Var(Y ) (65/48) (5/4)

Exercise 2.8. Let X and Y be jointly continuous random variables with joint density

c(s2 e−2t + e−t ) 0 < s < 1, 0 < t < ∞

fX,Y (s, t) =
0

otherwise

Find Corr(X, Y ).

Solution. Click here.


COVARIANCE AND CORRELATION LECTURE NOTES 7

3. Solutions to Exercises

Solution to 1.3. Let X, Y , and Z be random variables and a, b ∈ R be scalars. Then,

(1) For a random variable X,


2
Cov(X, X) = E[X · X] − E[X]E[X] = E[X 2 ] − E[X] = Var(X).

(2) For random variables X and Y ,

Cov(X, Y ) = E[XY ] − E[X]E[Y ] = E[Y X] − E[Y ]E[X] = Cov(Y, X).

(3) For random variables X and Y , and scalars a, b ∈ R,

Cov(aX + b, Y ) = E[(aX + b)Y ] − E[aX + b]E[Y ]



= aE[XY ] + bE[Y ] − aE[X]E[Y ] + bE[Y ]
 
= a E[XY ] − E[X]E[Y ] + b E[Y ] − E[Y ]

= a Cov(X, Y ).

(4) For random variables X, Y , and Z, and scalar a ∈ R,

Cov(X + aY, Z) = E[(X + aY )Z] − E[X + aY ]E[Z]



= E[XZ] + aE[Y Z] − E[X]E[Z] + aE[Y ]E[Z]

= E[XZ] − E[X]E[Z]) + a E[Y Z] − E[Y ]E[Z]

= Cov(X, Z) + a Cov(Y, Z).

(5) If X and Y are independent random variables, then E[XY ] = E[X]E[Y ], so

Cov(X, Y ) = E[XY ] − E[X]E[Y ] = E[X]E[Y ] − E[X]E[Y ] = 0.

Solution to 1.6. For the joint probability mass function, we have

pX,Y (s, t) = P (X = s, Y = t) = P (Y = t | X = s) P (X = s).


8 PROFESSOR HOHN

1
Further, by the assumptions of the problem, P (X = s) = 4 for any choice of s ∈ {2, 3, 4, 5},
and 
1

t≤s
s
P (Y = t | X = s) =
0

otherwise
So, we have

1

t≤s
4s
pX,Y (s, t) = P (Y = t | X = s) P (X = s) = .
0

otherwise

As a table, this is
Y
1 2 3 4 5 X=k
X
1 1 1
2 8 8 0 0 0 4
1 1 1 1
3 12 12 12 0 0 4
1 1 1 1 1
4 16 16 16 16 0 4
1 1 1 1 1 1
5 20 20 20 20 20 4
77 77 47 9 1
Y =k 240 240 240 80 20

To find Cov(X, Y ) = E[XY ] − E[X]E[Y ] we have


XX
E[XY ] = st pX,Y (s, t)
s t
5 X
s
X 1
= st
4s
s=2 t=1
5 X s
X t
=
4
s=2 t=1
1 2 1 2 3 1 2 3 4 1 2 3 4 5
= + + + + + + + + + + + + +
4 4 4 4 4 4 4 4 4 4 4 4 4 4
17
=
2

       
1 1 1 1 7
E[X] = 2 +3 +4 +5 =
4 4 4 4 2

and
COVARIANCE AND CORRELATION LECTURE NOTES 9

         
77 77 47 9 1 9
E[Y ] = 1 +2 +3 +4 +5 =
240 240 240 80 20 4
17 7 9 5
Therefore Cov(X, Y ) = − · = .
2 2 4 8

Solution to 2.8. Much of the work has already been done in Example 1.7. We found
3 6 5 15 6 13 13
Cov(X, Y ) = − 196 , E[X] = 7 · 8 = 28 , and E[Y ] = 7 · 12 = 14 . We have left to find E[X 2 ]
and E[Y 2 ]. To this end,

6 ∞ 1 2 2 −2t 6 ∞ 1 4 −2t
Z Z Z Z
−t
2
E[X ] = s (s e + e ) dsdt = (s e + s2 e−t ) dsdt
7 0 0 7 0 0
6 ∞  1 −2t 1 −t 
Z
6 1 1  13
= e + e dt = + = .
7 0 5 3 7 10 3 35

and
∞Z 1
6 ∞ 1 2 2 −2t
Z Z Z
6 2 −2t −t
2 2
+ t2 e−t dsdt

E[Y ] = t (s e + e ) dsdt = s t e
7 0 0 7 0 0
Z ∞
6 1 2 −2t  6 1  25
= t e + t2 e−t dt = +2 = .
7 0 3 7 12 14

We then find
13  15 2 331
Var(X) = − =
35 28 3920
and
25  13 2 181
Var(Y ) = − =
14 14 196
Therefore,

Cov(X, Y ) −3/196
Corr(X, Y ) = p p =p p ≈ −0.0548
Var(X) Var(Y ) (331/3920) (181/196)

You might also like