Cov Corr Notes
Cov Corr Notes
LECTURE NOTES
PROFESSOR HOHN
Contents
1. Covariance 1
2. Correlation 4
3. Solutions to Exercises 7
1. Covariance
Definition 1.1. Let X and Y be jointly distributed random variable. The covariance of X
and Y is defined by
Cov(X, Y ) = E[XY ] − E[X]E[Y ]
This is equivalent to
Cov(X, Y ) = E[(X − µX )(Y − µY )]
Properties 1.2. Let X, Y , and Z be jointly distributed random variables. From the defini-
tion of covariance, we derive the following:
Exercise 1.3. Show that each of the listed properties of the covariance is true.
Remark 1.4. It is true that if X and Y are independent, then Cov(X, Y ) = 0. However, if
Cov(X, Y ) = 0 then we can not immediately conclude that X and Y are independent. Put
graphically:
6
Cov(X, Y ) = 0 =⇒ X&Y independent.
d
Example 1.5. Let Z = N (0, 1) and define X = Z 2 . It is clear that X and Z are not
independent (clearly X depends on Z). Show that Cov(X, Z) = 0 even though X and Z are
dependent.
d
Now, we know that E[Z] = 0 (since Z = N (0, 1), so the mean is 0). Also,
Z ∞
1 2
3
E[Z ] = √ x3 e−x /2 dx.
2π −∞
2 /2
We can perform an integration by parts with u = x2 and dv = xe−x . Then du = 2xdx and
v=e −x2 /2 . So,
Z ∞ Z ∞
3 −x2 /2 2 ∞ 2 /2
x e dx = x2 e−x /2 −∞ −2 xe−x dx
−∞ −∞
Z ∞
2 /2
=0−2 xe−x dx
−∞
Z ∞
2 /2
= −2 xe−x dx
−∞
Recognize that this last integral is (up to a scalar constant) the same integral we calculate
to find E[Z], which is 0. Therefore E[Z 3 ] = 0. Hence Cov(X, Z) = 0 − 0 = 0.
2 /2
Another way to come upon E[Z 3 ] = 0 is by noticing that x3 is an odd function, and e−x
2 /2
is an even function. Hence, x3 e−x is an odd function. We know that the integral of an
odd function from −∞ to ∞ is zero. Thus,
Z ∞
1 2
3
E[Z ] = √ x3 e−x /2 dx = 0
2π −∞
COVARIANCE AND CORRELATION LECTURE NOTES 3
Exercise 1.6. Let X be a random selection of number {2, 3, 4, 5}, equally likely to be any of
the found numbers. Once X is drawn, Y is randomly drawn from the numbers {1, 2, ..., X}
with equal probability of getting each number. For example, given that X = 3, then Y is
drawn from the numbers {1, 2, 3} with P (Y = 1 | X = 3) = 1/3, P (Y = 2 | X = 3) = 1/3,
and P (Y = 3 | X = 3) = 1/3. Find the joint probability mass function of X and Y and find
Cov(X, Y ).
Exercise 1.7. Let X and Y be jointly continuous random variables with joint density
c (s2 e−2t + e−t ) 0 < s < 1, 0 < t < ∞
fX,Y (s, t) =
0
otherwise
Find Cov(X, Y ).
and
Z ∞Z 1
6
E[Y ] = t(s2 e−2t + e−t ) dsdt
7 0 0
Z ∞Z 1
6
= (s2 te−2t + te−t ) dsdt
7 0 0
6 ∞ t −2t
Z
= e + te−t dt
7 0 3
6 1 1
= · +1
7 3 4
6 13
= ·
7 12
Therefore,
6 9 5 13 6 3
Cov(X, Y ) = E[XY ] − E[X]E[Y ] = − · · =− .
7 16 8 12 7 196
2. Correlation
Definition 2.1. Suppose that X and Y are jointly distributed random variables with non-
zero variances. Then the correlation of X and Y is
Cov(X, Y )
Corr(X, Y ) = p .
Var(X) Var(Y )
Intuition 2.2. Remember back in multivariable calculus that if you take vectors v, w ∈ Rn
we could define the dot-product v · w of the two vectors. Also remember that if you wanted
√
to find the length of a vector v, you could do this by kvk = v · v. Now, speaking abstractly,
the covariance of two random variables acts like a generalized “dot-product” between the two
COVARIANCE AND CORRELATION LECTURE NOTES 5
variables. That is, we can think of Cov(X, Y ) roughly as a dot product of X and Y . In this
p p
analogy then, the “length” of a random variable is kXk = Cov(X, X) = Var(X). Let’s
also remember that with two vectors v and w, we had an interpretation of the dot-product as
v · w = kvk kwk cos(θ) where θ is the angle between the vectors. Solving for cos(θ) we have
v·w
cos(θ) = kvk kwk .
Hence with respect to the interpretation of Cov(X, Y ) as the dot product
p p
of X and Y , kXk = Var(X), and kY k = Var(Y ), the formula for correlation gives
Cov(X,Y )
Corr(X, Y ) = kXkkY k . With this interpretation then, you can invision Corr(X, Y ) = cos(θ)
where θ is a (very roughly) the “angle between” the random variables X and Y .
Proposition 2.3. For any random variables X and Y , it holds that −1 ≤ Corr(X, Y ) ≤ 1.
Intuitive Proof. With the interpretation that Corr(X, Y ) = cos(θ) where θ is some
generalized notion of the angle between X and Y , since cosine is always bounded between
−1 and 1, so must be the correlation.
Remark 2.5. Notice that Corr(X, Y ) = 0 if and only if Cov(X, Y ) = 0. So, all the previous
properties discussing when the covariance is zero still hold for the correlation.
Example 2.6. Show that Corr(aX + b, Y ) = Corr(X, Y ) for any fixed scalars a > 0 and
b ∈ R.
Solution. For random variables X and Y , and scalars a, b ∈ R, we have shown that
Cov(aX + b, Y ) = a Cov(X, Y ), and we have seen that Var(aX + b) = a2 Var(X). Also, note
√
that since a > 0, a2 = |a| = a. Therefore,
Cov(aX + b, Y ) a Cov(X, Y )
Corr(aX + b, Y ) = p p =p p
Var(aX + b) Var(Y ) a2 Var(X) Var(Y )
Cov(X, Y )
=p p = Corr(X, Y ).
Var(X) Var(Y )
√
Let’s make a note that if a < 0, then a2 = |a| = −a, so our previous calculation would have
left us with Corr(aX + b, Y ) = − Corr(X, Y ), but this makes sense with our “dot product”
intuition, since if a < 0, then aX “switches the direction” of X.
Example 2.7. Let X be a random selection of number {2, 3, 4, 5}, equally likely to be any of
the found numbers. Once X is drawn, Y is randomly drawn from the numbers {1, 2, ..., X}
6 PROFESSOR HOHN
with equal probability of getting each number. For example, given that X = 3, then Y is
drawn from the numbers {1, 2, 3} with P (Y = 1 | X = 3) = 1/3, P (Y = 2 | X = 3) = 1/3,
and P (Y = 3 | X = 3) = 1/3. Find Corr(X, Y ).
Solution. Much of the work we’ve already done in Exercise 1.6. We found Cov(X, Y ) =
5 9 7
, E[X] = and E[Y ] = . We have left to find E[X 2 ] and E[Y 2 ]. To this end,
8 4 2
5 X
s 5 s
XX X 1 XX s
E[X 2 ] = s2 pX,Y (s, t) = s2 =
s t
4s 4
s=2 t=1 s=2 t=1
5
X s2 4 9 16 25 27
= = + + + =
4 4 4 4 4 2
s=2
and
5 X
s 5 s
XX X 1 X 1 X
E[Y 2 ] = t2 pX,Y (s, t) = t2 = t2
s t
4s 4s
s=2 t=1 s=2 t=1
1 1 1 1
= (1 + 4) + (1 + 4 + 9) + (1 + 4 + 9 + 16) + (1 + 4 + 9 + 16 + 25)
4·2 4·3 4·4 4·5
77
=
12
Exercise 2.8. Let X and Y be jointly continuous random variables with joint density
c(s2 e−2t + e−t ) 0 < s < 1, 0 < t < ∞
fX,Y (s, t) =
0
otherwise
Find Corr(X, Y ).
3. Solutions to Exercises
= a Cov(X, Y ).
1
Further, by the assumptions of the problem, P (X = s) = 4 for any choice of s ∈ {2, 3, 4, 5},
and
1
t≤s
s
P (Y = t | X = s) =
0
otherwise
So, we have
1
t≤s
4s
pX,Y (s, t) = P (Y = t | X = s) P (X = s) = .
0
otherwise
As a table, this is
Y
1 2 3 4 5 X=k
X
1 1 1
2 8 8 0 0 0 4
1 1 1 1
3 12 12 12 0 0 4
1 1 1 1 1
4 16 16 16 16 0 4
1 1 1 1 1 1
5 20 20 20 20 20 4
77 77 47 9 1
Y =k 240 240 240 80 20
1 1 1 1 7
E[X] = 2 +3 +4 +5 =
4 4 4 4 2
and
COVARIANCE AND CORRELATION LECTURE NOTES 9
77 77 47 9 1 9
E[Y ] = 1 +2 +3 +4 +5 =
240 240 240 80 20 4
17 7 9 5
Therefore Cov(X, Y ) = − · = .
2 2 4 8
Solution to 2.8. Much of the work has already been done in Example 1.7. We found
3 6 5 15 6 13 13
Cov(X, Y ) = − 196 , E[X] = 7 · 8 = 28 , and E[Y ] = 7 · 12 = 14 . We have left to find E[X 2 ]
and E[Y 2 ]. To this end,
6 ∞ 1 2 2 −2t 6 ∞ 1 4 −2t
Z Z Z Z
−t
2
E[X ] = s (s e + e ) dsdt = (s e + s2 e−t ) dsdt
7 0 0 7 0 0
6 ∞ 1 −2t 1 −t
Z
6 1 1 13
= e + e dt = + = .
7 0 5 3 7 10 3 35
and
∞Z 1
6 ∞ 1 2 2 −2t
Z Z Z
6 2 −2t −t
2 2
+ t2 e−t dsdt
E[Y ] = t (s e + e ) dsdt = s t e
7 0 0 7 0 0
Z ∞
6 1 2 −2t 6 1 25
= t e + t2 e−t dt = +2 = .
7 0 3 7 12 14
We then find
13 15 2 331
Var(X) = − =
35 28 3920
and
25 13 2 181
Var(Y ) = − =
14 14 196
Therefore,
Cov(X, Y ) −3/196
Corr(X, Y ) = p p =p p ≈ −0.0548
Var(X) Var(Y ) (331/3920) (181/196)