Covariance - Correlation - Variance of A Sum - Correlation Coefficient
Covariance - Correlation - Variance of A Sum - Correlation Coefficient
Note that
= E [XY ] − (E X)(E Y ).
Intuitively, the covariance between X and Y indicates how the values of X and Y move relative to each other. If large values of X tend to
happen with large values of Y , then (X − E X)(Y − E Y ) is positive on average. In this case, the covariance is positive and we say X
and Y are positively correlated. On the other hand, if X tends to be small when Y is large, then (X − E X)(Y − E Y ) is negative on
average. In this case, the covariance is negative and we say X and Y are negatively correlated.
Example 5.32
Suppose X ∼ U nif orm(1, 2) , and given X = x, Y is exponential with parameter λ = x. Find Cov(X, Y ).
Solution
3
We can use Cov(X, Y ) = E XY − E XE Y . We have E X =
2
and
1
= E [ ] (since Y |X ∼ E xponential(X))
X
2
1
= ∫ dx
1
x
= ln 2.
We also have
1
= E [X ] (since Y |X ∼ E xponential(X))
X
= 1.
Thus,
3
Cov(X, Y ) = E [XY ] − (E X)(E Y ) = 1 − ln 2.
2
1. Cov(X, X) = Var(X);
2. if X and Y are independent then Cov(X, Y ) = 0;
https://www.probabilitycourse.com/chapter5/5_3_1_covariance_correlation.php 1/4
7/20/2021 Covariance | Correlation | Variance of a sum | Correlation Coefficient:
4. Cov(aX, Y ) = aCov(X, Y );
5. Cov(X + c, Y ) = Cov(X, Y ) ;
6. Cov(X + Y , Z ) = Cov(X, Z ) + Cov(Y , Z ) ;
7. more generally,
m n m n
Cov (∑ a i X i , ∑ b j Y j ) = ∑ ∑ a i b j Cov(X i , Y j ).
All of the above results can be proven directly from the definition of covariance. For example, if X and Y are independent, then as we have
seen before E [XY ] = E XE Y , so
Cov(X, Y ) = E [XY ] − E XE Y = 0.
Note that the converse is not necessarily true. That is, if Cov(X, Y ) = 0, X and Y may or may not be independent.
Let us prove Item 6
in Lemma 5.3, Cov(X + Y , Z ) = Cov(X, Z ) + Cov(Y , Z ) . We have
Cov(X + Y , Z ) = E [(X + Y )Z ] − E (X + Y )E Z
= E [XZ + Y Z ] − (E X + E Y )E Z
= E XZ − E XE Z + E Y Z − E Y E Z
= Cov(X, Z ) + Cov(Y , Z ).
You can prove the rest of the items in Lemma 5.3 similarly.
Example 5.33
2
Z = 1 + X + XY ,
W = 1 + X.
Find Cov(Z , W ).
Solution
2
Cov(Z , W ) = Cov(1 + X + XY , 1 + X)
2
= Cov(X + XY , X) (by part 5 of Lemma 5.3)
2
= Cov(X, X) + Cov(XY , X) (by part 6 of Lemma 5.3)
2 2 2
= Var(X) + E [X Y ] − E [XY ]E X (by part 1 of Lemma 5.3 & definition of Cov)
2 2 2 2
= 1 + E [X ]E [Y ] − E [X] E [Y ] (since X and Y are independent)
= 1 + 1 − 0 = 2.
Variance of a sum:
One of the applications of covariance is finding the variance of a sum of several random variables. In particular, if Z = X+Y , then
Var(Z ) = Cov(Z , Z )
= Cov(X + Y , X + Y )
2 2
Var(aX + bY ) = a Var(X) + b Var(Y ) + 2abCov(X, Y ) (5.21)
https://www.probabilitycourse.com/chapter5/5_3_1_covariance_correlation.php 2/4
7/20/2021 Covariance | Correlation | Variance of a sum | Correlation Coefficient:
Correlation Coefficient:
The correlation coefficient, denoted by ρXY or ρ(X, Y ), is obtained by normalizing the covariance. In particular, we define the correlation
coefficient of two random variables X and Y as the covariance of the standardized versions of X and Y . Define the standardized versions
of X and Y as
X − EX Y − EY
U = , V = (5.22)
σX σY
Then,
X − EX Y − EY
ρXY = Cov(U , V ) = Cov ( , )
σX σY
X Y
= Cov ( , ) (by Item 5 of Lemma 5.3)
σX σY
Cov(X, Y )
= .
σX σY
Cov(X, Y ) Cov(X, Y )
ρXY = ρ(X, Y ) = −−−−−−−−−−−− =
√Var(X) Var(Y) σX σY
A nice thing about the correlation coefficient is that it is always between −1 and 1. This is an immediate result of Cauchy-Schwarz
inequality that is discussed in Section 6.2.4. One way to prove that −1 ≤ ρ ≤ 1 is to use the following inequality:
2 2
α +β
αβ ≤ , for α, β ∈ R.
2
This is because (α − β)2 ≥ 0. The equality holds only if α = β. From this, we can conclude that for any two random variables U and
V ,
2 2
EU + EV
E [U V ] ≤ ,
2
with equality only if U = V with probability one. Now, let U and V be the standardized versions of X and Y as defined in Equation
5.22. Then, by definition ρXY = Cov(U , V ) = E U V . But since E U 2 = E V 2 = 1 , we conclude
2 2
EU + EV
ρXY = E [U V ] ≤ = 1,
2
Y − EY X − EX
= ,
σY σX
which implies
σY σY
Y = X + (E Y − E X)
σX σX
= aX + b, where a and b are constants.
ρ(−X, Y ) ≤ 1.
But ρ(−X, Y ) = −ρ(X, Y ) , thus we conclude ρ(X, Y ) ≥ −1 . Thus, we can summarize some properties of the correlation
coefficient as follows.
Properties of the correlation coefficient:
https://www.probabilitycourse.com/chapter5/5_3_1_covariance_correlation.php 3/4
7/20/2021 Covariance | Correlation | Variance of a sum | Correlation Coefficient:
1. −1 ≤ ρ(X, Y ) ≤ 1 ;
2. if ρ(X, Y ) = 1 , then Y = aX + b, where a > 0 ;
3. if ρ(X, Y ) = −1 , then Y = aX + b, where a < 0 ;
4. ρ(aX + b, cY + d) = ρ(X, Y ) for a, c > 0.
Definition 5.2
Note that as we discussed previously, two independent random variables are always uncorrelated, but the converse is not necessarily true.
That is, if X and Y are uncorrelated, then X and Y may or may not be independent. Also, note that if X and Y are uncorrelated from
Equation 5.21, we conclude that Var(X + Y ) = Var(X) + Var(Y ) .
If X and Y are uncorrelated, then
Note that if X and Y are independent, then they are uncorrelated, and so Var(X + Y ) = Var(X) + Var(Y ) . This is a fact that we
stated previously in Chapter 3, and now we could easily prove using covariance.
Example 5.34
Let X and Y be as in Example 5.24 in Section 5.2.3, i.e., suppose that we choose a point (X, Y ) uniformly at random in the unit disc
2 2
D = {(x, y)|x +y ≤ 1}.
Solution
We need to check whether Cov(X, Y ) = 0. First note that, in Example 5.24 of Section 5.2.3, we found out that X and Y
are not independent and in fact, we found that
−−−−− − −−−−− −
2 2
X|Y ∼ U nif orm(−√1 − Y , √1 − Y ).
Also, we have
= E [Y ⋅ 0] = 0.
Thus,
Cov(X, Y ) = E [XY ] − E XE Y = 0.
← previous
next →
https://www.probabilitycourse.com/chapter5/5_3_1_covariance_correlation.php 4/4