0% found this document useful (0 votes)
3 views6 pages

Lecture 18

Lecture 18 covers the concepts of covariance and correlation, defining their mathematical properties and relationships. It explains the multinomial distribution and provides examples, including the negative correlation in multinomial models. The lecture emphasizes the importance of understanding these statistical measures in analyzing data.

Uploaded by

Thành Nguyễn
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views6 pages

Lecture 18

Lecture 18 covers the concepts of covariance and correlation, defining their mathematical properties and relationships. It explains the multinomial distribution and provides examples, including the negative correlation in multinomial models. The lecture emphasizes the importance of understanding these statistical measures in analyzing data.

Uploaded by

Thành Nguyễn
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Lecture 18: Covariance and Correlation

STOR 435, Spring, 2025

3/31/2025

435-Spring-2025 correlation
Definition and properties

Pitman: Section 6.4; also see Lecture 8 for the definition.


Covariance:
Cov(X, Y) = E[(X − µX )(Y − µy )] = E(XY) − E(X) E(Y).
Cov(X,Y)
Correlation: Corr(X, Y) = SD(X) SD(Y) = ρ.
Properties:
(a) Cov(X, X) = Var(X) and Cov(X, Y) = Cov(Y, X).
(b) Cov(IA , IB ) = P(A ∩ B) − P(A) P(B).
(c) Var(X + Y) = Var(X) + Var(Y) + 2 Cov(X, Y).
(d) X and Y are independent =⇒ Corr(X, Y) = 0; but not
vice versa.
(e) (bi-linearity) For constants {ai }, {bj } and RV’s {Xi }, {Yj },
 
X m n
X m X
X n
Cov  ai Xi , bj Yj  = ai bj Cov(Xi , Yj ).
i=1 j=1 i=1 j=1

435-Spring-2025 correlation
Multinomial distribution
Let X1 , ..., Xn be iid with P(X1 = m) = pm , m = 1, ..., M. Denote
the frequency of category m in n trials
Nm = I{X1 =m} + · · · + I{Xn =m} . Then

n!
P(N1 = n1 , ..., NM = nM ) = pn1 · · · pnMM
n1 ! · · · nM ! 1
(why?) where n = n1 + · · · + nM .
Facts:
N1 + · · · + NM = n and p1 + · · · + pM = 1. Hence the free
parameters are n and pm , m = 1, ..., M − 1.
Bin(n, p) is a special case of multinomial distribution with
M = 2, p1 = p and p2 = q = 1 − p.
Nm ∼ Bin(n, pm ); Nk + Nm ∼ Bin(n, pk + pm ) and
(Nk , Nm , n − Nk − Nm ) follows a multinomial distribution with
parameters {n; pk , pm , 1 − pk − pm }.
435-Spring-2025 correlation
Negative correlation
Fact: Show that Cov(Nk , Nm ) = −n pk pm for k ̸= m in the
multinomial model.
Proof: The representation Nm = I{X1 =m} + · · · + I{Xn =m} and
the bi-linearity imply

Cov(Nk , Nm )
Xn Xn
= {E[I{Xi =k, Xj =m} ] − E[I{Xi =k} ] E[I{Xj =m} ]}
i=1 j=1
Xn X n
= {P(Xi = k, Xj = m)] − P(Xi = k) P(Xj = m)}
i=1 j=1
X X
= {· · · } + {· · · }
i̸=j i=j
= −n pk pm .
P
Why? Note the first double sum i̸=j {· · · } = 0.
435-Spring-2025 correlation
Examples of multinomial distribution

Roll a fair die 20 times. Let

X ={the number of times that labels “≤ 3” show up },


Y ={the number of times that labels “≥ 4” show up },
U ={the number of times that labels “≥ 3” show up },
V ={the number of times that label “3” shows up }.

435-Spring-2025 correlation
Examples continued

P(X = 12, U = 10) = P(X − V = 10, V = 2, Y = 8)


20!
= (2/6)10 (1/6)2 (3/6)8 = · · ·
10! 2! 8!
Cov(X, U) = Cov(X, V + Y) = Cov(X, V) + Cov(X, Y),
where
Cov(X, V) = Cov(X − V, V) + Cov(V, V)
1 1 1 5
= −20 · · + 20 · · ,
3 6 6 6
and
1 1
Cov(X, Y) = −20 · · .
2 2
Note: Handling overlapping parts is the key.

435-Spring-2025 correlation

You might also like