MEFall2023 5
MEFall2023 5
1 / 94
Corr & Reg
83 / 94
Corr & Reg
Given n pairs of observations (x1 , y1 ), (x2 , y2 ), ..., (xn , yn ) taken on two rvs X and Y , their linear
correlation is defined as, P
(X − X̄)(Y − Ȳ )
r= P p P
(X − X̄)2 (Y − Ȳ )2
This is called Pearson’s coefficient of correlation.
84 / 94
Corr & Reg
87 / 94
Corr & Reg
What we actually need is to estimate a and b from the given values of X and Y to get an estimate
of the above linear model that’s Ŷi = â + b̂Xi . The residual is then equal to ei = Yi − Ŷi = the
difference between the observed and estimated responses. 89 / 94
Corr & Reg
â = Ȳ − b̂X̄
In reality, the formulas for coefficients â and b̂ can be established using the concept of the least
squares method. In which, we try to choose those values of a and b that minimize the sum of the
squared differences (the residuals) between the observed responses Yi and the estimated responses
Ŷi , that is (Yi − Ŷi )2 = e2i . You can simply say that we choose those values of a and b that are
different from their true values to a very less extent.
Note: You can denote the coefficients a and b by α and β and their estimates by α̂ and β̂ too.
90 / 94
Corr & Reg
Solving equation (1) and (2) simultaneously for â and b̂ gives the LSE of a and b, given as below.
P P
P X Y
XY −
b̂ = n
P 2 ( X)2
P
X −
n
and
P P
X XY
X2
P P
Y −
â = P n
P 2 ( X)2
X −
n
But in practice, to estimate a (i.e. to calculate â) we use the following easier formula
â = Ȳ − b̂X̄
Note: Calculation (or computation) and estimation, are two different things. For example,
multiplying 7 by 8 is a calculation. There is no specific rule or reference used in calculations.
Whereas, we estimate a quantity (or a parameter) by using certain rule. For example, we estimate
the population mean (µ) by sample mean (X̄).
92 / 94
Corr & Reg
This straight line enables us to interpret the interaction behavior of X and Y , and also helps in
predicting as to what would be the values of Y corresponding to any other (missing, past, future)
values of X.
Do Exercises 10.14(b) 10.14(c), 10.15(b), 10.6, 10.7, 10.8, 10.9(a and b parts).
Class Quiz:
What is the difference between correlation and regression (google it). Exercises 10.17 and 10.7.
94 / 94