L11 LinearRegression
L11 LinearRegression
March 3, 2022
Pfizer COVID-19 Vaccine Trial
1
https://www.nejm.org/doi/full/10.1056/NEJMoa2034577
Pfizer COVID-19 Vaccine Trial
Risk ratio:
risk of COVID-19 with vaccine
RR =
risk of COVID-19 with placebo
9/18508
=
169/18435
≈ 0.053
Vaccine Placebo
Positive 9 169
Negative 18,499 18,266
Geometry Statistics
Is there a relationship between the heights of mothers
and their daughters?
If you know a mother’s height, can you predict her
daughter’s height with any accuracy?
Linear regression is a tool for answering these types of
questions.
It models the relationship as a straight line.
Regression Setup
Example:
xi is the height of the ith mother
yi is the height of the ith mother’s daughter
Linear Regression
Model the data as a line:
y
(xi , yi )
�i � yi = α + βxi + i
1
α : intercept
β : slope
�
i : error
x
Geometry: Least Squares
We want to fit a line as close to the data as possible,
which means we want to minimize the errors, i .
y
(xi , yi )
�i � yi = α + βxi + i
1
α : intercept
β : slope
�
i : error
x
Geometry: Least Squares
ỹi = yi − ȳ
x̃i = xi − x̄
Pn Pn
Note: i=1 ỹi = 0 and i=1 x̃i =0
ỹi = βx̃i + i
~
y
ỹ1 x̃1 1
�
. = β x̃.2 + .2
ỹ2
.. .. .. ~
x
ỹn x̃n n �~
x
So far, we have:
ỹi = β̂x̃i + i
Expanding out x̃i and ỹi gives
Rearranging gives
i ∼ N(0, σ 2 )
The likelihood is
n
2i
Y 1
L(α, β) = √ exp − 2
2πσ 2σ
i=1
Probability: Maximum Likelihood