0% found this document useful (0 votes)
24 views15 pages

Reading 4

Uploaded by

pratham.khanna
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views15 pages

Reading 4

Uploaded by

pratham.khanna
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

SLR

Smoking consumption and lung cancer:

Hua Liang (GWU) 2118-M 219 /

SLR I
The simple linear regression model is:

yi = β0 + β1 xi + ei i = 1...n (3)

Where
the errors ei are mutually independent
E (ei ) = 0, var(ei ) = σ 2

Hua Liang (GWU) 2118-M 220 /


SLR II

Hua Liang (GWU) 2118-M 221 /

SLR III

1 Linearity
2 Independence
3 Normality
4 Equal variance

LINE=⇒ LIE
Q1: Which assumptions are necessary?
Q2: How to verify these assumptions?
Q3: What are the remedial measures?

–HL

The observed data are (yi , xi ), i = 1, 2, . . . , n The unknown parameters, to


be estimated, are β0 , β1 , σ 2 .
Hua Liang (GWU) 2118-M 222 /
SLR IV
The mean parameters β0 , β1 are the parameters of main interest.
Based on the data, we estimate β0 , β1 by β̂0 , β̂1 . Our main tool is the
Method of Least Squares.
Definition: The least squares estimators for β̂0 , β̂1 are the values that
minimize the residual sum of squares
n
X
[yi − (β̂0 + β̂1 xi )]2
i=1

1 Pn 1 Pn
Put x̄ = n i=1 xi , ȳ = n i=1 yi . We will show that
Pn
(x − x̄ )(yi − ȳ)
Pn i
β̂1 = i=1 2
i=1 (xi − x̄ )

β̂0 = ȳ − β̂1 x̄
Hua Liang (GWU) 2118-M 223 /

SLR V

The regression line is y = β̂0 + β̂1 x .


The fitted values are ŷi = β̂0 + β̂1 xi , i = 1, 2, . . . , n
The residuals are êi = yi − ŷi , i = 1, 2, . . . , n
The residual
Pn sum of squares
P (RSS) is
RSS = i=1 (yi − ŷi )2 = ni=1 êi2

> plot ( smoke $CIG , smoke $LUNG)


> f i t l u n g <- lm ( smoke $LUNG ~ smoke $ CIG )
> summary ( f i t l u n g , cor = F )

Hua Liang (GWU) 2118-M 224 /


SLR VI

Call :
lm ( formula = smoke$LUNG ∼ smoke$CIG )
Residuals :
Min 1 Q Median 3Q Max
-6.943 -1.656 0.382 1.614 7.561

Coefficients :
Estimate Std . Error t value Pr ( >| t |)
( Intercept ) 6.4717 2.1407 3.02 0.0043
smoke$CIG 0.5291 0.0839 6.31 1.4 e -07

Residual standard error : 3.07 on 42 degrees of freedom


Multiple R2: 0.486 , Adjusted R2: 0.474
F - statistic : 39.8 on 1 and 42 DF , p - value : 1.44 e -07

> abline ( f i t l u n g , l t y = 2 )
> points ( smoke $CIG , f i t l u n g $ fitted , col = 3 , pch = 1 4 )
> names ( f i t l u n g )

[1] " coefficients " " residuals " " effects " " rank "
[5] " fitted . values " " assign " " qr " " df . residual "
[9] " xlevels " " call " " terms " " model "

> round ( f i t l u n g $ fitted [ 1 : 8 ] , 3 )

Hua Liang (GWU) 2118-M 225 /

SLR VII

1 2 3 4 5 6 7 8
16.1 20.1 16.1 21.6 22.9 24.2 27.9 21.4

> round ( f i t l u n g $ resid [ 1 : 8 ] , 3 )

1 2 3 4 5 6 7 8
0.949 -0.332 -0.142 0.467 -0.096 0.301 -0.608 2.141

Hua Liang (GWU) 2118-M 226 /


SLR VIII

Hua Liang (GWU) 2118-M 227 /

SLR IX

Theorem 11 (LS Estimation for SLR)


The least squares estimators for the SLR model (3) are
Pn
(x − x̄ )(yi − ȳ) Sxy
β̂1 = i=1Pn i 2
=
i=1 (xi − x̄ ) Sxx

β̂0 = ȳ − β̂1 x̄

Proof:
We want to findP(nβ̂0 , β̂1 ), the values that minimize the function
RSS(b0 , b1 ) = i=1 [yi − (b0 + b1 xi )] . 2

Hua Liang (GWU) 2118-M 228 /


SLR X

(β̂0 , β̂1 ) are the solution of the system of equations

∂RSS (b0 , b1 )
=0
∂b0

∂RSS (b0 , b1 )
=0
∂b1

n
X
∂RSS (b0 , b1 )
= −2[yi − (b0 + b1 xi )]
∂b0
i=1
X X
= −2[ yi − (nb0 + b1 xi )]
= −2n[ȳ − (b0 + b1 x̄ )]

Hua Liang (GWU) 2118-M 229 /

SLR XI

n
X
∂RSS (b0 , b1 )
= −2xi [yi − (b0 + b1 xi )]
∂b1
i=1
X X X
= −2[ xi yi − (b0 xi + b1 xi2 )]

So the solution (β̂0 , β̂1 ) has to satisfy:



ȳ = β̂ + β̂1 x̄
P 0 P P
xi yi = β̂0 xi + β̂1 xi2

Hua Liang (GWU) 2118-M 230 /


SLR XII
We replace β̂0 = ȳ − β̂1 x̄ in the second equation, to get:
X X X
xi yi = β̂0 xi + β̂1 xi2
X X
= (ȳ − β̂1 x̄ ) xi + β̂1 xi2
X X
= ȳ xi + β̂1 ( xi2 − n x̄ 2 )
X X
= ȳ xi + β̂1 (xi − x̄ )2

It follows that
P
xi (yi − ȳ)
β̂1 = P
(xi − x̄ )2
P
Subtract from the numerator x̄ (yi − ȳ) = 0 to get
P
(xi − x̄ )(yi − ȳ) Sxy
β̂1 = P =
(xi − x̄ )2 Sxx
Hua Liang (GWU) 2118-M 231 /

SLR XIII

Example 12 (Compute β̂0 , β̂1 for the Fuel example in R)


[1] 25.3

[1] 25.3

[1] 216

Matrix notation
Put   Pn !
Pn
i=1 (xi −x̄ )(yi −ȳ)
β̂1 2
β̂ = = i=1 (xi −x̄ )
β̂2 ȳ − β̂1 x̄

We can show that


β̂ = (X > X )−1 X > y

Hua Liang (GWU) 2118-M 232 /


   
1 x1 y1
 1 x2   y2 
   
X = .. ..  , Y = .. 
 . .   . 
1 xn yn


1 1 · · · 1
X> =
x1 x2 · · · xn
 Pn   Pn 
> n xi > y i
X X = Pn Pni=1 2 , X Y = Pni=1
x
i=1 i i=1 xi i=1 xi yi

 −1  
a b 1 d −b
=
c d ad − bc −c a

Hua Liang (GWU) 2118-M 233 /

 Pn Pn 
1 xi2 − i=1 xi
(X > X )−1 = P P P
i=1
n ni=1 xi2 − ( ni=1 xi )2 − ni=1 xi n
 Pn 2
Pn 
1 x − x
i=1 i
= P i=1 i
n
Sxx − i=1 xi n
!
1 x̄ 2
n + Sxx − Sx̄xx
=
− Sx̄xx − S1xx
 Pn 2
Pn Pn Pn 
> −1 > 1 x yi − xi yi xi
(X X ) (X Y ) = Pi
i=1 P
i=1 i=1 P i=1
Sxx − ni=1 xi ni=1 yi + n ni=1 xi yi

Hua Liang (GWU) 2118-M 234 /


n
X n
X n
X n
X
Sxx = (xi − x )2 = xi2 − nx 2 = (xi − x )xi 6= xi2
i=1 i=1 i=1 i=1

n
X n
X
Sxy = (xi − x )(yi − y) = xi2 − nx y
i=1 i=1
Xn n
X n
X
= (xi − x )yi = (yi − y)xi 6= xi yi
i=1 i=1 i=1

–HL

Pn Pn Pn
− i=1 xi i=1 yi +n i=1 xi yi
βb1 =
Sxx
Sxy
=
Sxx
Hua Liang (GWU) 2118-M 235 /

n
X n
X n
X n
X
xi2 yi − xi yi xi
i=1 i=1 i=1 i=1
Xn Xn n
X n
n
X n n n
1 1 X 2X X X
= xi2 yi − (
xi ) yi + ( 2
xi ) yi − xi yi xi
n n
i=1 i=1 i=1 i=1 i=1 i=1 i=1 i=1
n
( n n
) n
( n n n
)
X X 1 X X 1 X X X
= yi xi2 − ( xi )2 + xi xi yi − xi yi
n n
i=1 i=1 i=1 i=1 i=1 i=1 i=1
n
( n
) n
X X X
2 2
= yi xi − nx − xi Sxy
i=1 i=1 i=1
= βb0 Sxx

Hua Liang (GWU) 2118-M 236 /


SLR I

The Simple Linear Regression model is:

yi = β0 + β1 xi + ei , i = 1...n

Where the errors e1 , . . . , en are independent (actually, we may assume


that e1 , . . . , en are just uncorrelated), E (ei ) = 0, and var(ei ) = σ 2 .
We showed that the Least Squares estimators are

Pn
(x − x̄ )(yi − ȳ) Sxy
β̂1 = Pn i
i=1
2
=
i=1 (xi − x̄ ) Sxx
β̂0 = ȳ − β̂1 x̄ (4)

How good are β̂0 , β̂1 as estimators of the unknown β0 , β1 ?

Hua Liang (GWU) 2118-M 237 /

SLR II
Theorem 13 (Properties of the Least Squares Estimator)
In the simple linear regression model, the LS estimators (4) have the
following properties:
1 β̂0 , β̂1 are unbiased; i.e., E (β̂0 ) = β0 E (β̂1 ) = β1
2
σ2 σ2
var(β̂1 ) = Pn 2
=
i=1 (xi − x̄ ) Sxx
 
1 x̄ 2
var(β̂0 ) = + σ2
n Sxx

3 cov(β̂0 , β̂1 ) = −x̄ /Sxx σ 2


4 If the errors ei are normal, then β̂1 , β̂0 have a normal distribution, i.e.
 
σ2 2 1 x̄ 2
β̂1 ∼ N (β1 , ), β̂0 ∼ N (β0 , σ + )
Sxx n Sxx
Hua Liang (GWU) 2118-M 238 /
SLR-Proof I

1 Show that E (β̂1 ) is unbiased.


E (ȳ) = β0 + β1 x̄

P 
(xi − x̄ )(yi − ȳ)
E (β̂1 ) = E
Sxx
1 X
= E ( (xi − x̄ )(yi − ȳ))
Sxx
1 X
= (xi − x̄ )(E (yi ) − E (ȳ))
Sxx
1 X
= (xi − x̄ )(β0 + β1 xi − (β0 + β1 x̄ ))
Sxx
1 X
= (xi − x̄ )(xi − x̄ )β1
Sxx
= β1 .

Hua Liang (GWU) 2118-M 239 /

SLR-Proof II

E (β̂0 ) = β0 .
σ2
2 Show now that var(β̂1 ) = Sxx
P
(xi − x̄ )(yi − ȳ)
β̂1 =
Sxx
P P
(xi − x̄ )yi − (xi − x̄ )ȳ
=
Sxx
P
(xi − x̄ )yi
=
Sxx

Hua Liang (GWU) 2118-M 240 /


SLR-Proof III
P 
(xi − x̄ )yi
var(β̂1 ) = var
Sxx
1 X
= 2
var (xi − x̄ )yi
Sxx
1 X
= 2
(xi − x̄ )2 var(yi )
Sxx
1 X
= 2
(xi − x̄ )2 σ 2
Sxx
1
= 2
Sxx σ 2
Sxx
σ2
=
Sxx
 
1 x̄ 2
Show that var(β̂0 ) = n + Sxx σ 2
σ2
var(ȳ) = n

Hua Liang (GWU) 2118-M 241 /

SLR-Proof IV
2
cov(ȳ, yi ) = σn
cov(ȳ, β̂1 ) = 0
 
1 x̄ 2
var(β̂0 ) = n + Sxx σ2
3 Assume now that ei ∼ N (0, σ 2 ). Show that β̂0 , β̂1 have normal
distributions.
β̂1 , β̂0 are linear functions of y1 , . . . , yn .
Since y1 , . . . , yn are normally distributed, then β̂0 , β̂1 are also
normally distributed, as linear functions of the yi ’s.
Analysis of Lung Cancer data, continued
( Intercept ) smoke$CIG
( Intercept ) 4.582 -0.17536
smoke$CIG -0.175 0.00704

[1] 0.0851

[1] 53

Hua Liang (GWU) 2118-M 242 /


SLR-Proof V

[1] -2.12

In vector notation, the theorem states the following:


   
β̂0 β0
E (β̂) = E (β) i .e. E =
β̂1 β1

2
!
1
+ Sx̄xx − Sx̄xx
var(β̂) = σ 2 n
− Sx̄xx 1
Sxx

Hua Liang (GWU) 2118-M 243 /

SLR-Proof VI

Definition:
An estimator of a parameter (β0 or β1 ) is a function that can be
computed from the observed values y1 , . . . , yn , i.e. it does not depend
on any unknown quantities.
A linear estimator of β0 or β1 is any linear function Pn of the observed
values y1 , y2 , . . . , yn . This has the general form i=1 ai yi for some
numbers a1 , a2 , . . . , an .
An unbiased estimator of β0 (or β1 ) has the expectation equal to β0
(or β1 ).

Hua Liang (GWU) 2118-M 244 /


SLR-Proof VII

Theorem 14 (Theorem (Gauss–Markov))


For the Simple Linear Regression model, the LS estimators are the Best
Linear Unbiased Estimators (BLUE).

Example 15
Consider the SLR model yi = β0 + β1 xi + ei , i = 1, 2, 3, 4, with
x1 = 1, x2 = −1, x3 = 2, x4 = 2.
We have x̄ = 1, Sxx = 6. The LSE is
Sxy 1
β̂1 = = (−2y2 + y3 + y4 )
Sxx 6
1
β̂0 = ȳ − x̄ β̂1 = (3y1 + 7y2 + y3 + y4 )
12
y1 −y2 y1 +y2
Put now β̃1 = 2 , β̃0 = 2 .

Hua Liang (GWU) 2118-M 245 /

SLR-Proof VIII

Check that E (β̃1 ) = β1 , E (β̃0 ) = β0 .


So (β̃0 , β̃1 ) is another linear unbiased estimator of (β0 , β1 ).
σ2
However, β̂1 is a better estimator of β1 , since var(β̃1 ) = var( y1 −y
2
2
) = 2 ,
2 2
and var(β̂1 ) = Sσxx = σ6 < var(β̃1 ).
Moreover, by Gauss–Markov theorem, var(Aβ̃0 + B β̃1 ) ≥ var(Aβ̂0 + B β̂1 )
for any numbers A, B .
For example:
var(β̂0 ) ≤ var(β̃0 )
var(β̂1 ) ≤ var(β̃1 )
the fitted values ŷi = β̂0 + β̂1 xi have a smaller variance than
ỹi = β̃0 + β̃1 xi .

Hua Liang (GWU) 2118-M 246 /


SLR-Proof IX

Theorem 16 (Theorem (Gauss–Markov))


The LSE β̂ = (β̂0 , β̂1 )> is Best Linear Unbiased Estimator of
β = (β0 , β1 )> , i.e. if (β̃0 , β̃1 ) is another linear, unbiased estimator of β,
we have that
var(a β̂0 + b β̂1 ) ≤ var(a β̃0 + b β̃1 )
for any numbers a, b.

Hua Liang (GWU) 2118-M 247 /

You might also like