0% found this document useful (0 votes)

13 views25 pages

SimpleLinearRegression 150107

The document describes the basic simple linear regression model. It defines the model as yi = β0 + β1xi + εi, where β0 and β1 are unknown constants and the error terms εi are assumed to be independent and normally distributed. It explains how to fit the regression line to data by minimizing the sum of squared residuals, and derives the formulas to estimate the parameters β0 and β1. It also establishes that the estimator β1 is unbiased for the true slope parameter β1.

Uploaded by

vinays

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views25 pages

SimpleLinearRegression 150107

Uploaded by

vinays

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 25

Basic Linear Regression

Dave Goldsman

Georgia Institute of Technology, Atlanta, GA, USA

1/7/15

Goldsman 1/7/15 1 / 25
Outline

1 Simple Linear Regression Model

2 Basic Properties

3 Confidence Intervals and Inference for β0 and β1

Goldsman 1/7/15 2 / 25
Simple Linear Regression Model

Suppose we have a data set with the following paired observations:

(x1 , y1 ), (x2 , y2 ), . . . , (xn , yn )

Example:
xi = height of person i
yi = weight of person i
Can we make a model expressing yi as a function of xi ?

Goldsman 1/7/15 3 / 25
Simple Linear Regression Model

Estimate yi for fixed xi . Let’s model this with the simple linear
regression equation,
yi = β0 + β1 xi + εi ,
where β0 and β1 are unknown constants and the error terms are
usually assumed to be
iid
ε1 , . . . , εn ∼ N (0, σ 2 ) ⇒ yi ∼ N (β0 + β1 xi , σ 2 ).

Goldsman 1/7/15 4 / 25
Simple Linear Regression Model

y = β0 + β1 x
with “high” σ 2

y = β0 + β1 x
with “low” σ 2

Goldsman 1/7/15 5 / 25
Simple Linear Regression Model

Warning! Look at data before you fit a line to it:

doesn’t look very linear!

Goldsman 1/7/15 6 / 25
Simple Linear Regression Model

xi yi
Production Electric Usage
($ million) (million kWh)

Jan 4.5 2.5

Feb 3.6 2.3
Mar 4.3 2.5
Apr 5.1 2.8
May 5.6 3.0
Jun 5.0 3.1
Jul 5.3 3.2
Aug 5.8 3.5
Sep 4.7 3.0
Oct 5.6 3.3
Nov 4.9 2.7
Dec 4.2 2.5

Goldsman 1/7/15 7 / 25
Simple Linear Regression Model

3.4

yi 3.0

2.6

2.2

3.5 4.0 4.5 5.0 5.5 6.0

xi
Great... but how do you fit the line?

Goldsman 1/7/15 8 / 25
Simple Linear Regression Model

Fit the regression line y = β0 + β1 x to the data

(x1 , y1 ), . . . , (xn , yn )

by finding the “best” match between the line and the data. The
“best”choice of β0 , β1 will be chosen to minimize
n
X n
X
2
Q= (yi − (β0 + β1 xi )) = ε2i .
i=1 i=1

Goldsman 1/7/15 9 / 25
Simple Linear Regression Model

This is called the least squares fit. Let’s solve...

∂Q P
∂β0 = −2 (yi − (β0 + β1 xi )) = 0

∂Q P
∂β1 = −2 xi (yi − (β0 + β1 xi )) = 0

P P
⇔ yi = nβ0 + β1 xi
P P
xi y i = xi (yi − (β0 + β1 xi )) = 0

After a little algebra, get

P P P
n xP
i yi −( Pxi )( yi )
β̂1 = 2
n xi −( xi )2

1 P 1 P
β̂0 = ȳ − β̂1 x̄, where ȳ ≡ n yi and x̄ ≡ n xi .

Goldsman 1/7/15 10 / 25
Simple Linear Regression Model

Let’s introduce some more notation:

P
P P 2 P 2 ( xi )2
Sxx = (xi − x̄)2 = xi − nx̄2 = xi − n

P P
Sxy = (xi − x̄)yi = (xi − x̄)(yi − ȳ)

P P P P
( xi )( yi )
= xi yi − nx̄ȳ = xi y i − n

These are called sums of squares.

Goldsman 1/7/15 11 / 25
Simple Linear Regression Model

Then, after a little more algebra, we can write

Sxy
β̂1 =
Sxx

Fact: If the εi ’s are iid N (0, σ 2 ), it can be shown that βˆ0 and β̂1 are the
maximum likelihood estimators for β̂0 and β̂1 , respectively. (See any
text for easy proof).

Anyhow, the fitted regression line is:

ŷ = β̂0 + β̂1 x.

Goldsman 1/7/15 12 / 25
Simple Linear Regression Model

Fix a specific value of the explanatory variable x∗ , the equation gives a

fitted value ŷ|x∗ = β̂0 + β̂1 x∗ for the dependent variable y.
ŷ

ŷ = β̂0 + β̂1 x

ŷ|x∗

x
x ∗
xi

Goldsman 1/7/15 13 / 25
Simple Linear Regression Model

Notation Summary: For actual data points xi , the fitted values are
ŷi = β̂0 + β̂1 xi .

observed values : yi = β0 + β1 xi + εi

fitted values : yˆi = β̂0 + β̂1 xi

Goldsman 1/7/15 14 / 25
Simple Linear Regression Model

Example: Suppose
12
X X
n = 12, xi = 58.62, yi = 34.15,
i=1
X X X
x2i = 291.231, yi2 = 98.697, xi yi = 169.253

These give β̂0 = 0.4090 and β̂1 = 0.49883, and so the fitted regression
line is
ŷ = 0.409 + 0.499x.

For example, ŷ|5.5 = 3.1535.

Goldsman 1/7/15 15 / 25
Basic Properties

Outline

1 Simple Linear Regression Model

2 Basic Properties

3 Confidence Intervals and Inference for β0 and β1

Goldsman 1/7/15 16 / 25
Basic Properties

Since the yi ’s are independent with yi ∼ N (β0 + β1 xi , σ 2 ) (and the xi ’s

are constants), we have
1 1 X
E[β̂1 ] = E[Sxy ] = (xi − x̄)E[yi ]
Sxx Sxx
1 X
= (xi − x̄)(β0 + β1 xi )
Sxx
X
1 X
= β0 (xi − x̄) +β1 (xi − x̄)xi
Sxx | {z }
0

β1 X 2 β1 X 2 2
= (xi − xi x̄) = xi − nx̄
Sxx Sxx | {z }
Sxx
= β1

Thus, β̂1 is an unbiased estimator of β1 .

Goldsman 1/7/15 17 / 25
Basic Properties

Further, since β̂1 is a linear combination of independent normals, β̂1 is

itself normal. We can also derive
1 1 X σ2
Var(β̂1 ) = 2
Var(Sxy ) = 2 (xi − x̄)2 Var(yi ) = .
Sxx Sxx Sxx
2
Thus, β̂1 ∼ N (β1 , Sσxx ).

Goldsman 1/7/15 18 / 25
Basic Properties

While we’re at it, we can do the same kind of thing with the intercept
parameter, β0 :
β̂0 = ȳ − β̂1 x̄.
We have

E[β̂0 ] = E[ȳ] − x̄ E[β̂1 ] = β0 + β1 x̄ − x̄β1 = β0 ,

so that β̂0 is unbiased for β0 .

Similar to before, since β̂0 is a linear combination of independent

normals, it is also normal. Finally,
P 2
xi 2
Var(β̂0 ) = σ .
nSxx

Goldsman 1/7/15 19 / 25
Basic Properties

Proof:
1 P
Cov(ȳ, β̂1 ) = Sxx Cov(ȳ, (xi − x̄)yi )
P
(xi −x̄)
= Sxx Cov(ȳ, yi )
P
(xi −x̄) σ2
= Sxx n = 0

⇒ Var(β̂0 ) = Var(ȳ − βˆ1 x̄)

= Var(ȳ) + x̄2 Varβ̂1 − 2x̄ Cov(ȳ, β̂1 )
| {z }
0
σ2 2
= + x̄2 Sσxx
n
−nx̄2
= σ 2 Sxx
nSxx . 2

P
x2
Thus, β̂0 ∼ N (β0 , nSxxi σ 2 ).

Goldsman 1/7/15 20 / 25
Basic Properties

Now let’s estimate the error variation σ 2 by considering the deviations

between yi and ŷi , i.e., the sum of squared errors,
X
SSE ≡ (yi − ŷi )2
X
= (yi − (β̂0 + β̂1 xi ))2
X X X
= yi2 − β̂0 yi − β̂1 xi y i .

Turns out that a good estimator for σ 2 is

SSE σ 2 χ2 (n − 2)
σ̂ 2 ≡ ∼ .
n−2 n−2

Goldsman 1/7/15 21 / 25
Confidence Intervals and Inference for β0 and β1

Outline

1 Simple Linear Regression Model

2 Basic Properties

3 Confidence Intervals and Inference for β0 and β1

Goldsman 1/7/15 22 / 25
Confidence Intervals and Inference for β0 and β1

Back to β̂1 ∼ N (β1 , σ 2 /Sxx ) . . .

β̂1 − β1
⇒p ∼ N (0, 1)
σ 2 /Sxx
In addition, it turns out:
SSE σ2 χ2 (n−2)
(1) σ̂ 2 = n−2 ∼ n−2 ;

(2) σ̂ 2 is independent of β̂1

⇒
√β̂12−β1
σ /Sxx N (0, 1)
q ∼q 2 ∼ t(n − 2)
σ̂2 χ (n−2)
σ2 n−2
⇒
β̂1 − β1
√ ∼ t(n − 2).
σ̂/ Sxx

Goldsman 1/7/15 23 / 25
Confidence Intervals and Inference for β0 and β1

t(n − 2)

1−α

−tα/2,n−2 tα/2,n−2

Goldsman 1/7/15 24 / 25
Confidence Intervals and Inference for β0 and β1

2-sided Confidence Intervals for β1 :

β̂1 − β1
1 − α = P −tα/2,n−2 ≤ √ ≤ tα/2,n−2
σ̂/ Sxx

σ̂ σ̂
= P β̂1 − tα/2,n−2 √ ≤ β1 ≤ β̂1 + tα/2,n−2 √
Sxx Sxx

1-sided CI’s for β1 :

σ̂
β1 ∈ −∞, β̂1 + tα,n−2 √
Sxx

σ̂
β1 ∈ β̂1 − tα,n−2 √ ,∞
Sxx

Can also do CI’s for β0 as well as hypothesis testing.

Goldsman 1/7/15 25 / 25

Principles: Life and Work
From Everand
Principles: Life and Work
Ray Dalio
4/5 (643)
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
From Everand
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
Brene Brown
4/5 (1175)
The Glass Castle: A Memoir
From Everand
The Glass Castle: A Memoir
Jeannette Walls
4.5/5 (1856)
Sing, Unburied, Sing: A Novel
From Everand
Sing, Unburied, Sing: A Novel
Jesmyn Ward
4/5 (1267)
The Perks of Being a Wallflower
From Everand
The Perks of Being a Wallflower
Stephen Chbosky
4.5/5 (4103)
Her Body and Other Parties: Stories
From Everand
Her Body and Other Parties: Stories
Carmen Maria Machado
4/5 (903)
Shoe Dog: A Memoir by the Creator of Nike
From Everand
Shoe Dog: A Memoir by the Creator of Nike
Phil Knight
4.5/5 (629)
Steve Jobs
From Everand
Steve Jobs
Walter Isaacson
4.5/5 (1139)
The Emperor of All Maladies: A Biography of Cancer
From Everand
The Emperor of All Maladies: A Biography of Cancer
Siddhartha Mukherjee
4.5/5 (298)
The Yellow House: A Memoir (2019 National Book Award Winner)
From Everand
The Yellow House: A Memoir (2019 National Book Award Winner)
Sarah M. Broom
4/5 (100)
Angela's Ashes: A Memoir
From Everand
Angela's Ashes: A Memoir
Frank McCourt
4.5/5 (943)
The World Is Flat 3.0: A Brief History of the Twenty-first Century
From Everand
The World Is Flat 3.0: A Brief History of the Twenty-first Century
Thomas L. Friedman
3.5/5 (2289)
The Outsider: A Novel
From Everand
The Outsider: A Novel
Stephen King
4/5 (2885)
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
From Everand
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
Dave Eggers
3.5/5 (233)
Team of Rivals: The Political Genius of Abraham Lincoln
From Everand
Team of Rivals: The Political Genius of Abraham Lincoln
Doris Kearns Goodwin
4.5/5 (244)
Rise of ISIS: A Threat We Can't Ignore
From Everand
Rise of ISIS: A Threat We Can't Ignore
Jay Sekulow
3.5/5 (144)
Manhattan Beach: A Novel
From Everand
Manhattan Beach: A Novel
Jennifer Egan
3.5/5 (919)
Fear: Trump in the White House
From Everand
Fear: Trump in the White House
Bob Woodward
3.5/5 (836)
John Adams
From Everand
John Adams
David McCullough
4.5/5 (2546)
The Unwinding: An Inner History of the New America
From Everand
The Unwinding: An Inner History of the New America
George Packer
4/5 (45)
The Light Between Oceans: A Novel
From Everand
The Light Between Oceans: A Novel
M.L. Stedman
4.5/5 (815)
Little Women
From Everand
Little Women
Louisa May Alcott
4.5/5 (2369)
SIMPLEX METHOD - QTM Presentation
No ratings yet
SIMPLEX METHOD - QTM Presentation
18 pages
Matlab (Da)
No ratings yet
Matlab (Da)
14 pages
Polynomials 10th Mathematics
No ratings yet
Polynomials 10th Mathematics
3 pages
Worksheet Polynomials CL 9
No ratings yet
Worksheet Polynomials CL 9
6 pages
Math 9
No ratings yet
Math 9
3 pages
Marvs 3d Examination For St. James..
No ratings yet
Marvs 3d Examination For St. James..
2 pages
Pure & Mixed IPP Gomory's Cutting
No ratings yet
Pure & Mixed IPP Gomory's Cutting
42 pages
Reza Dianofitra - Lab Report - Convection Diffusion Problem PDF
No ratings yet
Reza Dianofitra - Lab Report - Convection Diffusion Problem PDF
17 pages
Assig - Polynomials (Previous Year)
No ratings yet
Assig - Polynomials (Previous Year)
3 pages
CISE301 Syllabus
No ratings yet
CISE301 Syllabus
3 pages
Chapter 2 Polynomials MCQ
No ratings yet
Chapter 2 Polynomials MCQ
3 pages
Numerical Optimization
No ratings yet
Numerical Optimization
12 pages
1 Gauss-Jordan Lapena
No ratings yet
1 Gauss-Jordan Lapena
4 pages
Differential Equation
No ratings yet
Differential Equation
6 pages
Integral Topic Assessment 1 2
No ratings yet
Integral Topic Assessment 1 2
1 page
Unit 2
No ratings yet
Unit 2
24 pages
Poly
No ratings yet
Poly
4 pages
Riya Bepari - 34700122020 - Numerical Methods
No ratings yet
Riya Bepari - 34700122020 - Numerical Methods
16 pages
Interpolation and Basis Function
No ratings yet
Interpolation and Basis Function
12 pages
TB 111 65475e0e903cd6.65475e1023ddf5.26934364
No ratings yet
TB 111 65475e0e903cd6.65475e1023ddf5.26934364
3 pages
Lecture-03 - Gauss Ellimination Method
No ratings yet
Lecture-03 - Gauss Ellimination Method
31 pages
The Two-Phase Simplex Method: Case 1
No ratings yet
The Two-Phase Simplex Method: Case 1
10 pages
Assignment Numerical Differentiation
No ratings yet
Assignment Numerical Differentiation
1 page
Lagrange's Interpolation - Solved Example Problems
No ratings yet
Lagrange's Interpolation - Solved Example Problems
12 pages
DAA Unit4 Part-2 (Branch and Bound)
No ratings yet
DAA Unit4 Part-2 (Branch and Bound)
3 pages
3 Interpolation
No ratings yet
3 Interpolation
32 pages
Maxima and Minima
No ratings yet
Maxima and Minima
7 pages
Lecture On Euler's Method
No ratings yet
Lecture On Euler's Method
15 pages
SE301: Numerical Methods: Interpolation
No ratings yet
SE301: Numerical Methods: Interpolation
50 pages
Simpson's Rule - Examples, Method, & Formula - Video & Lesson TR
No ratings yet
Simpson's Rule - Examples, Method, & Formula - Video & Lesson TR
13 pages

SimpleLinearRegression 150107

Uploaded by

SimpleLinearRegression 150107

Uploaded by

Basic Linear Regression

Georgia Institute of Technology, Atlanta, GA, USA

1 Simple Linear Regression Model

3 Confidence Intervals and Inference for β0 and β1

Suppose we have a data set with the following paired observations:

(x1 , y1 ), (x2 , y2 ), . . . , (xn , yn )

Warning! Look at data before you fit a line to it:

Jan 4.5 2.5

3.5 4.0 4.5 5.0 5.5 6.0

Fit the regression line y = β0 + β1 x to the data

This is called the least squares fit. Let’s solve...

After a little algebra, get

Let’s introduce some more notation:

These are called sums of squares.

Then, after a little more algebra, we can write

Anyhow, the fitted regression line is:

Fix a specific value of the explanatory variable x∗ , the equation gives a

fitted values : yˆi = β̂0 + β̂1 xi

For example, ŷ|5.5 = 3.1535.

1 Simple Linear Regression Model

3 Confidence Intervals and Inference for β0 and β1

Since the yi ’s are independent with yi ∼ N (β0 + β1 xi , σ 2 ) (and the xi ’s

Thus, β̂1 is an unbiased estimator of β1 .

Further, since β̂1 is a linear combination of independent normals, β̂1 is

E[β̂0 ] = E[ȳ] − x̄ E[β̂1 ] = β0 + β1 x̄ − x̄β1 = β0 ,

so that β̂0 is unbiased for β0 .

Similar to before, since β̂0 is a linear combination of independent

⇒ Var(β̂0 ) = Var(ȳ − βˆ1 x̄)

Now let’s estimate the error variation σ 2 by considering the deviations

Turns out that a good estimator for σ 2 is

1 Simple Linear Regression Model

3 Confidence Intervals and Inference for β0 and β1

Back to β̂1 ∼ N (β1 , σ 2 /Sxx ) . . .

(2) σ̂ 2 is independent of β̂1

2-sided Confidence Intervals for β1 :

1-sided CI’s for β1 :

Can also do CI’s for β0 as well as hypothesis testing.

You might also like