0% found this document useful (0 votes)
56 views23 pages

Simple Linear Regression and Correlation: Abrasion Loss vs. Hardness

The document summarizes key concepts in simple linear regression including: 1) Simple linear regression models the relationship between a dependent variable Y and independent variable X using a linear equation Y = β0 + β1X + ε, where β0 and β1 are parameters estimated from sample data. 2) The parameters β0 and β1 are estimated using the method of least squares which finds the values of β0 and β1 that minimize the sum of squared residuals between observed and predicted Y values. 3) The assumptions of simple linear regression include a linear relationship between Y and X, independent and normally distributed errors with constant variance.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
56 views23 pages

Simple Linear Regression and Correlation: Abrasion Loss vs. Hardness

The document summarizes key concepts in simple linear regression including: 1) Simple linear regression models the relationship between a dependent variable Y and independent variable X using a linear equation Y = β0 + β1X + ε, where β0 and β1 are parameters estimated from sample data. 2) The parameters β0 and β1 are estimated using the method of least squares which finds the values of β0 and β1 that minimize the sum of squared residuals between observed and predicted Y values. 3) The assumptions of simple linear regression include a linear relationship between Y and X, independent and normally distributed errors with constant variance.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 23

Chapter 11: SIMPLE LINEAR REGRESSION

AND CORRELATION

Part 1: Simple Linear Regression (SLR)


Introduction
Sections 11-1 and 11-2
Abrasion Loss vs. Hardness

Price of clock vs. Age of clock


2200

1800
Price Sold at Auction

Bidders
15.0
12.5
1400 10.0
7.5
5.0

1000

125 150 175


Age of Clock (yrs)

1
• Regression is a method for studying the
relationship between two or more
quantitative variables

• Simple linear regression (SLR):


One quantitative dependent variable
- response variable
- dependent variable
-Y
One quantitative independent variable
- explanatory variable
- predictor variable
-X

• Multiple linear regression:


One quantitative dependent variable
Many quantitative independent variables

– You’ll see this in STAT:3200/IE:3760


Applied Linear Regression, if you take it.
2
• SLR Examples:
– predict salary from years of experience
– estimate effect of lead exposure on school
testing performance
– predict force at which a metal alloy rod
bends based on iron content

3
• Example: Health data
Variables:
Percent of Obese Individuals
Percent of Active Individuals
Data from CDC. Units are regions of U.S. in 2014.

PercentObesity PercentActive
1 29.7 55.3
2 28.9 51.9
3 35.9 41.2
4 24.7 56.3
5 21.3 60.4
6 26.3 50.9
.
.
.
35
Percent obese
30
25

40 45 50 55 60 65

Percent Active

4
A scatterplot or scatter diagram can give us a
general idea of the relationship between obe-
sity and activity...
35
Percent obese
30
25

40 45 50 55 60 65

Percent Active

The points are plotted as the pairs (xi, yi)


for i = 1, . . . , 25

Inspection suggests a linear relationship be-


tween obesity and activity (i.e. a straight line
would go through the bulk of the points, and
the points would look randomly scattered around
this line).
5
Simple Linear Regression
The model

• The basic model

Yi = β0 + β1xi + i

– Yi is the observed response or dependent


variable for observation i

– xi is the observed predictor, regressor, ex-


planatory variable, independent variable,
covariate

– i is the error term

– i are iid N (0, σ 2)


(iid means independently and identically distributed)

6
– So, E[Yi|xi] = β0 + β1xi + 0 = β0 + β1xi

The conditional mean (i.e. the expected


value of Yi given xi, or after conditioning
on xi) is “β0 + β1xi” (a point on the esti-
mated line).

– Or, as another notation, E[Y |x] = µY |x

– The random scatter around the mean (i.e.


around the line) follows a N (0, σ 2) distri-
bution.

7
Example: Consider the model that re-
gresses Oxygen purity on Hydrocarbon level
in a distillation process with...

β0 = 75 and β1 = 15

For each xi there is a different Oxygen pu-


rity mean (which is the center of a normal
distribution of Oxygen purity values).

Plugging in xi to (75 + 15xi) gives you the


conditional mean at xi.

8
The conditional mean for x = 1:

E[Y |x] = 75 + 15 · 1 = 90

The conditional mean for x = 1.25:

E[Y |x] = 75 + 15 · 1.25 = 93.75

9
These values that randomly scatter around a
conditional mean are called errors.

The random error of observation i is denoted


as i. The errors around a conditional mean
are normally distributed, centered at 0, and
have a variance of σ 2 or i ∼ N (0, σ ).

Here, we assume all the conditional distri-


butions of the errors are the same, so we’re
using a constant variance model.

V [Yi|xi] = V (β0 + β1xi + i) = V (i) = σ 2

10
• The model can also be written as:

Yi|xi ∼ N (β0 + β1xi , σ 2)


| {z }
Conditional
mean
– mean of Y given x is β0 + β1x (known as
conditional mean)

– β0 + β1xi is the mean value of all the


Y ’s for the given value of xi

The regression line itself represents all the


conditional means.

All the observed points will not fall on the


line, there is some random noise around the
mean (we model this part with an error term).

Usually, we will not know β0, β1, or σ 2 so


we will estimate them from the data.
11
• Some interpretation of parameters:

– β0 is conditional mean when x=0

– β1 is the slope, also stated as the change


in mean of Y per 1 unit change in x

– σ 2 is the variability of responses about the


conditional mean

12
Simple Linear Regression
Assumptions

• Key assumptions

– linear relationship exists between Y and x

*we say the relationship between Y and


x is linear if the means of the conditional
distributions of Y |x lie on a straight line

– independent errors
(this essentially equates to independent
observations in the case of SLR)

– constant variance of errors

– normally distributed errors

13
Simple Linear Regression
Estimation

We wish to use the sample data to estimate the


population parameters: the slope β1 and the
intercept β0
• Least squares estimation
– To choose the ‘best fitting line’ using least
squares estimation, we minimize the sum
of the squared vertical distances of each
point to the fitted line.

14
– We let ‘hats’ denote predicted values or
estimates of parameters, so we have:

ŷi = βˆ0 + βˆ1xi

where ŷi is the estimated conditional mean


for xi,
βˆ0 is the estimator for β0,

and βˆ1 is the estimator for β1

– We wish to choose βˆ0 and βˆ1 such that we


minimize the sum of the squared vertical
distances of each
Pn point to the fitted line,
i.e. minimize i=1(yi − ŷi)2

– Or minimize the function g:

ˆ ˆ Pn
g(β0, β1) = i=1(yi − yˆi)2
Pn ˆ0 + βˆ1xi))2
= i=1 (y i − ( β
15
– This vertical distance of a point from the
fitted line is called a residual. The resid-
ual for observation i is denoted ei and

ei = yi − ŷi

– So, in least squares estimation, we wish


to minimize the sum of the squared
residuals (or error sum of squares SSE ).

– To minimize P
g(βˆ0, βˆ1) = ni=1(yi − (βˆ0 + βˆ1xi))2

we take the derivative of g with respect to


βˆ0 and βˆ1, set equal to zero, and solve.
n
∂g X
= −2 (yi − (βˆ0 + βˆ1xi)) = 0
∂ βˆ0 i=1
n
∂g X
= −2 (yi − (βˆ0 + βˆ1xi))xi = 0
∂ βˆ1 i=1

16
Simplifying the above gives:
n
X n
X
nβˆ0 + βˆ1 xi = yi
i=1 i=1
n n n
(x2i ) =
X X X
βˆ0 xi + βˆ1 y i xi
i=1 i=1 i=1

And these two equations are known as


the least squares normal equations.

Solving the normal equations gets us our


estimators βˆ0 and βˆ1...

17
Simple Linear Regression
Estimation

– Estimate of the slope:


Pn
i=1 (xi − x̄)(yi − ȳ) Sxy
β̂1 = Pn 2
=
i=1(xi − x̄) Sxx

– Estimate of the Y -intercept:

β̂0 = ȳ − β̂1x̄

the point (x̄, ȳ) will always be on the


least squares line

Alternative formulas for β̂0 and β̂1 are also


given in the book.

18
• Example: Cigarette data
(Nicotine vs. Tar content)

2.0
1.5


Nic





● ●●
1.0


● ●










0.5


0 5 10 15 20 25 30

Tar

n = 25

Least squares estimates from software:

β̂0=0.1309 and β̂1=0.0610

Summary statistics:
Pn
i=1 xi = 305.4 x̄ = 12.216
Pn
i=1 yi = 21.91 ȳ = 0.8764
19
Pn
i=1(yi − ȳ)(xi − x̄) = 47.01844
Pn 2 = 770.4336
(x
i=1 i − x̄)
Pn 2 = 4501.2 Pn 2 = 22.2105
x
i=1 i y
i=1 i

Using the previous formulas and the sum-


mary statistics...

ˆ Sxy 47.01844
β1 = = = 0.061029
Sxx 770.4336
and
βˆ0 = ȳ − β̂1x̄

= 0.8764 − 0.061029(12.216)

= 0.130870

(Same estimates as software)

20
Simple Linear Regression
Estimating σ 2

• One of the assumptions of simple linear re-


gression is that the variance for each of the
conditional distributions of Y |x is the same
at all x-values (i.e. constant variance).

• In this case, it makes sense to pool all the


observed error information (in the residuals)
to come up with a common estimate for σ 2

21
Recall the model:

iid
Yi = β0 + β1xi + i with i ∼ N (0, σ 2)

– We use the error sum of squares (SSE )


to estimate σ 2...
Pn 2
SS (y − ŷ )
σˆ2 = E
= i=1 i i
= M SE
n−2 n−2

∗ SSE =P
error sum of squares
= ni=1(yi − ŷi)2

∗ M SE is the mean squared error

∗ E[M SE] = E[σˆ2] = σ 2 (Unbiased


estimator)
p
ˆ √
2
∗ σ̂ = σ = M SE

22
∗ ‘2’ is subtracted from n in the denom-
inator because we’ve used 2 degrees of
freedom for estimating the slope and in-
tercept (i.e. there were 2 parameters es-
timated when modeling the conditional
mean)

∗ When we estimated σ 2 in a single nor-


mal
Pn population, we divide
(y − ŷ )2 by (n − 1) because
i=1 i i
we only estimated 1 mean structure pa-
rameter which was µ, now we’re esti-
mate two parameters for our mean struc-
ture, β0 and β1.

23

You might also like