0% found this document useful (0 votes)
14 views15 pages

Notes 2

The document discusses the properties and conditions of a simple linear regression model, focusing on the four 'LINE' conditions: linearity, independence, normality, and equal variances of errors. It explains the estimation of intercept (b0) and slope (b1) using least squares, and how these estimates relate to the mean response and predictor variables. Additionally, it covers the significance of population variance (σ²) and mean square error (MSE) in predicting future responses.

Uploaded by

promptmba24
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views15 pages

Notes 2

The document discusses the properties and conditions of a simple linear regression model, focusing on the four 'LINE' conditions: linearity, independence, normality, and equal variances of errors. It explains the estimation of intercept (b0) and slope (b1) using least squares, and how these estimates relate to the mean response and predictor variables. Additionally, it covers the significance of population variance (σ²) and mean square error (MSE) in predicting future responses.

Uploaded by

promptmba24
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

Properties and LINE Conditions

Further Topics...

1 Four “LINE” conditions of a simple linear regression model

2 Math Formulas of b0 (intercept) and b1 (slope)

3 Properties of b0 and b1

4 Estimation of σ 2 (population variance)


Simple Linear Regression Model Four “LINE” Conditions

A simple linear regression model for a data set (xi , Yi ) is defined as

Yi = β0 + β1 xi + εi , i = 1, . . . , n.

Four conditions for a simple linear regression model:


1 The mean of the response, E(Yi ) = β0 + β1 xi is a Linear function of the xi .

2 The errors, εi , are Independent.

3 The errors, εi , at each value of the predictor, xi , are Normally distributed.

4The errors, εi , at each value of the predictor, xi , have Equal variances


(denoted σ 2 ).
We are studying “LINE” in this course.
Least Squares Estimates: b0 (Estimate of β0) and b1
(Estimate of β1)
In the previous lecture, we talked about a data set of 10 students, and we have
heights (h) and weights (w) of the 10 students.
The “best fitting line” is shown in the following plot: the intercept b0 = −266.53 and
the slope b1 = 6.14.

(75, 208)
200

(73, 181)
180
Weight

Y = 158.8
160
140

(63, 127)
120

(64, 121) x = 69.3

64 66 68 70 72 74

Height
By differentiation of the least squares criterion
n
X
Q= [Yi − (b0 + b1 xi )]2
i=1

we can get
n
X
(Yi − b0 − b1 xi ) = 0
i=1
n
X
xi (Yi − b0 − b1 xi ) = 0
i=1

Solving the two equations in the previous slide, we get


n
P
(xi − x̄)(Yi − Ȳ )
i=1 Sxy
b1 = n =
P Sxx
(xi − x̄)2
i=1

b0 = Ȳ − b1 x̄
1 Because the formulas for b0 and b1 are derived using the least squares
criterion, the resulting equation

Ŷi = b0 + b1 xi

is often referred to as the “least squares regression line,” or simply the


“least squares line.”

2 Re-arranging the terms in the formula

b0 = Ȳ − b1 x̄,

we can get
Ȳ = b0 + b1 x̄,
which means that the least squares line passes through the point (x̄, Ȳ ).
Some Notations
We use the notations:
1 Sum of squares for x:
n
X n
X
2
Sxx = (xi − x̄) = x2i − nx̄2
i=1 i=1

2 Sum of squares for Y :


n
X n
X
2
Syy = (Yi − Ȳ ) = Yi2 − nȲ 2
i=1 i=1

3 Cross-product sum of squares:


Xn n
X
Sxy = (xi − x̄)(Yi − Ȳ ) = xi Yi − nx̄Ȳ
i=1 i=1

4 Sample mean for x:


n
P
xi
i=1
x̄ =
n
5 Sample mean for Y :
n
P
Yi
i=1
Ȳ =
n
What Does b0 and b1 Tell Us?

1 b0 is the predicted response value when x = 0.


1. In the example of 10 students’ height and weight, b0 tells us that a person who
is 0 inches tall is predicted to weigh -267 pounds, which is not meaningful.

2. This happened because we “extrapolated” beyond the “scope of the model”


(the range of the x values).

2 b1 is the estimate of the change in mean response value E(Y ) for every
additional one-unit increase in the predictor x.
1. In the example of 10 students’ height and weight, b1 tells us that we predict the
mean weight to increase by 6.14 pounds for every additional one-inch increase
in height.

2. In general, we can expect the mean response to increase or decrease by b1


units for every one unit increase in the predictor x.
Understanding the Slope b1

1 If we study the formula for the slope b1 :


n
P
(xi − x̄)(Yi − Ȳ )
i=1
b1 = n
P
(xi − x̄)2
i=1

we see that the denominator is necessarily positive since it only involves


summing positive terms.

2 Therefore, the sign of the slope b1 is solely determined by the numerator.

3 The numerator tells us, for each data point, to sum up the product of two
distances – the distance of the x value from x̄ (the mean of all of the x values)
and the distance of the Y value from Ȳ (the mean of all of the Y values).
When is the Slope b1 > 0?
1 Is the trend in the following plot positive, i.e., as x increases, Y tends to increase?

2 If the trend is positive, then the slope b1 must be positive.

3 The vertical dashed line is x̄. The horizontal dashed line is Ȳ .

(75, 208)
200

(73, 181)
180
Weight

Y = 158.8
160
140

(63, 127)
120

(64, 121) x = 69.3

64 66 68 70 72 74

Height
When is the Slope b1 < 0?
1 Is the trend in the following plot negative, i.e., as x increases, Y tends to decrease?

2 If the trend is negative, then the slope b1 must be negative.

3 The vertical dashed line is x̄. The horizontal dashed line is Ȳ .

Skin Cancer Mortality versus Latitude


220

(33, 219)
Mortality (Deaths per 10 million)
200
180
160

(34.5, 160)
Y = 152.9
140

(43, 134)
120
100

x = 39.5

(44.8, 86)

30 35 40 45

Latitude (at center of state)


Estimation of σ 2 (Unknown Population Variance)
Why should we care about σ 2 ? – One reason is that we want to predict future
response from an estimated regression line.

We have two thermometer brands (A) and (B). The predictor is Celsius and the
response is Fahrenhelt. Will this thermometer brand (A) or brand (B) yield more
precise future predictions?

(A) (B)
120

120
100

100
Fahrenheit
80

80
60

60
40

40
20

0 10 20 30 40 50 0 10 20 30 40 50

Celsius Celsius
Review of Sample Variance
When there is no predictor x, we use Ȳ to estimate E(Y ), and we use the sample
variance s2 to estimate σ 2 .
The sample variance:
n
(Yi − Ȳ )2
P
i=1
s2 =
n−1
0.3
Probability density

0.2
0.1
0.0

-4 -2 0 2 4
In the simple linear regression setting when there is a predictor x. At each x
value, there is a sub-group of data points, and we use

Ŷi = b0 + b1 xi

to estimate
E(Yi ) = β0 + β1 xi .

Population of 200 Students Sample of 20 Students

+
+
College entrance test score

16
+
population regression line +
sample regression line +
20

14
+
+
+
+ +

12
+
15

10
+
+
10

+
6

+
5

+
4

1.0 1.5 2.0 2.5 3.0 3.5 4.0 1.0 1.5 2.0 2.5 3.0 3.5 4.0

High school gpa High school gpa


Mean Square Error M SE in Simple Linear Regression

The mean square error:


n
(Yi − Ŷi )2
P
i=1
M SE =
n−2

1 The numerator again adds up, in squared units, how far each response yi is
from its estimated mean Ŷi .

2 The denominator divides the sum by n − 2, because we effectively estimate


two parameters - the population intercept β0 and the population slope β1 .
That is, we lose two degrees of freedom.

3 It can be shown that E(M SE) = σ 2 , i.e., MSE is an unbiased estimator of σ 2 .


We can write it as σ̂ 2 = MSE.

You might also like