Chapter 3 - Classical Simple Linear Regression
Chapter 3 - Classical Simple Linear Regression
Chapter 3 - Classical Simple Linear Regression
Simple Multiple
Non- Non-
Linear Linear
Linear Linear
xy x y
r n
x
2 ( x) 2
. y
2 ( y) 2
n n
Example
Consider the following example
Weight Age
Serial
Y2 X2 xy (Kg) (years)
.n
144 49 84 12 7 1
64 36 48 8 6 2
144 64 96 12 8 3
100 25 50 10 5 4
121 36 66 11 6 5
169 81 117 13 9 6
a)
b)
c) R-squared and comment on the goodness of fit of
the model to data
d) Fit the regression model and interpret the results
Sources of Errors in Regression
4 x
Y=a+bx
10 20 30 40 50 60 70 X
Income
Explaining Variation
Required, Calculate
a) SSR
b) SST
c) SSE
Table
Week Weekly Sales (1000 s of Selling Price ($)
gallons)
1 10 1.30
2 6 2.00
3 5 1.70
4 12 1.50
5 10 1.60
6 15 1.20
7 5 1.60
8 12 1.40
9 17 1.00
10 20 1.10
Y X XY
10 1.30 13.0 1.69 100
6 2.00 12.0 4.00 36
5 1.70 8.5 2.89 25
12 1.50 18.0 2.25 144
10 1.60 16.0 2.56 100
15 1.20 18.0 1.44 225
5 1.60 8.0 2.56 25
12 1.40 16.8 1.96 144
17 1.00 17.0 1.00 289
20 1.10 22.0 1.21 400
Totals: 112 14.40 149.3 21.56 1488
The line that best fits a collection of X-Y data
points is the line that minimises the sum of
squared distances from the fitted line.
This is known as the least squares line or
fitted regression equation.
The fitted line will be in the form of:
Calculation of Residuals (SST)
Residuals with Predicted Data
Testing Validity of the Model
In simple linear regression, the validity of the
model is tested by Coefficient of
Determination
If Coefficient of Determination is high, it
shows the model is very good.
We can also check validity of the model If
SSR>SSE
Coefficient of Determination
The coefficient of determination measures the
percentage of variability in Y that can be
explained through knowledge of the variability
in the independent variable X -