Linear Regression
Linear Regression
4
Empirical Model - Example Plot
where the slope and intercept of the line are called regression
coefficients.
10
Method of Least Squares
• Suppose that we have n pairs of observations (x1,
y1), (x2, y2), … (xn, yn). The method of least squares
is used to estimate the parameters, 0 and 1, by
minimizing the sum of the squares of the vertical
deviations.
Figure 11-3:
Deviations of the
data from the
estimated
regression model.
11
12
Least Squares Normal Equations
13
14
Fitted Regression Line
15
Sums of Squares
The following notation may also be used:
n 2
n n x i
S xx xi x x i 1 (11-10)
2 2
i
i 1 i 1 n
n n
n n x y
i i
S xy yi y xi x xi yi i 1 i 1 (11-11)
i 1 i 1 n
Then,
S
ˆ1 xy and ˆ0 y ˆ1 x
S xx
16
Simple Linear Regression - Example
Example 11-1
17
18
Example 11-1 (continued)
19
Computing 2
The error sum of squares is:
20
21
Regression Statistics
Multiple R 0.937
R Square 0.877
Adjusted R Square 0.871
Standard Error 1.087
Observations 20.000
ANOVA
df SS MS F Significance F
Regression 1 152.127 152.127 128.862 0.000
Residual 18 21.250 1.181
Total 19 173.377
22
Properties of Least Squares Estimators
(11-16)
(11-17)
23
ˆ
21 x2
2
se ˆ1 ˆ
se 0 ˆ
S xx n S xx
24
Hypothesis Test for the Slope
If we wish to test the slope is some value β1,0:
(11-18)
25
(11-21)
26
Significance of Regression
An important special case of these hypotheses is:
(11-23)
29
30
The ANOVA Table
The quantities MSR and MSE are called mean squares of
the regression and the errors, respectively.
Analysis of variance (ANOVA) table:
31
(14.947)10.17744 = 152.13
21.25
32
Equivalence of t-tests and ANOVA
33
Confidence Intervals on
Regression Model Parameters
The following state the confidence intervals for the slope
and intercept of a regression model.
34
Example 11–4 (Confidence Interval on the Slope)
12.181 ≤ β1 ≤ 17.713
35
Confidence Interval on
the Mean Response
The point estimate for the response at a given x is:
ˆY x ˆ0 ˆ1 x0
0
36
Example 11–5 (Confidence Interval on the Mean Response)
37
38
Example 11–5 (continued)
Figure 11-7:
Scatter diagram of
oxygen purity data
from Example 11-1
with fitted
regression line and
95% confidence
limits on Y|x0.
39
40
Example 11–6 (Prediction Interval)
41
42
Example 11–6 (continued)
Figure 11-8:
Scatter diagram of
oxygen purity data
from Example 11-1
with fitted regression
line, 95% prediction
limits (outer lines) ,
and 95% confidence
limits on Y|x0.
43
Residual Plots
46
Residual Analysis - Example
Example 11-7
47
48
Example 11-7 (continued)
49
50
Coefficient of Determination (R2)
• The quantity
(11-34)
R2 Computations - Example
• For the oxygen purity regression model,
R2 = SSR/SST
152.13/173.38
= 152.13/173.38
0.877
= 0.877
• Thus, the model accounts for 87.7% of the
variability in the data.
52
Regression on Transformed Variables
In many cases a plot of the independent variable, y,
against the dependent variable, x, may show the
relationship is not linear.
Performing a linear regression would lead to a poor
fit and residual analysis would show the model is
inadequate.
However, we can often transform the dependent
variable first. This transformed variable, x’, may
have a linear relationship with y.
53
54
Obs. Output (y) Velocity (x) x'=1/x
Example 11-9
1 1.582 5.00 0.200
2 1.822 6.00 0.167
An engineer has collected data on the 3 1.057 3.40 0.294
4 0.5 2.70 0.370
DC output from a windmill under 5 2.236 10.00 0.100
6 2.386 9.70 0.103
different wind speed conditions. He 7 2.294 9.55 0.105
8 0.558 3.05 0.328
wishes to develop a model describing 9 2.166 8.15 0.123
10 1.866 6.20 0.161
output in terms of wind speed. 11 0.653 2.90 0.345
12 1.93 6.35 0.157
13 1.562 4.60 0.217
The table on the right shows the data 14 1.737 5.80 0.172
15 2.088 7.40 0.135
collected for output, y, as a response 16 1.137 3.60 0.278
17 2.179 7.85 0.127
and wind speed, x, as the dependent 18 2.112 8.80 0.114
19 1.8 7.00 0.143
variable. 20 1.501 5.45 0.183
21 2.303 9.10 0.110
22 2.31 10.20 0.098
The final column shows the 23 1.194 4.10 0.244
55
2.5
2.0
DC Output
1.5
1.0
Original
0.5
0.0
0 2 4 6 8 10 12
Wind Velocity, x
2.5
2.0
DC Output
Transformed
1.5
1.0
0.5
0.0
0.0 0.1 0.2 0.3 0.4 0.5
Transformed Wind Velocity, 1/x
58