Model Building
Model Building
Statistics
Chap 15-1
Linear vs. Nonlinear Fit
Y Y
X X
residuals
residuals
X X
Linear fit does not give Nonlinear fit gives
random residuals random residuals
Nonlinear Relationships
Yi = β0 + β1X1i + β 2 X + ε i
2
1i
Model form:
Yi = β0 + β1X1i + β 2 X + ε i2
1i
where:
β0 = Y intercept
β1 = regression coefficient for linear effect of X on Y
β2 = regression coefficient for quadratic effect on Y
εi = random error in Y for observation i
Quadratic Regression Model
Yi = β0 + β1X1i + β 2 X1i2 + ε i
Quadratic models may be considered when the scatter
plot takes on one of the following shapes:
Y Y Y Y
X1 X1 X1 X1
β1 > 0 β1 > 0 β1 < 0 β1 < 0
β2 > 0 β2 < 0 β2 > 0 β2 < 0
β1 = the coefficient of the linear term
β2 = the coefficient of the squared term
Testing for Significance:
Quadratic Effect
Yi = b0 + b1X1i + b 2 X1i2
Hypotheses
where:
The test statistic is
b2 = squared term slope
b2 − β2 coefficient
t STAT =
Sb 2 β2 = hypothesized slope (zero)
Sb = standard error of the slope
d.f. = n − 3 2
Testing for Significance:
Quadratic Effect
(continued)
Testing the Quadratic Effect
8 3
100
15 5
22 7 80
33 8
60
40 10
Purity
54 12
40
67 13
70 14 20
78 15
0
85 15
0 5 10 15 20
87 16 Time
99 17
Example: Quadratic Model
(continued)
Simple regression results:
^Y = -11.283 + 5.985 Time
Standard
Coefficients Error t Stat P-value
t statistic and r2 are all high,
Intercept -11.28267 3.46805 -3.25332 0.00691 but the residuals are not
Time 5.98520 0.30966 19.32819 2.078E-10 random:
Residuals
Adjusted R Square 0.96628 5
Standard Error 6.15997 0
-5 0 5 10 15 20
-10
Time
Example: Quadratic Model in Excel
& Minitab
(continued)
Quadratic regression results:
^ = 1.539 + 1.565 Time + 0.245 (Time)2
Y
Excel Minitab
Standard The regression equation is
Coefficients Error t Stat P-value Purity = 1.54 + 1.56 Time + 0.245 Time Squared
Intercept 1.53870 2.24465 0.68550 0.50722
Predictor Coef SE Coef T P
Time 1.56496 0.60179 2.60052 0.02467 Constant 1.5390 2.24500 0.69 0.507
Time 1.5650 0.60180 2.60 0.025
Time-squared 0.24516 0.03258 7.52406 1.165E-05
Time Squared 0.24516 0.03258 7.52 0.000
(continued)
Quadratic regression results:
^ = 1.539 + 1.565 Time + 0.245 (Time)2
Y
The adjusted r2 of the quadratic model is higher than the adjusted r2 of the
simple regression model. The quadratic model explains 99.4% of the
variation in Y.
Example: Quadratic Model Residual
Plots
(continued)
Quadratic regression results:
Y = 1.539 + 1.565 Time + 0.245 (Time)2
Time Residual Plot Time-squared Residual Plot
10 10
Residuals
Residuals
5 5
0 0
0 5 10 15 20 0 100 200 300 400
-5 -5
Time Time-squared
The residuals plotted versus both Time and Time-squared show a random
pattern.
Collinearity
(continued)
1
VIFj =
1− R j
2
Regression Analysis
Output for the pie sales example:
Price and all other X
Regression Statistics VIF is < 5
Multiple R 0.030438 There is no evidence of
R Square 0.000926
Adjusted R
collinearity between Price
Square -0.075925 and Advertising
Standard Error 1.21527
Observations 15
VIF 1.000927