0% found this document useful (0 votes)
9 views13 pages

Lecture Notes 12 8

Uploaded by

joselledump9
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views13 pages

Lecture Notes 12 8

Uploaded by

joselledump9
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

Part III

Nonlinear Regression

255
Chapter 13

Introduction to Nonlinear
Regression

We look at nonlinear regression models.

13.1 Linear and Nonlinear Regression Models

We compare linear regression models,

Yi = f (Xi , β) + εi
= X0i β + εi
= β0 + β1 Xi1 + · · · + βp−1 Xi,p−1 + εi

with nonlinear regression models,

Yi = f (Xi , γ) + εi

where f is a nonlinear function of the parameters γ and


   
Xi1 γ0
Xi =  ...  γ =  ... 
   
q×1 p×1
Xiq γp−1

In both the linear and nonlinear cases, the error terms εi are often (but not always)
independent normal random variables with constant variance. The expected value in
the linear case is
E{Y } = f (Xi , β) = X0i β
and in the nonlinear case,
E{Y } = f (Xi , γ)

257
258 Chapter 13. Introduction to Nonlinear Regression (ATTENDANCE 12)

Exercise 13.1 (Linear and Nonlinear Regression Models) Identify whether the
following regression models are linear, intrinsically linear (nonlinear, but transformed
easily into linear) or nonlinear.

1. Yi = β0 + β1 Xi1 + εi
This regression model is linear / intrinsically linear / nonlinear

2. Yi = β0 + β1 Xi1 + εi
This regression model is linear / intrinsically linear / nonlinear
because
p
Yi = β0 + β1 Xi1 + εi
0
= β0 + β1 Xi1 + εi

3. ln Yi = β0 + β1 Xi1 + εi
This regression model is linear / intrinsically linear / nonlinear
because
p
ln Yi = β0 + β1 Xi1 + εi
Yi0 = β0 + β1 Xi1
0
+ εi

where Yi0 = ln Yi

4. Yi = γ0 + γ1 Xi1 + γ2 Xi2 + εi
This regression model1 is linear / intrinsically linear / nonlinear

5. f (Xi , β) = β0 + β1 Xi1 + β2 Xi2 + εi


This regression model is linear / intrinsically linear / nonlinear

6. f (Xi , γ) = γ0 [exp(γ1 X)]


This regression model is linear / intrinsically linear / nonlinear
because

f (Xi , γ) = γ0 [exp(γ1 Xi )]
ln f (Xi , γ) = ln [γ0 [exp(γ1 Xi )]]
Yi0 = ln γ0 + ln[exp(γ1 Xi )]
Yi0 = γ00 + γ1 Xi

where Yi0 = ln Yi and γ00 = ln γ0


1
Traditionally, the “γi ” parameters are reserved for nonlinear regression models and the “βi ”
parameters are reserved for linear regression models. But, of course, traditions are made to be
broken.
Section 2. Example (ATTENDANCE 12) 259

7. f (Xi , γ) = γ0 + γγ2 γ1 3 Xi
This regression model is linear / intrinsically linear / nonlinear
even though
γ1
f (Xi , γ) = γ0 + Xi
γ2 γ3
Yi = γ0 + γ10 Xi
γ1
where γ10 = γ2 γ3
where three parameters have been condensed into one.

13.2 Example

An interesting example which uses the nonlinear exponential regression model to


describe hospital data is given in this section.

13.3 Least Squares Estimation in Nonlinear Re-


gression
SAS programs:
att12-13-3-read-bounded, att12-13-3-read-logistic,

The least squares estimation of the linear regression model,

Y = Xβ + ε

and involves minimizing the least squares criterion2


n
X
Q= [Yi − f (Xi , β)]2
i=1

with respect to the linear regression parameters β0 , β1 , . . . , βp−1 , and so gives the
following (analytically–derived) estimators,

b = (X0 X)−1 X0 Y
2
Another estimation method is the maximum likelihood method which involves minimizing the
likelihood of the distribution function of the linear regression model with respect to the parameters.
If the error terms, ε, are independent identically normal, then the least squares method and MLE
methods give the same estimators.
260 Chapter 13. Introduction to Nonlinear Regression (ATTENDANCE 12)

In a similar way, the least squares estimation of the nonlinear regression model,

Yi = f (Xi , γ) + εi

involves minimizing the least squares criterion


n
X
Q= [Yi − f (Xi , γ)]2
i=1

with respect to the nonlinear regression parameters, γ0 , γ1 , . . . , γp−1 but, often, it


is not possible to analytically derive the estimators. A numerical method 3 is often
required. We will use the numerical Gauss–Newton method given in SAS to derive
least squares estimators for various nonlinear regression models.

Exercise 13.2 (Least Squares Estimation in Nonlinear Regression)


illumination, X 1 2 3 4 5 6 7 8 9 10
ability to read, Y 70 70 75 88 91 94 100 92 90 85
Fit two different nonlinear (and intrinsically linear) regressions to this data. Since
there is, typically, no analytical solution to nonlinear regressions, a numerical pro-
cedure, called the Gauss–Newton method is used to derive each of the nonlinear
regressions below.
1. Simple Bounded Model, Yi = γ2 + (γ0 − γ2 ) exp(−γ1 Xi ) + εi

(a) Initial Estimates, Part 1


Assume the simple bounded model is initially given by

Yi = γ2 (1−exp(−γ1 Xi ))+γ0 exp(−γ1 Xi ) = 100(1−exp(−Xi ))+50 exp(−Xi )

In other words, the γ0 , γ1 and γ2 parameters are initially estimated by


(choose one)
(0) (0) (0)
i. g0 = 100, g1 = −1, g2 = 1
(0) (0) (0)
ii. g0 = 50, g1 = 1, g2 = 100
(0) (0) (0)
iii. g0 = 50, g1 = −1, g2 = 100
(b) Initial Estimates, Part 2
A sketch of the simple bounded function where
(0) (0) (0)
g0 = 50, g1 = 1, g2 = 100
reveals that the upper bound of this function can be found at
(choose one) 1 / 50 / 100
3
The text not only shows how it is not possible to analytically derive the estimators for the
exponential regression model, but also gives a detailed step–by–step explanation of how to derive
the estimators for this model using the Gauss–Newton numerical method.
Section 3. Least Squares Estimation in Nonlinear Regression (ATTENDANCE 12)261

reading ability

simple bounded
100

50

illumination 10

Figure4 13.1 (Nonlinear Regression, Simple Bounded Model)


In other words, the upper bound of the simple bounded function, parameter
(0)
γ0 which is estimated by g0 , is initially set at 100 because we know the
largest observed response is at Y = 100 (when X = 7). In a similar way,
the other two parameters, the intercept (γ0 ) and rate of change (γ1 ) of the
function, are initially set to 50 and 1, respectively.
(c) Nonlinear Least Squares Estimated Regression
Using the starting values
(0) (0) (0)
g0 = 50, g1 = 1, g2 = 100
SAS gives us

g0 = 52.2624, g1 = 0.4035, g2 = 93.6834

In other words, the best fitting nonlinear exponential model,

Yi = γ2 (1 − exp(−γ1 Xi )) + γ0 exp(−γ1 Xi )

is given by (circle one)


Yi = 1+0.699493.2731
exp(−0.5117Xi )
Yi = 93.6834(1 − exp(−0.4035Xi )) + 52.2624 exp(−0.4035Xi )
Yi = 84.3854 + 0.0509 exp(0.119Xi )
(d) Various Residual Plots
i. True / False
The nonlinear regression plot shows that the regression line is a fairly
good fit to the data.
ii. True / False
The residual versus predicted plot indicates non–homogeneous vari-
ance in the error.
4
Use -10, 10, 1, 0, 150, 10, 1 in your calculator.
262 Chapter 13. Introduction to Nonlinear Regression (ATTENDANCE 12)

iii. True / False


The residual versus predictor, illumination, indicates non–
homogeneous variance in the error.
iv. True / False
The normal probability plot of residuals is fairly linear which indicates
the error is normal.
On the basis of these residual plots, it seems the simple bounded model is
not a great fitting model, but certainly, will be shown to be a better fitting
model to the data than the (to–be–analyzed) exponential model.

γ0
2. Logistic Model, Yi = 1+γ1 exp(−γ2 Xi )
+ εi

(a) Initial Estimates, Part 1


Assume the logistic model is initially given by
γ0 100
Yi = =
1 + γ1 exp(−γ2 Xi ) 1 + exp(−Xi )

In other words, the γ0 , γ1 and γ2 parameters are initially estimated by


(choose one)
(0) (0) (0)
i. g0 = 100, g1 = −1, g2 = 1
(0) (0) (0)
ii. g0 = 100, g1 = 2, g2 = 1
(0) (0) (0)
iii. g0 = 100, g1 = 1, g2 = 1
(b) Initial Estimates, Part 2
A sketch of the logistic function where
(0) (0) (0)
g0 = 100, g1 = 1, g2 = 1
reveals that the upper bound of this function can be found at
(choose one) 1 / 50 / 100
reading ability

logistic
100

-10 10
illumination
-1

Figure 13.2 (Nonlinear Regression, Logistic Model)


Section 3. Least Squares Estimation in Nonlinear Regression (ATTENDANCE 12)263

In other words, the upper bound of the logistic function, parameter γ0


(0)
which is estimated by g0 , is initially set at 100 because we know the
largest observed response is at Y = 100 (when X = 7). In a similar way,
the other two parameters, which control direction and rate of increase or
decrease of the function, are initially set to one (1) each.
(c) Nonlinear Least Squares Estimated Regression
Using the starting values
(0) (0) (0)
g0 = 100, g1 = 1, g2 = 1
SAS gives us

g0 = 93.2731, g1 = 0.6994, g2 = −0.5117

In other words, the best fitting nonlinear exponential model,


γ0
Yi =
1 + γ1 exp(γ2 Xi )

is given by (circle one)


Yi = 1+0.699493.2731
exp(−0.5117Xi )
Yi = 73.677 exp(0.0266Xi )
Yi = 84.3854 + 0.0509 exp(0.119Xi )
(d) Various Residual Plots
i. True / False
The nonlinear regression plot shows that the regression line is a fairly
good fit to the data.
ii. True / False
The residual versus predicted plot indicates non–homogeneous vari-
ance in the error.
iii. True / False
The residual versus predictor, illumination, indicates non–
homogeneous variance in the error.
iv. True / False
The normal probability plot of residuals is fairly linear which indicates
the error is normal.
On the basis of these residual plots, it seems the logistic model is about as
good fitting as the simple bounded model to the data.
264 Chapter 13. Introduction to Nonlinear Regression (ATTENDANCE 12)

13.4 Model Building and Diagnostics


SAS program: att12-13-4-read-nonlin-lof

It is often not easy to add or delete predictor variables to nonlinear regression models,
in other words, to model build. However, it is possible to perform diagnostic tests to
check for correlation or nonconstant error variance and to check for a lack of fit.

Large–sample theory is applicable if the nonlinear regression is “linear enough” at


each point of the regression. This is true if the iterative procedure for the estimation
of the nonlinear regression is quick, if various measures indicate it to be true or if a
bootstrap procedure indicates it to be true.

Exercise 13.3 (Model Building and Diagnostics)

illumination, X 1 2 3 4 5 6 7 8 9 9 10
ability to read, Y 70 70 75 88 91 94 100 92 90 92 85

According to the previous analysis, the logistic regression


γ0
Yi = + εi
1 + γ1 exp(−γ2 Xi )

is the best of the nonlinear models that were considered. Consequently, conduct a
lack of fit test5 for this nonlinear regression.

att12-13-4-read-nonlin-lof
γ0
1. Logistic Model, Yi = 1+γ1 exp(−γ2 Xi )
+ εi

(a) Statement.
The statement of the test is (check none, one or more):
γ0 γ0
i. H0 : E{Y } = 1+γ1 exp(−γ2 Xi )
versus H1 : E{Y } > 1+γ1 exp(−γ2 Xi )
.
γ0 γ0
ii. H0 : E{Y } = 1+γ1 exp(−γ2 Xi )
versus H1 : E{Y } < 1+γ1 exp(−γ2 Xi )
.
γ0 γ0
iii. H0 : E{Y } = 1+γ1 exp(−γ2 Xi )
versus H1 : E{Y } 6= 1+γ1 exp(−γ2 Xi )
.
5
Notice how an additional data point, (9, 92), has been added to the data and so there are now
two points at X = 9. This is necessary to conduct a lack of fit test.
Section 5. Inferences about Nonlinear Regression Parameters (ATTENDANCE 12)265

(b) Test.
The test statistic6 is
SSE (R) − SSE (F ) SSE (F )
F∗ = ÷
df R − df F df F
SSE − SSPE SSPE
= ÷
(n − 2) − (n − c) n−c
SSLF SSPE
= ÷
c−2 n−c
243.2 − 2 2
= ÷
7 1
=
(circle one) 9.075 / 17.23 / 58.57.
The critical value at α = 0.01, with 7 and 1 degrees of freedom, is
(circle one) 4.83 / 5.20 / 5928
(Use PRGM INVF ENTER 7 ENTER 1 ENTER 0.99 ENTER)
(c) Conclusion.
Since the test statistic, 17.23, is smaller than the critical value, 5928, we
(circle one) accept / reject the null hypothesis that the regression func-
γ0
tion is E{Y } = 1+γ1 exp(−γ 2 Xi )
.

13.5 Inferences about Nonlinear Regression Pa-


rameters
SAS program: att12-13-5-read-nonlin-CI,test
For linear regression models with normal error terms, the least squares or minimum
likelihood estimators, for any given sample size, are normally distributed, are unbiased
and have minimum variance. Although this is not true for nonlinear regression models
for any sample size, it is approximately true for such models with large sample size.

Specifically, if εi are independent N (0, σ 2 ) in the nonlinear regression model


Yi = f (Xi , g) + εi ;
for large n,
E{g} = γ
s2 {g} = MSE (D0 D)−1
6
From SAS,
γ0
SSE = 243.2 from the ”reduced” nonlinear model, Yi = 1+γ1 exp(−γ 2 Xi )
+ εj ,
and SSPE = 2 from the ’full” ANOVA model, Yij = µj + εij .
266 Chapter 13. Introduction to Nonlinear Regression (ATTENDANCE 12)

where D is the matrix of partial derivatives with respect to the parameters γ evaluated
at g(0).

Large–sample theory is applicable if the nonlinear regression is “linear enough” at


each point of the regression. This is true if the iterative procedure for the estimation
of the nonlinear regression is quick, if various measures indicate it to be true or if a
bootstrap procedure indicates it to be true.

Exercise 13.4 (Inferences about Nonlinear Regression Parameters)

illumination, X 1 2 3 4 5 6 7 8 9 10
ability to read, Y 70 70 75 88 91 94 100 92 90 85
According to the previous analysis, the logistic regression
γ0
Yi = + εi
1 + γ1 exp(−γ2 Xi )

is the best of the nonlinear models that were considered. Consequently, determine
various intervals and conduct tests for the this nonlinear regression. Assume large
sample results hold in this case.

att12-13-4-read-nonlin-CI,test

1. 90% Confidence Interval


From SAS, the 90% confidence interval for γ0 is given by

g0 ± t(1 − α/2; n − p)s{g0 } = 93.2731 ± t(1 − 0.10/2; 10 − 3)4.0022


= 93.2731 ± 1.8946(4.0022) =

(circle one) (85.7, 100.9) / (88.7, 98.9) / (90.7, 94.9).

2. 90% Bonferroni Simultaneous Confidence Interval


The 90% Bonferroni simultaneous interval for γ0 (knowing that it is one of three
(m = 3) parameters, γ0 , γ1 and γ2 ) is given by

g0 ± t(1 − α/2m; n − p)s{g0 } = 93.2731 ± t(1 − 0.10/2(3); 10 − 3)4.0022


= 93.2731 ± 2.642(4.0022) =

(circle one) (85.7, 100.9) / (82.7, 103.9) / (90.7, 94.9).

3. Test of Single γk
Test if γ0 6= 93 at α = 0.05.
Section 6. Learning Curve Example (ATTENDANCE 12) 267

(a) Statement.
The statement of the test, in this case, is (circle one)
i. H0 : γ0 = 93 versus Ha : γ0 < 93
ii. H0 : γ0 = 93 versus Ha : γ0 > 93
iii. H0 : γ0 = 93 versus Ha : γ0 6= 93
(b) Test.
The standardized test statistic of g1 = 0.0266 is
g0 − γ10 93.2731 − 93
t test statistic = = =
s{g0 } 4.0022

(circle one) 0.042 / 0.068 / 0.152.


The standardized critical value at α = 0.05 is
(circle one) −2.31 / −1.76 / 2.32
(Use PRGM ENTER INVT ENTER 8 ENTER 0.025 ENTER)
(c) Conclusion.
Since the test statistic, 0.068, is smaller than the critical value, 2.31, we
(circle one) accept / reject the null guess of γ0 = 93.

13.6 Learning Curve Example

You might also like