0% found this document useful (0 votes)
53 views

Nonlinear Regression: What Is Nonlinear Model?

This document discusses nonlinear regression models. It defines nonlinearity as occurring when the dependent variable or independent variables have nonlinear relationships. Some examples of nonlinear dependent variables given are binary, nominal, count, and interval variables. The document discusses how to detect nonlinearity through theory, scatterplots, seasonality in data, and model fit. It also covers transforming nonlinear models into linear models using logarithms, powers, and other functions. Finally, it discusses estimating nonlinear models using nonlinear least squares regression in R, using the nls function and an example of fitting a logistic growth model to population data.

Uploaded by

nagatopein6
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
53 views

Nonlinear Regression: What Is Nonlinear Model?

This document discusses nonlinear regression models. It defines nonlinearity as occurring when the dependent variable or independent variables have nonlinear relationships. Some examples of nonlinear dependent variables given are binary, nominal, count, and interval variables. The document discusses how to detect nonlinearity through theory, scatterplots, seasonality in data, and model fit. It also covers transforming nonlinear models into linear models using logarithms, powers, and other functions. Finally, it discusses estimating nonlinear models using nonlinear least squares regression in R, using the nls function and an example of fitting a logistic growth model to population data.

Uploaded by

nagatopein6
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

07/01/16

Nonlinear regression

What is nonlinear model?


In general nonlinearity in regression model can be seen as:
1.  Nonlinearity of dependent variable ( y ) – for example y
is binary (logit model), nominal (multivariate nominal
model), counts (Poisson model), interval-valued
(symbolic data regression), etc.
2.  Nonlinearity within independent variables (x’s) – until
now we have considered a linear relations (dependence,
connection) between independent variables.
In this case we can have:
a)  functions (relations) that can be transformed to linear
functions,
b)  Functions (relations) that can not be transformed to linear
form

1
07/01/16

How to detect nonlinearity?


1.  Theory – in many sciences we have theories about
nonlinear relations within som phenomenon. For example
in economics Laffer curve shows a nonlinear relations
between taxation and the hypothetical resulting levels of
government revenue
2.  Scatterplot – when looking at the plot you can see that
data points are not linear or even not nearly linear
3.  Seasonality in data – i.e. in agriculture, building industry
we often have seasonality within the data
4.  Estimated model does not fit the data well or does not fit
it at all; the estimated β’s are not significant – this might
suggest nonlinearity
5.  Can often do incremental F tests or Wald tests

Linear or notlinear? – That is the question

2
07/01/16

Linear or notlinear? – That is the question

Linear or notlinear? – That is the question

3
07/01/16

Linear or notlinear? – That is the question

Linear or notlinear? – That is the question

4
07/01/16

Transformation of nonlinear models to linear


models
Exponential function:
yˆ = β 0 β1x
we have to use logarithms:
ln yˆ = ln (β 0 β1x )
ln yˆ = ln β 0 + ln β1x
ln yˆ = ln β 0 + x ln β1
we have to substitute: lny = y’, lnβ0 = β0’, lnβ1 = β1’ and we
get linear model:
yˆ ʹ = β 0ʹ + β1ʹx

Transformation of nonlinear models to linear


models
Logarithms:
yˆ = β 0 + β1 ln x
we have to substitute: lnx = x’ and we get linear function:

yˆ = β 0 + β1 xʹ

5
07/01/16

Transformation of nonlinear models to linear


models
Power function:
yˆ = β 0 x β1
we have to use logarithms:
ln yˆ = ln β 0 + ln (x β1 )
ln yˆ = ln β 0 + β1 ln x
we have to substitute: lny = y’, lnβ0 = β0’, lnx = x’ and we get
linear function:
yˆ ʹ = β 0ʹ + β1 xʹ

Transformation of nonlinear models to linear


models β 0
Logistic function:
β0
yˆ =
1 + β1e − x
where: β0 > 0, β1 > 1
we use substitutions:
1 β1 1
β 0ʹ = , β1ʹ = , yˆ ʹ = , xʹ = e − x
β0 β0 yˆ

yˆ ʹ = β 0ʹ + β1ʹxʹ

6
07/01/16

Transformation of nonlinear models to linear


models
Hyperbolic function:
β1 β1 x
yˆ = β 0 + (1) yˆ = ( 2)
x x + β1
1
we use substitutions for equation (1): = xʹ and we get:
x
yˆ = β 0 + β1 xʹ
1 β 1 1
we use substitutions for equation (2): β 0ʹ = β , β1ʹ = β , xʹ = x , yˆ ʹ = yˆ
1

to get: 0 0

yˆ = β 0ʹ + β1ʹxʹ

Nonlinear models in R
To estimate nonlinear models (that can be transformed to
linear models) we use lm function, but we have to define
formula. Some examples will be presented:
yˆ = β 0 + β1 x1 + β 2 x2 (1)
yˆ = β1 x1 + β 2 x2 ( 2)
yˆ = β 0 + β1 ln x1 + β 2 ln x2 (3)
log yˆ = β 0 + β1 x1 + β 2 x2 (4)
x2
yˆ = β 0 + β1 + β 3 x3 + β 4 x42 (5)
x1
yˆ = β 0 β1x1 β 2x2 (6) we have to use logarithms
yˆ = β 0 x1β1 x2β 2 (7) for equations 6 and 7
yˆ t = β 0 + β1 x1t −1 + β 2 x2t − 2 (8)

7
07/01/16

Nonlinear models in R
How to apply these functions in formula:
Function no. Formula
1 y~x1+x2
2 y~-1+x1+x2 or y~0+x1+x2
3 y~log(x1)+log(x2)
4 log10(y)~x1+x2
5 y~I(x2/x1)+sqrt(x3)+I(x4^2)
6 log(y)~x1+x2
7 log(y)~log(x1)+log(x2)
8 z.y~v1+v2

where: z.y = yt, v1 = x1t-1, v2 = x2t-2

Nonlinear functions and nonlinear least


squares in R
The nonlinear regression model is the generalization of
the linear regression model in which the conditional mean of
the response variable is not a linear function of parameters.
For example data about
decennial U. S. Census
population for the
United States (in millions),
from 1790 through 2000

8
07/01/16

Nonlinear functions and nonlinear least


squares in R
A common simple model for population growth is the logistic
growth model:
θ1
y = m(x,θ ) + ε = +ε
1 + exp[− (θ 2 + θ 3 x )]
where: y – response, x – predictor, θ = (θ1, θ2, θ3) –
parameters
Changing θ (theta’s) stretches or shrinks the axes, and
changes the rate at which the curve varies from its lower
value at 0 to its maximum value

Nonlinear functions and nonlinear least


squares in R
Let’s assume (θ1 = 1, θ2 = 1,θ3 = 1)

9
07/01/16

Nonlinear functions and nonlinear least


squares in R
In general nonlinear regression model is:
y = E ( y | x ) + ε = m(x, θ ) + ε
This model posits that the mean E(y|x) depends on x
through the kernel mean function m(x, θ), where the
predictor x has one or more components and the parameter
vector θ also has one or more components
In the logistic growth model x consists of the single predictor
x = year and the parameter vector θ = (θ1, θ2, θ3) has three
components

Nonlinear functions and nonlinear least


squares in R
The model further assumes that the errors ε are independent
with variance σ2/w, where the w are known nonnegative
weights, and σ2 is a generally unknown variance to be
estimated, in most applications w = 1 for all observations.
The nls function can be used to estimate θ as the values
that minimize the residua sum of squares:

2
S (θ ) = ∑ w[ y − m(θ, x )]
From now on we will use for the minimizer of the residual
sum of squares θˆ

10
07/01/16

Nonlinear functions and nonlinear least


squares in R
Unlike the linear least-squares problem, there is usually no
formula that provides the minimizer of the equation
2
S (θ ) = ∑ w[ y − m(Rather
θ, x )] than an iterative procedure is used, which in
broad outline is as follows:
1) The user supplies an initial guess, say t0 of starting values for
the parameters. Whether or not the algorithm can successfully find
a minimizer will depend on getting starting values that are
reasonably close to the solution. We discuss how this might be
done for the logistic growth function below. For some special
mean functions, including logistic growth, R has self-starting
functions that can avoid this step.

Nonlinear functions and nonlinear least


squares in R
2) At iteration j ≥ 1, the current guess tj is obtained by updating
tj-1. If S(tj ) is smaller than S(tj -1) by at least a predetermined
amount, then the counter j is increased by 1 and this step is
repeated. If no improvement is possible then tj-1 is taken as the
estimator
This simple algorithm hides at least three important
considerations. First, we want a method that will guarantee that at
each step we either get a smaller value of S or at least S will not
increase. There are many nonlinear least-squares algorithms; see,
for example, Bates and Watts (1988). Many algorithms make use
of the derivatives of the mean function with respect to the
parameters.

11
07/01/16

Nonlinear functions and nonlinear least


squares in R
The default algorithm in nls uses a form of Gauss-Newton
iteration that employs derivatives approximated numerically unless
we provide functions to compute the derivatives. Second, the sum
of squares function S may be a perverse function with multiple
minima. As a consequence, the purported least-squares estimates
could be a local rather than global minimizer of S. Third, as given
the algorithm can go on forever if improvements to S are small at
each step. As a practical matter, therefore, there is an iteration
limit that gives the maximum number of iterations permitted, and a
tolerance that denes the minimum improvement that will be
considered to be greater than 0.

Nonlinear functions and nonlinear least


squares in R
Function nls uses following arguments:
formula The formula argument is used to tell nls about the
mean function.
start The argument start is a list that tells nls which of the
named quantities on the right side of the formula are parameters,
and thus implicitly which are predictors. It also provides starting
values for the parameter estimates.
algorithm = "default" The "default" algorithm used in
nls is a Gauss-Newton algorithm. Other possible values are
"plinear" for the Golub-Pereyra algorithm for partially linear
models and "port" for a algorithm that should be selected if
there are constraints on the parameters

12
07/01/16

Nonlinear functions and nonlinear least


squares in R
lower = -Inf, upper = Inf One of the characteristics of
nonlinear models is that the parameters of the model might be
constrained to lie in a certain region. In the logistic population-
growth model, for example, we must have θ3 > 0, as population
size is increasing, and we must also have θ1 > 0.
trace = FALSE If TRUE, print the value of the residual sum of
squares and the parameter estimates at each iteration.

Nonlinear functions and nonlinear least


squares in R
Unlike in linear least squares, most nonlinear least-squares
algorithms require specication of starting values for the
parameters
A starting value of the asymtote θ1 is some value larger than any
value in the data, and so value around t1 = 400 is a resonable
start). The estimated population in 2010 before official Census
count was released was 307 milion.

13
07/01/16

Nonlinear functions and nonlinear least


squares in R
Unlike in linear least squares, most nonlinear least-squares
algorithms require specication of starting values for the
parameters
Linear model can be used to get some information (guess) what
initial values should be
m1<-lm(logit(population/400) ~ year, USPop)
print(m1)
Coefficients:
(Intercept) year
-49.24991 0.02507

Nonlinear functions and nonlinear least


squares in R
pop.mod <- nls(population ~ theta1/(1 + exp(-(theta2 +
theta3*year))), start=list(theta1 = 400, theta2 = -49, theta3 =
0.025), data=USPop, trace=TRUE)

By setting trace=TRUE, we can see that S evaluated at the starting values is 3061. The first
iteration reduces this to 558.5, the next iteration to 458, and the remaining iterations result in
only very small changes.
We get convergence in 6 iterations.

14
07/01/16

Nonlinear functions and nonlinear least


squares in R
summary(pop.mod)

Nonlinear functions and nonlinear least


squares in R
The column marked Estimates displays the least squares estimates. The
estimated upper bound for the U. S. population is 440.8, or about 441 million.
The column marked Std. Error displays the estimated standard errors of
these estimates. The very large standard error for the asymptote reflects the
uncertainty in the estimated asymptote when all the observed data is much
smaller than the asymptote.
The standard error of this estimate can be computed with the deltaMethod
function in the car package:
se<-deltaMethod(pop.mod, "-theta2/theta3")
print(se)

The estimated year in which the population is half the asymptote is:
− θˆ3 θˆ2 =and
1976so
,6 the standard error is about 7.6 years

15
07/01/16

Nonlinear functions and nonlinear least


squares in R
The column t value in the summary output shows the ratio of
each parameter estimate to its standard error. In sufficiently large
samples, this ratio will generally have a normal distribution, but
interpreting „sufficiently large" is difficult with nonlinear models.
Even if the errors ε are normally distributed, the estimates may be
far from normally distributed in small samples.
The p-values shown are based on asymptotic normality. The
residual standard deviation is the estimate of σ :

σˆ = S (θˆ ) (n − k )

Nonlinear functions and nonlinear least


squares in R U. S. population, with
logistic growth t extrapolated
to 2100. The circles
represent
observed Census population
counts, while „x” represents
the estimated 2010
population. The broken
horizontal lines are drawn at
the asymptotes and midway
between the asymptotes;
the broken
vertical line is drawn at the
year corresponding to the
mid-way point.

16
07/01/16

Nonlinear functions and nonlinear least


squares in R

This figure suggests there


are systematic features that
are missed, reflecting
differences in growth rates,
perhaps due to factors such
as changes in immigration

Nonlinear functions and nonlinear least


squares in R
Bates and Watts (1988, Sec. 3.2) describe many techniques for
finding starting values for fitting nonlinear models.
For the logistic growth model described we study, for example,
finding starting values amounts to:
(1) guessing the parameter θ1 as a value larger than any observed
in the data; and
(2) substituting this value into the mean function, rearranging
terms, and then getting other starting values by OLS simple
linear regression.

17
07/01/16

Nonlinear functions and nonlinear least


squares in R
The self-starting logistic growth model in R is based on a different,
but equivalent, parametrization of the logistic function. We will
start again with the logistic growth model, with mean function:

θ1
m( x , θ ) =
1 + exp[− (θ 2 + θ 3 x )]
Fitting a nonlinear model with the self-starting logistic growth
function in R is quite easy:
pop.ss <- nls(population ~ SSlogis(year, phi1, phi2,
phi3), data=USPop)
summary(pop.ss)

Nonlinear functions and nonlinear least


squares in R

18
07/01/16

Nonlinear functions and nonlinear least


squares in R
The right side of the formula is now the name of a function that
has the responsibility for computing the mean function and for
finding starting values. The estimate of the asymptote parameter
φ1 and its standard error are identical to the estimate and
standard error for θ1 in the θ-parametrization.

19

You might also like