0% found this document useful (0 votes)
50 views5 pages

MTH686-Non Linear Regression Lecture 4

Uploaded by

hatjapandu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
50 views5 pages

MTH686-Non Linear Regression Lecture 4

Uploaded by

hatjapandu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Lecture Notes 4: Non-Linear Regression

So far we have discussed about the simple linear regression model and also
different method of estimations. We have further discussed about a simple
numerical technique which can be used to find an approximate solution of
a non-linear equation if the function is sufficiently smooth. Although, we
have illustrated for a function of one variable, the method can be easily ex-
tended for more than one variable also. Now before moving to multiple linear
regression model and non-linear regression model, we will be discussing an-
other specific regression model which plays a very important role in different
applications.
Let us consider the example which has been given in Lecture Note 1 on
Rumford cooling experiment, where you observe temperature versus time.
Suppose we denote the temperature at the time point t as yt , then we can
think of a more general model than a simple regression model yt = β0 +β1 t+t ,
for example
yt = β0 + β1 t + . . . + βp tp + t ; t = t1 , . . . , t n . (1)
Here, t has the same assumptions as before, i.e. it is assumed that t ’s are
independent and identically distributed random variables with mean zero and
finite variance. The above model (1) is known as the polynomial regression
model. In practice it has lots of significance as the simple linear regression
model can take only straight line, the polynomial regression model can take
curve lines also. Now all the methods which we have proposed for the simple
linear regression model, can be used directly. For illustrative purposes, we
have taken p = 2, and ti = i, but all the methods can be used for any general
p and for general ti ’s also. Therefore, we consider the following quadratic
model:
yt = β0 + β1 t + β2 t2 + t ; t = 1, 2, . . . , n. (2)
Therfore, the least squares estimators of β0 , β1 and β2 can be obtained as the
argument minimum of QLSE (β0 , β1 , β2 ), where
n
X 2
QLSE (β0 , β1 , β2 ) = yt − β0 − β1 t − β2 t2 . (3)
t=1

1
Note that QLSE (β0 , β1 , β2 ) is a nice differentiable function and the least
squares estimators of β0 , β1 and β2 can be obtained as the solutions of the
following three linear equations:
n
∂ X
yt − β0 − β1 t − β2 t2 = 0

QLSE (β0 , β1 , β2 ) = −2 (4)
∂β0 t=1
n
∂ X
t yt − β0 − β1 t − β2 t2 = 0

QLSE (β0 , β1 , β2 ) = −2 (5)
∂β1 t=1
n
∂ X
t2 yt − β0 − β1 t − β2 t2 = 0.

QLSE (β0 , β1 , β2 ) = −2 (6)
∂β2 t=1

Let us use the following notations:


n
X n
X n
X
A1 = yt , A2 = tyt , A3 = t2 y t ,
t=1 t=1 t=1

and n n n n
X X X X
2 3
C1 = t, C2 = t, C3 = t , C4 = t4 .
t=1 t=1 t=1 t=1
Then (13), (14) and (6) can be written as

A1 − nβ0 − C1 β1 − C2 β2 = 0 (7)
A2 − C1 β0 − C2 β1 − C3 β2 = 0 (8)
A3 − C2 β0 − C3 β1 − C4 β2 = 0. (9)

The equations (7), (8) and (9) can be expressed in a matrix form as
    
n C 1 C2 β0 A1
 C1 C2 C3   β1  =  A2  . (10)
C2 C3 C4 β2 A3
Hence, the solutions of (7), (8) and (9) can be obtained as
   −1  
βb0 n C 1 C2 A1
 β1  = C1 C2 C3   A2  , (11)
 b  
βb2 C2 C3 C4 A3

2
provided the above matrix is invertable. In this case becasue of the specific
structure of the matrix, it can be shown that the above matrix is invertable.
From (11), it can be seen that the least squares estimators of β0 , β1 and β2
can be obtained in explicit forms when the exist.
It has been shown before that although the least squares estimators can
be obtained in explicit forms, and they have some nice desirable properties,
they are not robust. In presence of few outliers the performace of the least
squares estimators affect significantly. All the robust estimators like least
absolute deviation estimators or Huber-M estimators can be used even in
this case also to produce more robust estimators. In this case we are going
to introduce another robust estimators which can be obtained quite easily
similar to the least squares estimators.
Suppose w(t) is a positive valued continuous function defined on [0, 1],
such that w(t) ≥ γ > 0, for all t ∈ [0, 1]. Let us define
n  
X t 2
QW LSE (β0 , β1 , β2 ) = w yt − β0 − β1 t − β2 t2 . (12)
t=1
n

It is assumed that w(t) is known is advance. We obtain the estimators of


β0 , β1 and β2 by the argument minimum of QW LSE (β0 , β1 , β2 ). We call these
estimators as the weighted least squares estimators of β0 , β1 and β2 . Now
the motivation of using the weighted least squares estimators as more robust
estimators than the least squares estimators is the following. Suppose the
outliers are in the middle portion of the time scale, i.e. near the time point
t
. In that case if w(t) = 1+(t−0.5)2 , then QW LSE (β0 , β1 , β2 ) puts less weight
2
in the middle portion of the data than towards the end point. Similarly, if
it is known that the outliers are at the beginning of the data sequence, then
one can choose a weight function which is an increasing function, hence it
puts less weight at the beginning and more weight afterwards.
From (12) we obtain three normal equations as follows:
n  
∂ X t
yt − β0 − β1 t − β2 t2 = 0

QW LSE (β0 , β1 , β2 ) = −2 w (13)
∂β0 t=1
n

3
n  
∂ X t
yt − β0 − β1 t − β2 t2 = 0

QLSE (β0 , β1 , β2 ) = −2 tw (14)
∂β1 t=1
n
n  
∂ X
2 t
yt − β0 − β1 t − β2 t2 = 0.

QLSE (β0 , β1 , β2 ) = −2 tw (15)
∂β2 t=1
n

If we use the notation


n   n   n  
X t X t X t 2
Ae1 = w yt , A
e2 = w tyt , A
e3 = w t yt ,
t=1
n t=1
n t=1
n

and
n   n   n   n  
X t X t X t X t 4
C
e1 = w t, C
e2 = w t2 , C
e3 = w t3 , C
e4 = w t.
t=1
n t=1
n t=1
n t=1
n

Then (13), (14) and (6) can be written as


e1 − nβ0 − C
A e1 β1 − C
e2 β2 = 0 (16)
e2 − C
A e1 β0 − C
e2 β1 − C
e3 β2 = 0 (17)
e3 − C
A e2 β0 − C
e3 β1 − C
e4 β2 = 0. (18)

The equations (16), (17) and (18) can be expressed in a matrix form as
   
n C e1 Ce2  β0  Ae1
 C1 C2 C3  β1  =  A 2 . (19)
 e e e   e 
C
e2 Ce3 Ce4 β2 Ae3

Hence, the solutions of (16), (17) and (18) can be obtained as


   −1  
βb0 n Ce1 C e2 A
e1
 β1  =  C1 C2 C3   A2  , (20)
 b   e e e   e 
βb2 Ce2 C
e3 C e4 A
e3

provided the above matrix is invertable.


There are two important questions which need to be answered. First one
is how to choose the weight function if it is not known from where the outliers
can be present, and the second question is how to choose p in the general

4
polynomial regression model (1). The second question we will address it
later when we will discuss the general multiple regression model. For the
first question the following method can be used.
Weight Function Bank
We choose different weight functions say w1 (t), . . . , wK (t), where each of
the weight function
Z satisfies the properties as defined before. We make it
1
normalized, i.e. wj (t)dt = 1, for j = 1, . . . , K. Now find the WLSEs
0
for each of the weight functions, and choose that particular weight function
for which it provides the minimum weighted residual sums of squares, i.e.
the minimum QW LSE (βb0 , βb1 , βb2 ). It is recommended that one of the weight
function can be taken as w(t) = 1, hence the LSEs also become one of the
members.

You might also like