MTH686-Non Linear Regression Lecture 4
MTH686-Non Linear Regression Lecture 4
So far we have discussed about the simple linear regression model and also
different method of estimations. We have further discussed about a simple
numerical technique which can be used to find an approximate solution of
a non-linear equation if the function is sufficiently smooth. Although, we
have illustrated for a function of one variable, the method can be easily ex-
tended for more than one variable also. Now before moving to multiple linear
regression model and non-linear regression model, we will be discussing an-
other specific regression model which plays a very important role in different
applications.
Let us consider the example which has been given in Lecture Note 1 on
Rumford cooling experiment, where you observe temperature versus time.
Suppose we denote the temperature at the time point t as yt , then we can
think of a more general model than a simple regression model yt = β0 +β1 t+t ,
for example
yt = β0 + β1 t + . . . + βp tp + t ; t = t1 , . . . , t n . (1)
Here, t has the same assumptions as before, i.e. it is assumed that t ’s are
independent and identically distributed random variables with mean zero and
finite variance. The above model (1) is known as the polynomial regression
model. In practice it has lots of significance as the simple linear regression
model can take only straight line, the polynomial regression model can take
curve lines also. Now all the methods which we have proposed for the simple
linear regression model, can be used directly. For illustrative purposes, we
have taken p = 2, and ti = i, but all the methods can be used for any general
p and for general ti ’s also. Therefore, we consider the following quadratic
model:
yt = β0 + β1 t + β2 t2 + t ; t = 1, 2, . . . , n. (2)
Therfore, the least squares estimators of β0 , β1 and β2 can be obtained as the
argument minimum of QLSE (β0 , β1 , β2 ), where
n
X 2
QLSE (β0 , β1 , β2 ) = yt − β0 − β1 t − β2 t2 . (3)
t=1
1
Note that QLSE (β0 , β1 , β2 ) is a nice differentiable function and the least
squares estimators of β0 , β1 and β2 can be obtained as the solutions of the
following three linear equations:
n
∂ X
yt − β0 − β1 t − β2 t2 = 0
QLSE (β0 , β1 , β2 ) = −2 (4)
∂β0 t=1
n
∂ X
t yt − β0 − β1 t − β2 t2 = 0
QLSE (β0 , β1 , β2 ) = −2 (5)
∂β1 t=1
n
∂ X
t2 yt − β0 − β1 t − β2 t2 = 0.
QLSE (β0 , β1 , β2 ) = −2 (6)
∂β2 t=1
and n n n n
X X X X
2 3
C1 = t, C2 = t, C3 = t , C4 = t4 .
t=1 t=1 t=1 t=1
Then (13), (14) and (6) can be written as
A1 − nβ0 − C1 β1 − C2 β2 = 0 (7)
A2 − C1 β0 − C2 β1 − C3 β2 = 0 (8)
A3 − C2 β0 − C3 β1 − C4 β2 = 0. (9)
The equations (7), (8) and (9) can be expressed in a matrix form as
n C 1 C2 β0 A1
C1 C2 C3 β1 = A2 . (10)
C2 C3 C4 β2 A3
Hence, the solutions of (7), (8) and (9) can be obtained as
−1
βb0 n C 1 C2 A1
β1 = C1 C2 C3 A2 , (11)
b
βb2 C2 C3 C4 A3
2
provided the above matrix is invertable. In this case becasue of the specific
structure of the matrix, it can be shown that the above matrix is invertable.
From (11), it can be seen that the least squares estimators of β0 , β1 and β2
can be obtained in explicit forms when the exist.
It has been shown before that although the least squares estimators can
be obtained in explicit forms, and they have some nice desirable properties,
they are not robust. In presence of few outliers the performace of the least
squares estimators affect significantly. All the robust estimators like least
absolute deviation estimators or Huber-M estimators can be used even in
this case also to produce more robust estimators. In this case we are going
to introduce another robust estimators which can be obtained quite easily
similar to the least squares estimators.
Suppose w(t) is a positive valued continuous function defined on [0, 1],
such that w(t) ≥ γ > 0, for all t ∈ [0, 1]. Let us define
n
X t 2
QW LSE (β0 , β1 , β2 ) = w yt − β0 − β1 t − β2 t2 . (12)
t=1
n
3
n
∂ X t
yt − β0 − β1 t − β2 t2 = 0
QLSE (β0 , β1 , β2 ) = −2 tw (14)
∂β1 t=1
n
n
∂ X
2 t
yt − β0 − β1 t − β2 t2 = 0.
QLSE (β0 , β1 , β2 ) = −2 tw (15)
∂β2 t=1
n
and
n n n n
X t X t X t X t 4
C
e1 = w t, C
e2 = w t2 , C
e3 = w t3 , C
e4 = w t.
t=1
n t=1
n t=1
n t=1
n
The equations (16), (17) and (18) can be expressed in a matrix form as
n C e1 Ce2 β0 Ae1
C1 C2 C3 β1 = A 2 . (19)
e e e e
C
e2 Ce3 Ce4 β2 Ae3
4
polynomial regression model (1). The second question we will address it
later when we will discuss the general multiple regression model. For the
first question the following method can be used.
Weight Function Bank
We choose different weight functions say w1 (t), . . . , wK (t), where each of
the weight function
Z satisfies the properties as defined before. We make it
1
normalized, i.e. wj (t)dt = 1, for j = 1, . . . , K. Now find the WLSEs
0
for each of the weight functions, and choose that particular weight function
for which it provides the minimum weighted residual sums of squares, i.e.
the minimum QW LSE (βb0 , βb1 , βb2 ). It is recommended that one of the weight
function can be taken as w(t) = 1, hence the LSEs also become one of the
members.