Lab 05-1
Lab 05-1
Introduction:
Data is often given for discrete values along a continuum. However we may require
estimates at points between the discrete values. Then we have to fit curves to such data to
obtain intermediate estimates. In addition, we may require a simplified version of a
complicated function. One way to do this is to compute values of the function at a
number of discrete values along the range of interest. Then a simpler function may be
derived to fit these values. Both of these applications are known as curve fitting.
There are two general approaches of curve fitting that are distinguished from each other
on the basis of the amount of error associated with the data. First, where the data exhibits
a significant degree of error, the strategy is to derive a single curve that represents the
general trend of the data. Because any individual data may be incorrect, we make no
effort to intersect every point. Rather, the curve is designed to follow the pattern of the
points taken as a group. One approach of this nature is called least squares regression.
Second, where the data is known to be very precise, the basic approach is to fit a curve
that passes directly through each of the points. The estimation of values between well
known discrete points from the fitted exact curve is called interpolation.
1
Figure 1: (a) Least squares linear regression (b) linear interpolation (c) curvilinear
interpolation
Linear regression:
The simplest example of least squares regression is fitting a straight line to a set of paired
observations: (x1, y1), (x2, y2), , , ,(xn, yn). The mathematical expression for straight line
is
ym=a0+a1x
2
Where a0 and a1 are coefficients representing the intercept and slope and ym is the model
value. If y0 is the observed value and e is error or residual between the model and
observation then
e=y0-ym=y0 - a0 - a1x
Now we need some criteria such that the error e is minimum and also we can arrive at a unique
solution (for this case a unique straight line). One such strategy is to minimize the sum of the
square errors. So sum of square errors
To determine the values of a0 and a1, equation (1) is differentiated with respect to each
coefficient.
Setting these derivatives equal to zero will result in a minimum Sr. If this is done, the
equations can be expressed as
Now realizing that a0 na0 , we can express the above equations as a set of two
simultaneous linear equations with two unknowns a0 and a1.
i 1 ii
from where
3
Demonstration :
Fit a straight line to the x and y values of table 1
Table 1:
Y
x
1 0.5
2 2.5
3 2.0
4 4.0
5 3.5
6 6.0
7 5.5
Code :
%taking input
for i=1:n
x(i)=input('X(i):');
y(i)=input('Y(i):');
end
%calculating coefficients
sumx=0;
sumy=0;
sumxy=0;
sumxsq=0;
for i=1:n
sumx=sumx+x(i);
sumy=sumy+y(i);
sumxy=sumxy+x(i)*y(i);
sumxsq=sumxsq+x(i)^2;
end
format long ;
%calculating a1 and a0
a1=(n*sumxy-sumx*sumy)/(n*sumxsq-sumx^2)
a0=sumy/n-a1*sumx/n
4
Verification :
Ans: a0=0.0714, a1=0.83928
In some cases, we have some engineering data that cannot be properly represented by a straight
line. We can fit a polynomial to these data using polynomial regression.
The least squares procedure can be readily extended to fit the data to a higher order polynomial.
For example, we want to fit a second order polynomial
Taking derivative of equation (2) with respect to unknown coefficients a0, a1 and a2
5
These equations can be set equal to zero and rearranged to develop the following set of
normal equations:
Task:
Write a code to fit a second order polynomial to the data given in table 2
Table 2
x y
0 2.1
1 7.7
2 13.6
3 27.2
4 40.9
5 61.1