Curve Fitting
Curve Fitting
1
Linear Regression
• assume
a0 and a1 are coefficients representing the intercept and the slope, respectively,
and e is the error, or residual, between the model and the observations
2
rearranging Eq. (1) as
(2)
One strategy for fitting a “best” line through the data would be to minimize the sum
of the residual errors for all the available data, as in
(3)
One way to remove the effect of the signs might be to minimize sum of the
squares of the residuals as :
(4)
The sum of the squares of the residuals is a function of the two fitting
parameters, a0 and a1, Sr(a0,a1).
Minimize by setting its partial derivatives to zero and solving for a0 and a1
Setting these derivatives equal to zero will result in a minimum Sr
(5)
(6)
4
The equations can be solved simultaneously :
(7)
(8)
(9)
x1 , y1
x
( xn , y n )
y xi , yi
n 2 yy
_
St yi y
i 1
x2 , y 2
x3 , y3
x1 , y1
6
St is the magnitude of the residual error associated with the dependent
variable prior to regression. After performing the regression, we can compute
Sr, the sum of the squares of the residuals around the regression line
The difference between the two quantities, St - Sr, quantifies the improvement or
error reduction due to describing the data in terms of a straight line rather than
as an average value. Because the magnitude of this quantity is scale
dependent, the difference is normalized to St to yield
7
function [a, r2] = linregr(x,y)
% linregr: linear regression curve fitting
% [a, r2] = linregr(x,y): Least squares fit of straight
% line to data by solving the normal equations
% input:
% x = independent variable
% y = dependent variable
% output:
% a = vector of slope, a(1), and intercept, a(2)
% r2 = coefficient of determination
n = length(x);
if length(y)~=n, error('x and y must be same length');
end
x = x(:); y = y(:); % convert to column vectors
sx = sum(x); sy = sum(y);
sx2 = sum(x.*x); sxy = sum(x.*y); sy2 = sum(y.*y);
a(1) = (n*sxy—sx*sy)/(n*sx2—sx^2);
a(2) = sy/n—a(1)*sx/n;
r2 = ((n*sxy—sx*sy)/sqrt(n*sx2—sx^2)/sqrt(n*sy2—sy^2))^2;
% create plot of data and best fit line
xp = linspace(min(x),max(x),2);
yp = a(1)*xp+a(2);
plot(x,y,'o',xp,yp)
grid on
8
LINEARIZATION OF NONLINEAR RELATIONSHIPS
Not all data can be explained by a linear relationship to an independent
variable, e.g
where
9
Linearize the fitting equation:
10
Polynomial Regression
The linear least-squares regression can be extended to it a set of n data
points (x1, y1), …, (xn, yn) with an mth-degree polynomial in the form :
The coeficients am, …, a2, a1, a0 are determined such that E is minimized. A
necessary condition for E to attain a minimum is that its partial derivative with
respect to each of these coeficients vanishes, that is :
11
Manipulation of these equations yields a system of m + 1 linear equations to be
solved for am, …, a2, a1, a0 :
12
the coeficients a2, a1, a0 are determined by solving a system of three linear
equations:
13
function [a2, a1, a0] = QuadraticRegression(x,y)
%
% QuadraticRegression uses quadratic least-squares approximation to fit a
% data by a 2nd-degree polynomial in the form y = a2*x^2 + a1*x + a0.
%
% [a2, a1, a0] = QuadraticRegression(x,y), where
%
% x, y are n-dimensional row or column vectors of data,
%
% a2, a1 and a0 are the coefficients that describe the quadratic fit.
%
n = length(x);
Sumx = sum(x); Sumy = sum(y);
Sumx2 = sum(x.^2); Sumx3 = sum(x.^3); Sumx4 = sum(x.^4);
Sumxy = sum(x.*y); Sumx2y = sum(x.*x.*y);
% Form the coefficient matrix and the vector of right-hand sides
A = [n Sumx Sumx2;Sumx Sumx2 Sumx3;Sumx2 Sumx3 Sumx4];
b = [Sumy;Sumxy;Sumx2y];
w = A\b; % Solve for the coefficients
a2 = w(3); a1 = w(2); a0 = w(1);
% Plot the data and the quadratic fit
xx = linspace(x(1),x(end)); % Generate 100 points for plotting purposes
p = zeros(100,1); % Pre-allocate
for i = 1:100,
p(i) = a2*xx(i)^2 + a1*xx(i) + a0; % Calculate 100 points
end
plot(x,y,'o')
hold on
plot(xx,p)
end
14
MATLAB Built-In Functions Polyfit
A brief description of the MATLAB built-in function polyfit is given as:
POLYFIT Fit polynomial to data.
P = POLYFIT(X,Y,N) finds the coefficients of a polynomial P(X) of
degree N that fits the data Y best in a least-squares sense. P is
a row vector of length N + 1 containing the polynomial
coefficients in descending powers :
P(1)*X^N + P(2)*X^(N-1) + ⋯ + P(N)*X + P(N + 1).
EXAMPLE
Using the polyfit functions find and plot the second-degree polynomial that
best its the data. Apply the user-deined function QuadraticRegression and
compare the results.
15
Solution
>> x = 0:0.3:1.2; y = [3.6 4.8 5.9 7.6 10.9];
>> P = polyfit(x,y,2)
P = 3.8095 1.2286 3.7657 % Coefficients of the second-deg polynomial fit
>> xi = linspace(0,1.2); % Generate 100 points for plotting purposes
>> yi = polyval(P,xi); % Evaluate the polynomial at these points
>> plot(xi,yi,x,y,'o')
16
17