0% found this document useful (0 votes)
73 views17 pages

Curve Fitting

Curve fitting involves using a function to fit a set of data points without having to match the data exactly. There are two main categories: least-squares regression for noisy data that finds the best fit curve, and polynomial interpolation that passes through all data points. Common methods include linear regression that finds the best fit line, and polynomial regression that fits curves using polynomials of varying degrees.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
73 views17 pages

Curve Fitting

Curve fitting involves using a function to fit a set of data points without having to match the data exactly. There are two main categories: least-squares regression for noisy data that finds the best fit curve, and polynomial interpolation that passes through all data points. Common methods include linear regression that finds the best fit line, and polynomial regression that fits curves using polynomials of varying degrees.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

Curve Fitting

 Often have set data, y , that is a function of some independent variable,


x , but the underlying relationship is unknown (don’t know f(x) ) .
• Measured data
• Tabulated data
 Such function may be obtained through curve itting, or approximation, of
the data.
 Curve fitting is a procedure where a function is used to it a given set of
data in the “best” possible manner without having to match the data
exactly.
 Two categories of curve fitting:
• Least‐squares regression
o Noisy data – uncertainty in value for a given value
o Want “good” agreement between and data points . Curve (f(x) may
not pass through any data points
• Polynomial interpolation
o Data points are known exactly – noiseless data
o Resulting curve passes through all data points

1
Linear Regression

• Noisy data, y, values at known


values x

• Suspect relationship between


and is linear

• assume

• Determine a0 and a1 that


define the “best‐ fit” line for
the data

• The simplest example is fitting a straight line to a set of paired observations:


(x1, y1), (x2, y2), . . . , (xn, yn). The mathematical expression for the straight line
is :
1

a0 and a1 are coefficients representing the intercept and the slope, respectively,
and e is the error, or residual, between the model and the observations
2
rearranging Eq. (1) as

(2)

One strategy for fitting a “best” line through the data would be to minimize the sum
of the residual errors for all the available data, as in

(3)

n = total number of points

One way to remove the effect of the signs might be to minimize sum of the
squares of the residuals as :

(4)

The sum of the squares of the residuals is a function of the two fitting
parameters, a0 and a1, Sr(a0,a1).

Minimize by setting its partial derivatives to zero and solving for a0 and a1
Setting these derivatives equal to zero will result in a minimum Sr

the equations as a set of two simultaneous linear equations with two


unknowns (a0 and a1):

(5)

(6)

These are called the normal equations.

4
The equations can be solved simultaneously :

(7)

(8)

Quantification of Error of Linear Regression


Recall that the sum of the squareserror is defined as :

(9)

St the square of the residual represented the square of the discrepancy


between the data and a single estimate of the measure of central
tendency—the mean :
(10)
5
( xn , y n )
y xi , yi 
Ei  yi  a0  a1 xi
n
Sr    yi  a 0  a1xi 2 x2 , y2 
i 1
 x3 , y 3 

 x1 , y1 
x

( xn , y n )

y xi , yi 
n 2 yy
_

St    yi  y 
i 1
 x2 , y 2 
x3 , y3 
x1 , y1 

6
St is the magnitude of the residual error associated with the dependent
variable prior to regression. After performing the regression, we can compute
Sr, the sum of the squares of the residuals around the regression line

The difference between the two quantities, St - Sr, quantifies the improvement or
error reduction due to describing the data in terms of a straight line rather than
as an average value. Because the magnitude of this quantity is scale
dependent, the difference is normalized to St to yield

r2 is called the coefficient of determination and r is the correlation coefficient


For a perfect fit, Sr = 0 and r2 = 1, signifying that the line explains 100% of the
variability of the data. For r 2 = 0 Sr = St and the fit represents no improvement.
An alternative formulation for r that is more convenient for computer
implementation is

7
function [a, r2] = linregr(x,y)
% linregr: linear regression curve fitting
% [a, r2] = linregr(x,y): Least squares fit of straight
% line to data by solving the normal equations
% input:
% x = independent variable
% y = dependent variable
% output:
% a = vector of slope, a(1), and intercept, a(2)
% r2 = coefficient of determination
n = length(x);
if length(y)~=n, error('x and y must be same length');
end
x = x(:); y = y(:); % convert to column vectors
sx = sum(x); sy = sum(y);
sx2 = sum(x.*x); sxy = sum(x.*y); sy2 = sum(y.*y);
a(1) = (n*sxy—sx*sy)/(n*sx2—sx^2);
a(2) = sy/n—a(1)*sx/n;
r2 = ((n*sxy—sx*sy)/sqrt(n*sx2—sx^2)/sqrt(n*sy2—sy^2))^2;
% create plot of data and best fit line
xp = linspace(min(x),max(x),2);
yp = a(1)*xp+a(2);
plot(x,y,'o',xp,yp)
grid on
8
LINEARIZATION OF NONLINEAR RELATIONSHIPS
Not all data can be explained by a linear relationship to an independent
variable, e.g

Linearize the fitting equation:

where

9
Linearize the fitting equation:

Linearize the fitting equation:

10
Polynomial Regression
The linear least-squares regression can be extended to it a set of n data
points (x1, y1), …, (xn, yn) with an mth-degree polynomial in the form :

with a total error

The coeficients am, …, a2, a1, a0 are determined such that E is minimized. A
necessary condition for E to attain a minimum is that its partial derivative with
respect to each of these coeficients vanishes, that is :

11
Manipulation of these equations yields a system of m + 1 linear equations to be
solved for am, …, a2, a1, a0 :

Quadratic Least-Squares Regression


The objective is to it a set of n data points (x1, y1), …, (xn, yn) with a second-
degree polynomial

the total error

12
the coeficients a2, a1, a0 are determined by solving a system of three linear
equations:

The user-deined function QuadraticRegression uses the quadratic least-


squares regression approach to ind the second-degree polynomial that best
its a set of data. The coeficients a2, a1, a0 are found by writing Equation in
matrix form and applying the built-in backslash “\” operator in MATLAB. The
function also returns the plot of the data and the best quadratic polynomial it.

13
function [a2, a1, a0] = QuadraticRegression(x,y)
%
% QuadraticRegression uses quadratic least-squares approximation to fit a
% data by a 2nd-degree polynomial in the form y = a2*x^2 + a1*x + a0.
%
% [a2, a1, a0] = QuadraticRegression(x,y), where
%
% x, y are n-dimensional row or column vectors of data,
%
% a2, a1 and a0 are the coefficients that describe the quadratic fit.
%
n = length(x);
Sumx = sum(x); Sumy = sum(y);
Sumx2 = sum(x.^2); Sumx3 = sum(x.^3); Sumx4 = sum(x.^4);
Sumxy = sum(x.*y); Sumx2y = sum(x.*x.*y);
% Form the coefficient matrix and the vector of right-hand sides
A = [n Sumx Sumx2;Sumx Sumx2 Sumx3;Sumx2 Sumx3 Sumx4];
b = [Sumy;Sumxy;Sumx2y];
w = A\b; % Solve for the coefficients
a2 = w(3); a1 = w(2); a0 = w(1);
% Plot the data and the quadratic fit
xx = linspace(x(1),x(end)); % Generate 100 points for plotting purposes
p = zeros(100,1); % Pre-allocate
for i = 1:100,
p(i) = a2*xx(i)^2 + a1*xx(i) + a0; % Calculate 100 points
end
plot(x,y,'o')
hold on
plot(xx,p)
end
14
MATLAB Built-In Functions Polyfit
A brief description of the MATLAB built-in function polyfit is given as:
POLYFIT Fit polynomial to data.
P = POLYFIT(X,Y,N) finds the coefficients of a polynomial P(X) of
degree N that fits the data Y best in a least-squares sense. P is
a row vector of length N + 1 containing the polynomial
coefficients in descending powers :
P(1)*X^N + P(2)*X^(N-1) + ⋯ + P(N)*X + P(N + 1).

EXAMPLE

Using the polyfit functions find and plot the second-degree polynomial that
best its the data. Apply the user-deined function QuadraticRegression and
compare the results.
15
Solution
>> x = 0:0.3:1.2; y = [3.6 4.8 5.9 7.6 10.9];
>> P = polyfit(x,y,2)
P = 3.8095 1.2286 3.7657 % Coefficients of the second-deg polynomial fit
>> xi = linspace(0,1.2); % Generate 100 points for plotting purposes
>> yi = polyval(P,xi); % Evaluate the polynomial at these points
>> plot(xi,yi,x,y,'o')

Execution of QuadraticRegression yields:


>> [a2, a1, a0] = QuadraticRegression(x,y)
a2 =
3.8095
a1 =
1.2286
a0 =
3.7657
The coeficients of the second-degree polynomial are precisely those returned
by polyfit.

16
17

You might also like