Lectures On Curve Fitting With Matlab: Mattie/Lectures/Curvefitting PDF
Lectures On Curve Fitting With Matlab: Mattie/Lectures/Curvefitting PDF
1 Curve fitting
General question: How to approximate a given function with a class of simpler functions.
3. Non-linear LSQ
4. Advanced ...
x0 x1 x2 ... xn
y0 y1 y2 ... yn
Table 1:
The table may consist of some measured data or values yk of a given function at the given
xk -points
If we know or if there’s reason to believe, that the y-values represent values of a smooth
1
function, it may be reasonable to look for a function in a given class of functions that
passes through all the data points. This is called interpolation.
1
By smooth we mean a function that is at least continuous and has sufficiently many derivatives for
the application
1
In case the measurements are inaccurate or there are other reasons, like lots of data, it
is often more reasonable to look for the trend of the data instead, and thus give up the
requirement of the model function to exactly pass through the data points. Instead of
interpolation we then usually look for a least squares approximation.
Typically the class of simpler functions is taken to be polynomials but other “basis func-
tions” are also possible. For instance periodic data is better approximated by trigono-
metric polynomials.
In addition to linear approximations, where problem parameters appear linearly, there are
natural non-linear models, whose solution requires non-linear optimization techniques.
Interpolation task:
Given n + 1 x- and y-values (Table 1), find a polynomial p of degree at most n, that
satisfies
p(xk ) = yk , k = 0, . . . , n.
2
>> p=polyval(coeff,xev); % Values of polynomial at the points
>> plot(xdata,ydata,'o',xev,p) % Plot data and interpolation polynomial
The reader is asked to run the commands in Matlab or Octave. Check that the polyno-
mial function passes through the datapoints.
Giving the parameter n of polyfit smaller values, one would get polynomials which
would give lower degree LSQ-approximations to the data. We will return to these in a
moment.
Now we will open the “black box” showing how to return the interpolation problem to
that of solving a linear system of equations. This will also give us basic tools and ideas
of how to handle more general LSQ-problems besides polynomials.
Let p(x) = a0 + a1 x + . . . + an xn .
Note: Now we use the more commonly adopted order of polynomial representation. We
will try to make it clear, which order is used, and why we need to use the commands
fliplr, flipud every once in a while.
There are n+1 unknown coefficients a0 , a1 , . . . an , and the known data points give us n+1
equations p(xk ) = yk , k = 0, . . . n. So it’s reasonable to hope for a unique solution.
Let’s start with an example:
3
1 x0 x20
y0 a0
A = 1 x1 x21 , y = y1 , c = a1 .
1 x2 x22 y2 a2
4
>> x=[-2;-1;3];y=[1 -2 5];
>> A=[ones(size(x)) x x.^2]
A =
1 -2 4
1 -1 1
1 3 9
>> c=A\y
c =
-3.1000
-0.1500
0.9500
How do we evaluate the polynomial at an arbitrary set of points? Well, we have just been
taught the use of polyval, which is of course available here. So polyval(flipud(c),xev)
would do it for the vector xev of points of evaluation. But as we are heading towards
cases beyond polynomials, let’s instead just use standard matrix algebra to turn a linear
combination of vectors into matrix multiplication.
5
Vandermonde matrix
The structure of the matrix on the right - call it V - is the same as that of our A in the
above Matlab-session, except the x-datavector is replaced by the (usually much longer)
vector t of evaluation points:
. |column 1: ones | column 2: tpoints | column 3: (tpoints)2 |.
It is now obvious how to form the interpolation polynomial in the general case of degree
n where we have n + 1 datapoints x0 , . . . , xn
The matrix that is needed for both solving the polynomial coefficients and for computing
the values at selected points has the name Vandermonde matrix, and its general form is:
1 x0 . . . xn0
1 x1 . . . xn1
V = .
.. .. .. ..
. . . .
1 xn . . . xnn
Note: The above matrix is square, it is non-singular as long as the x-datapoints are
6
distinct, as discussed below. This is the matrix used to solve the coefficients (model
parameters) of the interpolation polynomial.
A similar matrix formed replacing the x-data column by the (long) vector of evaluation
points (t-column in the example) is used to compute values of the (polynomial) model at
selected points.
Here’s a continuation of our example showing the evaluation:
Matlab has the function vander, which forms the above matrix with its columns in
the reversed order, as discussed. Try for instance vander(1:5) . (vander accepts its
argument as row or column vector.)
Returning to the data of the previous Matlab session:
The reason for Matlab’s “flipping” the columns is the way Matlab presents polynomials
as the vector of coeffcients, starting with the most significant one, as discussed above.
Matlab has the function polyval for evaluating a polynomial with the coefficient
vector c. BUT: For the above reason, the coefficient vector has to be “flipped” to start
with the highest degree.
The session continues:
7
>> t=(-2.5:.1:3.5)'; % Vector of points where polynomial will be ...
evaluated.
>> V=[ones(size(t)) t t.^2];
>> p=V*c;
>> q=polyval(flipud(c),t); % c is a column vector, hence ``ud''
>> max(abs(p-q)) % Difference of the two computations
ans =
8.8818e-16 % ... is of order round off unit
>> eps
ans =
2.2204e-16 % ``machine epsilon''
In mathematics, these are the basic questions. Here everything boils down to the question
of non-singularity of the Vandermonde matrix.
As everyone knows, there is a unique solution provided det(V ) 6= 0 for the Vandermonde
matrix V . It is possible to derive a nice formula for the determinant as a product of
differences of the xk -points, thus showing that it’s different from zero as long as the
xk -points are distinct.
However, we don’t need to do this exercise, as the existence can be proved by a clever
direct construction due to Lagrange (or an alternative one by Newton), and uniqueness
follows from basic properties of polynomials.
All details are shown here: interpolation16.pdf
Thus the method of Lagrange mentioned above, or its suitable variations are better ways
to construct the interpolation polynomial.
2
Briefly: small errors in data cause large errors in results
8
Exercises
It is instructive to write the code for Vandermonde matrix, alternatively you can use the
function vander.
Your first test can be the help example above.
Continue writing a script where you evaluate the polynomial at a reasonably dense set of
points on the interval [min(xdata-.25, max(xdata)+.25]. Plot the data and the inter-
polation polynomial. Pay attention to the polynomial passing through the datapoints.
Also, pay attention to the order of the coefficient vector/columns of the Vandermonce
matrix.
The term Least squares (LSQ) refers to solving an overdetermined system of equations
or inexactly specified datafitting task so that the residual erros will be minimized in the
euclidean norm (i.e. the sum of squares of the residuals will be minimized). In the case of
an overdeterminded linear system it means to find a solution vector c that minimizes the
residual ||y − Ac||, where A is an m × n matrix with m > n and y is the datavector of y
values. Think of A being a matrix which depends on the xdata (like the Vandermonde-
matrix).
Let’s start again by “black box” computing techniques using our old friends polyfit and
polyval.
Recall the calling sequence: c=ployfit(xdata,ydata,n). In case of interpolation, the
9
degree n = length(xdata) − 1, which gives an exact solution, i.e. makes the residual
error 0. Choosing n smaller we will get the best approximation in the LSQ sense among
polynomials of degree ≤ n.
Here’s a script you can write, run and modify on Matlab or Octave.
Example 1 .
%% Example 1 on LSQ
% example1LSQ.m
clear;close all;format compact
%%
x=linspace(-pi,pi,20); % x-data
y=sin(x); % y-data
c1=polyfit(x,y,1); % coeffs of LSQ-line (name it p1)
c4=polyfit(x,y,4); % coeffs of LSQ-polynomial of deg 4 (p4)
t=linspace(-4,4); % Evaluation points (100)
y1=polyval(c1,t); % Values of polynomial p1 at t-points
y4=polyval(c4,t); % Values of polynomial p4 at t-points
plot(x,y,'o',t,y1,t,y4);grid on;shg
title('LSQ-polynomials')
legend('data','LSQline degree n=1','LSQpoly degree n=4')
10
1.2.2 Linear LSQ-fit, general basis functions
x(i, j) = φj (xi ), i = 0, . . . , m, j = 0, . . . n.
In case of polynomial approximation the basis functions are the monomials: φj (t) = tj
and the design matrix X consists of the first n colums of the Vandermonde matrix (last n
of Matlab’s vander). We know already that the colums of a Vandermonde matrix are
linearly independent on the interval under consideration.
Our assumption m > n means that the design matrix X has more rows than colums, thus
the system of (approximate) equations: Xc ≈ y is overdetermined.
If we multiply the equation from the left by X T , we get the system of equations called
the normal equations:
(X T X)c = X T y,
whose solution c is the LSQ-solution This can be proved by an orthogonality argumet
or using partial derivatives wrt. the parameters. We will skip it here, references ...
***ref***Fo-Ma-Mo p. 201-202, neat, short.
The normal equations are not a reliable way with large data, as the condition number
cond(X T X) is of the order cond(X)2 . In the case of a low degree polynomial the matrix
X T X is small, though, for instance in case of an LSQ-line it is only 2 × 2, no matter how
much data we have. But certaily even a small ill-conditioned matrix can produce poor
results.
The good news is that Matlab’s backslash (\) does the job for us. So the syntax
of solving the approximate equation is the same as solving a square system of linear
equations: c=X\y
11
In fact, this algorithm does not just solve the “normal equations”:(X T X)c = X T y. Instead
it works the numerically more reliable way of using the s.k. QR-factorization, which
makes a difference especially with large problems. See **ref**Moler LSQ 5.5. The QR-
Factorization.
Example 2 Find a function model of the form F (x) = a cos πx+b sin πx to approximate
the data
x = -1 -.5 0 .5 1
y = -1 0 1 2 1
in the LSQ-sense.
Plot the data and the LSQ-approximation.
Perhaps the model needs some adjustment. Try to include a constant term:
G(x) = a + b cos πx + c sin πx.
Include the plot of G in your figure.
Solution:
12
It looks like a constant term is needed, let’s do it.
13
Here’s a link to the above Matlab-session run in the “Live editor” mode and exported
into pdf: example2LSQsincosLIVE.pdf
Exercises
Exercises
x = -1 -.5 0 .5 1
y = -1 0 1 2 1
Example 6
14
of linear algebra. You can also use distributed arrays for big data processing. For more information on distributing arrays, see Distributing
Arrays to Parallel Workers.
Check Distributed Array Support in Functions If a MATLAB function has distributed array support, you can consult additional distributed
array usage information on its function page. See Distributed Arrays in the Extended Capabilities section at the end of the function page.
You can also browse distributed support for functions, and filter by product. On the Help bar, click Functions and select a product. In the
function list, at the bottom of the left pane, select Distributed Arrays.
For information about updates to individual distributed-enabled functions, see the release notes.
To check support for special distributed data types, consult the following sections.
15