2 Fitting
2 Fitting
2 Fitting
Module: Fitting
1
Pre-requisites
Learning goals
To be able to fit a given data to a given polynomial and plot it.
Fitting
In this module, we use the term fiting to describe the process of finding
out the best function that describes the given data. There are more than one
way of defining what the best function is, and different definitions can lead
to different results. However, we do not discuss these details here. Further,
we also do not discuss the mathematical ideas behind the notion of design
matrix. We refer the interested reader to books on linear regression.
Mol. % of silicon
Lattice constant
Density
100
87.4
85.8
75.7
57.5
44.3
22.9
15.0
13.5
12.6
0
5.434
5.454
5.461
5.473
5.518
5.549
5.593
5.620
5.613
5.626
5.657
2.328
2.72
2.80
3.03
3.62
3.95
4.70
4.86
4.89
5.323
5.65
5.6
5.55
5.5
5.45
5.4
0
20
40
60
80
100
Si (in mol%)
Figure 1: The linear least squares fit for the experimental data on lattice
parameter variation with composition in Si-Ge alloy.
plot(comp,a,o,fitx,fity)
% Set the axes values
axis([0 100 5.4 5.7]);
% Label the axes and title the plot; also make the plot square
xlabel "Si (in mol%)"
ylabel "Lattice paramter (in Angstroms)"
title "Composition versus lattice parameter plot"
axis(square)
% Print the plot as an eps file
print -depsc VegardsLawGOCT.eps
Note that the GNU Octave command used for fitting is polyfit. There are
three arguments for polyfit. The first is the independent variable (say, x);
5
the second is the dependent variable (say, y); and the third is the order of
the fit. By choosing the third variable to be unity, we do a linear fit in the
least squares sense. However, by choosing it to be n, we can fit the data
to an n-th order polynomial, namely, y = an xn + an1 xn1 + ... + a1 x + a0
and the coefficients an -a0 are returned to the user. In the script above, the
coefficient vector is stored as z.
In the previous section, we discussed polynomial fit for the given data. However, sometimes, the exact form of the polynomial is known, and we wish to fit
the data to the given polynomial and not just to any polynomial. For example, consider the variation of specific heat with temperature. Depending on
the mechanisms of absorption of heat in a material, the temperature dependence of specific heat will be different. In general, at very low temperatures,
the specific heat has the following dependence on temperature: Cp = aT +bT 3
where a and b are constants. At high temperatures, the specific heat has the
following dependence on temperature: Cp = a + bT + CT 2 + dT 2 where
a, b, c and d are constants. More information on the variation of specific
heat with temperature is available in Swalin listed in References and further
reading section.
In Table 3 we show the data on the variation of specific heat with temperature
for copper. Note that the data here is for the temperature range 0 to 20 K
which is different from the one that was discussed in the plotting module.
Since the temperature range in question is 0 to 20 K, it can be considered
as the low temperature regime for copper. Unlike in the earleir case, since
we know that at low temperatures, the variation of the specific heat with
temperature is polynomial (given by the polynomial Cp = aT + bT 3 ), we
want to fit the data to this given polynomial and identify the constants a
and b. In this section, we will show how to carry out such a polynomial
fitting to the given data.
The following GNU Octave script fits the given specific heat data to a polynomial of the type aT + bT 3 .
Temperature
Specific Heat
1
2
3
4
5
6
7
8
9
10
12
14
16
18
20
7.43x104
1.77x103
3.37x103
5.82x103
9.43x103
1.45x102
2.13x102
3.01x102
4.14x102
5.55x102
9.36x102
1.49x101
2.25x101
3.28x101
4.62x101
0.8
0.6
0.4
0.2
-0.2
0
10
15
20
25
Figure 2: The polynomial fit for the experimental data on specific heat variation with temperature in copper at low temperatures (in the range 0-20).
axis(square)
print -depsc CuSpHeatFit.eps
From this script, it is clear that the given data fits the polynomial Cp =
2.2460 1004 T + 5.7411 1005 T 3 . The plot generated using this script
is shown in Fig. 2.
Note the use of the so-called design matrix; the design matrix is a matrix
in which for each data point, the independent variable is put in the required
polynomial form, and knowing the value of the dependent variable, the unknown coefficients for the polynomial are evaluated. The solution of this
matrix equation gives us the coefficients that is consistent with the given
data.
Self-assessment questions
1. What is the GNU Octave command for fitting a given data to a polynomial of the type y = a3 x3 + a2 x2 + a1 x + a0 ?
2. What is the GNU Octave command to calculate values using a given fit
(that is obtained using polyfit)?
3. To fit a given data to the polynomial y = a + bx5 one can use polyfit.
True or false?
Exercises
Problem 1. Linear fit
Consider the consumption of copper as a function of temperature in a
reaction. The data is as shown in Table 3. Assume that the consumption is related to temperature through an Arrhenius type of reaction,
Q
namely, h = A exp RT
where R is the universal gas cosntant, T is
the absolute temperature, Q is the activation energy for the reaction,
h is the copper consumption (in microns), A is the pre-exponential
constant. From the given data, estimate Q and A.
Problem 2. Polynomial fit
The data for linear thermal expansion of cubic BN is given in Table. 4.
Fit the data to a polynomial and plot the data and the fit.
10
473
493
513
0.26
0.35
0.42
77
200
297
535
668
785
889
1008
1061
1137
1205
1289
a
a0
103
-0.380.05
- 0.280.05
0.00
+0.440.14
+0.860.14
+1.4580.14
+2.020.14
+2.660.14
+2.96 0.14
+3.600.14
+3.900.14
+4.400.14
Temperature
Specific Heat
300
350
400
450
500
550
600
650
700
800
900
1000
1100
1200
1250
1300
24.45
24.88
25.25
25.61
25.91
26.21
26.48
26.73
26.99
27.48
28.04
28.66
29.48
30.53
31.12
32.16
Table 5: The variation of specific heat with temperature in copper in the low
temperature regime (300 1300 K). The data is taken from Heat capacity of
reference materials: Cu and W, G. K. White and S. J. Collocott, J. Phys.
Chem. Ref. Data, 13, pp. 1251-1257, 1984.
Problem 3. Non-linear, polynomial fit Given below is the data for
specific heat of copper at constant pressure in the temperature range
300 1300. It is known that in this range, the specific heat of copper is
described by the polynomial Cp(T ) = 22.61 + 6.27 103 T Verify that
this indeed is so.
Solution to exercises
Solution to Problem 1.
12
Q
Consider the expression: h = A exp RT
Taking logarithm on both
Q
sides, one obtains, log (h) = log (A) RT Thus, one can see that
the plot of (1/T) versus log(h) is a straight line. The slope of the
curve, multiplied by R, the unviersal gas constant gives the activation
energy (in J/mol) and the intercept value, when exponentiated gives
the pre-exponential constant A. The GNU Octave script below does this
Arrhenius plot as well as fitting. By executing the script, we calculate
the activation energy to be 24.252 kJ/mol and the pre-exponential constant to be 125.87 microns. The Arrhenius plot that we have fit the
data to, along with the data is shown in Fig. 3.
# CuConsumption.oct
#
# Copyright (C) 2011 Prita Pant and M P Gururajan
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of GNU General Public License as published by
# the Free Software Foundation; either version 3 of the License, or (at
# your option) any later version.
#
# This program is distributed in the hope that it will be useful, but
# WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
# General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software Foundation,
# Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA
#
#
% Load the given data from the data file
X = load("CuConsumptionData.dat");
% Read the first column of the data as temperature
T = X(:,1);
% Read the second column of the data as dh
dh = X(:,2);
% Make the relevant x and y parameters; we would like to plot the logarith
% of dh as a function of (1/T).
13
x = 1./T;
y = log(dh);
% Use polyfit to fit the data in the least square sense; note that the
% third parameter, which is unity, is what tells Octave that we are
% looking for a linear fit to the data
z = polyfit(x,y,1);
% Use the fit to generate data at equal intervals
fitx = 0.0018:0.0001:0.0022;
fity = polyval(z,fitx);
% Plot the experimental data and the fit
plot(x,y,o,fitx,fity)
% Set the axes values
axis([0.0018 0.0022 -1.5 0]);
% Label the axes and title the plot; also make the plot square
xlabel "1/T"
ylabel "ln(dh)"
title "Arrhenius plot"
axis(square)
% Print the plot as an eps file
print -depsc CuArrheniusPlot.eps
% From the linear fit data, one can now calculate the activation energy
% in J/mol and the pre-exponential constant
z(1)*8.314
exp(z(2))
Solution to Problem 2.
#
#
#
#
#
#
#
#
ThermalExpansionBN.oct
Copyright (C) 2011 Prita Pant and M P Gururajan
This program is free software; you can redistribute it and/or modify
it under the terms of GNU General Public License as published by
the Free Software Foundation; either version 3 of the License, or (at
your option) any later version.
14
Arrhenius plot
0
-0.2
-0.4
ln(dh)
-0.6
-0.8
-1
-1.2
-1.4
0.0018
0.00185
0.0019
0.00195
0.002
0.00205
0.0021
0.00215
0.0022
1/T
15
#
# This program is distributed in the hope that it will be useful, but
# WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
# General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software Foundation,
# Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA
#
#
% Load the given data from the data file
X = load("BNalpha.dat");
% Read the first column of the data as temperature
T = X(:,1);
% Read the second column of the data as linear thermal expansion (epsilon)
epsilon = X(:,2);
epsilon = epsilon.*1000;
% Use polyfit to fit the data in the least square sense; note that the
% third parameter, which is unity, is what tells Octave that we are
% looking for a linear fit to the data
z = polyfit(T,epsilon,2);
% Use the fit to generate data at equal intervals
fitx = 0:10:1300;
fity = polyval(z,fitx);
% Plot the experimental data and the fit
plot(T,epsilon,o,fitx,fity)
% Set the axes values
axis([0 1300 -1000 5000]);
% Label the axes and title the plot; also make the plot square
xlabel "Temperature (in K)"
ylabel " Linear thermal expansion"
title "Temperature verus linear thermal expansion plot"
axis(square)
% Print the plot as an eps file
print -depsc ThermalExpansionBN.eps
16
4000
3000
2000
1000
-1000
0
200
400
600
800
1000
1200
Temperature (in K)
Figure 4: The polynomial fit for the experimental data on linear thermal
expansion variation with temperature in Boron Nitride in the temperature
range 771289 K.
17
Solution to Problem 3.
Using the script below, we see that the data can be fit to the polynomial
Cp (T ) = 22.342 + 6.8475 103 T as opposed to the given expression,
namely, Cp(T ) = 22.61 + 6.27 103 T Thus, we see that the given data
does fit to the known Cp expression quite well.
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
%
Solution3.oct
Copyright (C) 2011 Prita Pant and M P Gururajan
This program is free software; you can redistribute it and/or modify
it under the terms of GNU General Public License as published by
the Free Software Foundation; either version 3 of the License, or (at
your option) any later version.
This program is distributed in the hope that it will be useful, but
WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software Foundation,
Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA
S = load("CuCpHighTVariation.dat");
% Get the temperature and specific heat from the data file
T = S(:,1);
Cp = S(:,2);
% Plot the experimental data with open circles
18
plot(T,Cp,o)
hold on
% Generate the design matrix
X = [ones(size(T)) T];
% Calculate the coefficients a and b which are consistent with the given d
a = X\Cp
% Using the obtained a and b values, generate data points
t = (300:1.:1300.);
Y = [ones(size(t)) t]*a;
% Plot the fit
plot(t,Y,.-)
% Label the axes
xlabel ("Temperature (in K)");
ylabel ("Specific heat (in J/mol/K)");
% Make the plot square and save the output file
axis(square)
print -depsc CuHighTSpHeatFit.eps
10
34
32
30
28
26
24
200
400
600
800
1000
1200
1400
Temperature (in K)
Figure 5: The polynomial fit for the experimental data on specific heat variation with temperature in copper at temeperatures in the range 300-1300
K).
20
publishers, 1972.
2. Advanced Engineering Mathematics, E. Kreyszig, 8th edition, John Wiley and Sons, 1999.
3. Elementary Numerical Analysis, K. E. Atkinson, 3rd edition, Wiley
India, 2003.
21