Curve and Data Fitting
Curve and Data Fitting
Curve Fitting
Chapter Outline
5.1 Introduction
5.2 Least Square Method
5.3 Fitting of Linear Curves
5.4 Fitting of Quadratic Curves
5.5 Fitting of Exponential and Logarithmic Curves
5.1 introduction
Curve 昀椀tting is the process of 昀椀nding the ‘best-昀椀t’ curve for a given set of data. It is
the representation of the relationship between two variables by means of an algebraic
equation. On the basis of this mathematical equation, predictions can be made in many
statistical problems.
Suppose a set of n points of values (x1, y1), (x2, y2), …, (xn, yn) of the two variables
x and y are given. These values are plotted on a rectangular coordinate system, i.e.,
the xy-plane. The resulting set of points is known as a scatter diagram (Fig. 5.1).
The scatter diagram exhibits the trend and it is possible to visualize a smooth curve
approximating the data. Such a curve is known as an approximating curve.
y y
o x o x
Fig. 5.1
Dg mi y
Sg S d b
th
. e
gr il
En mp
Co
Page 1 of 27
lOMoARcPSD|36554936
The distance QP is known as deviation, error, or residual and is denoted by di. It may
be positive, negative, or zero depending upon whether P lies above, below, or on the
curve. Similar residuals or errors corresponding to the remaining (n – 1) points may be
obtained. The sum of squares of residuals, denoted by E, is given as
n n
E = Â di 2 = Â [ yi - f ( xi )]2
i =1 i =1
If E = 0 then all the n points will lie on y = f (x). If E π 0, f (x) is chosen such that E is
minimum, i.e., the best 昀椀tting curve to the set of points is that for which E is minimum.
This method is known as the least square method. This method does not attempt to
determine the form of the curve y = f (x) but it determines the values of the parameters
of the equation of the curve.
Let (xi, yi), i = 1, 2, …, n be the set of n values and let the relation between x and y be
y = a + bx. The constants a and b are selected such that the straight line is the best 昀椀t to
the data.
The residual at x = xi is
di = yi - f ( xi )
= yi - (a + bxi ) i = 1, 2, ..., n
n
E = Â di 2
i =1
n
= Â ÈÎ yi - (a + bxi )˘˚
2
i =1
n
= Â ( yi - a - bxi )2
Dg mi y
Sg S d b
th
i =1
. e
gr il
En mp
Co
Page 2 of 27
lOMoARcPSD|36554936
For E to be minimum,
(i) ∂E = 0
∂a
n
 2( yi - a - bxi )(-1) = 0
i =1
n
 ( yi - a - bxi ) = 0
i =1
n n n
 yi = aÂ1 + b xi
i =1 i =1 i =1
 y = na + b x
∂E
(ii) =0
∂b
n
 2( yi - a - bxi )(- xi ) = 0
i =1
n
 ( xi yi - axi - bxi 2 ) = 0
i =1
n n n
 xi yi = a xi + b xi 2
i =1 i =1 i =1
 xy = a x + b x 2
These two equations are known as normal equations. These equations can be solved
simultaneously to give the best values of a and b. The best 昀椀tting straight line is
obtained by substituting the values of a and b in the equation y = a + bx .
Example 1
Solution
…(2)
. e
gr il
En mp
Co
Page 3 of 27
lOMoARcPSD|36554936
Here, n = 6
x y x2 xy
1 2.4 1 2.4
2 3 4 6
3 3.6 9 10.8
4 4 16 16
6 5 36 30
8 6 64 48
Example 2
Fit a straight line to the following data. Also, estimate the value of y at
x = 2.5.
x 0 1 2 3 4
y 1 1.8 3.3 4.5 6.3
Solution
 y = na + b x …(1)
 xy = a x + b x 2 …(2)
Here, n = 5
Dg mi y
Sg S d b
th
. e
gr il
En mp
Co
Page 4 of 27
lOMoARcPSD|36554936
x y x2 xy
0 1 0 0
1 1.8 1 1.8
2 3.3 4 6.6
3 4.5 9 13.5
4 6.3 16 25.2
At x = 2.5,
y (2.5) = 0.72 + 1.33 (2.5) = 4.045
Example 3
Solution
 Y = na + b P
Dg mi y
...(1)
Sg S d b
th
. e
gr il
En mp
Co
Page 5 of 27
lOMoARcPSD|36554936
 PY = a P + b P 2 ...(2)
Here, n = 6
P Y P2 PY
100 0.45 10000 45
120 0.55 14400 66
140 0.60 19600 84
160 0.70 25600 112
180 0.80 32400 144
200 0.85 40000 170
ÂP = 900 ÂY = 3.95 ÂP = 142000
2
ÂPY = 621
Example 4
Fit a straight line to the following data. Also, estimate the value of y at
x = 70.
x 71 68 73 69 67 65 66 67
y 69 72 70 70 68 67 68 64
Solution
Since the values of x and y are larger, we choose the origin for x and y at 69 and 67
respectively,
Let X = x - 69 and Y = y - 67
Let the straight line to be 昀椀tted to the data be
Y = a + bX
The normal equations are
 Y = na + b X …(1)
 XY = a X + b X 2 …(2)
Dg mi y
Sg S d b
th
. e
gr il
En mp
Co
Page 6 of 27
lOMoARcPSD|36554936
Here, n = 8
x y X Y X2 XY
71 69 2 2 4 4
68 72 − 1 5 1 − 5
73 70 4 3 16 12
69 70 0 3 0 0
67 68 − 2 1 4 − 2
65 67 − 4 0 16 0
66 68 − 3 1 9 − 3
67 64 − 2 3
− 4 6
ÂX = –6 ÂY = 12 ÂX = 54
2
ÂXY = 12
Example 5
Fit a straight line to the following data taking x as the dependent vari-
able.
x 1 3 4 6 8 9 11 14
y 1 2 4 4 5 7 8 9
Solution
If x is considered the dependent variable and y the independent variable, the equation
of the straight line to be 昀椀tted to the data is
x = a + by
Dg mi y
Sg S d b
th
. e
gr il
En mp
Co
Page 7 of 27
lOMoARcPSD|36554936
 x = na + b y …(1)
 xy = a y + b y2 …(2)
Here, n = 8
x y y2 xy
1 1 1 1
3 2 4 6
4 4 16 16
6 4 16 24
8 5 25 40
9 7 49 63
11 8 64 88
14 9 81 126
Âx = 56 Ây = 40 Ây = 256
2
Âxy = 364
Substituting these values in Eqs (1) and (2),
56 = 8a + 40b …(3)
364 = 40 a + 256b …(4)
Solving Eqs (3) and (4),
a = − 0.5
b = 1.5
Hence, the required equation of the straight line is
x = - 0.5 + 1.5 y
Example 6
W 50 70 100 120
P = mW + c = c + mW
. e
gr il
En mp
Co
Page 8 of 27
lOMoARcPSD|36554936
P W W2 PW
12 50 2500 600
15 70 4900 1050
21 100 10000 2100
25 120 14400 3000
ÂP = 73 ÂW = 340 ÂW = 31800 2
ÂPW = 6750
ExERCiSE 5.1
x 0 5 10 15 20 25
y 12 15 17 22 24 30
t°C 19 25 30 36 40 45 50
R 76 77 79 80 82 83 85
Page 9 of 27
lOMoARcPSD|36554936
ÈÎAns. : y = 19 + 9.7 x ˘˚
V 60 65 70 75 80 85 90
Let (xi, yi), i = 1, 2, …, n be the set of n values and let the relation between x and y be
y = a + bx + cx 2 . The constants a, b, and c are selected such that the parabola is the
best 昀椀t to the data. The residual at x = xi is
di = yi - f ( xi )
= yi - a + bxi + cxi2 ( )
n
E = Â di2
i =1
2
( )
n
= Â È yi - a + bxi + cxi 2 ˘
i =1
Î ˚
2
( )
n
= Â yi - a - bx i - cxi 2
i =1
For E to be minimum,
∂E
(i) =0
∂a
Dg mi y
n
Sg S d b
th
i =1
En mp
Co
Page 10 of 27
lOMoARcPSD|36554936
n
 ( yi - a - bxi - cxi ) = 0
i =1
n n n n
 yi = aÂ1 + b xi + c xi 2
i =1 i =1 i =1 i =1
 yi = na + b x + c x 2
(ii) ∂E = 0
∂b
n
 2( yi - a - bxi - cxi )(- xi ) = 0
i =1
i =1
n n n n
 xi yi = a xi + b xi2 + c xi3
i =1 i =1 i =1 i =1
 xy = na + b x 2 + c x3
∂E
(iii) =0
∂c
n
 2( yi - a - bxi - cxi2 )( xi2 ) = 0
i =1
n
 xi2 yi - axi2 - bxi3 - cxi4 = 0
i =1
n n n n
 xi2 yi = a xi2 + b xi3 + c xi4
i =1 i =1 i =1 i =1
 x 2 y = a  x 2 + b x 3 + c  x 4
These equations are known as normal equations. These equations can be solved simul-
taneously to give the best values of a, b, and c. The best 昀椀tting parabola is obtained by
substituting the values of a, b, and c in the equation y = a + bx + cx 2 .
Example 1
Estimate y(2.4).
Sg S d b
th
. e
gr il
En mp
Co
Page 11 of 27
lOMoARcPSD|36554936
Solution
 y = na + b x + c x 2 …(1)
 xy = a x + b x 2 + c x3 …(2)
 x 2 y = a  x 2 + b x 3 + c  x 4 …(3)
Here, n = 4
x y x2 x3 x4 xy x2y
1 1.7 1 1 1 1.7 1.7
2 1.8 4 8 16 3.6 7.2
3 2.3 9 27 81 6.9 20.7
4 3.2 16 64 256 12.8 51.2
Sx = 10 Sy = 9 2
Sx = 30 3
Sx = 100
4
Sx = 354 Sxy = 25 2
Sx y = 80.8
Example 2
Page 12 of 27
lOMoARcPSD|36554936
Solution
Let the equation of the least squares quadratic curve be y = a + bx + cx2. The normal
equations are
Ây = na + bÂx + cÂx2 ...(1)
x y x2 x3 x4 xy x 2y
0 1 0 0 0 0 0
1 1.8 1 1 1 1.8 1.8
2 1.3 4 8 16 2.6 5.2
3 2.5 9 27 81 7.5 22.5
4 6.3 16 64 256 25.2 100.8
Âx = 10 Ây = 12.9 Âx = 30
2
Âx = 100 Âx = 354 Âxy = 37.1 Âx y = 130.3
3 4 2
Example 3
y 5 12 26 60 97
Also, estimate y at x = 6.
Solution
Let the equation of the parabola be y = a + bx + cx2. The normal equations are
Dg mi y
Sg S d b
th
Page 13 of 27
lOMoARcPSD|36554936
x y x2 x3 x4 xy x 2y
1 5 1 1 1 5 5
2 12 4 8 16 24 48
3 26 9 27 81 78 234
4 60 16 64 256 240 960
5 97 25 125 625 485 2425
Âx = 15 Ây = 200 Âx = 55
2
Âx = 225 Âx = 979 Âxy = 832
3 4
Âx y = 3672
2
Example 4
Solution
Let X = x-5
Y = y -10
Let the equation of the parabola be Y = a + bX + cX 2 .
The normal equations are
 Y = na + b X + c X 2
Dg mi y
Sg S d b
th
…(1)
. e
gr il
En mp
Co
Page 14 of 27
lOMoARcPSD|36554936
ÂX Y = a  X + b X 2 + c  X 3 …(2)
 X 2Y = a  X 2 + b  X 3 + c  X 4 …(3)
Here, n = 9
x y X Y X2 X3 X4 XY X 2Y
1 2 − 4 − 8 16 − 64 256 32 − 128
2 6 − 3 − 4 9 − 27 81 12 − 36
3 7 − 2 − 3 4 − 8 16 6 − 12
4 8 − 1 − 2 1 − 1 1 2 − 2
5 10 0 0 0 0 0 0 0
6 11 1 1 1 1 1 1 1
7 11 2 1 4 8 16 2 4
8 10 3 0 9 27 81 0 0
9 9 4 − 1 16 64 256 −4 − 16
Example 5
y
. e
gr il
En mp
Co
Page 15 of 27
lOMoARcPSD|36554936
Solution
 y = na + b  x 2 ...(1)
 x 2 y = a  x 2 + b x 4 ...(2)
Here, n = 5
x y x2 x4 x2y
1 1.8 1 1 1.8
2 5.1 4 16 20.4
3 8.9 9 81 80.1
4 14.1 16 256 225.6
5 19.8 25 625 495
Ây = 49.7 Âx = 55
2
Âx = 979
4
Âx y = 822.9
2
Example 6
Solution
 xy = a x 2 + b x3
. e
…(1)
gr il
En mp
Co
Page 16 of 27
lOMoARcPSD|36554936
 x 2 y = a  x 3 + b x 4 …(2)
x y x2 x3 x4 xy x 2y
1 2.51 1 1 1 2.51 2.51
2 5.82 4 8 16 11.64 23.28
3 9.93 9 27 81 29.79 89.37
4 14.84 16 64 256 59.36 237.44
5 20.55 25 125 625 102.75 513.75
6 27.06 36 216 1296 162.36 974.16
ExERCiSE 5.2
x −2 −1 0 1 2
2
2. Fit a curve y = ax + bx to the following data:
x −2 −1 0 1 2
Page 17 of 27
lOMoARcPSD|36554936
2
3. Fit a parabola y = a + bx + cx to the following data:
x 0 2 5 10
y 4 7 6.4 −6
2
4. Fit a curve y = a
0
+ax+a
1 2
x for the given data:
x 3 5 7 9 11 13
y 2 3 4 6 5 8
Let (xi , yi), i = 1, 2, …, n be the set of n values and let the relation between x and y be
y = abx.
Taking logarithm on both the sides of the equation y = abx,
loge y = loge a + x loge b
Y = A + BX
This is a linear equation in X and Y. The normal equations are
 = nA + B X
Y
 XY = A X + B X 2
Solving these equations, A and B, and, hence, a and b can be found. The best 昀椀tting
exponential curve is obtained by substituting the values of a and b in the equation
y = abx.
Similarly, the best 昀椀tting exponential curves for the relation y = axb and y = aebx can be
obtained.
Example 1
Page 18 of 27
lOMoARcPSD|36554936
Solution
y = ab x
Taking logarithm on both the sides,
loge y = loge a + x loge b
 XY = A X + B X 2 …(2)
Here, n = 8
x y X Y X2 XY
1 1 1 0.0000 1 0.0000
2 1.2 2 0.1823 4 0.3646
3 1.8 3 0.5878 9 1.7634
4 2.5 4 0.9163 16 3.6652
5 3.6 5 1.2809 25 6.4045
6 4.7 6 1.5476 36 9.2856
7 6.6 7 1.8871 49 13.2097
8 9.1 8 2.2083 64 17.6664
b = 1.3828
th
. e
gr il
En mp
Co
Page 19 of 27
lOMoARcPSD|36554936
Example 2
Fit a curve of the form y = abx to the following data by the method of
least squares:
x 1 2 3 4 5 6 7
Solution
y = abx
Taking logarithm on both the sides,
logey = logea + x logeb
Putting logey = Y, logea = A, x = X and logeb = B,
Y = A + BX
The normal equations are
ÂY = nA + BÂX ...(1)
x y X Y X2 XY
1 87 1 4.4659 1 4.4659
2 97 2 4.5747 4 9.1494
3 113 3 4.7274 9 14.1822
4 129 4 4.8598 16 19.4392
5 202 5 5.3083 25 26.5415
6 195 6 5.2730 36 31.6380
7 193 7 5.2627 49 36.8389
ÂX = 28 ÂY = 34.4718 ÂX = 140 2
ÂXY = 142.2551
Substituting these values in Eqs (1) and (2),
34.4718 = 7A + 28 B ...(3)
142.2551 = 28 A + 140 B ...(4)
Solving Eqs (3) and (4),
A = 4.3006
Dg mi y
Sg S d b
th
B = 0.156
. e
gr il
En mp
Co
Page 20 of 27
lOMoARcPSD|36554936
logea = A
logea = 4.3006
a = 73.744
logeb = B
logeb = 0.156
b = 1.1688
Hence, the required curve is
y = 73.744 (1.1688)x
Example 3
Solution
y = axb
Taking logarithm on both the sides,
loge y = loge a + b loge x
 Y = nA + B X …(1)
 XY = A X + B X 2 …(2)
Here, n = 5
x y X Y X2 XY
20 22 2.9957 3.0910 8.9742 9.2597
16 41 2.7726 3.7136 7.6873 10.2963
10 120 2.3026 4.7875 5.3019 11.0237
11 89 2.3979 4.4886 5.7499 10.7632
14 56 2.6391 4.0254 6.9648 10.6234
 X = 13.1079  Y = 20.1061 ÂX 2
=34.6781 Â XY =51.9663
Substituting these values in Eqs (1) and (2),
20.1061 = 5A + 13.1079 B
Dg mi y
…(3)
Sg S d b
th
Page 21 of 27
lOMoARcPSD|36554936
Example 4
Solution
y = aebx
Taking logarithm on both the sides,
loge y = loge a + bx loge e
= loge a + bx
 Y = nA + B X
…(1)
ÂX Y = AÂ X + B Â X 2
…(2)
Here, n = 5
x y X Y X2 XY
1 115 1 4.7449 1 4.7449
3 105 3 4.6539 9 13.9617
5 95 5 4.5539 25 22.7695
7 85 7 4.4427 49 31.0989
9 80 9 4.3820 81 39.438
Page 22 of 27
lOMoARcPSD|36554936
a = 120.2653
b = B = - 0.0469
and
Hence, the required equation of the curve is
y = 120.2653 e -0.0469 x
Example 5
bx
Fit the exponential curve y = ae to the following data:
x 0 2 4 6 8
y 150 63 28 12 5.6
Solution
y = aebx
Taking logarithm on both the sides,
loge y = loge a + bx loge e
= loge a + bx
Putting logey = Y, logea = A, b = B and x = X,
Y = A + BX
The normal equations are
 Y = nA + b X ...(1)
 XY = A X + B X 2 ...(2)
Dg mi y
Sg S d b
th
. e
gr il
En mp
Co
Page 23 of 27
lOMoARcPSD|36554936
Here, n = 5
x y X Y X2 XY
0 150 0 5.0106 0 0
2 63 2 4.1431 4 8.2862
4 28 4 3.3322 16 13.3288
6 12 6 2.4849 36 14.9094
8 5.6 8 1.7228 64 13.7824
Example 6
The pressure and volume of a gas are related by the equation PVg = c.
Fit this curve to the following data:
P 0.5 1.0 1.5 2.0 2.5 3.0
Solution
PVg = c
1 1
Putting loge V = y, loge c = a, loge P = x, - = b,
Dg mi y
g g
Sg S d b
th
. e
y = a + bx
gr il
En mp
Co
Page 24 of 27
lOMoARcPSD|36554936
P V x y x2 xy
0.5 1.62 –0.6931 0.4824 0.4804 –0.3343
1.0 1.00 0 0 0 0
1.5 0.75 0.4055 –0.2877 0.1644 –0.1166
2.0 0.62 0.6931 –0.4780 0.4804 –0.3313
2.5 0.52 0.9163 –0.6539 0.8396 –0.5992
3.0 0.46 1.0986 –0.7765 1.2069 –0.8531
Âx = 2.4204 Ây = –1.7137 Âx = 3.1717
2
Âxy = –2.2345
ExERCiSE 5.3
x 2 3 4 5 6
x
[Ans.: y = 100 (1.2) ]
. e
gr il
En mp
Co
Page 25 of 27
lOMoARcPSD|36554936
x 0 2 4
y 5.012 10 31.62
0.46x
[Ans.: y = 4.642e ]
x 1 2 3 4
2.09
[Ans.: y = 2.227x ]
g
4. Estimate g by 昀椀tting the ideal gas law PV = c to the following data:
V 50 30 20 15 10 5
[Ans.: g = 1.504]
points to Remember
 y = na + b x + c x 2
 xy = a x + b x 2 + c x3
 x 2 y = a  x 2 + b x 3 + c  x 4
(ii) The normal equations for the curve y = a + bx2 are
 y = na + b  x 2
Dg mi y
 x 2 y = a  x 2 + b x 4
Sg S d b
th
. e
gr il
En mp
Co
Page 26 of 27
lOMoARcPSD|36554936
 xy = a x 2 + b x3
 x 2 y = a  x 3 + b x 4
Fitting of Exponential and Logarithmic Curves
For the curve y = abx,
Taking logarithm on both the sides of the equation y = abx,
loge y = loge a + x loge b
 Y = nA + B X
 XY = A X + B X 2
Similarly, the best 昀椀tting exponential curves for the relation y = axb and y = aebx can be
obtained.
Dg mi y
Sg S d b
th
. e
gr il
En mp
Co
Page 27 of 27