0% found this document useful (0 votes)
51 views6 pages

Least Square Line

This document discusses curve fitting and least squares methods. It explains that Johannes Kepler obtained a formula for planetary motion using least squares. It then defines the least squares line method for approximating data with a linear function y=Ax+B by minimizing the residuals between the data points and the line. Finally, it derives the normal equations for determining the coefficients A and B of the least squares line.

Uploaded by

Professor Saad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
51 views6 pages

Least Square Line

This document discusses curve fitting and least squares methods. It explains that Johannes Kepler obtained a formula for planetary motion using least squares. It then defines the least squares line method for approximating data with a linear function y=Ax+B by minimizing the residuals between the data points and the line. Finally, it derives the normal equations for determining the coefficients A and B of the least squares line.

Uploaded by

Professor Saad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Curve Fitting

The applications of numerical methods in science and engineering mostly


involves curve fitting of data obtained by certain experiment. In 1601, Johannes
Kepler (the German astronomer) obtained the formula

𝑇 = 𝐶𝑥 , 𝐶 = 0.199769
Using the method of Least Squares. The formula is known to be third law of
planetary motion. Here 𝑇 is the orbital period measured in days while 𝑥 is the
distance of the planet to the sun measured in millions of kilometers.

Least Squares Line:


Sometimes the data to be approximated is obtained with equipment, that is
reliable only to three or fewer significant digits of accuracy and interpolation
can’t be used successfully. In this case, the best linear approximation of the
form 𝑦 = 𝑓(𝑥) = 𝐴𝑥 + 𝐵 is determined that goes near (not always through) the
data points.
Let {(𝑥 , 𝑦 )} be the given data where 𝑥 , 𝑘 = 1,2, … , 𝑁 are distinct and
𝑦 = 𝑓(𝑥) = 𝐴𝑥 + 𝐵 is the least squares line that approximates the given data.
Then 𝑒 = 𝑓(𝑥 ) − 𝑦 , 1 ≤ 𝑘 ≤ 𝑁 are called errors or deviations or residuals,
as shown below
There are several norms which show how far the least squares line 𝑦 = 𝐴𝑥 + 𝐵
lies from the data.
Following are three norms defined in this regard:
Maximum Error: 𝐸 (𝑓) = 𝑚𝑎𝑥 {|𝑓(𝑥 ) − 𝑦 |}

Average Error: 𝐸 (𝑓) = ∑ {|𝑓(𝑥 ) − 𝑦 |}

Root-mean-square Error: 𝐸 (𝑓) = { ∑ {|𝑓(𝑥 ) − 𝑦 | }}

Following is an example showing the difference between these errors:

Example:
Consider the linear function 𝑦 = 1.6𝑥 + 8.6 approximating the data
(−1,10), (0,9), (1,7), (2,5), ( 3,4), (4,3), (5,0), 𝑎𝑛𝑑 (6, −1) as shown in the
following table:
𝑥 𝑦 𝑓(𝑥 ) = −1.6𝑥 + 8.6 |𝑒 | |𝑒 |

-1 10.0 10.2 0.2 0.04

0 9.0 8.6 0.4 0.16

1 7.0 7.0 0.0 0.00

2 5.0 5.4 0.4 0.16

3 4.0 3.8 0.2 0.04

4 3.0 2.2 0.8 0.64

5 0.0 0.6 0.6 0.36

6 -1.0 -1.0 0.0 0.00

Sum= 2.6 Sum= 1.40

which shows that


𝐸 (𝑓) = 0.8
𝐸 (𝑓) = 0.325
𝐸 (𝑓) = 0.42833
Remark: A best fitting line can be determined by minimizing any of the errors
discussed earlier. However it is much easier to minimize 𝐸 (𝑓)
computationally.
The least-squares line is the line 𝑦 = 𝑓(𝑥) = 𝐴𝑥 + 𝐵 which minimizes 𝐸 (𝑓)
corresponding to the given set of points, i.e. if for distinct {𝑥 } ,
{(𝑥 , 𝑦 )} is the set of points, then to determine the corresponding least-

squares line, it is sufficient to minimize 𝑁 𝐸 (𝑓) =∑ {|𝑓(𝑥 ) − 𝑦 | }.


Following is the theorem that determines the choice of 𝐴 and 𝐵 for the least-
squares line.
Theorem:
Let for the distinct {𝑥 } , {(𝑥 , 𝑦 )} be the set of points that are to be
approximated by the least-squares line 𝑦 = 𝑓(𝑥) = 𝐴𝑥 + 𝐵 , then 𝐴 and 𝐵 can
be determined by finding the solution of the following system (called normal
equations)

𝑥 𝐴+ 𝑥 𝐵= 𝑥 𝑦

𝑥 𝐴 + 𝑁𝐵 = 𝑦

Proof:
Consider

𝐸(𝐴, 𝐵) = (𝐴𝑥 + 𝐵 − 𝑦 ) = 𝑑

where 𝑑 are the squared vertical distances of data points from the
corresponding points on least-squares line.
𝐸(𝐴, 𝐵) being function of the variables 𝐴 and 𝐵 can be minimized by putting
( , ) ( , )
and equal to zero.

i.e.

𝜕𝐸(𝐴, 𝐵)
= 2(𝐴𝑥 + 𝐵 − 𝑦 )(𝑥 ) = 2 (𝐴𝑥 + 𝐵𝑥 − 𝑥 𝑦 ) = 0
𝜕𝐴

𝜕𝐸(𝐴, 𝐵)
= 2(𝐴𝑥 + 𝐵 − 𝑦 ) = 2 (𝐴𝑥 + 𝐵 − 𝑦 ) = 0
𝜕𝐵

which shows that


𝐴 𝑥 +𝐵 𝑥 − 𝑥 𝑦 =0

and

𝐴 𝑥 + 𝑁𝐵 − 𝑦 =0

which leads to the required system.

Example:
Consider the data points (−1,10), (0,9), (1,7), (2,5), (3,4), (4,3), (5,0), (6, −1)
that are to be approximated by the least-squares line 𝑦 = 𝑓(𝑥) = 𝐴𝑥 + 𝐵. To
determine the coefficients A and B, following is the required table:

𝑥 𝑦 𝑥 𝑥 𝑦

-1 10.0 1 -10

0 9.0 0 0

1 7.0 1 7

2 5.0 4 10

3 4.0 9 12

4 3.0 16 12

5 0.0 25 0

6 -1.0 36 -6

Sum=20 Sum=37 Sum=92 Sum=25


which leads to the following system of normal equations:
92𝐴 + 20𝐵 = 25
20𝐴 + 8𝐵 = 37

The values of A and B can be calculated as:


540
𝐴=− ≈ −1.6071429
336
2904
𝐵= ≈ 8.6428571
336
Thus the required least-squares line is obtained as:
𝑦 = −1.6071429𝑥 + 8.6428571
that is depicted in the following graph:

You might also like