Least-Square Method
Least-Square Method
Least-Square basics
Any solution must satisfy the first and third equations, which cannot both
be true. A system of equations with no solution is called inconsistent. What
is the meaning of a system with no solutions? Perhaps the coefficients are
slightly inaccurate. In many cases, the number of equations is greater than
the number of unknown variables, making it unlikely that a solution can
satisfy all the equations.
If we choose this “closeness’’ to mean close in Euclidean distance, there is a
straightforward algorithm for finding the closest 𝑥.ҧ This special 𝑥ҧ will be
called the least squares solution. We can get a better picture of the failure
of system to have a solution by writing it in a different way. The matrix
form of the system is A𝑥ҧ = b, or
To work with perpendicularity, recall that two vectors are at right angles to
one another if their dot product is zero. For two m-dimensional column
vectors u and v, we can write the dot product solely in terms of matrix
multiplication by
The vectors u and v are perpendicular, or orthogonal, if uT · v = 0, using ordinary
matrix multiplication.
Now we return to our search for a formula for 𝑥.ҧ We have established that
This gives a system of equations that defines the least squares solution,
and
The normal equations
To measure our success at fitting the data, we calculate the residual of the
least-squares solution 𝑥ҧ as
If the residual is the zero vector, then we have solved the original system
A𝑥ҧ = b exactly. If not, the Euclidean length of the residual vector is a
backward error measure of how far 𝑥ҧ is from being a solution.
There are at least three ways to express the size of the residual. The Euclidean
length of a vector,
are also used to measure the error of the least-squares solution. The three
expressions are closely related; namely
so finding the x that minimizes one, minimizes all. For Example 1, the SE =
(.5)2 + 02 + (−.5)2 = 0.5, the 2-norm of the error is ||r||2 = 0.5 ≈ 0.707, and the
RMSE = 0.5/3 = 1/ 6 ≈ 0.408.
Example 2
Solve the least squares problem:
We can evaluate the fit by using the statistics defined earlier. The residuals at
the data points are
STEP 2. Force the model to fit the data. Substitute the data points into
the model. Each data point creates an equation whose unknowns are
the parameters, such as c1 and c2 in the line model. This results in a
system Ax = b, where the unknown x represents the unknown
parameters.
STEP 3. Solve the normal equations. The least squares solution for the
parameters will be found as the solution to the system of normal
equations AT Ax = AT b.
Example 4
Find the best line and best parabola for the four data points (−1,1), (0,0),
(1,0), (2,−2) in the figure below
In accordance with the preceding procedure, we will follow three steps:
(1) Choose the model y = c1 + c2t as before. (2) Forcing the model to fit the data
yields
Solving for the coefficients c1 and c2 results in the best line y = c1 + c2t = 0.2 −
0.9t. The residuals are
The error statistics are
Next, we extend this example by keeping the same four data points but
changing the model. Set y = c1 + c2t + c3t2 and substitute the data points to yield
Solving for the coefficients results in the best parabola y = c1 + c2t + c3t2 =
0.45 − 0.65t − 0.25t2. The residual errors are given in the following table: