Mas de Regresion Multivariada

Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

Least Squares Fitting of Data

David Eberly
Geometric Tools, LLC
http://www.geometrictools.com/
Copyright c 1998-2014. All Rights Reserved.
Created: July 15, 1999
Last Modied: February 9, 2008
Contents
1 Linear Fitting of 2D Points of Form (x, f(x)) 2
2 Linear Fitting of nD Points Using Orthogonal Regression 2
3 Planar Fitting of 3D Points of Form (x, y, f(x, y)) 3
4 Hyperplanar Fitting of nD Points Using Orthogonal Regression 4
5 Fitting a Circle to 2D Points 4
6 Fitting a Sphere to 3D Points 6
7 Fitting an Ellipse to 2D Points 7
7.1 Distance From Point to Ellipse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
7.2 Minimization of the Energy Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
8 Fitting an Ellipsoid to 3D Points 8
8.1 Distance From Point to Ellipsoid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
8.2 Minimization of the Energy Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
9 Fitting a Paraboloid to 3D Points of the Form (x, y, f(x, y)) 9
1
This document describes some algorithms for tting 2D or 3D point sets by linear or quadratic structures
using least squares minimization.
1 Linear Fitting of 2D Points of Form (x, f(x))
This is the usual introduction to least squares t by a line when the data represents measurements where
the y-component is assumed to be functionally dependent on the x-component. Given a set of samples
{(x
i
, y
i
)}
m
i=1
, determine A and B so that the line y = Ax + B best ts the samples in the sense that the
sum of the squared errors between the y
i
and the line values Ax
i
+ B is minimized. Note that the error is
measured only in the y-direction.
Dene E(A, B) =

m
i=1
[(Ax
i
+ B) y
i
]
2
. This function is nonnegative and its graph is a paraboloid whose
vertex occurs when the gradient satistes E = (0, 0). This leads to a system of two linear equations in A
and B which can be easily solved. Precisely,
(0, 0) = E = 2
m

i=1
[(Ax
i
+ B) y
i
](x
i
, 1)
and so
_
_

m
i=1
x
2
i

m
i=1
x
i

m
i=1
x
i

m
i=1
1
_
_
_
_
A
B
_
_
=
_
_

m
i=1
x
i
y
i

m
i=1
y
i
_
_
.
The solution provides the least squares solution y = Ax + B.
2 Linear Fitting of nD Points Using Orthogonal Regression
It is also possible to t a line using least squares where the errors are measured orthogonally to the pro-
posed line rather than measured vertically. The following argument holds for sample points and lines in n
dimensions. Let the line be L(t) = tD+A where D is unit length. Dene X
i
to be the sample points; then
X
i
= A+ d
i
D+ p
i
D

i
where d
i
= D (X
i
A) and D

i
is some unit length vector perpendicular to D with appropriate coecient
p
i
. Dene Y
i
= X
i
A. The vector from X
i
to its projection onto the line is
Y
i
d
i
D = p
i
D

i
.
The squared length of this vector is p
2
i
= (Y
i
d
i
D)
2
. The energy function for the least squares minimization
is E(A, D) =

m
i=1
p
2
i
. Two alternate forms for this function are
E(A, D) =
m

i=1
_
Y
T
i
_
I DD
T
_
Y
i
_
and
E(A, D) = D
T
_
m

i=1
_
(Y
i
Y
i
)I Y
i
Y
T
i
_
_
D = D
T
M(A)D.
2
Using the rst form of E in the previous equation, take the derivative with respect to A to get
E
A
= 2
_
I DD
T
_
m

i=1
Y
i
.
This partial derivative is zero whenever

m
i=1
Y
i
= 0 in which case A = (1/m)

m
i=1
X
i
(the average of the
sample points).
Given A, the matrix M(A) is determined in the second form of the energy function. The quantity D
T
M(A)D
is a quadratic form whose minimum is the smallest eigenvalue of M(A). This can be found by standard
eigensystem solvers. A corresponding unit length eigenvector D completes our construction of the least
squares line.
For n = 2, if A = (a, b), then matrix M(A) is given by
M(A) =
_
m

i=1
(x
i
a)
2
+
n

i=1
(y
i
b)
2
_
_
_
1 0
0 1
_
_

_
_

m
i=1
(x
i
a)
2

m
i=1
(x
i
a)(y
i
b)

m
i=1
(x
i
a)(y
i
b)

m
i=1
(y
i
b)
2
_
_
.
For n = 3, if A = (a, b, c), then matrix M(A) is given by
M(A) =
_

_
1 0 0
0 1 0
0 0 1
_

m
i=1
(x
i
a)
2

m
i=1
(x
i
a)(y
i
b)

m
i=1
(x
i
a)(z
i
c)

m
i=1
(x
i
a)(y
i
b)

m
i=1
(y
i
b)
2

m
i=1
(y
i
b)(z
i
c)

m
i=1
(x
i
a)(z
i
c)

m
i=1
(y
i
b)(z
i
c)

m
i=1
(z
i
c)
2
_

_
where
=
m

i=1
(x
i
a)
2
+
m

i=1
(y
i
b)
2
+
m

i=1
(z
i
c)
2
.
3 Planar Fitting of 3D Points of Form (x, y, f(x, y))
The assumption is that the z-component of the data is functionally dependent on the x- and y-components.
Given a set of samples {(x
i
, y
i
, z
i
)}
m
i=1
, determine A, B, and C so that the plane z = Ax +By +C best ts
the samples in the sense that the sum of the squared errors between the z
i
and the plane values Ax
i
+By
i
+C
is minimized. Note that the error is measured only in the z-direction.
Dene E(A, B, C) =

m
i=1
[(Ax
i
+ By
i
+ C) z
i
]
2
. This function is nonnegative and its graph is a hyper-
paraboloid whose vertex occurs when the gradient satistes E = (0, 0, 0). This leads to a system of three
linear equations in A, B, and C which can be easily solved. Precisely,
(0, 0, 0) = E = 2
m

i=1
[(Ax
i
+ By
i
+ C) z
i
](x
i
, y
i
, 1)
and so
_

m
i=1
x
2
i

m
i=1
x
i
y
i

m
i=1
x
i

m
i=1
x
i
y
i

m
i=1
y
2
i

m
i=1
y
i

m
i=1
x
i

m
i=1
y
i

m
i=1
1
_

_
_

_
A
B
C
_

_
=
_

m
i=1
x
i
z
i

m
i=1
y
i
z
i

m
i=1
z
i
_

_
.
The solution provides the least squares solution z = Ax + By + C.
3
4 Hyperplanar Fitting of nD Points Using Orthogonal Regression
It is also possible to t a plane using least squares where the errors are measured orthogonally to the proposed
plane rather than measured vertically. The following argument holds for sample points and hyperplanes in
n dimensions. Let the hyperplane be N (X A) = 0 where N is a unit length normal to the hyperplane
and A is a point on the hyperplane. Dene X
i
to be the sample points; then
X
i
= A+
i
N+ p
i
N

i
where
i
= N(X
i
A) and N

i
is some unit length vector perpendicular to Nwith appropriate coecient p
i
.
Dene Y
i
= X
i
A. The vector from X
i
to its projection onto the hyperplane is
i
N. The squared length of
this vector is
2
i
= (N Y
i
)
2
. The energy function for the least squares minimization is E(A, N) =

m
i=1

2
i
.
Two alternate forms for this function are
E(A, N) =
m

i=1
_
Y
T
i
_
NN
T
_
Y
i
_
and
E(A, N) = N
T
_
m

i=1
Y
i
Y
T
i
_
N = N
T
M(A)N.
Using the rst form of E in the previous equation, take the derivative with respect to A to get
E
A
= 2
_
NN
T
_
m

i=1
Y
i
.
This partial derivative is zero whenever

m
i=1
Y
i
= 0 in which case A = (1/m)

m
i=1
X
i
(the average of the
sample points).
Given A, the matrix M(A) is determined in the second form of the energy function. The quantity N
T
M(A)N
is a quadratic form whose minimum is the smallest eigenvalue of M(A). This can be found by standard
eigensystem solvers. A corresponding unit length eigenvector N completes our construction of the least
squares hyperplane.
For n = 3, if A = (a, b, c), then matrix M(A) is given by
M(A) =
_

m
i=1
(x
i
a)
2

m
i=1
(x
i
a)(y
i
b)

m
i=1
(x
i
a)(z
i
c)

m
i=1
(x
i
a)(y
i
b)

m
i=1
(y
i
b)
2

m
i=1
(y
i
b)(z
i
c)

m
i=1
(x
i
a)(z
i
c)

m
i=1
(y
i
b)(z
i
c)

m
i=1
(z
i
c)
2
_

_
.
5 Fitting a Circle to 2D Points
Given a set of points {(x
i
, y
i
)}
m
i=1
, m 3, t them with a circle (x a)
2
+ (y b)
2
= r
2
where (a, b) is
the circle center and r is the circle radius. An assumption of this algorithm is that not all the points are
collinear. The energy function to be minimized is
E(a, b, r) =
m

i=1
(L
i
r)
2
4
where L
i
=
_
(x
i
a)
2
+ (y
i
b)
2
. Take the partial derivative with respect to r to obtain
E
r
= 2
m

i=1
(L
i
r).
Setting equal to zero yields
r =
1
m

i=1
L
i
.
Take the partial derivative with respect to a to obtain
E
a
= 2
m

i=1
(L
i
r)
L
i
a
= 2
m

i=1
_
(x
i
a) + r
L
i
a
_
and take the partial derivative with respect to b to obtain
E
b
= 2
m

i=1
(L
i
r)
L
i
b
= 2
m

i=1
_
(y
i
b) + r
L
i
b
_
.
Setting these two derivatives equal to zero yields
a =
1
m
m

i=1
x
i
+ r
1
m
m

i=1
L
i
a
and
b =
1
m
m

i=1
y
i
+ r
1
m
m

i=1
L
i
b
.
Replacing r by its equivalent from E/r = 0 and using L
i
/a = (a x
i
)/L
i
and L
i
/b = (b y
i
)/L
i
,
we get two nonlinear equations in a and b:
a = x +

L

L
a
=: F(a, b)
b = y +

L

L
b
=: G(a, b)
where
x =
1
m

m
i=1
x
i
y =
1
m

m
i=1
y
i

L =
1
m

m
i=1
L
i

L
a
=
1
m

m
i=1
axi
Li

L
b
=
1
m

m
i=1
byi
Li
Fixed point iteration can be applied to solving these equations: a
0
= x, b
0
= y, and a
i+1
= F(a
i
, b
i
) and
b
i+1
= G(a
i
, b
i
) for i 0. Warning. I have not analyzed the convergence properties of this algorithm. In a
few experiments it seems to converge just ne.
5
6 Fitting a Sphere to 3D Points
Given a set of points {(x
i
, y
i
, z
i
)}
m
i=1
, m 4, t them with a sphere (xa)
2
+(y b)
2
+(z c)
2
= r
2
where
(a, b, c) is the sphere center and r is the sphere radius. An assumption of this algorithm is that not all the
points are coplanar. The energy function to be minimized is
E(a, b, c, r) =
m

i=1
(L
i
r)
2
where L
i
=
_
(x
i
a)
2
+ (y
i
b)
2
+ (z
i
c). Take the partial derivative with respect to r to obtain
E
r
= 2
m

i=1
(L
i
r).
Setting equal to zero yields
r =
1
m

i=1
L
i
.
Take the partial derivative with respect to a to obtain
E
a
= 2
m

i=1
(L
i
r)
L
i
a
= 2
m

i=1
_
(x
i
a) + r
L
i
a
_
,
take the partial derivative with respect to b to obtain
E
b
= 2
m

i=1
(L
i
r)
L
i
b
= 2
m

i=1
_
(y
i
b) + r
L
i
b
_
,
and take the partial derivative with respect to c to obtain
E
c
= 2
m

i=1
(L
i
r)
L
i
c
= 2
m

i=1
_
(z
i
c) + r
L
i
c
_
.
Setting these three derivatives equal to zero yields
a =
1
m
m

i=1
x
i
+ r
1
m
m

i=1
L
i
a
and
b =
1
m
m

i=1
y
i
+ r
1
m
m

i=1
L
i
b
.
and
c =
1
m
m

i=1
z
i
+ r
1
m
m

i=1
L
i
c
.
Replacing r by its equivalent from E/r = 0 and using L
i
/a = (a x
i
)/L
i
, L
i
/b = (b y
i
)/L
i
, and
L
i
/c = (c z
i
)/L
i
, we get three nonlinear equations in a, b, and c:
a = x +

L

L
a
=: F(a, b, c)
b = y +

L

L
b
=: G(a, b, c)
c = z +

L

L
c
=: H(a, b, c)
6
where
x =
1
m

m
i=1
x
i
y =
1
m

m
i=1
y
i
z =
1
m

m
i=1
z
i

L =
1
m

m
i=1
L
i

L
a
=
1
m

m
i=1
axi
Li

L
b
=
1
m

m
i=1
byi
Li

L
c
=
1
m

m
i=1
czi
Li
Fixed point iteration can be applied to solving these equations: a
0
= x, b
0
= y, c
0
= z, and a
i+1
=
F(a
i
, b
i
, c
i
), b
i+1
= G(a
i
, b
i
, c
i
), and c
i+1
= H(a
i
, b
i
, c
i
) for i 0. Warning. I have not analyzed the
convergence properties of this algorithm. In a few experiments it seems to converge just ne.
7 Fitting an Ellipse to 2D Points
Given a set of points {X
i
}
m
i=1
, m 3, t them with an ellipse (XU)
T
R
T
DR(XU) = 1 where U is the
ellipse center, R is an orthonormal matrix representing the ellipse orientation, and D is a diagonal matrix
whose diagonal entries represent the reciprocal of the squares of the half-lengths lengths of the axes of the
ellipse. An axis-aligned ellipse with center at the origin has equation (x/a)
2
+ (y/b)
2
= 1. In this setting,
U = (0, 0), R = I (the identity matrix), and D = diag(1/a
2
, 1/b
2
). The energy function to be minimized is
E(U, R, D) =
m

i=1
(L
i
r)
2
where L
i
is the distance from X
i
to the ellipse with the given parameters.
This problem is more dicult than that of tting circles. The distance L
i
requires nding roots to a quartic
polynomial. While there are closed form formulas for the roots of a quartic, these formulas are not easily
manipulated algebraically or dierentiated to produce an algorithm such as the one for a circle. The approach
instead is to use an iterative minimizer to compute the minimum of E.
7.1 Distance From Point to Ellipse
It is sucient to solve this problem when the ellipse is axis-aligned. For other ellipses, they can be rotated
and translated to an axis-aligned ellipse centered at the origin and the distance can be measured in that
system. The basic idea can be found in Graphics Gems IV (an article by John Hart on computing distance
between point and ellipsoid).
Let (u, v) be the point in question. Let the ellipse be (x/a)
2
+ (y/b)
2
= 1. The closest point (x, y) on
the ellipse to (u, v) must occur so that (x u, y v) is normal to the ellipse. Since an ellipse normal is
((x/a)
2
+(y/b)
2
) = (x/a
2
, y/b
2
), the orthogonality condition implies that ux = tx/a
2
and vy = ty/b
2
7
for some t. Solving yields x = a
2
u/(t + a
2
) and y = b
2
v/(t + b
2
). Replacing in the ellipse equation yields
_
au
t + a
2
_
2
+
_
bv
t + b
2
_
2
= 1.
Multiplying through by the denominators yields the quartic polynomial
F(t) = (t + a
2
)
2
(t + b
2
)
2
a
2
u
2
(t + b
2
)
2
b
2
v
2
(t + a
2
)
2
= 0.
The largest root

t of the polynomial corresponds to the closest point on the ellipse.
The largest root can be found by a Newtons iteration scheme. If (u, v) is inside the ellipse, then t
0
= 0 is
a good initial guess for the iteration. If (u, v) is outside the ellipse, then t
0
= max{a, b}

u
2
+ v
2
is a good
initial guess. The iteration itself is
t
i+1
= t
i
F(t
i
)/F

(t
i
), i 0.
Some numerical issues need to be addressed. For (u, v) near the coordinate axes, the algorithm is ill-
conditioned. You need to handle those cases separately. Also, if a and b are large, then F(t
i
) can be quite
large. In these cases you might consider uniformly scaling the data to O(1) as oating point numbers rst,
compute distance, then rescale to get the distance in the original coordinates.
7.2 Minimization of the Energy Function
TO BE WRITTEN LATER. (The code at the web site uses a variation on Powells direction set method.)
8 Fitting an Ellipsoid to 3D Points
Given a set of points {X
i
}
m
i=1
, m 3, t them with an ellipsoid (X U)
T
R
T
DR(X U) = 1 where U is
the ellipsoid center and R is an orthonormal matrix representing the ellipsoid orientation. The matrix D is a
diagonal matrix whose diagonal entries represent the reciprocal of the squares of the half-lengths of the axes
of the ellipsoid. An axis-aligned ellipsoid with center at the origin has equation (x/a)
2
+(y/b)
2
+(z/c)
2
= 1.
In this setting, U = (0, 0, 0), R = I (the identity matrix), and D = diag(1/a
2
, 1/b
2
, 1/c
2
). The energy
function to be minimized is
E(U, R, D) =
m

i=1
(L
i
r)
2
where L
i
is the distance from X
i
to the ellipse with the given parameters.
This problem is more dicult than that of tting spheres. The distance L
i
requires nding roots to a sixth
degree polynomial. There are no closed formulas for the roots of such polynomials. The approach instead is
to use an iterative minimizer to compute the minimum of E.
8.1 Distance From Point to Ellipsoid
It is sucient to solve this problem when the ellipsoid is axis-aligned. For other ellipsoids, they can be
rotated and translated to an axis-aligned ellipsoid centered at the origin and the distance can be measured
8
in that system. The basic idea can be found in Graphics Gems IV (an article by John Hart on computing
distance between point and ellipsoid).
Let (u, v, w) be the point in question. Let the ellipse be (x/a)
2
+ (y/b)
2
+ (z/c)
2
= 1. The closest point
(x, y, z) on the ellipsoid to (u, v) must occur so that (x u, y v, z w) is normal to the ellipsoid. Since
an ellipsoid normal is ((x/a)
2
+ (y/b)
2
+ (z/c)
2
) = (x/a
2
, y/b
2
, z/c
2
), the orthogonality condition implies
that u x = t x/a
2
, v y = t y/b
2
, and w z = t z/c
2
for some t. Solving yields x = a
2
u/(t + a
2
),
y = b
2
v/(t + b
2
), and z = c
2
w/(t + c
2
). Replacing in the ellipsoid equation yields
_
au
t + a
2
_
2
+
_
bv
t + b
2
_
2
+
_
cw
t + c
2
_
2
= 1.
Multiplying through by the denominators yields the sixth degree polynomial
F(t) = (t +a
2
)
2
(t +b
2
)
2
(t +c
2
)
2
a
2
u
2
(t +b
2
)
2
(t +c
2
)
2
b
2
v
2
(t +a
2
)
2
(t +c
2
)
2
c
2
w
2
(t +a
2
)
2
(t +b
2
)
2
= 0.
The largest root

t of the polynomial corresponds to the closest point on the ellipse.
The largest root can be found by a Newtons iteration scheme. If (u, v, w) is inside the ellipse, then t
0
= 0 is
a good initial guess for the iteration. If (u, v, w) is outside the ellipse, then t
0
= max{a, b, c}

u
2
+ v
2
+ w
2
is a good initial guess. The iteration itself is
t
i+1
= t
i
F(t
i
)/F

(t
i
), i 0.
Some numerical issues need to be addressed. For (u, v, w) near the coordinate planes, the algorithm is ill-
conditioned. You need to handle those cases separately. Also, if a, b, and c are large, then F(t
i
) can be quite
large. In these cases you might consider uniformly scaling the data to O(1) as oating point numbers rst,
compute distance, then rescale to get the distance in the original coordinates.
8.2 Minimization of the Energy Function
TO BE WRITTEN LATER. (The code at the web site uses a variation on Powells direction set method.)
9 Fitting a Paraboloid to 3D Points of the Form (x, y, f(x, y))
Given a set of samples {(x
i
, y
i
, z
i
)}
m
i=1
and assuming that the true values lie on a paraboloid
z = f(x, y) = p
1
x
2
+ p
2
xy + p
3
y
2
+ p
4
x + p
5
y + p
6
= P Q(x, y)
where P = (p
1
, p
2
, p
3
, p
4
, p
5
, p
6
) and Q(x, y) = (x
2
, xy, y
2
, x, y, 1), select P to minimize the sum of squared
errors
E(P) =
m

i=1
(P Q
i
z
i
)
2
where Q
i
= Q(x
i
, y
i
). The minimum occurs when the gradient of E is the zero vector,
E = 2
m

i=1
(P Q
i
z
i
)Q
i
= 0.
9
Some algebra converts this to a system of 6 equations in 6 unknowns:
_
m

i=1
Q
i
Q
T
i
_
P =
m

i=1
z
i
Q
i
.
The product Q
i
Q
T
i
is a product of the 6 1 matrix Q
i
with the 1 6 matrix Q
T
i
, the result being a 6 6
matrix.
Dene the 6 6 symmetric matrix A =

m
i=1
Q
i
Q
T
i
and the 6 1 vector B =

m
i=1
z
i
Q
i
. The choice for P
is the solution to the linear system of equations AP = B. The entries of A and B indicate summations over
the appropriate product of variables. For example, s(x
3
y) =

m
i=1
x
3
i
y
i
:
_

_
s(x
4
) s(x
3
y) s(x
2
y
2
) s(x
3
) s(x
2
y) s(x
2
)
s(x
3
y) s(x
2
y
2
) s(xy
3
) s(x
2
y) s(xy
2
) s(xy)
s(x
2
y
2
) s(xy
3
) s(y
4
) s(xy
2
) s(y
3
) s(y
2
)
s(x
3
) s(x
2
y) s(xy
2
) s(x
2
) s(xy) s(x)
s(x
2
y) s(xy
2
) s(y
3
) s(xy) s(y
2
) s(y)
s(x
2
) s(xy) s(y
2
) s(x) s(y) s(1)
_

_
_

_
p
1
p
2
p
3
p
4
p
5
p
6
_

_
=
_

_
s(zx
2
)
s(zxy)
s(zy
2
)
s(zx)
s(zy)
s(z)
_

_
10

You might also like