0% found this document useful (0 votes)
49 views6 pages

Least Squares Line Fitting: I I I I

The document summarizes three popular techniques for estimating parameters from image measurements: least squares, Hough transforms, and RANSAC. It describes how least squares can be used to fit lines to 2D point data by minimizing the perpendicular distances from points to candidate lines. However, least squares is sensitive to outliers. Hough transforms and RANSAC are more robust by either counting inliers within a margin or fitting models using a minimal number of points. RANSAC in particular fits lines to random 2-point samples and chooses the model with the largest consensus set.

Uploaded by

magicboker
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
49 views6 pages

Least Squares Line Fitting: I I I I

The document summarizes three popular techniques for estimating parameters from image measurements: least squares, Hough transforms, and RANSAC. It describes how least squares can be used to fit lines to 2D point data by minimizing the perpendicular distances from points to candidate lines. However, least squares is sensitive to outliers. Hough transforms and RANSAC are more robust by either counting inliers within a margin or fitting models using a minimal number of points. RANSAC in particular fits lines to random 2-point samples and chooses the model with the largest consensus set.

Uploaded by

magicboker
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

COMP 558 lecture 15 Nov.

3, 2010
We have spent several lectures on how to process images and detect features such as edges and
boxes, and estimate image transformations such as translations. Throughout the rest of this course,
we will use these image measurements to estimate the parameters of some of the models that we
saw in the rst part of the course, namely models of external and internal parameters, and scene
geometry. We will use several estimation techniques. Today we look at three popular ones: least
squares, Hough transforms, and RANSAC.
The ordering of material in these notes is slightly dierent from the slides. In these notes, the
vanishing point example is presented only at the end.
Least squares line tting
If you have taken a statistics course, then you have seen the following version of the line tting
problem. You have two data variables x and y and you want to t a linear relationship between
samples (x
i
, y
i
) such that
y
i
= mx
i
+ b. (1)
Typically this model does not t the data exactly because of noise or other variability and, in
particular, the variability is in y only, e.g. the x variable might be chosen by the experimentor (and
is not random) and so only the y is random. In this case, one assumes a model
y
i
= mx
i
+ b + n
i
where the n
i
are noise. To t the line, you nd the m and b that minimizes

i
(y
i
mx
i
+ b)
2
=

i
n
2
i
.
To do so, you take the partial derivatives with respect to m and b and set them to 0. This gives
you two linear equations with two unknowns m and b and coecients that depends on the data x
i
and y
i
. You can easily solve these equations to get m and b.
Lets look at a slightly dierent version of the problem which is more common in vision. Again
we have a set of points (x
i
, y
i
) in a 2D plane and we want to t a line to these points. However,
rather than taking our noise or error to be in the y direction only, we take the error to be the
distance from (x
i
, y
i
) to a line
xcos + y sin = r (2)
where is the direction of the normal to the line and r is the perpendicular distance from the origin
to the line in direction . Note Eq. (2)is a more general equation for a line than Eq. 1 since Eq.
(2) allows for lines of the form x = constant.
The perpendicular distance from any point (x
i
, y
i
) in the plane to such a line is:
|x
i
cos + y
i
sin r|.
We want to nd the and r that minimize the sum of squared distances. (This is called total least
squares.)
How do we solve for and r ? We cannot just use the same method as above, i.e. try to minimize

i
(x
i
cos + y
i
sin r)
2
1
COMP 558 lecture 15 Nov. 3, 2010
directly, because you have the non-linearity of the cosine and sine. (As you vary , for any xed r,
the sum of squares expression is not a simple quadratic as before.) Instead, we nd the values of
a, b and r that minimize

i
(x
i
a + y
i
b r)
2
(3)
subject to the (non-linear) constraint that a
2
+ b
2
= 1, that is, (a, b) is a unit vector. This means
we have a constrained minimization problem.
Taking the derivative of (3) with respect to r and set it to 0 gives:
a

i
x
i
+ b

i
y
i
= r

i
1.
Letting x =

i
x
i

i
1
and y =

i
y
i

i
1
be the sample means, we get
a x + b y = r
which says that the sample mean ( x, y) falls on the solution line. Substituting r into (3) gives

i
((x
i
x)a + (y
i
y)b)
2
.
Recall we are trying to minimize this expression, subject to a
2
+ b
2
= 1. But we can write this
expression dierently, namely dene an n2 matrix A whose rows are (x
i
x, y
i
y). The expression
becomes (a, b)A
T
A(a, b)
T
. For a
2
+b
2
= 1, the expression is minimized by the eigenvector of A
T
A
having the smaller eigenvalue. (Note both eigenvalues are non-negative since v
T
A
T
Av 0 for any
v.) Thus, our solution amounts to considering all the lines that pass through ( x, y), and seeing
which perpendicular direction minimizes the sum of square distances to the points (x
i
, y
i
)
Inliers and outliers
One problem with least squares techniques is that it (implicitly) assumes that the noise distribution
is the same for the all points. In many situations, though, the situation is more complicated,
namely there are some points that obey the model (called inliers) and other points that do not
(called outliers). In the line tting problem, if we are penalizing all points by the squared distances
to a candidate line, the outliers can have a huge penalty. This can drive the estimated solution
away from the correct solution (that is, the correct solution for the inliers).
Lets now look at a few other methods that are more robust to outliers and that are popular in
computer vision, namely the Hough transform and RANSAC.
Hough transform
For each sample point (x
i
, y
i
) we are penalizing each line (, r) by adding a penalty that is the
squared distance from the point to that line. There are other possibilities, though. We could
penalize points by their squared distance from the line (or their absolute distance) up to some
distance, say d
max
. and then beyond this distance, have a constant error. There are many methods
based on this idea and they fall under the name robust statistics. The simplest such approach is to
give zero error up to some margin distance and then constant error beyond that. We would then
try to nd the (, r) that minimizes this error. For example, one could use the simple algorithm:
2
COMP 558 lecture 15 Nov. 3, 2010
for each line (theta,r) // need to choose a sampling (quantization)
Count number of points (xi,yi) that fall outside the distance tau from line
return the (theta,r) pair with the smallest count
The Hough transform does essentially what I just described except that it counts (votes for)
points within the margin, rather than counting the bad points that lie outside the margin.
for each (x_i,y_i){
for each theta{ // need to choose a sampling
r := round(x_i cos(theta) + y_i (sin theta))
Vote for (r,theta)
}
}
return the (theta,r) pair with the most votes
Minor technical point: for a given (x
i
, y
i
), the line (r, ) is identical to the line (r, +). Thus the
votes would always have a certain two-fold symmetry. One would choose the solution with r > 0.
The transformation from a set of points (x
i
, y
i
) to a histogram of votes in the model parameter
space (in this case, a 2D grid (r, )) is known as the Hough Transform.
The advantage of Hough is that the outliers have very little eect on the estimate. As long as
the outliers dont conspire to vote on some other particular model, the votes of the outliers will be
spread out over the (r, ) space. Of course, if there really two good line models present to explain
the data, then you will get two peaks and you would need to choose between them, or just take
both and conclude there are two models.
RANSAC (Random Sample Consensus)
Lets now consider second approach which is often used in computer vision when there are lots of
outliers and when the number of parameters of the model is more than two. We will see examples
later when the number of parameters is 7 or 8, for example, and you want to estimate these
parameters precisely. In this case, a Hough transform approach just doesnt work. For now, you
can just think of the same problem as above, namely tting a line model to a set of points.
We have seen that least squares tries to use all of the data to t a model. This next approach
(RANSAC) is the extreme opposite. It ts a model using the minimal number of points possible.
For a line, the minimal number of points is 2.
The RANSAC algorithm for line tting is roughly as follows. (There are several versions.)
Sample a large number of point pairs. For each point pair, t a line model, namely nd the unique
line passing through the two points. Check all the other points and see how many of them lie
near the line, where near means within some distance threshold
1
). These points are called the
consensus set for that line model. Repeat this some number of times. Then, choose the model that
has the largest consensus set. Use least squares to t a model to this consensus set and terminate.
At rst glance, this algorithm seems a bit nutty, since your intuition is that even if you happen
to sample only inliers, each of the points has noise and so the line that passes through these points
will be very sensitive to this noise. Surely, you might think, it is better to randomly choose 3 points
(or maybe 4?) since you could get a better t.
3
COMP 558 lecture 15 Nov. 3, 2010
The problem with this intuition is that the more points you sample to t a line, the greater the
likelihood that one of the sample points will be an outlier (and if that happens then your t will
be garbage). That means that you are going to have to sample more times in order to nd pairs of
points that are both inliers.
Let w be the probability that if that a randomly chosen point is an inlier, so 0 < w < 1. Suppose
we need n points to t a model. In the case of a line n = 2. Drawing n points is called a trial.
The probability of all n points in a trial being inliers is w
n
so the probability that at least one
of the n points in a trial is an outlier is p
o
= 1 w
n
.
RANSAC has a number of parameters that need to be chosen. There are thresholds. You also
need to decide when a point is close enough to a model that you consider it part of the consensus set,
and how big the consensus set should be in order for you to consider the model. These parameters
are related to each other, which makes matters a bit trickier.
Suppose you have a threshold for deciding whether a point is suciently near a model, and
you would like to estimate p which is the percentage of points that come within a distance from
the true model. For each trial, if all n points are inliers, then when you t a model to these
points, you model should be close to the true model. In turn, of the remaining N n points,
roughly p(N n) should be inliers and thus the consensus set should be roughly p(N n). I say
roughly because the tted model is based on noisy data.
On the other hand, if one or more of the n points is an outlier, then the tted model will be
junk (e.g. consider the case n = 2) and the consensus set will be very small, in particular, much
smaller than pN. So, after carrying out many trials, we will see that most trials produce a small
consensus set, but that some trials produce a large consensus set. The percentage of trials that
have a large consensus set will be roughly the percentage of trials in which all samples belong to
the model. But the latter is p
n
, so this gives us an estimate for p.
Estimating a vanishing point
Suppose you have run a Canny edge detector or some other edge detector and so you have a set of
image points and orientations (x, y, ) where is the direction of the gradient of the image intensity,
i.e. perpendicular to the edge. Each such triplet denes a line
(x x
v
, y y
v
) (cos , sin ) = 0
where (x
v
, y
v
) is the vanishing point.
Estimating the location of the vanishing point requires estimating the intersection of such lines.
This problem would be trivial to solve except that: (1) only some subset of the lines in the 3D
scene will be parallel to each other and so we dont know which image edges to use, and (2) the
estimated values of (x, y, ) typically are noisy.
Notice that this problem resembles the problem we discussed above of tting a line to a set of
points. There, (1) we talked about inliers and outliers, namely points that belonged to a line and
those that did not, and (2) the positions of the points that belonged to a line were noisy. I argued
that when the percentage of outliers was non-negligable, a least squares t on all the data didnt
make much sense and suggested instead that we use a method such as the Hough transform or
RANSAC. The same idea holds for estimating vanishing points.
4
COMP 558 lecture 15 Nov. 3, 2010
Hough transform approach 1
First assume that the vanishing point lies within the limited eld of view of the image. (This is a
very strong assumption and in general we dont want to make it. But it is a good place to start.)
Given (x
i
, y
i
,
i
), our model for a line through (x
i
, y
i
) and perpendicular to (cos
i
, sin
i
) is
(x x
i
, y y
i
) (cos , sin ) = 0 .
Here is a sketch of an algorithm we could try:
for each edge element (xi, yi, theta_i){
if | cos(theta_i) | < 1/sqrt(2)
for x in 1:Nx
y = round( (yi sin (theta) + (xi - x) cos(theta)) / sin( theta) )
if (1 <= y <= Ny)
count(x,y)++
endif
endfor
else
for y in 1:Ny
x = round( (xi cos (theta) + (yi - y) sin(theta)) / cos( theta) )
if (1 <= x <= Nx)
count(x,y)++
endif
endfor
}
find (x,y) that maximizes count(x,y) // there may be several peaks
This algorithm will be very sensitive to noise, and so to make it work you would need to do more.
For example, to account for noise in the
i
estimate, you could add a loop over a small range of
angles centered at
i
. You could also weight the vote by the inverse distance from (x
i
, y
i
), since the
errors in
i
would lead to bigger errors in the location of the line as you move away from (x
i
, y
i
).
Hough transform approach 2
A second approach is to take pairs of edges (x
i
, y
i
,
i
) and (x
j
, y
j
,
j
), compute the intersection of
their lines, and then vote directly on the intersection points.
for each pair of edge elements (x,y,theta) and (x,y,theta)
if theta != theta{
compute intersection (xv,yv) of lines containing these edges
count( round(xv), round(yv) )++
}
find (xv,yv) that maximizes count(xv,yv)
5
COMP 558 lecture 15 Nov. 3, 2010
Parameterizing (x
v
, y
v
) using a unit sphere
One key limitation with the above methods is that often the vanishing points does not lie within
the image domain. The rst algorithm explicitly assumed that the vanishing point was within the
range of a xed N
x
N
y
grid. The second assumed the intersection point is represented by an
element in the count matrix. The algorithm cannot handle the case that the vanishing point is at
innity.
The classic solution to this problem is to consider an image projection sphere, rather than a
projection plane. (For example, the back of the eyeball is roughly a sphere.) Place a unit sphere
at the center of projection and parameterize each ray that arrives at the camera center by where
the ray intersects the unit sphere. (This unit sphere of directions is sometimes called the Gaussian
sphere.)
RANSAC approach
How would you use RANSAC to detect vanishing points. As in Hough 2, take two edges and intersect
their lines to get a model (x
v
, y
v
). Then for each of the remaining edges, check the distance from
(x
v
, y
v
) to the line dened by that edge. If the distance is small enough, then the edge is consistent
with that model, etc.
At rst glance, you might think that this method avoids the problems of the Hough transform,
namely that any (nite) vanishing point can be represented and tested. Note, however, that if the
vanishing point is far from the origin, then then the distance from the vanishing point to any line
that is dened by an edge in the image will most likely be very large. This will tend to (incorrectly)
favor vanishing points that lie closer to the optical axis. One could reduce this bias again by using
the hemisphere, and considering distances on the hemisphere rather than on the projection plane.
6

You might also like