0% found this document useful (0 votes)
40 views74 pages

Point Pattern Matching: Pattern Recognition 2017/2018 Marc Van Kreveld

This document discusses point pattern matching and several related algorithms. It begins by describing point sets and their uses, as well as point pattern matching approaches like matching a point set to a model or matching two point sets. It then discusses in detail matching a point set to specific geometric models like a circle, disk, line, plane, or rectangle. Algorithms for solving these problems like RANSAC and the Hough transform are also summarized. The document concludes by discussing improvements that can be made to the basic RANSAC algorithm.

Uploaded by

Steven Tran
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
40 views74 pages

Point Pattern Matching: Pattern Recognition 2017/2018 Marc Van Kreveld

This document discusses point pattern matching and several related algorithms. It begins by describing point sets and their uses, as well as point pattern matching approaches like matching a point set to a model or matching two point sets. It then discusses in detail matching a point set to specific geometric models like a circle, disk, line, plane, or rectangle. Algorithms for solving these problems like RANSAC and the Hough transform are also summarized. The document concludes by discussing improvements that can be made to the basic RANSAC algorithm.

Uploaded by

Steven Tran
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 74

Point Pattern Matching

Pattern Recognition 2017/2018


Marc van Kreveld
Point sets
• Two types:
– actual point set, e.g. epicenters of earthquakes, burglary
locations
– sampling from higher-dimensional features, e.g. LiDAR
acquired scan from a building
Point pattern matching
• Matching a point set to a model
• Matching parts of a point set to a model
– RANSAC
– Hough transform
• Matching two point patterns to each other
– exact: translation, scaling, rotation
– with imprecision (noise)
– with outliers
Matching a point set to a model
• Examples of models:
– circle (boundary)
– disk (with interior)
– line (in 2D or 3D)
– plane (in 3D)
– rectangle (in 2D or 3D)
– more complex shape
(template)
Matching a point set to a model
• Dimensions of models
(number of parameters to
describe one uniquely):
– circle: 3
– disk: 3
– line in 2D: 2
– line in 3D: 4
– plane in 3D: 3
– rectangle in 2D: 4 (axis-parallel)
or 5 (arbitrarily oriented)
Matching a point set to a model
• Often, the more
parameters a model
has, the more complex
it is to find the model
in the point set
Matching a point set to a circle
• Need a measure of fit that can be optimized over all
possible circles
– Min max distance (bottleneck)
from all points to circle
– Min sum distance (min sum)
– Min sum distance squared

• All three versions give rise to different geometric


optimization problems.
• Min max is most sensitive to outliers
Matching a point set to a circle
• Need a measure of fit that can be optimized over all
possible circles
– Min max distance (bottleneck)
from all points to circle
– Min sum distance (min sum)
– Min sum distance squared

• All three versions give rise to different geometric


optimization problems.
• Min max is most sensitive to outliers
The rectangle problem
• Given a set P of points, “is” it a rectangle?

yes no no
The rectangle problem
• It appears that about 30% of the facets to be seen
in a LiDAR point cloud of an urban scene can be
considered a rectangle
The rectangle problem
• Two conditions:
– It should not have extra stuff (outside)
– It should not have missing stuff (inside)
The rectangle problem
• Formalization of the two conditions:
– no extra stuff outside  no point outside at all
– no missing stuff inside  take the union of radius-r disks
centered on the points, and require that this union
contains the rectangle completely

r is a parameter related to the assumed density of the


points
The rectangle problem
• If we know the orientation, we can determine a
tightly fitting rectangle with that orientation uniquely
(if any rectangle works, a tightly fitting one will)
 Algorithmic idea:
rotate a tightly fitting
rectangle around the
point set and test the
two conditions
The rectangle problem

union of the r-disks


trace of the centered at points
corners of the
tightly fitting
rectangle
The rectangle problem
• The algorithm can be made to run in O(n log n) time,
when there are n points
The disk problem
• Since a disk is rotation-invariant and has only three
parameters, is the equivalent problem easier for a
disk?
The disk problem
• Algorithmic approach:
– Compute the union of the r-disks centered at points
– Take the vertices of this union and make them “red” points

 there is a disk with all “blue” points inside and no extra


stuff if and only if there is a red-blue separating circle
(under a few reasonable assumptions)

– Red-blue point separation by a circle can be formulated as


a 3D linear programming problem
The disk problem
• Red-blue point separation by a circle
– Map every point (x, y) into 3D to the point (x, y, x2 + y2)
– Search for a plane with all
red 3D points above and all
blue 3D points below it;
this is the geometric-dual
formulation of standard
linear programming
The disk problem
• Red-blue point separation by a circle
– Geometrically, all points are put onto the unit paraboloid
U: z = x2 + y2
– Every plane in 3D:
• Does not intersect U
• is vertical and intersects U
in a parabola
• is not vertical and intersects U in
a shape whose vertical projection
onto the xy-plane is a circle
– Conversely, for every circle in the
xy-plane there is a unique plane in
3D whose projected intersection
with U is that circle
The disk problem
• Computing a disk with all n points inside and no
empty areas …
– It is known that the union of n unit disks can be computed
in O(n log n) time and has O(n) vertices
– It is known that linear programming in 3D with linearly
many constraints can be solved in linear time

• … can be solved in O(n log n) time


Matching parts of a point set to a
model
• Two popular methods: RANSAC and the Hough
transform
• Used for 3D reconstruction, in particular, to find
many points close to a plane in 3D, yielding facets
of buildings
Points clustered by planes
RANSAC
• RANdom SAmple Consensus: method that can be
used to detect planes (and other shapes) in point sets
– randomized
– assumes a model defined by few points

line (2D): defined by 2 points


plane: defined by 3 points
sphere: defined by 4 points
vertical cylinder: defined by 3 points
RANSAC
• Simplest case: 2D point set, we want to find a line
with most points on (or near to) it
– points on/near this line are called inliers, they support
the line
– other points are outliers, they do not support the line
RANSAC
1. Choose a threshold distance d
2. For #iterations do
– Choose 2 points, make line L
– For each point q in P, test if q lies within distance d from L
If yes, increase the support of L by 1
– If L has higher support than the highest-support line
found so far, remember L and its support
3. Return L as the line with most points near it
RANSAC
RANSAC
• For testing whether a point q supports a line L, we
do not actually compute the distance from q to L
• Instead, we generate two lines at distance d from L
• Then we test for each point whether it lies below
the upper and above the lower line
RANSAC
• How large should the threshold distance d be?
• How many iterations should we do to have a high
probability of finding the line with highest support?

 the threshold distance is related to the measurement error


(~5 cm) and the flatness of the surface

 the number of iterations depends on the inlier-outlier ratio


and with how much probability we want to find the best line
RANSAC, iterations
• Suppose we want to have 95% probability, p=0.95,
of finding the best line
• Suppose there are k points on the line (inliers) and
n points in total
• Then the probability of choosing 2 points on the
line is (k/n)2
• The probability of never selecting 2 points on the
line in r iterations is ( 1 – (k/n)2 )r
• The probability of finding the line in r iterations is
1 – ( 1 – (k/n)2 )r
31
RANSAC, iterations
• So we want 1 – ( 1 – (k/n)2 )r > p

( 1 – (k/n)2 )r < 1–p


log ( 1 – (k/n)2 )r < log (1 – p)
r log ( 1 – (k/n)2 ) < log (1 – p)

r > log (1 – p) / log ( 1 – (k/n)2 )


RANSAC, iterations
• Examples:
– if 10% of the points lie on the line and we want to find it
with 95% certainty, we need nearly 300 iterations
– if 5% of the points lie on the line and we want to find it
with 95% certainty, we need nearly 1200 iterations
– if 10% of the points lie on the line and we want to find it
with 90% certainty, we need nearly 230 iterations
Improving RANSAC
• Sample points not too far apart
• Use surface normals of points to test if a sample
makes sense, and when deciding on support
• Collect points in a plane if they form a
“connected component”
Improving RANSAC
• For facades and roofs in cities, sample triples of
points no further than 50 m apart  increases
probability finding planes; fewer iterations
• Do not sample too close either: inaccuracy of points
in the same plane will lead to a wrong angle / normal
Improving RANSAC
• For each point we can estimate a surface normal
using, say, 12 nearest neighbors
– Use Principal Component Analysis and let the normal be
the third component (eigenvector with smallest
eigenvalue)
– or, fit a plane through these 12 points plus the point itself,
and use the normal of that plane
Improving RANSAC
• When performing RANSAC and we know normals:
– delete a sample of three points immediately if their
normals deviate much from the plane they define
– let only points with correct normals support a tested plane
(say, angle deviation at most 20 degrees), on top of the
requirement of being close to the plane
Improving RANSAC

Four nearest neighbors,


point itself included
Improving RANSAC

Four nearest neighbors,


point itself included

The brown point supports the line (correct normal)


but the green point does not (wrong normal)
Improving RANSAC
• Collect points in a plane if they form a “connected
component”
on the same plane
Iterated RANSAC
• After finding the plane with the most points, remove the
points from the set and remember them as a cluster
• Then continue and find
more planes, until no
plane seems to have
sufficient support

Points not in clusters are black


Why are the outlines of planar
regions black?
Iterated RANSAC
• The remaining points are
– vegetation
– curved surfaces (cars, domes)
– traffic signs, lamp posts, mailboxes, garbage bins, bicycles,
drainage pipes, …
– points on planes whose normal was incorrect (possibly,
close to corners)
– points on very small or largely occluded planes
– points inside buildings measured through windows
• These points may still help for reconstruction
Hough transform
• The Hough transform is an alternative to RANSAC
and can also give a plane close to many points
• It discretizes the set of all lines by a grid; points give
a count to all grid cells whose lines come near that
point

intercept
1 1
1 1
1 1
1

slope
Hough transform
• The middle of the cell with the highest number
gives slope and intercept of a line with many points
close by

intercept
1 1
1 1
1 1
1

slope
Hough transform
• In 3D, the grid for representing all planes is 3-
dimensional (and may take up a lot of storage space)

(with z-axis)
intercept
Matching two point patterns, exact
• Matching point sets A and B exactly
– under translation
– under translation and scaling
– under translation and rotation
– under translation, scaling and rotation
Matching two point patterns, exact
• If two point sets match exactly, then their centers of
mass coincide
 translate A so that its center of mass lies at the
origin, and do the same for B
• For the translation-only case, we just need to check
whether all points of A and B lie at the same
locations

O(n log n) time


Matching two point patterns, exact
• The case translation and scaling:
– the centers of mass should coincide
– the points of A and B furthest from the centers of mass
should be equally far away


• Translate A and B so that their centers of mass
lie at the origin
• Scale B so that the point furthest from
the origin lies as far away as in A
O(n log n) time
• Check if the point sets coincide
Matching two point patterns, exact
• The case translation, scaling, and rotation in the plane:
– After the translation and scaling parts, if the furthest point
from the origin in A and in B are unique, then rotate to make
them coincide, and then test the point sets
– If the furthest points are not unique, there may be more
rotations to test (if there are different numbers of furthest
points in A and B, then no exact match can exist)

o o
Matching two point patterns, exact
• The case translation, scaling, and rotation in 3D:
– After the translation and scaling parts, if the furthest
point from the origin in A and in B are unique, then make
them coincide
– Assume b is the furthest point in B. Then we can still
rotate in 3D around the axis ob to make the point sets
coincide. Choose the point furthest from ob in B and the
one furthest from the
corresponding axis in A,
and make them coincide
Matching two point patterns, exact
• The case translation, scaling, and rotation:
In practice, the two point sets will never coincide
exactly when rotation is involved, due to rounding in
storage and computation when rotating
Matching two point patterns
• When there is imprecision (noise, geometric
deviations), we want to match each point of A to
within a neighborhood (-disk) of a unique point
in B with a transformation

A B
Matching two point patterns
• Note that this is not the same as finding a
transformation so that the Hausdorff distance
from A to B is at most the imprecision value 

A B
Matching two point patterns
• Within this class of problems, deciding whether a
transformation T exists that maps A close to B:
– The properties for the exact case no longer hold
– Translation-only is easier than the rest (obviously)
– Using squares for an -neighborhood is easier than
using -disks
– Hausdorff (many-to-one) or one-to-one-matching may
be computationally easier, depending on the precise
problem setting
Example application: grid maps

Taken from: D. Eppstein, M. van Kreveld, B. Speckmann, and F. Staals:


Improved Grid Map Layout by Point Set Matching. International Journal of
Computational Geometry and Applications 25(2), 101-122 (2015)
Example application: grid maps
Grid maps
• Use the centroids of the regions (48 United States)
and the centers of a 6x8 grid as point sets A and B
• Find a one-to-one matching between them:
– no transformation I, L1-distances
– translation only, L1-distances
– translation only, L22-distances
• Using no transformation, I, just takes a reasonable
fixed placement and scaling of A amidst B (same
center of mass, same bounding box, …)
Grid maps
• All nxn distances are used between a point in A and a
point in B, and a minimum weight matching is used
– with I, this is it
– with translation only, L1-distances or L22-distances, we
optimize over all possible translations (i.e., we determine
the translation that minimizes the weight of the matching)
Grid maps
Grid maps
• Computation for I is O(n2 log3 n) time
– L1-distances minimum matching is faster than Euclidean
• Computation for translation and L1 is O(n6 log3 n) time
– It can be shown that there is an optimum with a horizontal and
a vertical alignment of pairs from A and B
• Computation for translation and L22 is O(n3) time
– Translation is trivial because it is known that the centers of mass
must coincide to minimize the matching, using this measure
– The rest is a minimum weight matching problem in a graph with
n vertices and n2 edges
Grid maps
• Evaluation of the results by:
– sum of all 48 L1 distances
– sum of all 48 L2 distances
– sum of all 48 L22 distances
– number of relative orientations between pairs of states
that are preserved (California is southwest of Colorado,
etc.)
– number of adjacencies between states that also appear in
the grid
Grid maps
Grid maps
• What if the number of regions and the grid size
do not have the same number of cells (points)?
E.g., the 33 boroughs of London on a 6x6 grid
Grid maps
• Option 1: choose a suitable subgrid beforehand
• Option 2: let the matching algorithm decide which
cells of the grid to use
Outliers
• Not an issue in RANSAC
• Possibly an issue for the rectangle problem and the
circle problem

The point set is not a rectangle,


but if one point may be ignored
as an outlier, then it is
The rectangle problem with outliers
• Suppose we allow one outlier
• Brute force approach: try every point as the outlier
and run the existing algorithm on the rest

• Running time becomes n x O(n log n) = O(n2 log n)

• Generalization to k outliers gives (n choose k) times


O(n log n) = O(nk+1 log n) time
The rectangle problem with outliers
• Observe: one outlier must lie on the convex hull
(not necessarily true for 2 outliers), we need to try
these points only as the outlier
The rectangle problem with outliers
• We can also run the algorithm where one side of
the rectangle ignores one point of the set; such an
algorithm still runs in O(n log n) time
The rectangle problem with outliers
• We can also run the algorithm where one side of
the rectangle ignores one point of the set; such an
algorithm still runs in O(n log n) time
The rectangle problem with outliers
• We can also run the algorithm where one side of
the rectangle ignores one point of the set; such an
algorithm still runs in O(n log n) time
The rectangle problem with outliers
• We can also run the algorithm where one side of
the rectangle ignores one point of the set; such an
algorithm still runs in O(n log n) time
The rectangle problem with outliers
• A generalization to k outliers is possible: we can
choose how many points each side excludes
• However, we are not completely fair: the outlier
should not help to make the union of disks either

This is much harder to


handle efficiently
Outliers
• In general, allowing many outliers can be handled
– exactly, but the brute force approach is very slow
– exactly, when the geometry helps to limit what points
can be outliers
– heuristically (ideally after studying
the geometry), but this can easily
miss the best solution
(for example, incrementally
removing outliers one by one)
Summary
• In point pattern matching, we have complete
matching or partial matching between two point
sets, or complete or partial matching between a
point set and a shape
• We may need to deal with noise, especially in the
case with two point sets
• We may have to deal with outliers, especially in the
case with complete matching

You might also like