Como Observa Un Robot
Como Observa Un Robot
2
I =
2
I/x
2
+
2
I/y
2
linear, but 2nd order
C18 2013 9 / 38
Which to choose?
The gradient magnitude is attractive because it is rst order in the
derivatives. Differentiation enhances noise, and the 2nd derivatives in
the Laplacian operator introduces even more.
The Laplacian is attractive because it is linear, which means it can be
implemented by a succession of fast linear operations effectively
matrix operations as we are dealing with a pixelated image.
Both approaches have been used.
For both methods we need to consider
how to compute gradients, and
how to suppress noise
so that insignicant variations in pixel intensity are not agged up as
edges.
This will involve spatial convolution of the image with an impulse
response function that smooths the image and computes the
gradients.
C18 2013 10 / 38
Preamble: spatial convolution
You are familiar with the 1D convolution integral in the time domain
between a input signal i (t ) and impulse response function h(t )
o(t ) = i (t ) h(t ) =
_
+
i (t )h()d =
_
+
i ()h(t )d .
The second equality reminds us that convolution commutes
i (t ) h(t ) = h(t ) i (t ). It also associates.
In the frequency domain we would write O(s) = H(s)I(s).
Now in a continuous 2D domain, the spatial convolution integral is
o(x, y) = i (x, y) h(x, y) =
_
+
_
+
i (x a, y b)h(a, b)dadb
In the spatial domain youll often see h(x, y) referred to as the
point spread function or the convolution mask.
C18 2013 11 / 38
Spatial convolution /ctd
For pixelated images I(x, y) we need a discrete convolution
O(x, y) = I(x, y) h(x, y) =
j
I(x i , y j )h(i , j )
for x, y ranging over the image width and height respectively, and i , j
ensuring that access is made to any and all non-zero entries in h.
Many authors rewrite the convolution by replacing h(i , j ) with
h(i , j )
O(x, y) =
j
I(x i , y j )h(i , j ) =
j
I(x + i , y + j )h(i , j )
=
j
I(x + i , y + j )
h(i , j )
This looks more like the expression for a cross-correlation but,
confusingly, it is still called a convolution.
C18 2013 12 / 38
Computing partial derivatives using convolution
We can approximate I/x at image pixel (x, y)
using a central difference
I/x (1/2) {[I(x + 1, y) I(x, y)] + [I(x, y) I(x 1, y)]}
= (1/2) {I(x + 1, y) I(x 1, y)}
Writing this as a proper convolution we would set
h(1) = +
1
2
h(0) = 0 h(1) =
1
2
D(x, y) = I(x, y) h(x) =
1
i =1
I(x i , y)h(i )
Notice how the proper mask is reversed from what we might navely
expect from the expression.
C18 2013 13 / 38
Computing partial derivatives using convolution /ctd
Now, as ever,
(I/x) (1/2)[I(x + 1, y) I(x 1, y)]
Writing this as a sort of correlation
h(1) = 1/2
h(0) = 0
h(1) = +1/2
D(x, y) = I(x, y)
h(x) =
1
i =1
I(x + i , y)
h(i )
Note how we can just lay this mask directly on the pixels to be
multiplied and summed ...
C18 2013 14 / 38
Example Results
The actual image used is
grey-level, not colour
x-gradient image y-gradient image
C18 2013 15 / 38
In 2 dimensions ...
As before, one imagines the ipped correlation-like mask centred on
a pixel, and the sum of products computed.
Often a 2D mask is separable in that it can be broken up in to two
separate 1D convolutions in x and y
O = h
2d
I = f
y
g
x
I
The computational complexity is lower but intermediate storage is
required, and so for a small mask it may be cheaper to use it directly.
C18 2013 16 / 38
Example result Laplacian (non-directional)
The actual image used is
grey-level, not colour
Laplacian
C18 2013 17 / 38
Noise and smoothing
Differentiation enhances
noise the edge appears
clear enough in the image,
but less so in the gradient
map.
If we knew the noise spectrum, we might nd an optimal brickwall
lter g(x, y) G(s) to suppress noise edges outside the signal edge
band. We would applied this g to the image (g I) before nding
derivatives.
But the sharper the cut-off in spatial-frequency, the wider the spatial
support of g(x, y) has to be (it tends to an Innite Impulse Response
(IIR) lter). This is expensive to compute.
Conversely, if g(x, y) has nite size (an FIR lter) it will not be
band-limited, spatially blurring of the signal edges.
Can we compromise spread in space and spatial-frequency in some
optimal way?
C18 2013 18 / 38
Compromise in space and spatial-frequency
Suppose our IR function is h(x) and h H is a Fourier transform
pair.
Dene the spreads in space and spatial-frequency as X and where
X
2
=
_
(x x
m
)
2
h
2
(x)dx
_
h
2
(x)dx
with x
m
=
_
xh
2
(x)dx
_
h
2
(x)dx
2
=
_
(
m
)
2
H
2
()d
_
H
2
()d
with
m
=
_
H
2
()d
_
H
2
()d
Now vary h to minimize the product of the spreads U = X.
An uncertainty principle indicates that U
min
= 1/2 when
h(x) = A Gaussian function =
1
2
exp
_
x
2
2
_
C18 2013 19 / 38
2.3 The Canny Edge Detector (JF Canny 1986)
2D Gaussian smoothing is applied to the image
S(x, y) = G(x, y) I =
1
2
2
exp
_
(x
2
+ y
2
)
2
2
_
I(x, y)
But separated into two separable masks
S(x, y) =
1
2
exp
_
(x
2
)
2
2
_
2
exp
_
(y
2
)
2
2
_
I(x, y)
Then 1st derivatives in x and y are found by two further convolutions
D
x
(x, y) = S/x = h
x
S
D
y
(x, y) = S/y = h
y
S.
Interestingly Cannys decision to use a Gaussian is made from a
separate study of how to maximize robustness, localization, and
uniqueness in edge detection.
C18 2013 20 / 38
Canny/ Implementation
Step 1: Gaussian smooth
image S = G I
This is separable
S = G
x
G
y
I
Often 5 1 or 7 1 masks
Step 2: Compute gradients
S/x, S/y
using derivative masks.
C18 2013 21 / 38
Canny/ Implementation
Step 3: Non-maximal suppression
a. Use S/x and S/y to nd gradient magnitude and direction at
a pixel.
b. Track along direction to marked
positions on neighbourhood perimeter.
c. Linearly interpolate gradient magnitude
at positions using magnitudes at arrow
pixels.
d. If gradient magnitude at central pixel
greater than both interpolated values,
declare an edgel, and t parabola to nd
(x, y) to sub-pixel acuity along the edge
direction.
Step 4:
Store edgels sub-pixel position (x, y); gradient strength; and
orientation (full 360
as light/dark distinguished),
C18 2013 22 / 38
Canny/ Implementation: Hysterical Thresholding
Step 5:
Link edges
together into
strings using
orientation to aid
search for
neighbouring
edgels.
Step 6:
Perform
thresholding with
hysteresis, by
running along
the linked strings
C18 2013 23 / 38
Example of Edgel and String Outputs
C18 2013 24 / 38
2.4 From Strings to Straight Lines
Split algorithm for lines:
1. Fit a straight line to all edgels in string
2. If RMS error less than threshold accept
and stop.
3. Otherwise nd point of highest curvature
on edge string and split into two. Repeat
from 1 for each substring.
Merge algorithm for lines:
1. Fit straight lines to each pair of
consecutive edgels in a string
2. Compute RMS error for each potential
merger of an adjacent pair of lines into a
single line, and nd the pair for which
the RMS error is minimum
3. If RMS error less than threshold, merge
and repeat from 2.
C18 2013 25 / 38
Actually tting lines and nding that RMS error
Consider n edgels that have been linked into an edge (sub-)string.
Fitting a line requires orthogonal regression.
See B14, or the variant here ...
Find
minE(a, b, c) =
n
i =1
(ax
i
+ by
i
+ c)
2
subject to a
2
+ b
2
= 1
Hence a 2-dof problem.
C18 2013 26 / 38
From edge elements to lines /ctd
Now minE(a, b, c) = min
n
i =1
(ax
i
+ by
i
+ c)
2
requires
E/c = 0.
2
n
i =1
(ax
i
+ by
i
+ c) = 0 c = (a
x + b
y) .
Therefore E becomes
E =
n
i =1
[a(x
i
x) + b(y
i
y)]
2
=
_
a b
U
T
U
_
a
b
_
where the 2nd-moment matrix is
U
T
U =
_
n
i =1
x
2
i
n
x
2
n
i =1
x
i
y
i
n
n
i =1
x
i
y
i
n
y
n
i =1
y
2
i
n
y
2
_
,
Solution a, b given by the unit eigenvector of U
T
U corresponding to
smaller eigenvalue. (Eigen-decomposition or SVD.)
RMS error given by the eigenvalue itself.
C18 2013 27 / 38
Example of String and Line Outputs
C18 2013 28 / 38
Problems using 1D image structure for Geometry
Computing edges make the feature map sparse but interpretable.
Much of the salient information is retained.
If the camera motion is known, feature matching is a 1D problem, for
which edges are very well suited (see Lecture 4).
However, matching is much harder when
the camera motion is unknown: known as
the Aperture problem.
End points are unstable, hence line
matching is largely uncertain. (Indeed
only line orientation is useful for detailed
geometrical work.)
In general, matching requires the unambiguity of 2D image features
or corners.
Corners dened as sharp peaks in the 2D autocorrelation of
local patches in the image.
C18 2013 29 / 38
2.5 Corner detection: preamble on auto-correlation
Suppose that we are interested in
correlating a (2n + 1)
2
pixel patch
at (x, y) in Image I with a similar
patch displaced from it by (u, v).
We would write the correlation between the patches as
C
uv
(x, y) =
n
i =n
n
j =n
I(x + i , y + j )I(x + u + i , y + v + j )
As we keep (x, y) xed, but change (u, v), we build up the
auto-correlation surface around (x, y).
C18 2013 30 / 38
Preamble: Auto-correlation
Reminder correlation:
C
uv
(x, y) =
n
i =n
n
j =n
I(x + i , y + j )I(x + u + i , y + v + j )
Here are the surfaces around 3 different patches
Plain corner Straight edge 3-way corner
A pixel in a uniform region will have a at autocorrelation
A pixel on an edge will have a ridge-like autocorrelation, but
A pixel at a corner has a peak.
C18 2013 31 / 38
Results ...
So a simple corner detector might have
the following steps at each pixel
a. Determine the auto-correlation
C
uv
(x, y) around the pixel position
x, y.
b. Find positions x, y where C
uv
(x, y) is
maximum in two directions.
c. If C
uv
(x, y) > threshold mark as a
corner.
There is an expression which can be computed more cheaply than
C
uv
which gives comparable qualitative results.
This is the sum of squared differences
E
uv
(x, y) =
+n
i =n
+n
j =n
[I(x + u + i , y + v + j ) I(x + i , y + j )]
2
C18 2013 32 / 38
Harris Corner Detector (Chris Harris, 1987)
Earlier we estimated the gradient from the pixels values.
Now assume we know the the gradients, and estimate a pixel
difference using a 1st order Taylor expansion ...
I(x + u, y + v) I(x, y) u
I(x, y)
x
+ v
I(x, y)
y
So the sum of squared differences can be approximated as
E
uv
(x, y) =
+n
i =n
+n
j =n
(I(x + u + i , y + v + j ) I(x + i , y + j ))
2
j
_
u
I
x
+ v
I
y
_
2
=
j
_
u
2
_
I
x
_
2
+ 2uv
I
x
I
y
+ v
2
_
I
y
_
2
_
Note, I/x etc are computed at the relevant (x + i , y + j ).
C18 2013 33 / 38
Harris Corner Detector /ctd
The double sum over i ,j is replaced by a convolution with
W =
1 1 1 1 1
1 1 1 1 1
1 1 1 1 1
1 1 1 1 1
1 1 1 1 1
E
uv
(x, y) =
j
(u
2
_
I
x
_
2
+ 2uv
I
x
I
y
+ v
2
_
I
y
_
2
)
= u
2
(W
_
I
x
_
2
) + 2uv(W
I
x
I
y
) + v
2
(W
_
I
y
_
2
)
= ( u v )
_
p r
r q
__
u
v
_
where
p(x, y) = W (I/x)
2
q(x, y) = W (I/y)
2
r (x, y) = W (I/x)(I/y)
C18 2013 34 / 38
Harris Corner Detector /ctd
We can introduce smoothing by replacing W with a Gaussian G, so
that
E
uv
(x, y) = ( u v )
_
p r
r q
__
u
v
_
where
p(x, y) = G (I/x)
2
q(x, y) = G (I/y)
2
r (x, y) = G (I/x)(I/y)
The quantities p, q and r computed at each (x, y) dene the shape of
the auto-correlation function E
uv
(x, y) at (x, y)
C18 2013 35 / 38
Harris Corner Detector /ctd
Recall that
E
uv
(x, y) = ( u v )
_
p r
r q
__
u
v
_
Now (u, v) denes a direction, so an estimate of E
uv
a unit distance
away from (x, y) along a vector direction n is then
E
n
Sn
n
n
where S =
_
p r
r q
_
Now recall (eg, from A1 Eng Comp notes) that if
1
and
2
are the
larger and smaller eigenvalues of S, respectively, that
2
n
Sn
n
n
1
2
E
1
C18 2013 36 / 38
/ctd Harris Corner Detector
This allows a classication of image structure:
Both
1
,
2
0 the autocorrelation is small in all directions:
image must be at.
1
0,
2
0 the autocorrelation is high in just one direction
a 1D edge
1
0,
2
0 the autocorrelation is high in all directions a
2D corner.
Harris original interest score was later modied by Harris and
Stephens
S
Harris
=
1
1
+
2
S
HS
=
1
(
1
+
2
)
2
4
is a positive constant 0 1 which decreases the response to
edges, sometimes called the edge-phobia.
With > 0 edges give negative scores, and corners positive. Scores close to zero
indicate at surfaces or T features which are intermediary between edges and
corners. The size of determines how much edges are penalised.
C18 2013 37 / 38
Examples
C18 2013 38 / 38
Summary
In this lecture we have considered
As enabling techniques, we have described convolution and
correlation.
Described how to smooth imagery and detect gradients
Described how to recover 1D structure (edges) in an image
In particular, have developed the Canny Edge detector.
Considered how to join edgels into strings, and to t lines.
Described how to recover 2D structure (corners) in an image.
Have developed the Harris cor ner detector