0% found this document useful (0 votes)
31 views38 pages

Como Observa Un Robot

This document provides an overview of computer vision techniques for feature detection, including edge detection and corner detection. It discusses how cameras capture photometric information and how step changes in image irradiance can reveal scene geometry. It describes common image operations used for feature detection, such as convolution, gradient calculation, and the Laplacian. The document focuses on the Canny edge detection algorithm, outlining its implementation which involves Gaussian smoothing, gradient calculation, non-maximal suppression, and hysterical thresholding to link edges.

Uploaded by

nerinconq
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views38 pages

Como Observa Un Robot

This document provides an overview of computer vision techniques for feature detection, including edge detection and corner detection. It discusses how cameras capture photometric information and how step changes in image irradiance can reveal scene geometry. It describes common image operations used for feature detection, such as convolution, gradient calculation, and the Laplacian. The document focuses on the Canny edge detection algorithm, outlining its implementation which involves Gaussian smoothing, gradient calculation, non-maximal suppression, and hysterical thresholding to link edges.

Uploaded by

nerinconq
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 38

C18 2013 1 / 38

C18 Computer Vision


David Murray
[email protected]
www.robots.ox.ac.uk/dwm/Courses/4CV
Michaelmas 2013
C18 2013 2 / 38
Computer Vision: This time ...
1. Introduction; imaging geometry; camera calibration
2. Salient feature detection edges, line and corners
3. Recovering 3D from two images I: epipolar geometry.
4. Recovering 3D from two images II: stereo correspondence
algorithms; triangulation.
C18 2013 3 / 38
Lecture 2
2.1 Cameras as photometric devices (just a note)
2.2 Image convolution
2.3 Edge detection
2.4 Edges to strings, strings to lines
2.5 Corner detection
C18 2013 4 / 38
2.1 Cameras as photometric devices
In Lecture 1, we considered the camera as a geometric abstraction
grounded on the rectilinear propagation of light.
But they are also photometric devices.
Important to consider the way image formation depends on:
the nature of scene surface (reecting, absorbing ...)
the relative orientations surface, light source and camera
the power and spectral properties of source
the spectral properties of the imaging system.
The important overall outcome (eg, Forsyth & Ponce, p62) is that
image irradiance is proportional to the scene radiance
A relief! This means the image really can tell us about the scene.
C18 2013 5 / 38
Cameras as photometric devices /ctd
But the study of photometry (often called physics-based vision)
requires detailed models of the reectance properties of the scene
and of the imaging process itself the sort of models that underpin
photo-realistic graphics too.
Eg, understanding how light scatters on water droplets allowed this
image to be de-fogged
Original De-fogged
Can we avoid going into such detail? ... Yes by considering
aspects of scene geometry that are revealed in step changes in
image irradiance.
C18 2013 6 / 38
Step irradiance changes are due to ...
1. Changes in Scene Radiance.
Natural (eg shadows) or
deliberately introduced via articial
illumination: light stripes are
generated either by a mask or
from a laser.
2. Changes in the scene
reectance at sudden changes in
surface orientation. These arise at
the intersection of two surfaces,
and thus represent geometrical
entities xed on the object in the
scene.
3. Changes in reectance properties due to changes in surface
albedo. The reectance properties are scaled by a changing albedo
arising from surface markings. Again these are xed to the object.
C18 2013 7 / 38
Feature detection
We are after step spatial changes in image irradiance because
(i) they are likely to be tied to scene geometry; and
(ii) they are likely to be salient (have high info content)
A simple classication of changes in image irradiance I(x, y) is into
areas that, locally, have
1D structure 2D structure
Edge Detectors Corner Detectors
C18 2013 8 / 38
2.2 Image operations for Feature detection
Feature detection should be a local operation, working without
knowledge of higher geometrical entities or objects ...
We should use pixel values I(x, y) and derivatives I/x and
I/y, and so on.
It would be useful to have a non-directional combination of these, so
that a feature map of a rotated image is identical to the rotated
feature map of the original image: F(RI) = RF(I)
Considering edge detection, two possibilities are:
Search for maxima in the gradient magnitude
_
(I/x)
2
+ (I/y)
2
1st order, but non-linear
Search for zeroes in the Laplacian

2
I =
2
I/x
2
+
2
I/y
2
linear, but 2nd order
C18 2013 9 / 38
Which to choose?
The gradient magnitude is attractive because it is rst order in the
derivatives. Differentiation enhances noise, and the 2nd derivatives in
the Laplacian operator introduces even more.
The Laplacian is attractive because it is linear, which means it can be
implemented by a succession of fast linear operations effectively
matrix operations as we are dealing with a pixelated image.
Both approaches have been used.
For both methods we need to consider
how to compute gradients, and
how to suppress noise
so that insignicant variations in pixel intensity are not agged up as
edges.
This will involve spatial convolution of the image with an impulse
response function that smooths the image and computes the
gradients.
C18 2013 10 / 38
Preamble: spatial convolution
You are familiar with the 1D convolution integral in the time domain
between a input signal i (t ) and impulse response function h(t )
o(t ) = i (t ) h(t ) =
_
+

i (t )h()d =
_
+

i ()h(t )d .
The second equality reminds us that convolution commutes
i (t ) h(t ) = h(t ) i (t ). It also associates.
In the frequency domain we would write O(s) = H(s)I(s).
Now in a continuous 2D domain, the spatial convolution integral is
o(x, y) = i (x, y) h(x, y) =
_
+

_
+

i (x a, y b)h(a, b)dadb
In the spatial domain youll often see h(x, y) referred to as the
point spread function or the convolution mask.
C18 2013 11 / 38
Spatial convolution /ctd
For pixelated images I(x, y) we need a discrete convolution
O(x, y) = I(x, y) h(x, y) =

j
I(x i , y j )h(i , j )
for x, y ranging over the image width and height respectively, and i , j
ensuring that access is made to any and all non-zero entries in h.
Many authors rewrite the convolution by replacing h(i , j ) with

h(i , j )
O(x, y) =

j
I(x i , y j )h(i , j ) =

j
I(x + i , y + j )h(i , j )
=

j
I(x + i , y + j )

h(i , j )
This looks more like the expression for a cross-correlation but,
confusingly, it is still called a convolution.
C18 2013 12 / 38
Computing partial derivatives using convolution
We can approximate I/x at image pixel (x, y)
using a central difference
I/x (1/2) {[I(x + 1, y) I(x, y)] + [I(x, y) I(x 1, y)]}
= (1/2) {I(x + 1, y) I(x 1, y)}
Writing this as a proper convolution we would set
h(1) = +
1
2
h(0) = 0 h(1) =
1
2
D(x, y) = I(x, y) h(x) =
1

i =1
I(x i , y)h(i )
Notice how the proper mask is reversed from what we might navely
expect from the expression.
C18 2013 13 / 38
Computing partial derivatives using convolution /ctd
Now, as ever,
(I/x) (1/2)[I(x + 1, y) I(x 1, y)]
Writing this as a sort of correlation

h(1) = 1/2

h(0) = 0

h(1) = +1/2
D(x, y) = I(x, y)

h(x) =
1

i =1
I(x + i , y)

h(i )
Note how we can just lay this mask directly on the pixels to be
multiplied and summed ...
C18 2013 14 / 38
Example Results
The actual image used is
grey-level, not colour
x-gradient image y-gradient image
C18 2013 15 / 38
In 2 dimensions ...
As before, one imagines the ipped correlation-like mask centred on
a pixel, and the sum of products computed.
Often a 2D mask is separable in that it can be broken up in to two
separate 1D convolutions in x and y
O = h
2d
I = f
y
g
x
I
The computational complexity is lower but intermediate storage is
required, and so for a small mask it may be cheaper to use it directly.
C18 2013 16 / 38
Example result Laplacian (non-directional)
The actual image used is
grey-level, not colour
Laplacian
C18 2013 17 / 38
Noise and smoothing
Differentiation enhances
noise the edge appears
clear enough in the image,
but less so in the gradient
map.
If we knew the noise spectrum, we might nd an optimal brickwall
lter g(x, y) G(s) to suppress noise edges outside the signal edge
band. We would applied this g to the image (g I) before nding
derivatives.
But the sharper the cut-off in spatial-frequency, the wider the spatial
support of g(x, y) has to be (it tends to an Innite Impulse Response
(IIR) lter). This is expensive to compute.
Conversely, if g(x, y) has nite size (an FIR lter) it will not be
band-limited, spatially blurring of the signal edges.
Can we compromise spread in space and spatial-frequency in some
optimal way?
C18 2013 18 / 38
Compromise in space and spatial-frequency
Suppose our IR function is h(x) and h H is a Fourier transform
pair.
Dene the spreads in space and spatial-frequency as X and where
X
2
=
_
(x x
m
)
2
h
2
(x)dx
_
h
2
(x)dx
with x
m
=
_
xh
2
(x)dx
_
h
2
(x)dx

2
=
_
(
m
)
2
H
2
()d
_
H
2
()d
with
m
=
_
H
2
()d
_
H
2
()d
Now vary h to minimize the product of the spreads U = X.
An uncertainty principle indicates that U
min
= 1/2 when
h(x) = A Gaussian function =
1

2
exp
_

x
2

2
_
C18 2013 19 / 38
2.3 The Canny Edge Detector (JF Canny 1986)
2D Gaussian smoothing is applied to the image
S(x, y) = G(x, y) I =
1
2
2
exp
_

(x
2
+ y
2
)
2
2
_
I(x, y)
But separated into two separable masks
S(x, y) =
1

2
exp
_

(x
2
)
2
2
_

2
exp
_

(y
2
)
2
2
_
I(x, y)
Then 1st derivatives in x and y are found by two further convolutions
D
x
(x, y) = S/x = h
x
S
D
y
(x, y) = S/y = h
y
S.
Interestingly Cannys decision to use a Gaussian is made from a
separate study of how to maximize robustness, localization, and
uniqueness in edge detection.
C18 2013 20 / 38
Canny/ Implementation
Step 1: Gaussian smooth
image S = G I
This is separable
S = G
x
G
y
I
Often 5 1 or 7 1 masks
Step 2: Compute gradients
S/x, S/y
using derivative masks.
C18 2013 21 / 38
Canny/ Implementation
Step 3: Non-maximal suppression
a. Use S/x and S/y to nd gradient magnitude and direction at
a pixel.
b. Track along direction to marked
positions on neighbourhood perimeter.
c. Linearly interpolate gradient magnitude
at positions using magnitudes at arrow
pixels.
d. If gradient magnitude at central pixel
greater than both interpolated values,
declare an edgel, and t parabola to nd
(x, y) to sub-pixel acuity along the edge
direction.
Step 4:
Store edgels sub-pixel position (x, y); gradient strength; and
orientation (full 360

as light/dark distinguished),
C18 2013 22 / 38
Canny/ Implementation: Hysterical Thresholding
Step 5:
Link edges
together into
strings using
orientation to aid
search for
neighbouring
edgels.
Step 6:
Perform
thresholding with
hysteresis, by
running along
the linked strings
C18 2013 23 / 38
Example of Edgel and String Outputs
C18 2013 24 / 38
2.4 From Strings to Straight Lines
Split algorithm for lines:
1. Fit a straight line to all edgels in string
2. If RMS error less than threshold accept
and stop.
3. Otherwise nd point of highest curvature
on edge string and split into two. Repeat
from 1 for each substring.
Merge algorithm for lines:
1. Fit straight lines to each pair of
consecutive edgels in a string
2. Compute RMS error for each potential
merger of an adjacent pair of lines into a
single line, and nd the pair for which
the RMS error is minimum
3. If RMS error less than threshold, merge
and repeat from 2.
C18 2013 25 / 38
Actually tting lines and nding that RMS error
Consider n edgels that have been linked into an edge (sub-)string.
Fitting a line requires orthogonal regression.
See B14, or the variant here ...
Find
minE(a, b, c) =
n

i =1
(ax
i
+ by
i
+ c)
2
subject to a
2
+ b
2
= 1
Hence a 2-dof problem.
C18 2013 26 / 38
From edge elements to lines /ctd
Now minE(a, b, c) = min
n

i =1
(ax
i
+ by
i
+ c)
2
requires
E/c = 0.
2

n
i =1
(ax
i
+ by
i
+ c) = 0 c = (a

x + b

y) .
Therefore E becomes
E =
n

i =1
[a(x
i

x) + b(y
i

y)]
2
=
_
a b

U
T
U
_
a
b
_
where the 2nd-moment matrix is
U
T
U =
_
n
i =1
x
2
i
n

x
2

n
i =1
x
i
y
i
n

n
i =1
x
i
y
i
n

y

n
i =1
y
2
i
n

y
2
_
,
Solution a, b given by the unit eigenvector of U
T
U corresponding to
smaller eigenvalue. (Eigen-decomposition or SVD.)
RMS error given by the eigenvalue itself.
C18 2013 27 / 38
Example of String and Line Outputs
C18 2013 28 / 38
Problems using 1D image structure for Geometry
Computing edges make the feature map sparse but interpretable.
Much of the salient information is retained.
If the camera motion is known, feature matching is a 1D problem, for
which edges are very well suited (see Lecture 4).
However, matching is much harder when
the camera motion is unknown: known as
the Aperture problem.
End points are unstable, hence line
matching is largely uncertain. (Indeed
only line orientation is useful for detailed
geometrical work.)
In general, matching requires the unambiguity of 2D image features
or corners.
Corners dened as sharp peaks in the 2D autocorrelation of
local patches in the image.
C18 2013 29 / 38
2.5 Corner detection: preamble on auto-correlation
Suppose that we are interested in
correlating a (2n + 1)
2
pixel patch
at (x, y) in Image I with a similar
patch displaced from it by (u, v).
We would write the correlation between the patches as
C
uv
(x, y) =
n

i =n
n

j =n
I(x + i , y + j )I(x + u + i , y + v + j )
As we keep (x, y) xed, but change (u, v), we build up the
auto-correlation surface around (x, y).
C18 2013 30 / 38
Preamble: Auto-correlation
Reminder correlation:
C
uv
(x, y) =
n

i =n
n

j =n
I(x + i , y + j )I(x + u + i , y + v + j )
Here are the surfaces around 3 different patches
Plain corner Straight edge 3-way corner
A pixel in a uniform region will have a at autocorrelation
A pixel on an edge will have a ridge-like autocorrelation, but
A pixel at a corner has a peak.
C18 2013 31 / 38
Results ...
So a simple corner detector might have
the following steps at each pixel
a. Determine the auto-correlation
C
uv
(x, y) around the pixel position
x, y.
b. Find positions x, y where C
uv
(x, y) is
maximum in two directions.
c. If C
uv
(x, y) > threshold mark as a
corner.
There is an expression which can be computed more cheaply than
C
uv
which gives comparable qualitative results.
This is the sum of squared differences
E
uv
(x, y) =
+n

i =n
+n

j =n
[I(x + u + i , y + v + j ) I(x + i , y + j )]
2
C18 2013 32 / 38
Harris Corner Detector (Chris Harris, 1987)
Earlier we estimated the gradient from the pixels values.
Now assume we know the the gradients, and estimate a pixel
difference using a 1st order Taylor expansion ...
I(x + u, y + v) I(x, y) u
I(x, y)
x
+ v
I(x, y)
y
So the sum of squared differences can be approximated as
E
uv
(x, y) =
+n

i =n
+n

j =n
(I(x + u + i , y + v + j ) I(x + i , y + j ))
2

j
_
u
I
x
+ v
I
y
_
2
=

j
_
u
2
_
I
x
_
2
+ 2uv
I
x
I
y
+ v
2
_
I
y
_
2
_
Note, I/x etc are computed at the relevant (x + i , y + j ).
C18 2013 33 / 38
Harris Corner Detector /ctd
The double sum over i ,j is replaced by a convolution with
W =
1 1 1 1 1
1 1 1 1 1
1 1 1 1 1
1 1 1 1 1
1 1 1 1 1
E
uv
(x, y) =

j
(u
2
_
I
x
_
2
+ 2uv
I
x
I
y
+ v
2
_
I
y
_
2
)
= u
2
(W
_
I
x
_
2
) + 2uv(W
I
x
I
y
) + v
2
(W
_
I
y
_
2
)
= ( u v )
_
p r
r q
__
u
v
_
where
p(x, y) = W (I/x)
2
q(x, y) = W (I/y)
2
r (x, y) = W (I/x)(I/y)
C18 2013 34 / 38
Harris Corner Detector /ctd
We can introduce smoothing by replacing W with a Gaussian G, so
that
E
uv
(x, y) = ( u v )
_
p r
r q
__
u
v
_
where
p(x, y) = G (I/x)
2
q(x, y) = G (I/y)
2
r (x, y) = G (I/x)(I/y)
The quantities p, q and r computed at each (x, y) dene the shape of
the auto-correlation function E
uv
(x, y) at (x, y)
C18 2013 35 / 38
Harris Corner Detector /ctd
Recall that
E
uv
(x, y) = ( u v )
_
p r
r q
__
u
v
_
Now (u, v) denes a direction, so an estimate of E
uv
a unit distance
away from (x, y) along a vector direction n is then
E
n

Sn
n

n
where S =
_
p r
r q
_
Now recall (eg, from A1 Eng Comp notes) that if
1
and
2
are the
larger and smaller eigenvalues of S, respectively, that

2

n

Sn
n

n

1

2
E
1
C18 2013 36 / 38
/ctd Harris Corner Detector
This allows a classication of image structure:
Both
1
,
2
0 the autocorrelation is small in all directions:
image must be at.

1
0,
2
0 the autocorrelation is high in just one direction
a 1D edge

1
0,
2
0 the autocorrelation is high in all directions a
2D corner.
Harris original interest score was later modied by Harris and
Stephens
S
Harris
=

1

1
+
2
S
HS
=
1

(
1
+
2
)
2
4
is a positive constant 0 1 which decreases the response to
edges, sometimes called the edge-phobia.
With > 0 edges give negative scores, and corners positive. Scores close to zero
indicate at surfaces or T features which are intermediary between edges and
corners. The size of determines how much edges are penalised.
C18 2013 37 / 38
Examples
C18 2013 38 / 38
Summary
In this lecture we have considered
As enabling techniques, we have described convolution and
correlation.
Described how to smooth imagery and detect gradients
Described how to recover 1D structure (edges) in an image
In particular, have developed the Canny Edge detector.
Considered how to join edgels into strings, and to t lines.
Described how to recover 2D structure (corners) in an image.
Have developed the Harris cor ner detector

You might also like