0% found this document useful (0 votes)
45 views

Localized Feature Extraction

This document discusses two approaches to localized feature extraction from images: traditional curvature-based operators and more modern region or patch-based analysis. It focuses on defining curvature mathematically and computing it from digital images. Curvature can be computed by taking the difference in edge direction between connected pixels in a curve. Average differences over several pixels provides some noise immunity. Computing curvature in this way allows corners to be detected as points of high curvature.

Uploaded by

Manu Manu
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
45 views

Localized Feature Extraction

This document discusses two approaches to localized feature extraction from images: traditional curvature-based operators and more modern region or patch-based analysis. It focuses on defining curvature mathematically and computing it from digital images. Curvature can be computed by taking the difference in edge direction between connected pixels in a curve. Average differences over several pixels provides some noise immunity. Computing curvature in this way allows corners to be detected as points of high curvature.

Uploaded by

Manu Manu
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

4.

8 Localized feature extraction

Two main areas are covered here. The traditional approaches aim to derive local features by
measuring specific image properties. The main target has been to estimate curvature: peaks
of local curvature are corners, and analysing an image by its corners is especially suited to
images of artificial objects. The second area includes more modern approaches that improve
performance by using region or patch-based analysis. We shall start with the more established
curvature-based operators, before moving to the patch or region-based analysis.
4.8.1 Detecting image curvature (corner extraction)
4.8.1.1 Definition of curvature
Edges are perhaps the low-level image features that are most obvious to human vision. They
preserve significant features, so we can usually recognize what an image contains from its
edge-detected version. However, there are other low-level features that can be used in computer
vision. One important feature is curvature. Intuitively, we can consider curvature as the rate of
change in edge direction. This rate of change characterizes the points in a curve; points where the
edge direction changes rapidly are corners, whereas points where there is little change in edge
direction correspond to straight lines. Such extreme points are very useful for shape description
and matching, since they represent significant information with reduced data.
Curvature is normally defined by considering a parametric form of a planar curve. The
parametric contour v t = x t Ux + y t Uy describes the points in a continuous curve as the
endpoints of the position vector. Here, the values of t define an arbitrary parameterization, the
unit vectors are again Ux = 1 0 and Uy = 0 1 . Changes in the position vector are given by
the tangent vector function of the curve vt. That is, v̇ t = ẋ t Ux + ẏ t Uy . This vectorial
expression has a simple intuitive meaning. If we think of the trace of the curve as the motion
of a point and t is related to time, the tangent vector definesthe instantaneous motion. At
any moment, the point
moves
with a speed given by v̇ t = ẋ2 t + ẏ2 t in the direction
−1
 t = tan ẏ t ẋ t . The curvature at a point vt describes the changes in the direction
 t with respect to changes in arc length. That is,
d t
 t = (4.41)
ds
where s is arc length, along the edge itself. Here  is the angle of the tangent to the curve.
That is,  =  ± 90 , where  is the gradient direction defined in Equation 4.13. That is, if
we apply an edge detector operator to an image, we have for each pixel a gradient direction
value that represents the normal direction to each point in a curve. The tangent to a curve is
given by an orthogonal vector. Curvature is given with respect to arc length because a curve
parameterized by arc length maintains a constant speed of motion. Thus, curvature represents
changes in direction for constant displacements along the curve. By considering the chain rule,
we have
d t dt
 t = (4.42)
dt ds
The differential ds/dt defines the change in arc length with respect to the parameter t. If we
again consider the curve as the motion of a point, this differential defines the instantaneous
change in distance with respect to time. That is, the instantaneous speed. Thus,

ds dt = v̇ t = ẋ2 t + ẏ2 t (4.43)
and

dt ds = 1 ẋ2 t + ẏ2 t (4.44)

By considering that  t = tan−1 ẏ t ẋ t , then the curvature at a point vt in Equation 4.42
is given by
ẋ t ÿ t − ẏ t ẍ t
 t = (4.45)
ẋ2 t + ẏ2 t 3/ 2
This relationship is called the curvature function and it is the standard measure of curvature for
planar curves (Apostol, 1966). An important feature of curvature is that it relates the derivative
of a tangential vector to a normal vector. This can be explained by the simplified Serret–Frenet
equations (Goetz, 1970) as follows. We can express the tangential vector in polar form as

v̇t = v̇t cos  t + j sin  t (4.46)

If the curve is parameterized by arc length, then v̇t is constant. Thus, the derivative of a
tangential vector is simply given by

v̈ t = v̇ t − sin  t + j cos  t d t dt (4.47)

Since we are using a normal parameterization, then d t dt = d t ds. Thus, the tangential
vector can be written as

v̈t = tnt (4.48)

where n t = vt − sin  t + j cos  t defines the direction of v̈t, while the cur-
vature t defines its modulus. The derivative of the normal vector is given by ṅt =
v̇ t − cos  t − i sin  t d t /ds, which can be written as

ṅt = −tv̇t (4.49)

Clearly, nt is normal to v̇t. Therefore, for each point in the curve, there is a pair of orthogonal
vectors v̇t and nt whose moduli are proportionally related by the curvature.
In general, the curvature of a parametric curve is computed by evaluating Equation 4.45.
For a straight line, for example, the second derivatives ẍt and ÿt are zero, so the curvature
ẋ t = r cos t and ẏ t = −r sin t. Thus,
function is nil. For a circle of radius r, we have that
ÿ t = −r cos t, ẍ t = −r sin t and  t = 1 r. However, for curves in digital images, the
derivatives must be computed from discrete data. This can be done in three main ways. The
most obvious approach is to calculate curvature by directly computing the difference between
angular direction of successive edge pixels in a curve. A second approach is to derive a measure
of curvature changes in image intensity. Finally, a measure of curvature can be obtained by
correlation.

4.8.1.2 Computing differences in edge direction


Perhaps the easier way to compute curvature in digital images is to measure the angular change
along the curve’s path. This approach was considered in early corner detection techniques
(Bennett and MacDonald, 1975; Groan and Verbeek, 1978; Kitchen and Rosenfeld, 1982)
and it merely computes the difference in edge direction between connected pixels forming
a discrete curve. That is, it approximates the derivative in Equation 4.41 as the difference
between neighbouring pixels. As such, curvature is simply given by

k t = t+1 − t−1 (4.50)

where the sequence    t−1  t  t+1  t+2    represents the gradient direction of a sequence of
pixels defining a curve segment. Gradient direction can be obtained as the angle given by an edge
detector operator. Alternatively, it can be computed by considering the position of pixels in the
sequence. That is, by defining t = yt−1 − yt+1  xt−1 − xt+1 , where xt  yt  denotes pixel t in
the sequence. Since edge points are only defined at discrete points, this angle can only take eight
values, so the computed curvature is very ragged. This can be smoothed out by considering the
difference in mean angular direction of n pixels on the leading and trailing curve segment. That is,

1 n
1 −1
kn t = t+i −  (4.51)
n i=1 n i=−n t+i
The average also gives some immunity to noise and it can be replaced by a weighted average
if Gaussian smoothing is required. The number of pixels considered, the value of n, defines
a compromise between accuracy and noise sensitivity. Notice that filtering techniques may
also be used to reduce the quantization effect when angles are obtained by an edge detection
operator. As we have already discussed, the level of filtering the filtering is related to the size
of the template (as in Section 3.4.3).
To compute angular differences, we need to determine connected edges. This can easily be
implemented with the code already developed for hysteresis thresholding in the Canny edge
operator. To compute the difference of points in a curve, the connect routine (Code 4.12)
only needs to be arranged to store the difference in edge direction between connected points.
Code 4.16 shows an implementation for curvature detection. First, edges and magnitudes are
determined. Curvature is only detected at edge points. As such, we apply maximal suppression.
The function Cont returns a matrix containing the connected neighbour pixels of each edge.
Each edge pixel is connected to one or two neighbours. The matrix Next stores only the
direction of consecutive pixels in an edge. We use a value of −1 to indicate that there is no
connected neighbour. The function NextPixel obtains the position of a neighbouring pixel
by taking the position of a pixel and the direction of its neighbour. The curvature is computed
as the difference in gradient direction of connected neighbour pixels.
The result of applying this form of curvature detection to an image is shown in Figure 4.37.
Figure 4.37(a) contains the silhouette of an object; Figure 4.39(b) is the curvature obtained
by computing the rate of change of edge direction. In this figure, curvature is defined only
at the edge points. Here, by its formulation the measurement of curvature  gives just a
thin line of differences in edge direction which can be seen to track the perimeter points of
the shapes (at points where there is measured curvature). The brightest points are those with
greatest curvature. To show the results, we have scaled the curvature values to use 256 intensity
values. The estimates of corner points could be obtained by a uniformly thresholded version of
Figure 4.37(b), well in theory anyway!

(a) Image (b) Detected corners

Figure 4.37 Curvature detection by difference

Unfortunately, as can be seen, this approach does not provide reliable results. It is essentially a
reformulation of a first order edge detection process and presupposes that the corner information
lies within the threshold data (and uses no corner structure in detection). One of the major
difficulties with this approach is that measurements of angle can be severely affected by
quantization error and accuracy is limited (Bennett and MacDonald, 1975), a factor which will
return to plague us later when we study methods for describing shapes.

4.8.1.3 Measuring curvature by changes in intensity (differentiation)


As an alternative way of measuring curvature, we can derive the curvature as a function of
changes in image intensity. This derivation can be based on the measure of angular changes
in the discrete image. We can represent the direction at each image point as the function
 x y. Thus, according to the definition of curvature, we should compute the change in these
direction values normal to the image edge (i.e. along the curves in an image). The curve at
an edge can be locally approximated by the points given by the parametric line defined by
x t = x + t cos  x y and y t = y + t sin  x y. Thus, the curvature is given by the
change in the function  x y with respect to t. That is,
 x y  x y x t  x y y t
 x y = = + (4.52)
t x t y t

where x t t = cos  and y t t = sin . By considering the definition of the gra-
dient angle, we  have
that the
normal tangent direction at a point in a line is given by
 x y = tan−1 Mx −My . From this geometry we can observe that
 
cos   = −My Mx2 + My2 and sin   = Mx Mx2 + My2 (4.53)

By differentiation of  x y and by considering these definitions we obtain


 
1 2 Mx My My Mx
 x y = 3
My − MxMy + Mx2 − MxMy (4.54)
Mx2 + My2  2 x x y y

This defines a forward measure of curvature along the edge direction. We can use an alternative
direction to measure of curvature. We can differentiate backwards (in the direction of − x y
giving − x y. In this case we consider that the curve is given by x t = x + t cos − x y
and y t = y + t sin − x y. Thus,
 
1 Mx My My Mx
− x y = 3
My2 − MxMy − Mx2 + MxMy (4.55)
Mx2 + My2  2 x x y y

Two further measures can be obtained by considering the forward and a backward differential
along the normal. These differentials cannot be related to the actual definition of curvature, but
can be explained intuitively. If we consider that curves are more than one pixel wide, differenti-
ation along the edge will measure the difference between the gradient angle between interior and
exterior borders of a wide curve. In theory, the tangent angle should be the same. However, in dis-
crete images there is a change due to the measures in a window. If the curve is a straight line, then
the interior and exterior borders are the same. Thus, gradient direction normal to the edge does not
change locally. As we bend a straight line, we increase the difference between the curves defining
the interior and exterior borders. Thus, we expect the measure of gradient direction to change.
That is, if we differentiate along the normal direction, we maximize detection of gross curvature.
The value ⊥ x y is obtained when x t = x + t sin  x y and y t = y + t cos  x y.
In this case,
 
1 My My My Mx
⊥ x y = 3
Mx2 − MxMy − MxMy + MyMy (4.56)
Mx2 + My2  2 x x y y

In a backward formulation along a normal direction to the edge, we obtain:


 
1 My Mx My Mx
−⊥ x y = 3
−Mx2 + MxMy − MxMy + My2 (4.57)
Mx2 + My2  2 x x y y

This was originally used by Kass et al. (1988) as a means to detect line terminations, as
part of a feature extraction scheme called snakes (active contours), which are covered in
Chapter 6. Code 4.17 shows an implementation of the four measures of curvature. The function
Gradient is used to obtain the gradient of the image and to obtain its derivatives. The output
image is obtained by applying the function according to the selection of parameter op.
Let us see how the four functions for estimating curvature from image intensity perform for
the image given in Figure 4.37(a). In general, points where the curvature is large are highlighted
by each function. Different measures of curvature (Figure 4.38) highlight differing points on the
feature boundary. All measures appear to offer better performance than that derived by refor-
mulating hysteresis thresholding (Figure 4.37b), although there is little discernible performance
advantage between the direction of differentiation. As the results in Figure 4.38 suggest, detect-
ing curvature directly from an image is not a totally reliable way of determining curvature, and
(a) κϕ (b) κ –ϕ

(c) κ⊥ϕ (d) κ−⊥ϕ

Figure 4.38 Comparing image curvature detection operators

hence corner information. This is in part due to the higher order of the differentiation process.
(Also, scale has not been included within the analysis.)

4.8.1.4 Moravec and Harris detectors


In the previous section, we measured curvature as the derivative of the function  x y along
a particular direction. Alternatively, a measure of curvature can be obtained by considering
changes along a particular direction in the image P itself. This is the basic idea of Moravec’s
corner detection operator. This operator computes the average change in image intensity when
a window is shifted in several directions. That is, for a pixel with coordinates x y, and a
window size of 2w + 1 we have:

w 
w  2
Euv x y = Px+iy+j − Px+i+uy+j+v (4.58)
i=−w j=−w

This equation approximates the autocorrelation function in the direction (u, v). A measure of
curvature is given by the minimum value of Euv x y obtained by considering the shifts (u, v)
in the four main directions. That is, by (1,0), (0,−1), (0,1) and (−1,0). The minimum is chosen
because it agrees with the following two observations. First, if the pixel is in an edge defining a
straight line, Euv x y is small for a shift along the edge and large for a shift perpendicular to
the edge. In this case, we should choose the small value since the curvature of the edge is small.
Secondly, if the edge defines a corner, then all the shifts produce a large value. Thus, if we also
chose the minimum, this value indicates high curvature. The main problem with this approach
is that it considers only a small set of possible shifts. This problem is solved in the Harris
corner detector (Harris and Stephens, 1988) by defining an analytic expression for the autocor-
relation. This expression can be obtained by considering the local approximation of intensity
changes.
We can consider that the points Px+iy+j and Px+i+uy+j+v define a vector (u, v) in the
image. Thus, in a similar fashion to the development given in Equation 4.58, the increment
in the image
function between
the points can be approximated by the directional derivative
uPx+iy+j x + vPx+iy+j y. Thus, the intensity at Px+i+uy+j+v can be approximated as

Px+iy+j Px+iy+j
Px+i+uy+j+v = Px+iy+j + u+ v (4.59)
x y
This expression corresponds to the three first terms of the Taylor expansion around Px+iy+j (an
expansion to first order). If we consider the approximation in Equation 4.58 we have:


w 
w
Px+iy+j Px+iy+j 2
Euv x y = u+ v (4.60)
i=−w j=−w x y

By expansion of the squared term (and since u and v are independent of the summations), we
obtain:

Euv x y = A x y u2 + 2C x y uv + B x y v2 (4.61)

where
 2  2

w 
w
Px+iy+j 
w 
w
Px+iy+j
A x y = B x y =
i=−w j=−w x i=−w j=−w y
   (4.62)

w 
w
Px+iy+j Px+iy+j
C x y =
i=−w j=−w x y

That is, the summation of the squared components of the gradient direction for all the pixels
in the window. In practice, this average can be weighted by a Gaussian function to make the
measure less sensitive to noise (i.e. by filtering the image data). To measure the curvature
at a point (x, y), it is necessary to find the vector (u, v) that minimizes Euv x y given in
Equation 4.61. In a basic approach, we can recall that the minimum is obtained when the
window is displaced in the direction of the edge. Thus, we can consider that u = cos  x y
and v = sin  x y. These values were defined in Equation 4.53. Accordingly, the minima
values that define curvature are given by

A x y My 2 + 2C x y Mx My + B x y Mx 2


uv x y = min Euv x y = (4.63)
Mx 2 + My 2

In a more sophisticated approach, we can consider the form of the function Euv x y. We can
observe that this is a quadratic function, so it has two principal axes. We can rotate the function
such that its axes have the same direction as the axes of the coordinate system. That is, we
rotate the function Euv x y to obtain

Fuv x y =  x y2 u2 +  x y2 v2 (4.64)


The values of  and  are proportional to the autocorrelation function along the principal axes.
Accordingly, if the point (x, y) is in a region of constant intensity, both values are small. If the
point defines a straight border in the image, then one value is large and the other is small. If the
point defines an edge with high curvature, both values are large. Based on these observations a
measure of curvature is defined as
k x y =  − k  + 2 (4.65)
The first term in this equation makes the measure large when the values of  and  increase.
The second term is included to decrease the values in flat borders. The parameter k must be
selected to control the sensitivity of the detector. The higher the value, the more sensitive the
computed curvature will be to changes in the image (and therefore to noise).
In practice, to compute k x y it is not necessary to compute explicitly the values of
 and , but the curvature can be measured from the coefficient of the quadratic expres-
sion in Equation 4.61. This can be derived by considering the matrix forms of Equa-
tions 4.61 and 4.64. If we define the vector DT = u v , then Equations 4.60 and 4.63 can be
written as
Euv x y = DT MD and Fuv x y = DT QD (4.66)
T
where denotes the transpose and where
 
A x y C x y  0
M= and Q= (4.67)
C x y B x y 0 
To relate Equations 4.60 and 4.63, we consider that Fuv x y is obtained by rotating Euv x y
by a transformation R that rotates the axis defined by D. That is,
Fuv x y = RDT MRD (4.68)
This can be arranged as
Fuv x y = DT RT MRD (4.69)
By comparison with Equation 4.66, we have:
Q = RT MR (4.70)
This defines a well-known equation of linear algebra and it means that Q is an orthogonal
decomposition of M. The diagonal elements of Q are called the eigenvalues. We can use
Equation 4.70 to obtain the value of , which defines the firstterm in Equation 4.65 by
considering the determinant of
 the matrices. That is, det Q = det R T
det M det R. Since
R is a rotation matrix det RT det R = 1, thus
 = A x y B x y − C x y2 (4.71)
which defines the first term in Equation 4.65. The second term can be obtained by taking the
trace of the matrices on each side of this equation. Thus, we have:
 +  = A x y + B x y (4.72)
We can also use Equation 4.70 to obtain the value of  + , which defines the first term in
Equation 4.65. By taking the trace of the matrices in each side of this equation, we have:
k x y = A x y B x y − C x y2 − k A x y + B x y2 (4.73)
(a) κu,v (x, y ) (b) κk (x, y )

Figure 4.39 Curvature via the Harris operator

produces more contrast between lines with low and high curvature than uv x y. The reason is
the inclusion of the second term in Equation 4.73. In general, the measure of correlation is not
only useful to compute curvature; this technique has much wider application in finding points
for matching pairs of images.

4.8.2 Modern approaches: region/patch analysis


4.8.2.1 Scale invariant feature transform
The scale invariant feature transform (SIFT) (Lowe, 1999, 2004) aims to resolve many of the
practical problems in low-level feature extraction and their use in matching images. The earlier
Harris operator is sensitive to changes in image scale and as such is unsuited to matching images
of differing size. SIFT involves two stages: feature extraction and description. The description
stage concerns use of the low-level features in object matching, and this will be considered
later. Low-level feature extraction within the SIFT approach selects salient features in a manner
invariant to image scale (feature size) and rotation, and with partial invariance to change in
illumination. Further, the formulation reduces the probability of poor extraction due to occlusion
clutter and noise. It also shows how many of the techniques considered previously can be
combined and capitalized on, to good effect.
First, the difference of Gaussians operator is applied to an image to identify features of
potential interest. The formulation aims to ensure that feature selection does not depend on
feature size (scale) or orientation. The features are then analysed to determine location and
scale before the orientation is determined by local gradient direction. Finally, the features are
transformed into a representation that can handle variation in illumination and local shape
distortion. Essentially, the operator uses local information to refine the information delivered
by standard operators. The detail of the operations is best left to the source material (Lowe,
1999, 2004), for it is beyond the level or purpose here. As such, we shall concentrate on
principle only.
The features detected for the Lena image are illustrated in Figure 4.40. Here, the major
features detected are shown by white lines, where the length reflects magnitude and the direction
reflects the feature’s orientation. These are the major features, which include the rim of the
hat, face features and the boa. The minor features are the smaller white lines: the ones shown
here are concentrated around a background feature. In the full set of features detected at all
scales in this image, there are many more of the minor features, concentrated particularly in the
textured regions of the image (Figure 4.43). Later, we shall see how this can be used within
shape extraction, but the purpose here is the basic low-level features extracted by this new
technique.

(a) Original image (b) Output points with magnitude


and direction

Figure 4.40 Detecting features with the SIFT operator


In the first stage, the difference of Gaussians for an image P is computed in the manner of
Equation 4.28 as

g x y k
 − g x y
 ∗ P
D x y
 =
k

(4.74)
= L x y k
 − L x y k

The function L is a scale-space function which can be used to define smoothed images at
different scales. Note again the influence of scale-space in the more modern techniques. Rather
than any difficulty in locating zero-crossing points, the features are the maxima and minima of
the function. Candidate keypoints are then determined by comparing each point in the function
with its immediate neighbours. The process then proceeds to analysis between the levels of
scale, given appropriate sampling of the scale-space. This then implies comparing a point with
its eight neighbours at that scale and with the nine neighbours in each of the adjacent scales,
to determine whether it is a minimum or a maximum, as well as image resampling to ensure
comparison between the different scales.
To filter the candidate points to reject those which are the result of low local contrast (low
edge strength) or which are poorly localized along an edge, a function is derived by local
curve fitting, which indicates local edge strength and stability as well as location. Uniform
thresholding then removes the keypoints with low contrast. Those that have poor localization,
i.e. their position is likely to be influenced by noise, can be filtered by considering the ratio of
curvature along an edge to that perpendicular to it, in a manner following the Harris operator in
Section 4.8.1.4, by thresholding the ratio of Equations 4.71 and 4.72.
To characterize the filtered keypoint features at each scale, the gradient magnitude is calcu-
lated in exactly the manner of Equations 4.12 and 4.13 as

MSIFT x y = L x + 1 y − L x − 1 y2 + L x y + 1 − L x y − 12 (4.75)
 
−1 L x y + 1 − L x y − 1
SIFT x y = tan (4.76)
L x + 1 y − L x − 1 y

The peak of the histogram of the orientations around a keypoint is then selected as the local
direction of the feature. This can be used to derive a canonical orientation, so that the resulting
descriptors are invariant with rotation. As such, this contributes to the process which aims to
reduce sensitivity to camera viewpoint and to non-linear change in image brightness (linear
changes are removed by the gradient operations) by analysing regions in the locality of the
selected viewpoint. The main description (Lowe, 2004) considers the technique’s basis in much
greater detail, and outlines factors important to its performance, such as the need for sampling
and performance in noise.
As shown in Figure 4.41, the technique can certainly operate well, and scale is illustrated by
applying the operator to the original image and to one at half the resolution. In all, 601 keypoints
are determined in the original resolution image and 320 keypoints at half the resolution. By
inspection, the major features are retained across scales (a lot of minor regions in the leaves
disappear at lower resolution), as expected. Alternatively, the features can be filtered further
by magnitude, or even direction (if appropriate). If you want more than results to convince
you, implementations are available for Windows and Linux (http://www.cs.ubc.ca/spider/lowe/
research.html): a feast for a developer. These images were derived using siftWin32, version 4.
(a) Original image (b) Key points at full (c) Key points at half
resolution resolution

Figure 4.41 SIFT feature detection at different scales

4.8.2.2 Saliency
The new saliency operator (Kadir and Brady, 2001) was also motivated by the need to extract
robust and relevant features. In the approach, regions are considered salient if they are simultane-
ously unpredictable both in some feature and scale–space. Unpredictability (rarity) is determined
in a statistical sense, generating a space of saliency values over position and scale, as a basis
for later understanding. The technique aims to be a generic approach to scale and saliency
compared to conventional methods, because both are defined independent of a particular basis
morphology–meaning that it is not based on a particular geometric feature like a blob, edge or
corner. The technique operates by determining the entropy (a measure of rarity) within patches
at scales of interest and the saliency is a weighted summation of where the entropy peaks. The
new method has practical capability in that it can be made invariant to rotation, translation,
non-uniform scaling and uniform intensity variations and robust to small changes in viewpoint.
An example result of processing the image in Fig. 4.42(a) is shown in Figure 4.42(b) where the
200 most salient points are shown circled, and the radius of the circle is indicative of the scale.
Many of the points are around the walking subject and others highlight significant features in the
background, such as the waste bins, the tree or the time index. An example use of saliency was
within an approach to learn and recognize object class models (such as faces, cars or animals)

(a) Original image (b) Top 200 saliency matches circled

Figure 4.42 Detecting features by saliency


from unlabelled and unsegmented cluttered scenes, irrespective of their overall size (Fergus
et al., 2003). For further study and application, descriptions and Matlab binaries are available
from Kadir’s website (http://www.robots.ox.ac.uk/∼timork/).

4.8.2.3 Other techniques and performance issues


There has been a recent comprehensive performance review (Mikolajczyk and Schmid, 2005),
comparing established and new patch-based operators. The techniques that were compared
included SIFT, differential derivatives by differentiation, cross-correlation for matching, and
a gradient location and orientation-based histogram (an extension to SIFT, which performed
well); the saliency approach was not included. The criterion used for evaluation concerned the
number of correct matches, and the number of false matches, between feature points selected by
the techniques. The matching process was between an original image and one of the same scene
when subject to one of six image transformations. The image transformations covered practical
effects that can change image appearance, and were: rotation, scale change, viewpoint change,
image blur, JPEG compression, and illumination. For some of these there were two scene types
available, which allowed for separation of understanding of scene type and transformation. The
study observed that, within its analysis, ‘the SIFT-based descriptors perform best’, but this is
a complex topic and selection of technique is often application dependent. Note that there is
further interest in performance evaluation, and in invariance to higher order changes in viewing
geometry, such as invariance to affine and projective transformation.

You might also like