0% found this document useful (0 votes)
49 views46 pages

4.01 08 2022 - FeatureDescriptors

This document discusses local feature detection and description. It covers: 1) Detection of interest points using techniques like Harris corner detection and Laplacian of Gaussian. 2) Description of image patches around detected points using descriptors like SIFT that are invariant to transformations. 3) Matching descriptors between images using distance measures like Euclidean distance or ratio test to find correspondences. Local feature detection and description allows matching of image regions that are robust to changes in viewpoint, illumination and other transformations.

Uploaded by

Sarnitha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
49 views46 pages

4.01 08 2022 - FeatureDescriptors

This document discusses local feature detection and description. It covers: 1) Detection of interest points using techniques like Harris corner detection and Laplacian of Gaussian. 2) Description of image patches around detected points using descriptors like SIFT that are invariant to transformations. 3) Matching descriptors between images using distance measures like Euclidean distance or ratio test to find correspondences. Local feature detection and description allows matching of image regions that are robust to changes in viewpoint, illumination and other transformations.

Uploaded by

Sarnitha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 46

Local features: main components

1) Detection: Identify the


interest points

2) Description: Extract vector


feature descriptor surrounding x1  [ x1(1) , , xd(1) ]
each interest point.

3) Matching: Determine
correspondence between x2  [ x1( 2) , , xd( 2) ]
descriptors in two views
Image transformations
• Geometric

Rotation

Scale

• Photometric
Intensity change
Invariance and equivariance
• We want corner locations to be invariant to photometric transformations and
equivariant to geometric transformations
– Invariance: image is transformed and corner locations do not change
– Equivariance: if we have two transformed versions of the same image,
features should be detected in corresponding locations
– (Sometimes “invariant” and “equivariant” are both referred to as “invariant”)
– (Sometimes “equivariant” is called “covariant”)
Harris detector: Invariance properties
-- Image translation

• Derivatives and window function are equivariant

Corner location is equivariant w.r.t. translation


Harris detector: Invariance properties
-- Image rotation

Second moment ellipse rotates but its shape (i.e.


eigenvalues) remains the same

Corner location is equivariant w.r.t. image rotation


Harris detector: Invariance properties –
Affine intensity change
IaI+b

• Only derivatives are used =>


invariance to intensity shift I  I + b
• Intensity scaling: I  a I

R R
threshold

x (image coordinate) x (image coordinate)

Partially invariant to affine intensity change


Harris Detector: Invariance Properties
• Scaling

Corner

All points will be


classified as edges

Neither invariant nor equivariant to scaling


Scale invariant detection
Suppose you’re looking for corners

Key idea: find scale that gives local maximum of f


– in both position and scale
– One definition of f: the Harris operator
Lindeberg et al, et
Lindeberg 1996
al., 1996

Slide
Slidefrom
fromTinne
TinneTuytelaars
Tuytelaars
Gaussian pyramid

Image by cmglee, CC BY-SA 3.0


Implementation
• Instead of computing f for larger and larger
windows, we can implement using a fixed
window size with a Gaussian pyramid

(sometimes need to create in-


between levels, e.g. a ¾-size image)
Another common definition of f
• The Laplacian of Gaussian (LoG)

 g  g
2 2 (very similar to a Difference of Gaussians (DoG) –
 g 2  2
2
i.e. a Gaussian minus a slightly smaller Gaussian)
x y
Laplacian of Gaussian
• “Blob” detector minima

=
*
maximum

• Find maxima and minima of LoG operator in


space and scale
Scale selection
• At what scale does the Laplacian achieve a
maximum response for a binary circle of
radius r?

image Laplacian
Characteristic scale
• We define the characteristic scale as the scale
that produces peak of Laplacian response

characteristic scale

T. Lindeberg (1998). "Feature detection with automatic scale selection."


International Journal of Computer Vision 30 (2): pp 77--116.
Find local maxima in 3D position-scale space

5

4

Lxx ( )  Lyy ( ) 3

2

 List of
(x, y, s)

K. Grauman, B. Leibe
Note: The LoG and DoG operators
covariant
are both rotation equivariant
Local features: main components
1) Detection: Identify the
interest points

2) Description: Extract vector


feature descriptor surrounding
x1  [ x1
(1)
,  , xd
(1)
]
each interest point.

3) Matching: Determine x2  [ x1( 2) , , xd( 2) ]


correspondence between
descriptors in two views

Kristen Grauman
Feature descriptors
We know how to detect good points
Next question: How to match them?

?
Answer: Come up with a descriptor for each point,
find similar descriptors between the two images
Feature descriptors
We know how to detect good points
Next question: How to match them?

?
Lots of possibilities
– Simple option: match square windows around the point
– State of the art approach: SIFT
• David Lowe, UBC http://www.cs.ubc.ca/~lowe/keypoints/
Invariance vs. discriminability
• Invariance:
– Descriptor shouldn’t change even if image is
transformed

• Discriminability:
– Descriptor should be highly unique for each point
Rotation invariance for
feature descriptors
• Find dominant orientation of the image patch
– E.g., given by xmax, the eigenvector of H corresponding to max (the
larger eigenvalue)
– Or simply the orientation of the (smoothed) gradient
– Rotate the patch according to this angle

Figure by Matthew Brown


Multiscale Oriented PatcheS descriptor

Take 40x40 square window


around detected feature
– Scale to 1/5 size (using 8 pixels
prefiltering)
– Rotate to horizontal
– Sample 8x8 square window
centered at feature
– Intensity normalize the
window by subtracting the
mean, dividing by the
standard deviation in the
CSE 576: Computer Vision
window

Adapted from slide by Matthew Brown


Detections at multiple scales
Scale Invariant Feature Transform
Scale Invariant Feature Transform
Basic idea:
• Take 16x16 square window around detected feature
• Compute edge orientation (angle of the gradient - 90) for each pixel
• Throw out weak edges (threshold gradient magnitude)
• Create histogram of surviving edge orientations

0 2
angle histogram

Adapted from slide by David Lowe


SIFT descriptor
Full version
• Divide the 16x16 window into a 4x4 grid of cells (2x2 case shown below)
• Compute an orientation histogram for each cell
• 16 cells * 8 orientations = 128 dimensional descriptor

Adapted from slide by David Lowe


Properties of SIFT
Extraordinarily robust matching technique
– Can handle changes in viewpoint
• Up to about 60 degree out of plane rotation
– Can handle significant changes in illumination
• Sometimes even day vs. night (below)
– Fast and efficient—can run in real time
– Lots of code available
• http://people.csail.mit.edu/albert/ladypack/wiki/index.php/Known_implementations_of_SIFT
Feature matching
Given a feature in I1, how to find the best match
in I2?
1. Define distance function that compares two
descriptors
2. Test all the features in I2, find the one with min
distance
Feature distance
How to define the difference between two features f1, f2?
– Simple approach: L2 distance, ||f1 - f2 ||
– can give small distances for ambiguous (incorrect) matches

f1 f2

I1 I2
Feature distance
How to define the difference between two features f1, f2?
• Better approach: ratio distance = ||f1 - f2 || / || f1 - f2’ ||
• f2 is best SSD match to f1 in I2
• f2’ is 2nd best SSD match to f1 in I2
• gives large values for ambiguous matches

f1 f2' f2

I1 I2
Feature distance
• Does the SSD vs “ratio distance” change the
best match to a given feature in image 1?
Feature matching example

58 matches (thresholded by ratio score)


We’ll deal with
outliers later
Feature matching example

51 matches (thresholded by ratio score)


Evaluating the results
How can we measure the performance of a feature matcher?
50
75
200

feature distance
True/false positives
How can we measure the performance of a feature matcher?
50
true match
75
200
false match

feature distance

The distance threshold affects performance


– True positives = # of detected matches that are correct
• Suppose we want to maximize these—how to choose threshold?
– False positives = # of detected matches that are incorrect
• Suppose we want to minimize these—how to choose threshold?
Evaluating the results
How can we measure the performance of a feature matcher?

0.7

true
# true positives
positive
# matching features (positives)
rate
recall

0 0.1 false positive rate 1


# false positives
# unmatched features (negatives)
precision
Evaluating the results
How can we measure the performance of a feature matcher?

ROC curve (“Receiver Operator Characteristic”)


1

0.7
Single number: Area
true
# true positives
positive Under the Curve (AUC)
# matching features (positives)
rate E.g. AUC = 0.87
recall 1 is the best

0 0.1 false positive rate 1


# false positives
# unmatched features (negatives)
1 - specificity

You might also like