Comparis I On
Comparis I On
http://people.csail.mit.edu/albert/ladypack/wiki/index.php/Known_implementations_of_SIFT
SIFT Computation – Steps
(1) Scale-space extrema detection
– Extract scale and rotation invariant interest points (i.e.,
keypoints).
(2) Keypoint localization
– Determine location and scale for each interest point.
– Eliminate “weak” keypoints
(3) Orientation assignment
– Assign one or more orientations to each keypoint.
(4) Keypoint descriptor
– Use local image gradients at the selected scale.
D. Lowe, “Distinctive Image Features from Scale-Invariant Keypoints”, International Journal
of Computer Vision, 60(2):91-110, 2004.
1. Scale-space Extrema Detection
scale
• Harris-Laplace
LoG
• Find local maxima of:
– Harris detector in space y
– LoG in scale
Harris x
• SIFT scale
DoG
Find local maxima of:
– Hessian in space y
– DoG in scale
Hessian x
1. Scale-space Extrema Detection
(cont’d)
• DoG images are grouped by octaves (i.e., doubling of σ0)
• Fixed number of levels per octave
22 σ 0
D ( x, y , )
L( x, y, k ) L( x, y, )
down-sample
where
2σ0
L ( x, y , )
G ( x, y , ) * I ( x, y )
σ0
1. Scale-space Extrema Detection
(cont’d)
(ks=2) ksσ0
X ( x xi , y yi , i ) offset
D T ( X i ) 1 T 2
D( X i )
D (X ) D ( X i ) 2
2
D ( X i ) 2 D ( X i )
2
X 0
X X
2. Keypoint Localization
2 D ( X i ) D ( X i ) 2 D 1 ( X i ) D ( X i )
X X
X 2
X X 2 X
(r = α/β)
m ( x , y ) ( L ( x 1, y ) L ( x 1, y )) 2 ( L ( x , y 1) L ( x , y 1)) 2
( x , y ) a tan 2(( L ( x , y 1) L ( x , y 1)) / ( L ( x 1, y ) L ( x 1, y )))
0 2
• Histogram entries are weighted by (i) gradient magnitude and (ii) a
Gaussian function with σ equal to 1.5 times the scale of the keypoint.
3. Orientation Assignment (cont’d)
• Assign canonical orientation at peak of smoothed
histogram (fit parabola to better localize peak).
0 2
8 bins
4. Keypoint Descriptor (cont’d)
1. Take a 16 x16
window around
(8 bins)
detected
interest point.
2. Divide into a
4x4 grid of
cells.
3. Compute
histogram in 16 histograms x 8 orientations
each cell. = 128 features
4. Keypoint Descriptor (cont’d)
• Each histogram entry is weighted by (i) gradient magnitude
and (ii) a Gaussian function with σ equal to 0.5 times the
width of the descriptor window.
4. Keypoint Descriptor (cont’d)
• Partial Voting: distribute histogram entries into adjacent bins
(i.e., additional robustness to shifts)
– Each entry is added to all bins, multiplied by a weight of 1-d,
where d is the distance from the bin it belongs.
4. Keypoint Descriptor (cont’d)
• Descriptor depends on two main parameters:
(1) number of orientations r
rn2 features
(2) n x n array of orientation histograms
•
128 features
4. Keypoint Descriptor (cont’d)
• Non-linear illumination changes:
– Saturation affects gradient magnitudes more
than orientations
– Threshold entries to be no larger than 0.2 and
renormalize to unit length
128 features
Robustness to viewpoint changes
• Match features after random change in image scale and
orientation, with 2% image noise, and affine distortion.
• Find nearest neighbor in database of 30,000 features.
Additional
robustness can
be achieved using
affine invariant
region detectors.
Distinctiveness
• Vary size of database of features, with 30 degree affine
change, 2% image noise.
• Measure % correct for single nearest neighbor match.
Matching SIFT features
• Given a feature in I1, how to find the best
match in I2?
1. Define distance function that compares two
descriptors.
2. Test all the features in I2, find the one with
min distance.
I1 I2
Matching SIFT features (cont’d)
f1 f2
I1 I2
Matching SIFT features (cont’d)
• Accept a match if SSD(f1,f2) < t
• How do we choose t?
Matching SIFT features (cont’d)
• A better distance measure is the following:
– SSD(f1, f2) / SSD(f1, f2’)
• f2 is best SSD match to f1 in I2
• f2’ is 2nd best SSD match to f1 in I2
f1 f2' f2
I1 I2
Matching SIFT features (cont’d)
• Accept a match if SSD(f1, f2) / SSD(f1, f2’) < t
• t=0.8 has given good results in object
recognition.
– 90% of false matches were eliminated.
– Less than 5% of correct matches were discarded
Matching SIFT features (cont’d)
• How to evaluate the performance of a feature
matcher?
50
75
200
Matching SIFT features (cont’d)
• Threshold t affects # of correct/false matches
50
true match
75
200
false match
• ROC Curve 1
0.7
- Generated by computing
(FP, TP) for different TP
rate
thresholds.
http://en.wikipedia.org/wiki/Receiver_operating_characteristic
Applications of SIFT
• Object recognition
• Object categorization
• Location recognition
• Robot localization
• Image retrieval
• Image panoramas
Object Recognition
Object Models
Object Categorization
Location recognition
Robot Localization
Map continuously built over time
Image retrieval – Example 1
…
> 5000
images
22 correct matches
Image retrieval – Example 2
…
> 5000
images
33 correct matches
Image panoramas from an unordered image set
Variations of SIFT features
• PCA-SIFT
• SURF
• GLOH
SIFT Steps - Review
(1) Scale-space extrema detection
– Extract scale and rotation invariant interest points (i.e.,
keypoints).
(2) Keypoint localization
– Determine location and scale for each interest point.
– Eliminate “weak” keypoints
(3) Orientation assignment
– Assign one or more orientations to each keypoint.
(4) Keypoint descriptor
– Use local image gradients at the selected scale.
D. Lowe, “Distinctive Image Features from Scale-Invariant Keypoints”, International Journal
of Computer Vision, 60(2):91-110, 2004.
Cited 9589 times (as of 3/7/2011)
PCA-SIFT
• Steps 1-3 are the same; Step 4 is modified.
• Take a 41 x 41 patch at the given scale,
centered at the keypoint, and normalized to
a canonical direction.
Yan Ke and Rahul Sukthankar, “PCA-SIFT: A More Distinctive Representation for Local
Image Descriptors”, Computer Vision and Pattern Recognition, 2004
PCA-SIFT
• Instead of using weighted histograms,
concatenate the horizontal and vertical
gradients (39 x 39) into a long vector.
• Normalize vector to unit length.
2 x 39 x 39 = 3042 vector
PCA-SIFT
• Reduce the dimensionality of the vector using
Principal Component Analysis (PCA)
– e.g., from 3042 to 36
'
AKxN I Nx1 I
PCA
Kx1
Nx1 Kx1
Herbert Bay, Tinne Tuytelaars, and Luc Van Gool, “SURF: Speeded Up Robust Features”,
European Computer Vision Conference (ECCV), 2006.
Integral Image
• The integral image IΣ(x,y) of an image I(x, y) represents the
sum of all pixels in I(x,y) of a rectangular region formed by
(0,0) and (x,y).
SURF
Lˆxx Lˆxy
SURF : H approx
Using box filters Lˆ yx ˆ
Lyy
SURF: Speeded Up Robust Features
(cont’d)
• Instead of using a different measure for selecting the
location and scale of interest points (e.g., Hessian and
SURF
DOG in SIFT), SURF uses the determinant of H approx
to find both.
det( H SURF ˆ ˆ ˆ
) Lxx Lyy (0.9 Lxy ) 2
approx
SURF: Speeded Up Robust Features
(cont’d)
• Once interest points have been localized both in space
and scale, the next steps are:
(1) Orientation assignment
(2) Keypoint descriptor
SURF: Speeded Up Robust Features
(cont’d)
• Orientation assignment
Circular neighborhood of ( dx, dy )
radius 6σ around the interest point
600
(σ = the scale at which the point was detected)
angle
x response y response
Haar wavelets
(responses weighted
with Gaussian)
Side length = 4σ
– More
discriminatory!
SURF: Speeded Up Robust Features