0% found this document useful (0 votes)
16 views85 pages

IT5409 Ch4 Part2 Feature ExtractionMatching

The document discusses feature detection and image matching in computer vision. It describes two types of features that can be extracted from images: global features that describe the entire image and local features that describe local patches or keypoints. Some examples of global and local features are provided. The document also discusses different approaches for extracting features, including color histograms, texture features using first-order statistics and co-occurrence matrices, and how features can be used for image matching and applications in computer vision.

Uploaded by

Bui Minh Duc
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views85 pages

IT5409 Ch4 Part2 Feature ExtractionMatching

The document discusses feature detection and image matching in computer vision. It describes two types of features that can be extracted from images: global features that describe the entire image and local features that describe local patches or keypoints. Some examples of global and local features are provided. The document also discusses different approaches for extracting features, including color histograms, texture features using first-order statistics and co-occurrence matrices, and how features can be used for image matching and applications in computer vision.

Uploaded by

Bui Minh Duc
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 85

4/25/2023

Computer Vision
Chapter 4: Feature detection and Image
matching

Plan
• Edges
‒ Detection
‒ linking
• Feature extraction
‒ Global features
‒ Local features
‒ Matching
• Applications

1
4/25/2023

Local features vs Global features


• Two types of features are extracted from the image:
‒ local and global features (descriptors)
• Global features
‒ Describe the image as a whole to the generalize the entire object
‒ Include contour representations, shape descriptors, and texture
features
• Examples: Invariant Moments (Hu, Zernike), Histogram Oriented
Gradients (HOG), PHOG, and Co-HOG,...
• Local feature:
‒ the local features describe the image patches (key points in the
image) of an object
‒ represents the texture/color in an image patch
• Examples: SIFT, SURF, LBP, BRISK, MSER and FREAK, …

Feature extraction
• Global features
• Color / Shape / Texture
• Local features

2
4/25/2023

Global features?
How to distinguish these objects?

Types of features
• Contour representation, Shape features
• Color descriptors
• Texture features

3
4/25/2023

Color features
• Histogram 256 bins intensity histogram 16 bins intensity histogram

Distance / Similarity
• L1 ou L2 (euclidian) distances are often used
N
d L1 (H, G)   hi  g i
i 1

• Histogram intersection
∑ min( ℎ , 𝑔 )
∩ (H,G) =
∑ 𝑔

4
4/25/2023

Advantages of histogram
• Invariant to basic geometric transformations:
‒ Rotation
‒ Translation
‒ Zoom (Scaling)

Some inconveniences
• The similarity between colors in
adjacent colors (bin) is not taken
into account

• The spatial distribution of pixel


values is not considered: 2 different
images may have the same
histogram

10

5
4/25/2023

Some inconveniences
• Background effect: d(I1,I2) ? d(I1, I3)

I3
I1

I2

• Color representation dependency (color space),


device dependency, …

11

Texture features
• A texture can be defined as
‒ a region with variation of intensity
‒ as a spatial organization of pixels

12

6
4/25/2023

Texture features
• There are several methods for analyzing textures:
‒ First order statistics
• Statistics on histogram

‒ Co-occurence matrices
• Searching patterns

‒ Frequential analysis
• Gabor filter
‒…
• The most difficult is to find a good representation (good
parameters) for each texture

13

Texture features

14

7
4/25/2023

First order statistics


• Histogram-based: mean, variance, skewness, kurtosis,
energy, entropy, ...

h(i) : số điểm ảnh ở mức xám i

is.hust.edu.vn 15

First order statistics


• Skewness vs. kurtosis (red: normal distribution with mean
= 5, variance = 4)

is.hust.edu.vn 16

8
4/25/2023

GLCM (Grey Level Co-occurence Matrices)


 The idea here is to identify gray level that repeat
themselves given a distance and a direction
 Co-occurence matrices (Haralick)

 Matrix of size Ng x Ng
 Ng is the number of gray level in the image (256x256)
 We often reduce that number to 8x8, 16x16 or 32x32

 Many matrices, one for each distance and direction


 Distance : 1, 2, 3 (,4, …) 0°
 Direction : 0°, 45°, 90°, 135° (, …)
135° 90° 45°
 Processing time can be very long

GLCM

card ({ p1 , p2 I ( p1 )  ci , I ( p2 )  c j , N d , ( p1 , p2 )  true})
CM d ,  (ci , c j ) 
card ({ p1, p 2 N d ,  ( p1 , p2 )  true})

N d ,  ( p1 , p2 )  true p2 is a neigbor of p1 at a distance d direction 

18

9
4/25/2023

GLCM
 Example on how to compute these matrices:

1 2 3 4
1 4 4 3
1 ? ? ? ?
4 2 3 2
2 ? ? ? ?
1 2 1 4
3 ? ? ? ?
1 2 2 3
4 ? ? ? ?
Image
Matrix for distance=1
and direction=0°

We loop over the image and for each pair of pixels following the given
distance and orientation, we increment the co-occurence matrix

GLCM
 Example on how to compute these matrices:

1 2 3 4
1 4 4 3
1 0 0 0 1
4 2 3 2
2 0 0 0 0
1 2 1 4
3 0 0 0 0
1 2 2 3
4 0 0 0 0
Image
Matrix for distance=1 and
direction=0°

Pair of neighbor pixels (1,4)

10
4/25/2023

GLCM
 Example on how to compute these matrices:

1 2 3 4
1 4 4 3
1 0 0 0 1
4 2 3 2
2 0 0 0 0
1 2 1 4
3 0 0 0 0
1 2 2 3
4 0 0 0 1
Image
Matrix for distance=1 and
direction=0°

Pair of neighbor pixels (4,4)

GLCM
 Example on how to compute these matrices:

1 2 3 4
1 4 4 3
1 0 0 0 1
4 2 3 2
2 0 0 1 0
1 2 1 4
3 0 1 0 0
1 2 2 3
4 0 1 1 1
Image
Matrix for distance=1 and
direction=0°

...and so on (after 2 lines)

11
4/25/2023

GLCM
 Example on how to compute these matrices
(final):
1 2 3 4 1 2 3 4
1 4 4 3
1 0 2 0 2 1 0 2 1 0
4 2 3 2
2 1 1 2 0 2 1 1 0 0
1 2 1 4
3 0 1 0 0 3 0 0 0 1
1 2 2 3
4 0 1 1 1 4 0 2 1 0
Image
Matrix for distance=1 Matrix for distance=1
and direction=0° and direction=45°

…and so on for each matrix (several matrices at the end)

GLCM
• Most important/populer parameters computed from
GLCM:
Energy   CM d 2 (i, j ) minimal when all elements are equal

i j
entropy   CM d (i, j ) log(CM d (i, j ))
i j a measure of chaos,
maximal when all elements are equal

contrast   (i  j ) 2 CM d (i, j )
small values when big elements
i j are near the main diagonal
1
idm   CM d (i, j )
i j 1  (i  j ) 2 idm (inverse differential moment) has small
values when big elements are far from the
main diagonal
is.hust.edu.vn 24

12
4/25/2023

GLCM
• Haralick features:
‒ For each GLCM, we can compute up to 14 (13)
parameters characterizing the texture, of which the most
important : mean, variance, energy, inertia, entropy,
inverse differential moment
‒ Ref: http://haralick.org/journals/TexturalFeatures.pdf

is.hust.edu.vn 25

Invariances
• Rotation?
‒ Average on all directions

• Scaling?
‒ Multi-resolutions

26

13
4/25/2023

Texture features comparision

Source : William Robson Schwartz et al. Evaluation of Feature Descriptors for Texture Classification – 2012 JEI
is.hust.edu.vn 27

Shape features
• Contour-based features
‒ Chain coding, polygone approximation, geometric
parameters, angular profile, surface, perimeter, …
• Region based:
‒ Invariant moments, …

28

14
4/25/2023

Shape features

29

Examples: Freeman chain coding

30

15
4/25/2023

Examples: angular profile, …

31

Examples : Image moments


• Moment

𝑀 , = 𝑎𝑟𝑒𝑎 𝑜𝑓 𝑡ℎ𝑒 𝑟𝑒𝑔𝑖𝑜𝑛 𝐷

𝑀 , ,𝑀 , = 𝑐𝑒𝑛𝑡𝑟𝑜𝑖𝑑 𝑜𝑓 𝐷
• Central moments:

Invariant to
translation

32

16
4/25/2023

Invariant moments (Hu's moments)

invariant to
translation,
scale, and
rotation, and
reflection

Change for
image
reflection

33

Examples : Hu's moments


6 images and their Hu Moments

https://www.learnopencv.com/wp-content/uploads/2018/12/HuMoments-Shape-Matching.png

34

17
4/25/2023

Shape Context

https://www2.eecs.berkeley.edu/Research/Projects/CS/vision/shape/sc_digits.html

is.hust.edu.vn 35

Examples: PHOG

PHOG:
Pyramid Histogram of Oriented Gradients

Source:http://www.robots.ox.ac.uk/~vgg/research/caltech/phog.html

36

18
4/25/2023

Feature extraction
• Global features
• Local features
• Interest point detector
• Local descriptor

37

Why local features?


• Image matching: a challenging problem

Source: CS131 - Juan Carlos Niebles and Ranjay Krishna

38

19
4/25/2023

Image matching

by swashford

by Diva Sian

by scgbt Slide credit: Steve Seitz

39

Harder Still?

NASA Mars Rover images

Slide credit: Steve Seitz

40

20
4/25/2023

Answer Below (Look for tiny colored squares)

NASA Mars Rover images with SIFT feature matches


(Figure by Noah Snavely)

Slide credit: Steve Seitz

41

• Recognition of specific objects/scenes

Sivic and Zisserman, 2003

D. Lowe 2002

42

21
4/25/2023

Motivation for using local features


• Global representations have major limitations
• Instead, describe and match only local regions
• Increased robustness to
‒ Occlusions

‒ Articulation d dq
φ
φ
θq
θ

‒ Intra-category variations
Source: CS131 - Juan Carlos Niebles and Ranjay Krishna

43

Local features and alignment


• We need to match (align) images
• Global methods sensitive to occlusion, lighting,
parallax effects. So look for local features that match
well.
• How would you do it by eye?

[Darya Frolova and Denis Simakov]


44

22
4/25/2023

Local features and alignment


• Detect feature points in both images
• Find corresponding pairs

[Darya Frolova and Denis Simakov]

45

Local features and alignment


• Detect feature points in both images
• Find corresponding pairs
• Use these pairs to align images

[Darya Frolova and Denis46Simakov]

23
4/25/2023

Local features for image matching


1. Find a set of
distinctive key- points

A1 2. Define a region
around each keypoint

A2 A3
3. Extract and
normalize the region content

fA Similarity fB 4. Compute a localdescriptor


measure
N pixels

from the normalized region

e.g. color e.g. color

N pixels d ( f A, fB )  T 5. Match local descriptors

Slide credit: Bastian Leibe

47

Source : Jim Little, Lowe: features, UBC.


48

24
4/25/2023

Local features
• Objectifs:
• Look for similar objects/regions
• Partial query
Look for pictures that contain sunflowers

• Solution:
• Describing local regions
• Adding spatial constraints if need

49

Local feature extraction


• Local features: how to determine image patches / local
regions

Dividing into
patches with Keypoint detection
regular grid
Image segmentation

Without knowledge about


image content Based on the content of image

50

25
4/25/2023

Common Requirements
• Problem 1:
‒ Detect the same point independently in both images

No chance to match!
We need a repeatable detector!
• Problem 2:
‒ For each point correctly recognize the corresponding one

We need a reliable and distinctive descriptor!


Slide credit: Darya Frolova, Denis Simakov
51

Invariance: Geometric Transformations

Slide credit: Steve Seitz

52

26
4/25/2023

Invariance: Geometric Transformations

Levels of Geometric Invariance

Slide credit: Steve Seitz

53

Invariance: Photometric Transformations

• Often modeled as a linear


transformation:
‒ Scaling + Offset
Slide credit: Tinne Tuytelaars

54

27
4/25/2023

Requirements
• Region extraction needs to be repeatable and accurate
‒ Invariant to translation, rotation, scale changes
‒ Robust or covariant to out-of-plane (affine) transformations
‒ Robust to lighting variations, noise, blur, quantization

• Locality: Features are local, therefore robust to occlusion and


clutter

• Quantity: We need a sufficient number of regions to cover the


object

• Distinctiveness: The regions should contain “interesting” structure

• Efficiency: Close to real-time performance

55

Main questions
• Where will the interest points come from?
‒ What are salient features that we’ll detect in multiple
views?
• How to describe a local region?
• How to establish correspondences, i.e.,
compute matches?

56

28
4/25/2023

Feature extraction
• Global features
• Local features
• Interest point detector
• Local descriptor
• Matching

57

Interest points: why and where?

Yarbus eye tracking


Source : Derek Hoiem, Computer Vision, University of Illinois.

58

29
4/25/2023

Same image with different questions

59

Interest points: why and where?


• Where will the interest points come from?

60

30
4/25/2023

Keypoint Localization
• Goals:
‒ Repeatable detection
‒ Precise localization
‒ Interesting content
 Look for two-
dimensional signal
changes

Slide credit: Bastian Leibe

61

Finding Corners

• Key property:
‒ In the region around a corner, image gradient has two or
more dominant directions
• Corners are repeatable and distinctive
C.Harris and M.Stephens. "A Combined Corner and Edge Detector.“ Proceedings of the 4th Alvey
Vision Conference, 1988.

Slide credit: Svetlana Lazebnik

62

31
4/25/2023

Corners as distinctive interest points


• Design criteria
‒ We should easily recognize the point by looking through a
small window (locality)
‒ Shifting the window in any direction should give a large
change in intensity (good localization)

“flat” region: “edge”: “corner”:


no change in no change along significant change
all directions the edge in all directions
direction
Slide credit: Alyosha Efros

63

Corners versus edges

Large
Corner
Large

Small
Edge
Large

Small
Nothing
Small

64

32
4/25/2023

Harris detector formulation


Change of intensity for the shift [u,v]:

E (u , v)   w( x, y )  I ( x  u , y  v)  I ( x, y ) 
2

x, y

Window Shifted Intensity


function intensity

or
Window function w(x,y) =

1 in window, 0 outside Gaussian


Source: R. Szeliski
65

Corner Detection by Auto-correlation


Change in appearance of window w(x,y) for shift [u,v]:

E (u , v)   w( x, y )  I ( x  u , y  v)  I ( x, y ) 
2

x, y

I(x, y)
E(u, v)

E(0,0)

w(x, y)
66

33
4/25/2023

Corner Detection by Auto-correlation


Change in appearance of window w(x,y) for shift [u,v]:

E (u , v)   w( x, y )  I ( x  u , y  v)  I ( x, y ) 
2

x, y

I(x, y)
E(u, v)

E(3,2)

w(x, y)
67

Corner Detection by Auto-correlation


Change in appearance of window w(x,y) for shift [u,v]:

E (u , v)   w( x, y )  I ( x  u , y  v)  I ( x, y ) 
2

x, y

We want to discover how E behaves for small shifts

But this is very slow to compute naively.


O(window_width2 * shift_range2 * image_width2)

O( 112 * 112 * 6002 ) = 5.2 billion of these

68

34
4/25/2023

Corner Detection by Auto-correlation


Change in appearance of window w(x,y) for shift [u,v]:

E (u , v)   w( x, y )  I ( x  u , y  v)  I ( x, y ) 
2

x, y

We want to discover how E behaves for small shifts

But we know the response in E that we are looking


for – strong peak.  Approximation

69

Local quadratic approximation of E(u,v) in the


neighborhood of (0,0) is given by the
second-order Taylor expansion:

 E (0,0) 1  E (0,0) Euv (0,0) u 


E (u , v)  E (0,0)  [u v] u   [u v] uu  
 Ev (0,0)  2  Euv (0,0) Evv (0,0)   v 

Notation: partial derivative

70

35
4/25/2023

we are looking for strong peak

 E (0,0) 1  Euu (0,0) Euv (0,0) u 


E (u , v)  E (0,0)  [u v] u   [u v ] E (0,0) E (0,0)   v 
 Ev (0,0)  2  uv vv  

Ignore function
value; set to 0 Ignore first Just look at
derivative, shape of
set to 0 second
derivative

71

Corner Detection: Mathematics


Second-order Taylor expansion of E(u,v) about (0,0):
 E (0,0) 1  E (0,0) Euv (0,0) u 
E (u, v)  E (0,0)  [u v] u   [u v ] uu  
 Ev (0,0)  2  Euv (0,0) Evv (0,0)   v 
Eu (u, v)   2 w( x, y )I ( x  u, y  v )  I ( x, y )I x ( x  u, y  v )
x, y

Euu (u, v)   2 w( x, y )I x ( x  u , y  v) I x ( x  u, y  v )
x, y

  2w( x, y )I ( x  u, y  v )  I ( x, y )I xx ( x  u, y  v)


x, y

Euv (u, v)   2 w( x, y )I y ( x  u, y  v ) I x ( x  u, y  v)
x, y

  2w( x, y )I ( x  u, y  v )  I ( x, y )I xy ( x  u, y  v)


x, y
72

36
4/25/2023

Corner Detection: Mathematics


Second-order Taylor expansion of E(u,v) about (0,0):
 E (0,0) 1  E (0,0) Euv (0,0) u 
E (u, v)  E (0,0)  [u v] u   [u v ] uu  
 Ev (0,0)  2  Euv (0,0) Evv (0,0)   v 

𝐸 (0,0) = 2𝑤(𝑥, 𝑦) 𝐼 (𝑥, 𝑦)𝐼 (𝑥, 𝑦)


𝐸(0,0) = 0
𝐸 (0,0) = 0 ,
𝐸 (0,0) = 0 𝐸 (0,0) = 2𝑤(𝑥, 𝑦) 𝐼 (𝑥, 𝑦)𝐼 (𝑥, 𝑦)
,

𝐸 (0,0) = 2𝑤(𝑥, 𝑦) 𝐼 (𝑥, 𝑦)𝐼 (𝑥, 𝑦)


,

73

Corner Detection: Mathematics

E (u , v)   w( x, y )  I ( x  u , y  v)  I ( x, y ) 
2

x, y

𝑤(𝑥, 𝑦)𝐼 (𝑥, 𝑦) 𝑤(𝑥, 𝑦)𝐼 (𝑥, 𝑦)𝐼 (𝑥, 𝑦)


, , 𝑢
𝐸(𝑢, 𝑣) ≈ [𝑢 𝑣]
𝑣
𝑤(𝑥, 𝑦)𝐼 (𝑥, 𝑦)𝐼 (𝑥, 𝑦) 𝑤(𝑥, 𝑦)𝐼 (𝑥, 𝑦)
, ,

74

37
4/25/2023

Harris detector formulation


• This measure of change can be approximated by:

u 
E (u, v)  [u v] M  
v 
where M is a 22 matrix computed from image derivatives:

 I x2 IxI y  Gradient with respect to x,


M   w( x, y )   times gradient with respect to y
x, y  I x I y I y2 

Sum over image region – the area


we are checking for corner

M
Slide credit: Rick Szeliski

75

What does this matrix reveal?


• First, let’s consider an axis-aligned corner:

  I x2 I I x y
 1 0 
M   
 I x I y I   0 2 
2
y

Image I Ix Iy IxIy

Slide credit: David Jacobs

76

38
4/25/2023

What does this matrix reveal?


• First, let’s consider an axis-aligned corner:

  I x2 I I x y
 1 0 
M   
 I x I y I   0 2 
2
y

• This means:
‒ Dominant gradient directions align with x or y axis
‒ If either λ is close to 0, then this is not a corner, so look for
locations where both are large.

• What if we have a corner that is not aligned with the


image axes?

Slide credit: David Jacobs

77

General case
• Since M is symmetric, we have  0 
M  R 1  1 R
 0 2 
(Eigenvalue decomposition)

• We can visualize M as an ellipse with axis lengths


determined by the eigenvalues and orientation
determined by a rotation matrixR
Direction of the
fastest change

Direction of the
slowest change

(max)-1/2
(min)-1/2

adapted from Darya Frolova, Denis Simakov

78

39
4/25/2023

Interpreting the eigenvalues


• Classification of image points using eigenvalues of M :

2 “Edge”
2 >> 1
“Corner”
1 and 2 are large, 1 ~ 2;
E increases in all directions

1 and 2 are small;


E is almost constant in “Flat”
all directions “Edge”
region 1 >> 2
1

Slide credit: Kristen Grauman

79

Corner response function

q  det(M )  a trace(M )2  12  a (1  2 )2


2 “Edge”
θ<0
“Corner”
θ>0

• Fast approximation
‒ Avoid computing the “Flat” “Edge”
eigenvalues region θ<0
‒ α: constant 1
(0.04 to 0.06)
Slide credit: Kristen Grauman

80

40
4/25/2023

Window Function w(x,y) M   I x2 IxI y 


 w( x , y )  
I y2 
x, y  I x I y
• Option 1: uniform window
‒ Sum over square window
 I2 Ix I y 
M   x 
IxI y
x, y  I y2 
1 in window, 0 outside

‒ Problem: not rotation invariant

• Option 2: Smooth with Gaussian


‒ Gaussian already performs weighted sum

 I2 IxI y 
M  g ( )   x 
Gaussian  I x I y I y2 
‒ Result is rotation invariant

Slide credit: Bastian Leibe

81

Summary: Harris Detector [Harris88]

• Compute second moment matrix


(autocorrelation matrix)
Ix Iy
1. Image
derivatives
I I IxI
2. Square of 2x 2y
 I 2 ( ) I x I y ( D )  derivatives y
M ( I ,  D )  g ( I )   x D 
 I x I y ( D ) I y ( D ) 
2

3. Gaussian
filter g(I)
g(Ix2) g(Iy2) g(IxIy)

• Compute corner response


4. Cornerness function – two strong eigenvalues

q  det[M ( I , D )] a[trace(M ( I , D ))]2


 g ( I x2 ) g ( I y2 )  [ g ( I x I y )]2  a[ g ( I x2 )  g ( I y2 )]2
R
5. Perform non-maximum suppression
C.Harris and M.Stephens. “A Combined Corner and Edge Detector.” Proceedings of the 4th Alvey
Vision Conference: pages 147—151, 1988. Slide credit: Krystian Mikolajczyk

82

41
4/25/2023

Harris Detector: Workflow

Slide adapted from Darya Frolova, Denis Simakov


83

Harris Detector: Workflow


computer corner responses θ

Slide adapted from Darya Frolova, Denis Simakov


84

42
4/25/2023

Harris Detector: Workflow


Take points where θ > threshold

Slide adapted from Darya Frolova, Denis Simakov


85

Harris Detector: Workflow


Take only the local maxima of θ, where θ > threshold

Slide adapted from Darya Frolova, Denis Simakov


86

43
4/25/2023

Harris Detector: Workflow


Resulting Harris points

Slide adapted from Darya Frolova, Denis Simakov


87

Harris Detector – Responses [Harris88]

Effect: A very precise


corner detector.

Slide credit: Krystian Mikolajczyk

88

44
4/25/2023

Harris Detector: Properties


• Translation invariance?

Slide credit: Kristen Grauman

89

Harris Detector: Properties


• Translation invariance
• Rotation invariance?

Ellipse rotates but its shape (i.e.


eigenvalues) remains the same

Corner response θ is invariant to image rotation

Slide credit: Kristen Grauman

90

45
4/25/2023

Harris Detector: Properties


• Translation invariance
• Rotation invariance
• Scale invariance?

Corner All points will be


classified as edges!
Not invariant to image scale!

Slide credit: Kristen Grauman

91

Scale invariance: how to?


• Exhaustive search
• Invariance
• Robustness

92

46
4/25/2023

Exhaustive search
• Multi-scale approach

Slide adapted from T. Tuytelaars ECCV 2006 tutorial

93

Invariance
• Extract patch from each image individually

Slide adapted from T. Tuytelaars ECCV 2006 tutorial

94

47
4/25/2023

Automatic scale selection


• Solution:
‒ Design a function on the region, which is “scale
invariant” (the same for corresponding regions, even if
they are at different scales)
Example: average intensity. For corresponding regions (even of
different sizes) it will be the same.
‒ For a point in one image, we can consider it as a
function of region size (patch width)
f Image 1 f Image 2

scale = 1/2

region size region size


95

Automatic scale selection


• Common approach:
Take a local maximum of this function
Observation: region size, for which the maximum is
achieved, should be invariant to image scale.
Important: this scale invariant region size is found in
each image independently!

f Image 1 f Image 2
scale = 1/2

s1 s2 region size
region size
96

48
4/25/2023

Automatic Scale Selection

f ( I i1im ( x,  ))  f ( I i1im ( x,  ))

Same operator responses if the patch contains the same image up to scale
factor.

K. Grauman, B. Leibe

97

Example
Function responses for increasing scale (scale signature)

K. Grauman, B. Leibe
f ( I i1im ( x,  )) f ( I i1im ( x,  ))
98

49
4/25/2023

Example
Function responses for increasing scale (scale signature)

K. Grauman, B. Leibe
f ( I i1im ( x,  )) f ( I i1im ( x,  ))
99

Example
Function responses for increasing scale (scale signature)

K. Grauman, B. Leibe
f ( I i1im ( x,  )) f ( I i1im ( x,  ))
100

50
4/25/2023

Example
Function responses for increasing scale (scale signature)

K. Grauman, B. Leibe
f ( I i1im ( x,  )) f ( I i1im ( x,  ))
101

Example
Function responses for increasing scale (scale signature)

K. Grauman, B. Leibe
f ( I i1im ( x,  )) f ( I i1im ( x,  ))
102

51
4/25/2023

Scale Invariant Detection


• A “good” function for scale detection:
has one stable sharp peak
f f f
Good !
bad bad
region size region size region size

• For usual images: a good function would be one


which responds to contrast (sharp local intensity
change)

103

What is a useful signature function?


• Functions for determining scale f  Kernel  Image
Kernels:
L   2  Gxx ( x, y,  )  G yy ( x, y ,  ) 
(Laplacian)

DoG  G ( x, y, k )  G ( x, y,  )
(Difference of Gaussians)

where Gaussian
x2  y 2

G ( x, y ,  )  1
2
e 2 2
Note: both kernels are invariant
to scale and rotation

104

52
4/25/2023

What is a useful signature function?


• Laplacian-of-Gaussian = “blob” detector

Source: K. Grauman, B. Leibe

105

Characteristic scale
• We define the characteristic scale as the scale
that produces peak of Laplacian response

characteristic scale
T. Lindeberg (1998). "Feature detection with automatic scale selection." IJCV 30 (2): pp 77--116.
Source: Lana Lazebnik
106

53
4/25/2023

Laplacian-of-Gaussian (LoG)
• Interest points: 5

Local maxima in scale


space of LoG 4

Lxx ( )  L yy ( ) 3

2

 List of
 (x, y, σ)

Source: K. Grauman, B. Leibe

107

Example: Scale-space blob detector

108 Lana Lazebnik


Source:

54
4/25/2023

Example: Scale-space blob detector

109 Lana Lazebnik


Source:

Example: Scale-space blob detector

Source: Lana Lazebnik


110

55
4/25/2023

Alternative approach
Approximate LoG with Difference-of-Gaussian (DoG).

Ruye Wang

111

Alternative approach
• Approximate LoG with Difference-of-Gaussian (DoG):
‒ 1. Blur image with σ Gaussian kernel
‒ 2. Blur image with kσ Gaussian kernel
‒ 3. Subtract 2. from 1.
Small k gives a closer approximation to LoG, but usually we want to build a
scale space quickly out of this. k = 1.6 gives an appropriate scale space, k =
sqrt(2)

- =

Source: K. Grauman, B. Leibe


112

56
4/25/2023

Find local maxima in position-scale space


of DoG
Find maxima


k 2k
- =

- k  List of
= (x, y, s)

- =

Input image

113

Harris-Laplacian
• Harris-Laplacian1 scale
 Laplacian 

Find local maximum of:


‒ Harris corner detector in space y
(image coordinates)
 Harris  x
‒ Laplacian in scale

1 K.Mikolajczyk, C.Schmid. “Indexing Based on Scale Invariant Interest Points”. ICCV 2001
2 D.Lowe. “Distinctive Image Features from Scale-Invariant Keypoints”. IJCV 2004

114

57
4/25/2023

115

Scale Invariant Detectors


scale
• Harris-Laplacian1

 Laplacian 
Find local maximum of:
‒ Harris corner detector in space y
(image coordinates)
 Harris  x
‒ Laplacian in scale

 SIFT (D.Lowe)2 scale

 DoG 
Find local maximum of:
 Difference of Gaussians in space y
and scale
 DoG  x

1 K.Mikolajczyk, C.Schmid. “Indexing Based on Scale Invariant Interest Points”. ICCV 2001
2 D.Lowe. “Distinctive Image Features from Scale-Invariant Keypoints”. IJCV 2004

DoG (SIFT) keypoint Detector


• DoG at multi-octaves
• Extrema detection in scale space
• Keypoint location
‒ Interpolation
‒ Removing instable points
• Orientation Assignment

116

58
4/25/2023

DoG (SIFT) Detector


• DoG at multi-octaves

 
G k 2 * I

G k * I
D   Gk   G * I
G  * I

117

DoG (SIFT) Detector


• Scale-Space Extrema Choose all extrema within
3x3x3 neighborhood

 
D k 2

Dk 

D 

X is selected if it is larger or smaller than all 26 neighbors


(its 8 neighbors in the current image and 9 neighbors
each in the scales above and below)

118

59
4/25/2023

DoG (SIFT) Detector

• Orientation assignment
‒ Create histogram of local
gradient directions at selected
scale
‒ Assign canonical orientation at
peak of smoothed histogram
• Each key specifies stable 2D
coordinates
(x, y, scale,orientation)
If 2 major orientations, use both.

119

Example of keypoint detection


(a) 233x189 image
(b) 832 DOG extrema
(c) 729 left after peak
value threshold
(d) 536 left after testing
ratio of principle
curvatures (removing
edge responses)

120

60
4/25/2023

DoG (SIFT) Detector

A SIFT keypoint : {x, y, scale, dominant orientation}

Source: Distinctive Image Features from Scale-Invariant Keypoints – IJCV 2004


121

Scale Invariant Detectors


• Experimental evaluation of detectors w.r.t. scale
change
Repeatability rate:
# correspondences
# possible correspondences

K.Mikolajczyk, C.Schmid. “Indexing Based on Scale Invariant Interest Points”. ICCV 2001

Slide credit: CS131 -Juan Carlos Niebles and Ranjay Krishna

61
4/25/2023

Many existing detectors available


• Hessian & Harris [Beaudet ‘78], [Harris ‘88]

• Laplacian, DoG [Lindeberg ‘98], [Lowe ‘99]

• Harris-/Hessian-Laplace [Mikolajczyk & Schmid ‘01]

• Harris-/Hessian-Affine [Mikolajczyk & Schmid ‘04]

• EBR and IBR [Tuytelaars & Van Gool ‘04]

• MSER [Matas ‘02]

• Salient Regions [Kadir & Brady ‘01]

• Others…

• Those detectors have become a basic building block


for many recent applications in Computer Vision.
Slide credit: Bastian Leibe

123

Feature extraction
• Global features
• Local features
• Interest point detector
• Local descriptor
• Matching

124

62
4/25/2023

Local Descriptor
• Compact, good representation for local information

• Invariant
‒ Geometric transformations: rotation, translation, scaling,..
‒ Camera view change
‒ Illiminution
• Exemples
‒ SIFT, SURF(Speeded Up Robust Features), PCA-SIFT, …
‒ LBP, BRISK, MSER and FREAK, …

125

Invariant local features


• Image content is transformed into local feature coordinates that
are invariant to translation, rotation, scale, and other imaging
parameters

Following slides credit: CVPR 2003 Tutorial on Recognition and Matching Based on Local Invariant Features David Lowe

126

63
4/25/2023

Advantages of invariant local features


• Locality:
‒ features are local, so robust to occlusion and clutter (no prior
segmentation)
• Distinctiveness:
‒ individual features can be matched to a large database of objects
• Quantity:
‒ many features can be generated for even small objects
• Efficiency:
‒ close to real-time performance
• Extensibility:
‒ can easily be extended to wide range of differing feature types,
with each adding robustness
127

128

Becoming rotation invariant


• We are given a keypoint and its scale from DoG

• We will select a characteristic orientation for


the keypoint (based on the most prominent
gradient there)

• We will describe all features


relative to this orientation

64
4/25/2023

129

SIFT descriptor formation


• Use the blurred image associated with the
keypoint’s scale
• Take image gradients over the keypoint
neighborhood.
• To become rotation invariant, rotate the
gradient directions AND locations by (-
keypoint orientation)
‒ Now we’ve cancelled out rotation and have
gradients expressed at locations relative to
keypoint orientation θ
‒ We could also have just rotated the whole
image by -θ, but that would be slower.

Source: Distinctive Image Features from Scale-Invariant Keypoints – IJCV 2004


http://campar.in.tum.de/twiki/pub/Chair/TeachingWs13TDCV/feature_descriptors.pdf

130

SIFT descriptor formation

0 2

• Using precise gradient locations is fragile. We’d like to allow some “slop” in
the image, and still produce a very similar descriptor
• Using Gaussian filter : to avoid sudden changes in the descriptor with small
changes in the position of the window, and to give less emphasis to
gradients that are far from the center of the descriptor, as these are most
affected by misregistration errors
• Create array of orientation histograms (a 4x4 array is shown)

SIFT: Distinctive Image Features from Scale-Invariant Keypoints – IJCV 2004


Image: Ashish A Gupta, PhD thesis 2013

65
4/25/2023

131

SIFT descriptor formation

0 2

• Put the rotated gradients into their local orientation histograms


‒ A local orientation histogram has n bins (e.g 8 as shown).
‒ To avoid all boundary affects in which the descriptor abruptly changes as a sample shifts
smoothly from being within one histogram to another or from one orientation to another
•  interpolation is used to distribute the value of each gradient sample into adjacent histogram bins
• Also, scale down gradient contributions for gradients far from the bin center: a weight of 1-d
d: distance of the sample from the central value of the bin

• The SIFT authors found that best results were with 8 orientation bins per histogram
and and a 4x4 histogram array  a SIFT descriptor: vector of 128 values

Image: Ashish A Gupta, PhD thesis 2013

132

SIFT descriptor formation


• Adding robustness to illumination changes:
‒ The descriptor is made of gradients (differences between pixels)
•  already invariant to changes in brightness (e.g. adding 10 to all image
pixels yields the exact same descriptor)
‒ A higher -contrast photo will increase the magnitude of gradients linearly
•  to correct for contrast changes, normalize the vector (scale to length 1.0)
‒ Very large image gradients are usually from unreliable 3D illumination
effects (glare, etc)
•  to reduce their effect, clamp all values in the vector to be ≤ 0.2 (an
experimentally tuned value). Then normalize the vector again.
 Result is a vector which is fairly invariant to illumination changes

SIFT: Distinctive Image Features from Scale-Invariant Keypoints – IJCV 2004


Image: Ashish A Gupta, PhD thesis 2013

66
4/25/2023

SIFT
• Extraordinarily robust matching technique
‒ Can handle changes in viewpoint: up to about 60 degree out of plane rotation
‒ Can handle significant changes in illumination
• Sometimes even day vs. night (below)
‒ Fast and efficient—can run in real time

Steve Seitz
Steve Seitz
133

Sensitivity to number of histogram orientations

134

67
4/25/2023

Feature stability to noise

• Match features after random change in image scale & orientation,


with differing levels of image noise
• Find nearest neighbor in database of 30,000 features

David G. Lowe, "Distinctive image features from scale-invariant keypoints," IJCV, 60, 2 (2004), pp. 91-110

135

136
Feature stability to affine change

• Match features after random change in image scale & orientation,


with 2% image noise, and affine distortion
• Find nearest neighbor in database of 30,000 features

68
4/25/2023

137
Distinctiveness of features

• Vary size of database of features, with 30 degree affine change,


2% image noise
• Measure % correct for single nearest neighbor match

SIFT Keypoint Descriptor: summary

Blur the image Compute orientation


Compute gradients in
using the scale of histogram in 8
respect to the keypoint
the keypoint directions over 4x4
orientation(rotation
(scale invariance) sample regions
invariance)

Source: Distinctive Image Features from Scale-Invariant Keypoints – IJCV 2004


http://campar.in.tum.de/twiki/pub/Chair/TeachingWs13TDCV/feature_descriptors.pdf
138

69
4/25/2023

Other detectors and descriptors

Popular features: SURF, HOG, SIFT


http://campar.in.tum.de/twiki/pub/Chair/TeachingWs13TDCV/feature_descriptors.p
df

Summary some local features:


http://www.cse.iitm.ac.in/~vplab/courses/CV_DIP/PDF/Feature_Detectors_and_Descri
ptors.pdf

139

Feature extraction
• Global features
• Local features
• Interest point detector
• Local descriptor
• Matching

140

70
4/25/2023

Feature matching
Given a feature in I1, how to find the best match in I2?
1. Define distance function that compares two descriptors
• Use L1, L2, cosine, Mahalanobis,… distance

2. Test all the features in I2, find the one with min distance

OpenCV:
- Brute force matching
- Flann Matching: Fast Library for Approximate Nearest Neighbors
[Muja and Lowe, 2009]

Marius Muja and David G Lowe. Fast approximate nearest neighbors with automatic algorithm configuration. In VISAPP (1),
pages 331–340, 2009

141

Feature matching
• How to define the difference between two features f1, f2?
‒ Simple approach: use only distance value d(f1, f2)
•  can give good score to very ambiguous matches

‒ Better approaches: add additional constraints


• Radio of distance
• Spatial constraints between neigborhood pixels
• Fitting the transformation, then refine the matches (RANSAC)

71
4/25/2023

Feature matching
• Simple approach: use distance value d(f1, f2)
 can give good score to very ambiguous matches

f1 f2

I1 I2

Feature matching
• Better approaches: radio of distance = d(f1,f2) / d(f1,f2')
‒ f2 is best match to f1 in I2;
‒ f2’ is 2nd best SSD match to f1 in I2
‒ An ambiguous/bad match will have ratio close 1
‒ Look for unique matches which have low ratio

f1 f2' f2

I1 I2

72
4/25/2023

145
Ratio of distances reliable for matching

David G. Lowe, "Distinctive image features from scale-invariant keypoints," IJCV, 60, 2 (2004), pp. 91-110

Feature matching
• Better approaches: Spatial constraints between neigborhood pixels

Source: from slides of Valérie Gouet-Brunet

73
4/25/2023

Feature matching
• Better approaches: fitting the transformation (RANSAC alg.)
‒ Fitting 2D transformation matrix

• Six variables
– Each point give two equations
–  at least three points
• Least squares

‒ RANSAC: refinement of matches


• Compute error:

Evaluating the results


How can we measure the performance of a feature matcher?

50
75
200

Best feature distance

148

74
4/25/2023

149

True/false positives
50
true match
75
200
false match

feature distance

The distance threshold affects performance


‒ True positives = # of detected matches that are correct
• Suppose we want to maximize these—how to choose threshold?
‒ False positives = # of detected matches that are incorrect
• Suppose we want to minimize these—how to choose threshold?

Image matching
• How to define the distance between 2 images I1, I2?
‒ Using global features: easy
d(I1, I2) = d(feature of I1, feature of I2)

‒ Using local features:


• Voting strategy
• Solving an optimization problem (time consuming)
• Building a "global" feature from local features: BoW (bag-of-
words, bag-of-features), VLAD, ..

150

75
4/25/2023

Voting strategy
Input image Images in a database

Selected region
for query The similarity between 2
images is based on the
number of matches

Source: Modified from slides of Valérie Gouet-Brunet

151

Optimization problem
• Transportation problem

I1 : {(ri, wi), i=1, N} Provider


I2 : {(r’j, wj), j=1, M} Consommer
d(I1, I2) =???

d ( I1 , I 2 )  min  f ij  d (ri , r ' j )


i j

f ij  0

fi
ij  w j ,  f ij  wi
j

 f  d (r , r '
*
)
 f ij  min( wi ,  w j )
d EMD (I , I ) i j
ij i j

 f
i j i j 1 2 *
ij
i j

http://vellum.cz/~mikc/oss-projects/CarRecognition/doc/dp/node29.html
152

76
4/25/2023

Bag-of-words
• Local feature ~~ a word
• An image ~~ a document
• Apply a technique for textual document representation:
vector model

153

Visual Vocabulary …

1.Extracting local features from a 2. Builing visual vocabulary


set of images (dictionary) using a clustering
method

3. An image is represented by a bag of words


 can be represented by tf.idf vector

154

77
4/25/2023

Bag of words: outline


1. Extract features
2. Learn “visual vocabulary”
3. Quantize features using visual vocabulary
4. Represent images by frequencies of
“visual words”

155

Applications

156

78
4/25/2023

Object detection/recognition/search 157

Sivic and Zisserman, 2003

Lowe 2002
Rothganger et al. 2003

Object detection/recognition

158

79
4/25/2023

Application: Image Panoramas

Slide credit: Darya Frolova, Denis Simakov


17-Oct-
159
17

Application: Image Panoramas 16


0
Slide credit: Darya Frolova, Denis Simakov

• Procedure:
‒ Detect feature points in both images
‒ Find corresponding pairs
‒ Use these pairs to align the images
17-Oct-
160
17

80
4/25/2023

Automatic mosaicing

http://www.cs.ubc.ca/~mbrown/autostitch/autostitch.html

161

Wide baseline stereo

[Image from T. Tuytelaars ECCV 2006 tutorial]

162

81
4/25/2023

CBIR (Content-based image retrieval)

163

CBIR: partial retrieval

Source. http://www-rocq.inria.fr/imedia
164

82
4/25/2023

CBIR: BoW with SIFT + histogram


Tập đặc
20 ảnh/nhóm trưng
SIFT

5 ảnh/nhóm

Tập ảnh lưu Từ điển hình ảnh


trong DB
Tính vector trọng
số cho từng ảnh

~80 ảnh/nhóm

~60 ảnh/nhóm
CSDL mô tả đặc
trưng ảnh SIFT
dưới dạng các
Source :ĐATN – Phạm Xuân Trường K52 - BK vector trọng số
Tập ảnh test

165

CBIR: BoW with SIFT + histogram

Source :Đồ án tốt nghiệp – Phạm Xuân Trường K52 - BK


166

83
4/25/2023

CBIR: BoW with SIFT + histogram

Source :Đồ án tốt nghiệp – Phạm Xuân Trường K52 - BK

is.hust.edu.vn 167

References
• Lecture 5,6: CS231 - Juan Carlos Niebles and Ranjay Krishna,
Stanford Vision and Learning Lab
• Vision par Ordinateur, Alain Boucher, IFI

168

84
4/25/2023

Thank you for


your attention!

169

85

You might also like