06 Features
06 Features
Lecture 6
Outline
• Features introduction
• Key-points
• Histogram of Oriented gradients (HOG)
• Scale-Invariant Feature Transform (SIFT)
What is a Feature?
• Information extracted from an image/video.
• Hand-crafted
• Learned
• We can define a function
• Takes an image/video as an input
• Produces one or more numbers as output
• Hand-crafted features
• Feature engineering
• Learned features
• Automatically learned
Types of Features
• Global features
• Extracted from the entire image
• Examples: template (the image itself), HOG, etc.
• Region-based features
• Extracted from a smaller window of the image.
• Applying global method to a specific image region.
• Local features
• Describe a pixel, and the vicinity around a specific pixel.
• Local feature always refer to a specific pixel location.
Uses of Features
• Features can be used for many computer vision problems.
• Detection.
• Recognition.
• Tracking.
• Stereo estimation.
• Different types of features for different problems,
• Different assumptions about the images.
• That is why there are many different types of features.
Uses of Features: Matching
Uses of Features: Matching
Kristen Grauman
Compactness and Efficiency
• We want the representation to be as small
and as fast as possible
– Much smaller than a whole image
2. Define a region
A2 around each key-point
A3
3. Extract and normalize
the region content
fA fB 4. Compute a local
descriptor from the
normalized region
d( f A , f B ) T 5. Match local
descriptors
K. Grauman, B. Leibe
• Corner point can be
recognized in a window
• Shifting a window in any
direction should give a large
change in intensity
• LOCALIZING and
UNDERSTANDING shapes…
Basic Idea in Corner Detection
• Recognize corners by looking at small window.
• Shift in any direction to give a large change in intensity.
f Image
h Kernel
f h
f1 f2 f3 h1 h2 h3 f h f1h1 f 2 h2 f 3 h3
f4 f5 f6 h4 h5 h6 f 4 h 4 f5h5 f 6 h 6
f7 f8 f9 h7 h8 h9 f 7 h7 f 8 h8 f 9 h9
Corner Detection by Auto-correlation
Change in appearance of window w(x,y) for shift [u,v]:
x, y
I(x, y)
E(u, v)
E(0,0)
w(x, y)
Corner Detection by Auto-correlation
Change in appearance of window w(x,y) for shift [u,v]:
x, y
I(x, y)
E(u, v)
E(3,2)
w(x, y)
Corner detection
Three different cases
As a surface
36
Corner Detection by Auto-correlation
Change in appearance of window w(x,y) for shift [u,v]:
≈
Recall: Taylor series expansion
• A function f can be represented by
• an infinite series of its derivatives at a single point a:
Wikipedia
Approximation of
f(x) = ex
centered at f(0)
Corner Detection by Auto-correlation
Change in appearance of window w(x,y) for shift [u,v]:
I I I I
Notation: Ix Iy Ix I y
x y x y
James Hays
Harris corner detection
1) Compute M matrix for each window to recover a cornerness
score 𝐶.
Note: We can find M purely from the per-pixel image derivatives!
2) Threshold to find pixels which give large corner response
𝐶 > threshold.
3) Find the local maxima pixels,
i.e., non-maximal suppression.
C.Harris and M.Stephens. ―A Combined Corner and Edge Detector.‖ Proceedings of the 4th Alvey Vision
Conference: pages 147—151, 1988.
0. Input image
We want to compute M at each pixel.
𝐼
1. Compute image derivatives (optionally, blur first).
𝐼𝑥 𝐼𝑦
2. Compute 𝑀 components as squares of derivatives.
𝑅
Harris Detector: Steps
Harris Detector: Steps
Compute corner response 𝐶
Harris Detector: Steps
Find points with large corner response: 𝐶 > threshold
Harris Detector: Steps
Take only the points of local maxima of 𝐶
Harris Detector: Steps
Histogram of
Oriented Gradients
Edges
HOG: Human Detection
Histogram - revisit
0 1 1 2 4 6
2 1 0 0 2 4
5 2 0 0 4 2
1 1 2 4 1 0 0 1 2 3 4 5 6
image histogram
Image Histogram - revisit
Histograms of Oriented Gradients
• Given an image I, and a pixel location (i,j).
• We want to compute the HOG feature for that pixel.
• The main operations can be described as a sequence of five steps.
Pixel (i,j)
Histograms of Oriented Gradients
• Step 1: Extract a square window (called “block”) of some size.
Pixel (i,j)
Block
Histograms of Oriented Gradients
• Step 2: Divide block into a square grid of sub-blocks
(called “cells”) (2x2 grid in our example, resulting in four cells).
Pixel (i,j)
Block
Histograms of Oriented Gradients
• Step 3: Compute orientation histogram of each cell.
Pixel (i,j)
Block
Histograms of Oriented Gradients
• Step 4: Concatenate the four histograms.
Pixel (i,j)
Block
Histograms of Oriented Gradients
Let vector v be concatenation of the four histograms from step 4.
• Step 5: normalize v. Here we have three options for how to do it:
• Option 1: Divide v by its Euclidean norm.
Pixel (i,j)
Block
Histograms of Oriented Gradients
Let vector v be concatenation of the four histograms from step 4.
• Step 5: normalize v. Here we have three options for how to do it:
• Option 2: Divide v by its L1 norm (the L1 norm is the sum of all absolute values of v).
Pixel (i,j)
Block
Histograms of Oriented Gradients
Let vector v be concatenation of the four histograms from step 4.
• Option 3:
• Divide v by its Euclidean norm.
• In the resulting vector, clip any value over 0.2
• Then, renormalize the resulting vector by dividing again by its Euclidean norm.
Pixel (i,j)
Block
Histogram of Oriented Gradients
Image gradients
d
I
dx
d
I
dy
Image gradients
Histogram of Oriented Gradients
Summary of HOG Computation
• Step 1: Extract a square window (called “block”) of some size around
the pixel location of interest.
• Step 2: Divide block into a square grid of sub-blocks (called “cells”) (2x2
grid in our example, resulting in four cells).
• Step 3: Compute orientation histogram of each cell.
• Step 4: Concatenate the four histograms.
• Step 5: normalize v using one of the three options described previously.
Histograms of Oriented Gradients
• Parameters and design options:
• Angles range from 0 to 180 or from 0 to 360 degrees?
• In the Dalal & Triggs paper, a range of 0 to 180 degrees is used,
• and HOGs are used for detection of pedestrians.
• Number of orientation bins.
• Usually 9 bins, each bin covering 20 degrees.
• Cell size.
• Cells of size 8x8 pixels are often used.
• Block size.
• Blocks of size 2x2 cells (16x16 pixels) are often used.
• Usually a HOG feature has 36 dimensions.
• 4 cells * 9 orientation bins.
HOG
SIFT
Scale Invariant Feature Transform (SIFT)
• Lowe., D. 2004, IJCV
Response
of some
function f
f (I i …i (x, ))
1 m
f (I i1…i m (x,))
Automatic Scale Selection
• Function responses for increasing scale (scale signature)
Response
of some
function f
f (I i …i (x, ))
1 m
f (I i1…i m (x,))
Automatic Scale Selection
• Function responses for increasing scale (scale signature)
Response
of some
function f
f (I i …i (x, ))
1 m
f (I i1…i m (x,))
Automatic Scale Selection
• Function responses for increasing scale (scale signature)
Response
of some
function f
f (I i …i (x, ))
1 m
f (I i1…i m (x,))
Automatic Scale Selection
• Function responses for increasing scale (scale signature)
Response
of some
function f
f (I i …i (x, ))
1 m
f (I i1…i m (x,))
Automatic Scale Selection
• Function responses for increasing scale (scale signature)
Response
of some
function f
f (I i …i (x, ))
1 m
f (I i1…i m (x,))
What Is A Useful Signature Function f ?
(Laplacian of Gaussian)
Earl F. Glynn
What Is A Useful Signature Function f ?
“Blob” detector is common for corners
• Laplacian (2nd derivative) of Gaussian (LoG)
Scale space
Function
response
List of
(x, y, s)
CAP5415 - Lecture 9
Alternative kernel
Approximate LoG with Difference-of-Gaussian (DoG).
Alternative kernel
Approximate LoG with Difference-of-Gaussian (DoG).
1. Blur image with σ Gaussian kernel
2. Blur image with kσ Gaussian kernel
3. Subtract 2. from 1.
- =
Scale-space
Find local maxima in position-scale space of DoG
Find maxima/minima
…
…
k 2k
- =
List of
- k = (x, y, s)
- =
Input image
Results: Difference-of-Gaussian
• Larger circles = larger scale
• Descriptors with maximal scale response
SIFT Orientation estimation
• Compute gradient orientation histogram
• Select dominant orientation ϴ
SIFT Orientation Normalization
• Compute gradient orientation histogram
• Select dominant orientation ϴ
• Normalize: rotate to fixed orientation
0 2
T. Tuytelaars, B. Leibe
SIFT descriptor formation
• Compute on local 16 x 16 window around detection.
• Rotate and scale window according to discovered
orientation ϴ and scale σ (gain invariance).
• Compute gradients weighted by a Gaussian of variance
half the window (for smooth falloff).
James Hays
SIFT descriptor formation
• 4x4 array of gradient orientation histograms weighted by
gradient magnitude.
• Bin into 8 orientations x 4x4 array = 128 dimensions.
Showing only 2x2 here but is 4x4
James Hays
SIFT Descriptor Extraction
James Hays
Review: Local Descriptors
• Most features can be thought of as
• templates,
• histograms (counts),
• or combinations
• The ideal descriptor should be
– Robust and Distinctive
– Compact and Efficient
• Most available descriptors focus on edge/gradient information
– Capture texture information
– Color rarely used