0% found this document useful (0 votes)
24 views56 pages

CVI Week 2 1 Pre Note

The document discusses feature detection and extraction in computer vision, focusing on traditional visual descriptors like points, patches, edges, and contours, as well as deep learning-based approaches. It highlights the SIFT algorithm and its steps, advantages, and applications, while also contrasting shallow and deep learning architectures. The document emphasizes the importance of learning feature hierarchies through deep learning for improved performance in various domains.

Uploaded by

gfd45yjz79
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views56 pages

CVI Week 2 1 Pre Note

The document discusses feature detection and extraction in computer vision, focusing on traditional visual descriptors like points, patches, edges, and contours, as well as deep learning-based approaches. It highlights the SIFT algorithm and its steps, advantages, and applications, while also contrasting shallow and deep learning architectures. The document emphasizes the importance of learning feature hierarchies through deep learning for improved performance in various domains.

Uploaded by

gfd45yjz79
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 56

Feature Detection

and Deep Learning


Akila Subasinghe
School of Computer Science
University of Birmingham

[Computer Vision and Imaging]


Feature Detection/Extraction

1
Feature Detection/Extraction

2
Deep Learning-based Features

Nvidia to train 100,000 developers on deep learning AI


Outline
▪ Traditional visual feature descriptors
□ Points and patches
□ Edges and contours
□ Lines

▪ Deep learning-based visual feature extraction


□ How to?
□ Convolutional Neural Networks
□ Applications of DL4CV
4
Traditional visual feature descriptors

Szeliski, R. (2022). Computer vision: algorithms and applications. Springer Nature.


Traditional visual feature descriptors
Points and patches

Szeliski, R. (2022). Computer vision: algorithms and applications. Springer Nature.


Traditional visual feature descriptors
Points and patches

Szeliski, R. (2022). Computer vision: algorithms and applications. Springer Nature.


Traditional visual feature descriptors
Points and patches
Patches with gradients in at least two (significantly)
different orientations are the easiest to localize

Szeliski, R. (2022). Computer vision: algorithms and applications. Springer Nature.


Traditional visual feature descriptors
Points and patches
Adaptive non-maximal suppression
(ANMS)

Szeliski, R. (2022). Computer vision: algorithms and applications. Springer Nature.


Traditional visual feature descriptors
Points and patches
Scale invariance

10

Szeliski, R. (2022). Computer vision: algorithms and applications. Springer Nature.


Traditional visual feature descriptors
Points and patches
Rotational invariance

11
Traditional visual feature descriptors
Points and patches
Affine invariance

12

Szeliski, R. (2022). Computer vision: algorithms and applications. Springer Nature.


Traditional visual feature descriptors
Points and patches
multi-scale oriented patches (MOPS)

13

Szeliski, R. (2022). Computer vision: algorithms and applications. Springer Nature.


SIFT: Pet Example
SIFT Algorithm

1. Scale-space peak/feature selection


2. Key point Localization
3. Orientation Assignment
4. Key point Descriptor
5. Key point Matching (Not really SIFT)
SIFT Step 1: Constructing the
Scale Space.

 Use multiple Scales (octaves)(down samples)


 At each Octave, use Gaussian blurring to create various
versions of the same image…
SIFT Step 1: DOG
SIFT Algorithm

1. Scale-space peak/feature selection


2. Key point Localization
3. Orientation Assignment
4. Key point Descriptor
5. Key point Matching (Not really SIFT)
SIFT Step 2: Key point
Localization
 Detect maxima and minima of difference-of-
Gaussian in scale space

 Each point is compared to its 8 neighbors in


the current image and 9 neighbors each in
the scales above and below
SIFT Algorithm

1. Scale-space peak/feature selection


2. Key point Localization
3. Orientation Assignment
4. Key point Descriptor
5. Key point Matching (Not really SIFT)
SIFT Step 3: Orientation
Assignment.

 For each Key Point, we will assign an orientation to each key


point.
1. Calculate the magnitude and orientation
2. Create a histogram for magnitude and orientation.
SIFT Step 3: Orientation
Assignment.
SIFT Algorithm

1. Scale-space peak/feature selection


2. Key point Localization
3. Orientation Assignment
4. Key point Descriptor
5. Key point Matching (Not really SIFT)
SIFT Step 4: Key Point Descriptor

I = imread('image.jpg');
points = detectSURFFeatures(I);
SIFT advantages.
 Locality: features are local, so robust to
occlusion and clutter (no prior segmentation)
 Distinctiveness: individual features can be
matched to a large database of objects
 Quantity: many features can be generated for
even small objects
 Efficiency: close to real-time performance
 Extensibility: can easily be extended to wide
range of differing feature types
SIFT Mapping in Action…
SIFT Mapping in Action…
Traditional visual feature descriptors
Points and patches
Applications: Large-scale matching and retrieval

28

Szeliski, R. (2022). Computer vision: algorithms and applications. Springer Nature.


Traditional visual feature descriptors
Edges and contours

29

Szeliski, R. (2022). Computer vision: algorithms and applications. Springer Nature.


Traditional visual feature descriptors
Edges and contours
Sobel edge detector/operator

30

https://en.wikipedia.org/wiki/Sobel_operator
Traditional visual feature descriptors
Edges and contours
Canny edge detector
1.Apply Gaussian filter to smooth the image
in order to remove the noise

2.Find the intensity gradients of the image

3.Apply gradient magnitude thresholding or


lower bound cut-off suppression to get rid of
spurious response to edge detection

4.Apply double threshold to determine


potential edges

5.Track edge by hysteresis: Finalize the


detection of edges by suppressing all the
other edges that are weak and not
connected to strong edges. 31

https://en.wikipedia.org/wiki/Canny_edge_detector
Traditional visual feature descriptors
Edges and contours
Contour detection

32

Edge and Contour Detection with OpenCV and Python


Traditional visual feature descriptors
(straight) Lines and vanishing points

33

Szeliski, R. (2022). Computer vision: algorithms and applications. Springer Nature.


Traditional visual feature descriptors
(straight) Lines and vanishing points
Hough transforms: having edges “vote” for plausible
line locations

34

https://en.wikipedia.org/wiki/Hough_transform
Slide credit:

Traditional Image Categorization: Training phase

Training Training
Images
Training Labels

Image Classifier Trained


Features Training Classifier

35
Slide credit:

Traditional Image Categorization: Testing phase


Training Training
Images
Training Labels

Image Classifier Trained


Features Training Classifier

Testing
Image Trained Prediction
Features Classifier Outdoor
Test Image 36
Slide credit:

Features are have been the key…

Hand-crafted
SIFT [Loewe IJCV 04] HOG [Dalal and Triggs CVPR 05]
DPM [Felzenszwalb et al. PAMI 10]

Color Descriptor [Van De Sande et al. PAMI 10]


SPM [Lazebnik et al. CVPR 06]
37
What about learning the features?
• Learn a feature hierarchy all the way from pixels
to classifier
• Each layer extracts features from the output of
previous layer
• Layers have (nearly) the same structure
• Train all layers jointly (“end-to-end”)
Image/
Video Layer 1 Layer 2 Layer 3 Simple
Pixels Classifier
38
Learning Feature Hierarchy
Goal: Learn useful higher-level features from images
Feature representation

3rd layer
Input data “Objects”

2nd layer
“Object parts”

1st layer
“Edges”
[Lee et al., ICML
2009; CACM 2011]
Pixels
39

Slide credit: Rob Fergus


Learning Feature Hierarchy
• Better performance

• Other domains (unclear how to hand engineer):


– Kinect
– Video
– Multi spectral

40

Slide credit: Rob Fergus


“Shallow” vs. “Deep” architectures

41

Slide credit: Yann LeCun


“Shallow” vs. “Deep” architectures

Layer 1 … Layer k
42

Slide adapted from: Yann LeCun


Slide credit:

Why deep learning?

43

https://towardsdatascience.com/what-is-deep-learning-and-how-does-it-work-2ce44bb692ac
Types of Learning & History
Brain

Supervised
learning

Unsupervised
learning

Modern
architectures
44

https://towardsdatascience.com/supervised-vs-unsupervised-learning-in-2-minutes-72dad148f242 https://medium.com/analytics-vidhya/cnns-architectures-lenet-alexnet-vgg-googlenet-resnet-and-more-666091488df5
Slide credit:

General learning types


▪ Supervised learning
□ Learn to predict an output
when given an input vector.
▪ Reinforcement learning
□ Learn to select an action to
maximize payoff.
▪ Unsupervised learning
□ Discover a good internal
representation of the input.
□ Self-supervised learning
Courtesy: Hinton & Lecun 45
Slide credit:

A brief history

46
Slide credit:

A brief history

47
Slide adapted from:

A brief history

FCN UNet YOLO Generative Adversarial Network (GAN)

Mask R-CNN Capsule Network

Graph Neural Network (GNN)

Vision Transformer (ViT)


48
Today
Neural Radiance Field (NeRF)
Basic definition
Deep Learning Deep Neural Network
Demo

Human Brain

Video 49

Image courtesy: https://medium.com/autonomous-agents/mathematical-foundation-for-activation-functions-in-artificial-neural-networks-a51c9dd7c089


Basic definition

50
Basic definition
• Nonlinear
• Can approximate any continuous
function to arbitrary accuracy given
sufficiently many hidden units

Figure from Christopher Bishop 51


Basic definition
• Activations:

• Nonlinear activation function h


(e.g. sigmoid, RELU):

Figure from Christopher Bishop 52


Basic definition
• Layer 2

• Layer 3 (final)

• Outputs (e.g. sigmoid/softmax)


(binary) (multiclass)

• Putting everything together:

53
Basic definition
• Lots of hidden layers
• Depth = power (usually)
Weights to learn!

Weights to learn!

Weights to learn!

Weights to learn!
54

Figure from http://neuralnetworksanddeeplearning.com/chap5.html


The way to learn? Gradient descent

55

Credit: Andrew Ng, Alexei Efros, Samuel Velasco/Quanta Magazine


https://towardsdatascience.com/a-visual-explanation-of-gradient-descent-methods-momentum-adagrad-rmsprop-adam-f898b102325c

You might also like