0% found this document useful (0 votes)
20 views

1 ObjectDetection

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views

1 ObjectDetection

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 46

Object Detection

Outline

• Introduction to Object Detection


– Difference between Object Detection and
Image Classification
• Machine Learning-based Object Detection
• Deep Learning-based Object Detection
– RCNN
– Faster RCNN
– YOLO
• Evaluating a detector
Introduction to Object
Detection
Image Classification
• Computer vision problem to classify an image into a pre-
defined category
• Example
– Animals (cat, dog, lion, tiger etc)
– Color ( Red, yellow, blue etc)

Digit classification (MNIST)


Object recognition (Caltech-101)
Object Detection
• The task of localizing objects in an image and identifying
its class
Bounding Box
• Bounding box describe the spatial location of an
object.
• Rectangular
• Representation
– (x, y) -axis coordinates of the upper-left corner
and lower-right corner of the rectangle
• Alternate representation
– (x,y) -axis coordinates of the bounding box
center, and the width and height of the box.
Image Classification vs Object Detection

• Image classification
– Classifies the object in the image
– Doesn’t localize the object in the image
• Object detection
– Localizes and classifies the object

CAR CAR
Classification Object Detection
Traditional Machine
Learning-based Object
Detection
Traditional Machine Learning
Classification
• Sliding Window-based Object Detection
– Bounding boxes of different scales are slid across the
image
– Each bounding box is sent to a image classifier

Roth et al. On-line Conservative Learning


Sliding Window

www.pyimagesearch.com/2015/03/23/sliding-windows-for-object-
detection-with-python-and-opencv/)
Sliding Window-based Object Detection

• Training Step (Similar to Image Classification)


Feature
Image set
set Training
classification
model
Label set

• Testing Step

Bounding Trained
Predicted
Box Feature classification
Label
Image model
Sliding Window-based Object Detection

• Feature Extraction
– Descriptive features from image extracted
– Image representation which extracts relevant
information and discards irrelevant information
– Increase discrimination between image classes
– Account for variations within the same image class
– HOG, FAST, SIFT etc
• Feature classifiers
– Predict labels using the extracted features
– KNN, SVM etc
Histogram-of-Oriented Gradients

• Multiple steps required to extract HOG descriptors from


image
• Step 1 : Preprocessing
• Input image should be fixed size
Histogram-of-Oriented Gradients

• Step 2 : Calculate the Gradient or Edge Images


• Calculate the horizontal gradient image 𝑔𝑦 and vertical
gradient image 𝑔𝑥 using Sobel filter
• Calculate the magnitude and orientation of gradients
• 𝑔= 𝑔𝑥 2 + 𝑔𝑦 2
𝑔𝑦
• 𝜃= 𝑎𝑟𝑐𝑡𝑎𝑛
𝑔𝑥

𝑔𝑥 𝑔𝑦 𝑔
Histogram-of-Oriented Gradients

• Step 3 : Calculate Histogram of Gradients in 8×8 cells


• Image patch is discretized into grids
• Magnitude and orientation calculated for each grid
Histogram-of-Oriented Gradients

• Create histogram using magnitude and orientation


• The histogram contains 9 bins corresponding to angles 0,
20, 40 … 160.

Bin values summed


Histogram-of-Oriented Gradients

• Step 4: Block normalization


• Compute histogram over 16 x 16 block
– One 16 x 16 block = Four 8 x 8 cells
– One 8 x 8 cell
• 9 bin histogram
– One 16 x 16 block
• Four concatenated 9 bin histogram
(36 x 1 vector)
Histogram-of-Oriented Gradients
• Normalization of 36 x 1 vector
– Making the vector scale invariant
– Divide each vector element by the L2 norm of the full
vector

• Compute 36 x 1 vector over entire image


3780 x 1 vector
Reading
https://learnopencv.com/histogram-of-oriented-gradients/
Feature Classifiers
• K Nearest Neighbor (KNN)
• Simple classification algorithm
• Classifies based on a similarity measure between a test
feature vector and training set of feature vectors
Feature Classifiers
• Steps
– Select the number of neighbors (K) needed to classify
– Compute distance between test feature vector and every
feature vector in the training set
– Identify K-nearest neighors
– Test feature vector assigned to the majority class among
its K-nearest neighbors
Deep Learning-based Object
Detection
Sliding Window-based Deep Learning
Object Detection
• Training Step (Similar to Image Classifier)
Image set Training deep learning
model with feature
extraction and
classification
Label set

• Testing Step

Bounding Trained deep learning model Predicted


Box with feature extraction and Label
Image classification
Sliding Window-based Deep Learning
Object Detection
CNN-based Object Detection

https://www.upgrad.com/blog/basic-cnn-
architecture/
Sliding Window-based Deep Learning
Object Detection
• CNN provides state-of-the-art detection accuracy
• Computationally expensive
• Sliding window approach with varying scales of
bounding boxes not practical
• Research in CNN-based object detection
– Reduce computational complexity
– Real-time object detection
– State-of-the-art detection accuracy
RCNN-based Object Detection
• R-CNN uses an object proposal algorithm called selective
search
• Selective search reduces the number of bounding boxes
that are fed to the classifier to close to 2000 region
proposals
• Selective search uses features like texture, intensity, colour
etc. to identify possible locations of objects.
• The different proposals are fed into the CNN-based
classifier.

Girschik et al. Rich feature hierarchies for accurate object detection and
semantic segmentation, CVPR 2013
RCNN-based Object Detection
RCNN-based Object Detection
• Problems with R-CNN
• 2000 region proposals are given to CNN
• 47 second per image
• Selective search is not a learning-based algorithm, and is a
predefined algorithm. Specified features are used to
identify regions.
Fast RCNN
• RCNN
– 2000 region of proposals are given as input to CNN
– CNN detects the objects in these regions
– Convolution operation done 2000 times per image
• Fast RCNN
– The input image is given directly to CNN
– Region of proposals are identified from the CNN
feature maps
– Region of proposals are reshaped and given to object
detection layer
– Convolution operation done only once per image

Girschik et al. Fast R-CNN, ICCV 2015


Fast RCNN
Faster RCNN
• RCNN and Fast RCNN use selective search to find region
proposals
• Faster RCNN uses a deep learning network to identify the
region proposals
– Entire image is given as an input to a convolutional
network which generates a convolutional feature map.
– Deep learning network estimates the region proposals
from the feature map.
– The predicted region proposals are reshaped and given
to object detection layer.

Ren et al. Faster R-CNN: Towards real-time object detection with region
proposal networks, NIPS 2015
Faster RCNN

Ghoury et al. Real-Time Diseases Detection of Grape and Grape Leaves


using Faster R-CNN and SSD MobileNet Architectures, ICATCES 2019
Comparision
YOLO
• RCNN based methods use region proposals to identify
objects.
• YOLO, one convolutional network directly predicts the
bounding boxes and the class probabilities for these boxes.
• Each image is split into an SxS grid
• M bounding boxes are considered in each of the grid
• For each of the bounding box, YOLO predicts a class label
and bounding box representations.
YOLO

https://pjreddie.com/darknet/yolo/
YOLO Architecture
YOLO
• YOLO is faster than other algorithms.
• Detection accuracy is lower than the other algorithms

https://www.oreilly.com/library/view/reinforcement-learning-
with/9781788835725/786aac81-77a7-437e-9a75-64925d7940ca.xhtml
Evaluating a Detector

Slides source: R. Girshick, Object detection,


deep learning, and R-CNNs, UW CSE 455
Detection

Test image
Detection

0.9
Detection

0.9

0.6
Detection
0.2

0.9

0.6
Ground Truth
0.2

0.9

0.6

‘person’ detector predictions


ground truth ‘person’ boxes
Evaluating a Detection

https://www.linkedin.com/pulse/which-worse-false-positive-false-negative-
miha-mozina-phd/
Intersection Over Union
• IOU measure used for false positive, true positive etc
calculation

https://www.pyimagesearch.com/2016/11/07/intersection-over-union-
iou-for-object-detection/
Intersection Over Union

https://towardsdatascience.com/map-mean-average-precision-might-
confuse-you-5956f1bfa9e2
Precision and Recall
• Precision

• Recall

You might also like