0% found this document useful (0 votes)

70 views43 pages

Fast Methods For Deep Learning Based Object Detection

This document summarizes problems with the R-CNN object detection method and introduces Fast R-CNN and Faster R-CNN as improved methods. R-CNN training is slow and requires extracting deep learning features for each object proposal. Fast R-CNN improves on this by only extracting features once per image and using ROI pooling to classify and regress proposals. Faster R-CNN further speeds up detection by adding a Region Proposal Network to generate proposals, removing the need for an external proposal method. It enables end-to-end training of the whole system.

Uploaded by

seul alone

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

70 views43 pages

Fast Methods For Deep Learning Based Object Detection

Uploaded by

seul alone

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 43

Fast Methods for Deep Learning based

Object Detection
R-CNN: Problems

● Training is a multi-stage pipeline.

○ R-CNN first finetunes a ConvNet on object proposals using log loss.
○ Then, it fits SVMs to ConvNet features. These SVMs act as object detectors, replacing the softmax
classifier learnt by fine-tuning.
○ In the third training stage, bounding-box regressors are learned.
● Training is expensive in space and time.
○ For SVM and bounding-box regressor training, features are extracted from each object proposal in
each image and written to disk.
○ With very deep networks, such as VGG16, this process takes 2.5 GPU-days for the 5k images of the
VOC07 trainval set. These features require hundreds of gigabytes of storage.
● Object detection is slow.
○ At test-time, features are extracted from each object proposal in each test image.
○ Detection with VGG16 takes 47s / image (on a GPU).
Fast R-CNN
Fast R-CNN
Fast R-CNN
Fast R-CNN
Fast R-CNN
Fast R-CNN
Fast R-CNN
Training
Fast R-CNN
Training
Fast R-CNN

● Only calculate features once.

● ROI Pooling layer extracts constant length vector representations of proposals.
● Classify and regress bounding boxes with multi purpose loss for end-to-end
training.
Fast R-CNN: ROI Pooling
Fast R-CNN: ROI Pooling
Fast R-CNN: ROI Pooling
Fast R-CNN: ROI Pooling
Fast R-CNN: ROI Pooling
Fast R-CNN

● Instead of SVM + bounding box regression:

○ SoftMax classifier output
○ Bounding box regression output
● Multi-task training:
Fast R-CNN

● Advantages
○ Training is single-stage, using a multi-task loss
○ Training can update all network layers
○ No disk storage is required for feature caching
○ More accurate 66.9mAP vs 66.0mAP.
○ Faster training time 9.5h vs 84h (x8.8)
○ Faster test time per image: 0.32s vs 47s (x146)
● Problem
○ Test time don’t include region proposals.
○ Test time with region proposals: 2s vs 50s (x25)
● Solution
○ Make the CNN do region proposals too!
Faster R-CNN
● Faster R-CNN: Towards Real-Time Object Detection
with Region Proposal Networks (2015)
○ Shaoqing Ren, Kaiming He, Ross Girshick
● Insert a Region Proposal Network (RPN) after the
last convolutional layer.
● RPN trained to produce region proposals directly;
no need for external region proposals!
● After RPN, use RoI Pooling and an upstream
classifier and bbox regressor just like Fast R-CNN.
Faster R-CNN: RPN
● Slide a small window on the already computed
feature map (FREE!).
● Build a small network for:
○ Classifying object or not-object, and
○ Regressing bbox locations
● Position of the sliding window provides
localization information with reference to the
image.
● Box regression provides finer localization
information with reference to this sliding
window
Faster R-CNN: Training
● In the paper: Ugly pipeline
○ Use alternating optimization to train RPN, then Fast
R-CNN with RPN proposals, etc.
○ More complex than it has to be
● Since publication: Joint training!
○ One network, four losses
■ RPN classification (anchor good / bad)
■ RPN regression (anchor -> proposal)
■ Fast R-CNN classification (over classes)
■ Fast R-CNN regression (proposal -> box)
How Many Anchors Do We Need?
How Many Proposals Do We Need?

● Fast R-CNN used 2000 proposals from selective search.

● Faster R-CNN needs only 300 proposals from the RPN.
● RPN is better than selective search
○ Deep learning vs. classical computer vision
○ Optimized for this task
How Much Data Do We Need?
Also Read:
R-FCN: Object Detection via Region-based Fully
Convolutional Networks
https://arxiv.org/abs/1605.06409
Another Approach For
Speeding Up
Proposals
Another Approach For
Speeding Up
Proposals
Just Don’t Do It
Just RPN From Faster R-CNN

● Much faster than Faster R-CNN!

● But RPN had only object/not object classifier.
Add Classification!

● What about accuracy?

● How well does it handle different object scales?
Add More Scales!
Add More classifiers
SSD: Single Shot MultiBox Detector
SSD: Single Shot MultiBox Detector
Why Does Stride Matter?
● Smaller stride means more scanned
windows.
● Handles close objects better.
○ Need to have enough default boxes to do
accurate matching in each.
● Handles small objects better.
○ Better IoU with objects.
○ More positive windows per object.
● Too little stride is bad
○ Too many windows means too many false
positives to filter.
Improving Accuracy

● Object detection data is unbalanced

○ 1-30 True Positives per image.
○ 8,000 - 25,000 False Positives per image.
● Solution
○ Resample at fixed ratio (1:3)
● Not all negatives are equal!
○ Some are harder than others
● Better Solution
○ Hard negative mining: resample worst-misclassified false positives at fixed ratio.
Improving Accuracy

● Not enough data?

● Solution: Data augmentation
○ Random horizontal flip
○ Random crop
○ Random color distortion
○ Random expansion
How Much Does It Help?
Also Read:
YOLO9000: Better, Faster, Stronger
https://arxiv.org/abs/1612.08242
Speed/accuracy factors in object detectors

● Algorithm: Faster R-CNN / SSD / R-FCN / YOLO / ...

● Backbone: VGG16 / ResNet / MobileNet / etc…
● Input size
● Many other hyperparameters...
Speed/accuracy trade-offs for modern convolutional object
detectors (Google)
Frameworks

● Caffe
○ Faster R-CNN: https://github.com/rbgirshick/py-faster-rcnn
○ SSD: https://github.com/weiliu89/caffe/tree/ssd
● Tensorflow Object Detection API:
○ https://github.com/tensorflow/models/tree/master/research/object_detection
● Detectron:
○ https://github.com/facebookresearch/Detectron
● Many more re-implementations in different languages...
Honorable mentions

● VGG16: https://arxiv.org/abs/1409.1556
● ResNet: https://arxiv.org/abs/1512.03385
● Inception-ResNet: https://arxiv.org/abs/1602.07261
● ResNeXt: https://arxiv.org/abs/1611.05431
● Xception: https://arxiv.org/abs/1610.02357
● DenseNet: https://arxiv.org/abs/1608.06993
● MobileNet: https://arxiv.org/abs/1704.04861
● SqueezeNet: https://arxiv.org/abs/1602.07360
Looking for brilliant researchers

[email protected]

The Impact of Artificial Intelligence Use On Academic Performance Among Students at Baybay City Senior High School
No ratings yet
The Impact of Artificial Intelligence Use On Academic Performance Among Students at Baybay City Senior High School
49 pages
PNETLab - Lab Is Simple - Guide
No ratings yet
PNETLab - Lab Is Simple - Guide
8 pages
Object Detection Slides
No ratings yet
Object Detection Slides
90 pages
Exception Handling
No ratings yet
Exception Handling
54 pages
BIND DNS Server - Webmin Documentation
No ratings yet
BIND DNS Server - Webmin Documentation
16 pages
Automotive ECU SW Function Development Chart Template
100% (1)
Automotive ECU SW Function Development Chart Template
21 pages
Backorder Processing: Based Business Prioritization) - It Is Through Helps Improve Resulting Higher
100% (1)
Backorder Processing: Based Business Prioritization) - It Is Through Helps Improve Resulting Higher
16 pages
مصحف ورش طبعة الجزائر PDF
No ratings yet
مصحف ورش طبعة الجزائر PDF
707 pages
REST Services Version 1 2022.2
No ratings yet
REST Services Version 1 2022.2
90 pages
Module 05 - MW11D Intune - Profile Management
No ratings yet
Module 05 - MW11D Intune - Profile Management
32 pages
Object Detection and Identification
67% (3)
Object Detection and Identification
20 pages
Practitioner Prep Guide 2.1
No ratings yet
Practitioner Prep Guide 2.1
90 pages
Polynomials 3
No ratings yet
Polynomials 3
11 pages
ABDULSALAM IBRAHIM OLOLADE Report
No ratings yet
ABDULSALAM IBRAHIM OLOLADE Report
22 pages
M10 - Introduction To TensorFlow, Deep Learning and Application
No ratings yet
M10 - Introduction To TensorFlow, Deep Learning and Application
25 pages
Real Time Object Detection System
No ratings yet
Real Time Object Detection System
31 pages
18 4 Installation Grid Preq2
No ratings yet
18 4 Installation Grid Preq2
3 pages
The Framework For Object Detection: Generalized R-CNN
No ratings yet
The Framework For Object Detection: Generalized R-CNN
127 pages
Object Detection
No ratings yet
Object Detection
57 pages
Object Detection
No ratings yet
Object Detection
96 pages
The Evolution of Traditional To New Media
No ratings yet
The Evolution of Traditional To New Media
47 pages
Single RC Staircase Design
100% (4)
Single RC Staircase Design
4 pages
MV cs4243 2024 Amir 6 p2
No ratings yet
MV cs4243 2024 Amir 6 p2
95 pages
Final Presentation On Object Detection
No ratings yet
Final Presentation On Object Detection
10 pages
Object Detection
No ratings yet
Object Detection
76 pages
L7 Detection
No ratings yet
L7 Detection
54 pages
Lec36 Obj Detn
No ratings yet
Lec36 Obj Detn
60 pages
Final Year Project Humanoid Robot Facial Expression
No ratings yet
Final Year Project Humanoid Robot Facial Expression
52 pages
cv2021 Lec6 Object Detection - 1600 - PDF - Gdrive.vip
No ratings yet
cv2021 Lec6 Object Detection - 1600 - PDF - Gdrive.vip
60 pages
Deep Learning Algorithms For Object Detection
No ratings yet
Deep Learning Algorithms For Object Detection
43 pages
5-Exponential Mindset
No ratings yet
5-Exponential Mindset
45 pages
Unit 3
No ratings yet
Unit 3
45 pages
Advanced Deep Learning Based Object Detection Methods
No ratings yet
Advanced Deep Learning Based Object Detection Methods
36 pages
1 ObjectDetection
No ratings yet
1 ObjectDetection
46 pages
Cryptographic Hash Functions
No ratings yet
Cryptographic Hash Functions
40 pages
Object Detection Using Deep Learning
No ratings yet
Object Detection Using Deep Learning
6 pages
cs231n 2018 ds06
No ratings yet
cs231n 2018 ds06
38 pages
Lecture Paola Object Detection
No ratings yet
Lecture Paola Object Detection
29 pages
Yolo Family
No ratings yet
Yolo Family
40 pages
CVR FDP
No ratings yet
CVR FDP
37 pages
Object Detection1
No ratings yet
Object Detection1
29 pages
The Business Ui Answer Key Revised
No ratings yet
The Business Ui Answer Key Revised
20 pages
Faster R-CNN - Deep Dive Into Object Detection
No ratings yet
Faster R-CNN - Deep Dive Into Object Detection
31 pages
The openCV Installed With The Jetpack Does Not Have CUDA Supported PDF
No ratings yet
The openCV Installed With The Jetpack Does Not Have CUDA Supported PDF
11 pages
BTP Report Faster R CNN Compressed
No ratings yet
BTP Report Faster R CNN Compressed
32 pages
10 R CNN
No ratings yet
10 R CNN
28 pages
R-CNN, Fast R-CNN, Faster R-CNN, YOLO - Object Detection Algorithms
No ratings yet
R-CNN, Fast R-CNN, Faster R-CNN, YOLO - Object Detection Algorithms
11 pages
139 Pretrained Networks Object Detection
No ratings yet
139 Pretrained Networks Object Detection
22 pages
MAD Microproject
No ratings yet
MAD Microproject
21 pages
Yolo
No ratings yet
Yolo
24 pages
Report 34
No ratings yet
Report 34
22 pages
Dictionaries: Erin Keith
No ratings yet
Dictionaries: Erin Keith
22 pages
Cornernet: Detecting Objects As Paired Keypoints: Hei Law Jia Deng Princeton University, University of Michigan
No ratings yet
Cornernet: Detecting Objects As Paired Keypoints: Hei Law Jia Deng Princeton University, University of Michigan
24 pages
Object Detection Report
No ratings yet
Object Detection Report
27 pages
DeFRCN Decoupled Faster R-CNN For Few-Shot Object Detection
No ratings yet
DeFRCN Decoupled Faster R-CNN For Few-Shot Object Detection
17 pages
9SD00582 PSRPT 2024-02-15 05.05.44
No ratings yet
9SD00582 PSRPT 2024-02-15 05.05.44
14 pages
Ref 16
No ratings yet
Ref 16
14 pages
BTP PPT Phase1
No ratings yet
BTP PPT Phase1
14 pages
3.1 Faster - R-CNN - Towards - Real-Time - Object - Detection - With - Region - Proposal - Networks
No ratings yet
3.1 Faster - R-CNN - Towards - Real-Time - Object - Detection - With - Region - Proposal - Networks
13 pages
Understanding and Implementing Faster R-CNN - by Rishabh Singh - Medium
No ratings yet
Understanding and Implementing Faster R-CNN - by Rishabh Singh - Medium
14 pages
Object Detection Using CNN-RCNN.-1
No ratings yet
Object Detection Using CNN-RCNN.-1
14 pages
Introduction To Web Development
No ratings yet
Introduction To Web Development
10 pages
IMINT Target Acquisition Using Deep Learning
No ratings yet
IMINT Target Acquisition Using Deep Learning
5 pages
Video Visit Guide
No ratings yet
Video Visit Guide
12 pages
FLP - Training - Dec2020-Ab - Initio - Dev - Training-Unix - Basics-Basic Linux - Unix Commands With Examples
No ratings yet
FLP - Training - Dec2020-Ab - Initio - Dev - Training-Unix - Basics-Basic Linux - Unix Commands With Examples
12 pages
Li 2021 J. Phys.: Conf. Ser. 1827 012085
No ratings yet
Li 2021 J. Phys.: Conf. Ser. 1827 012085
11 pages
Fast R-CNN
No ratings yet
Fast R-CNN
9 pages
Fast R-CNN (R Girshick 2015) PDF
No ratings yet
Fast R-CNN (R Girshick 2015) PDF
9 pages
Caesar COCO-Stuff Thing and CVPR 2018 Paper
No ratings yet
Caesar COCO-Stuff Thing and CVPR 2018 Paper
10 pages
Cross-Dataset Training For Class Increasing Object Detection
No ratings yet
Cross-Dataset Training For Class Increasing Object Detection
10 pages
Fast R-CNN
No ratings yet
Fast R-CNN
9 pages
Region-Based Object Detection and Classification Using Faster R-CNN
No ratings yet
Region-Based Object Detection and Classification Using Faster R-CNN
6 pages
R CNN Regions With Convolutional Neural Network Features
No ratings yet
R CNN Regions With Convolutional Neural Network Features
8 pages
R-CNN Minus R: Karel Lenc Andrea Vedaldi
No ratings yet
R-CNN Minus R: Karel Lenc Andrea Vedaldi
9 pages
Najibi G-CNN An Iterative CVPR 2016 Paper
No ratings yet
Najibi G-CNN An Iterative CVPR 2016 Paper
9 pages
5638 Faster R CNN Towards Real Time Object Detection With Region Proposal Networks
No ratings yet
5638 Faster R CNN Towards Real Time Object Detection With Region Proposal Networks
9 pages
BEA Online Test
No ratings yet
BEA Online Test
9 pages
10 1109@access 2019 2932731
No ratings yet
10 1109@access 2019 2932731
9 pages
Acs Template Instructions Ol Readme
No ratings yet
Acs Template Instructions Ol Readme
7 pages
CNN Models To Detect Multiple Leds For Multilateral Occ.: Project: Ieee P802.15 Ig Vat
No ratings yet
CNN Models To Detect Multiple Leds For Multilateral Occ.: Project: Ieee P802.15 Ig Vat
9 pages
07 - Ai-900 71-90
No ratings yet
07 - Ai-900 71-90
6 pages
Face Detection With The Faster R-CNN
No ratings yet
Face Detection With The Faster R-CNN
6 pages
Obstacle Detection and Classification Using Deep Learning For Tracking in High-Speed Autonomous Driving
No ratings yet
Obstacle Detection and Classification Using Deep Learning For Tracking in High-Speed Autonomous Driving
6 pages
Ross Girshick Et Al - in 2013 Proposed An Architecture Called R-CNN (Region
No ratings yet
Ross Girshick Et Al - in 2013 Proposed An Architecture Called R-CNN (Region
6 pages
Last Lab Report
No ratings yet
Last Lab Report
6 pages
A Comprehensive Survey of The R-CNN Family For Object Detection
No ratings yet
A Comprehensive Survey of The R-CNN Family For Object Detection
6 pages
R-CNN (Object Detection) - A Beginners Guide To One of The Most - by Sharif Elfouly - Medium
No ratings yet
R-CNN (Object Detection) - A Beginners Guide To One of The Most - by Sharif Elfouly - Medium
6 pages
Topic 4 Convolution Integral
No ratings yet
Topic 4 Convolution Integral
5 pages
Untitled
No ratings yet
Untitled
2 pages
Computer Basics-WPS Office
No ratings yet
Computer Basics-WPS Office
4 pages
Introduction - Fast R-CNN (Object Detection) - by Sharif Elfouly - Medium
No ratings yet
Introduction - Fast R-CNN (Object Detection) - by Sharif Elfouly - Medium
4 pages
R-CNN and FR-CNN Report: Methods Used at The Core of Object Detection
No ratings yet
R-CNN and FR-CNN Report: Methods Used at The Core of Object Detection
4 pages
Faster RCNN Object Detection With PyTorch - DebuggerCafe
No ratings yet
Faster RCNN Object Detection With PyTorch - DebuggerCafe
1 page
C++ VS JAVA A PERFORMANCE DEEPDIVE: Unraveling the Performance Characteristics of C++ and Java for High-Performance Computing
From Everand
C++ VS JAVA A PERFORMANCE DEEPDIVE: Unraveling the Performance Characteristics of C++ and Java for High-Performance Computing
Manoj R Chakravarthi
No ratings yet
Machine Learning - Advanced Concepts
From Everand
Machine Learning - Advanced Concepts
Derrick Mwiti
No ratings yet
Scanline Rendering: Exploring Visual Realism Through Scanline Rendering Techniques
From Everand
Scanline Rendering: Exploring Visual Realism Through Scanline Rendering Techniques
Fouad Sabry
No ratings yet

Fast Methods For Deep Learning Based Object Detection

Uploaded by

Fast Methods For Deep Learning Based Object Detection

Uploaded by

Fast Methods for Deep Learning based

● Training is a multi-stage pipeline.

● Only calculate features once.

● Instead of SVM + bounding box regression:

● Fast R-CNN used 2000 proposals from selective search.

● Much faster than Faster R-CNN!

● What about accuracy?

● Object detection data is unbalanced

● Not enough data?

● Algorithm: Faster R-CNN / SSD / R-FCN / YOLO / ...

You might also like