0% found this document useful (0 votes)

100 views6 pages

R-CNN (Object Detection) - A Beginners Guide To One of The Most - by Sharif Elfouly - Medium

The document summarizes the R-CNN object detection system. It describes how R-CNN works by first generating region proposals from an input image using selective search. Each proposal is then fed into a pretrained CNN to extract features. These features are classified and bounded using independently trained SVMs and a bounding box regressor. The outputs are post-processed using non-maximum suppression to generate the final detected objects and bounding boxes. R-CNN pioneered the use of region-based object detection but had limitations that were addressed by later methods like Fast R-CNN.

Uploaded by

Amirhossein Saleknia

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

100 views6 pages

R-CNN (Object Detection) - A Beginners Guide To One of The Most - by Sharif Elfouly - Medium

Uploaded by

Amirhossein Saleknia

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

11/24/2020 R-CNN (Object Detection).

A beginners guide to one of the most… | by Sharif Elfouly | Medium

Get started Open in app

Sharif Elfouly
209 Followers · About Follow

R-CNN (Object Detection)

A beginners guide to one of the most fundamental concepts in object detection.

Sharif Elfouly Jul 16, 2019 · 5 min read

When the paper “Rich feature hierarchies for accurate object detection and semantic
segmentation” came out of UC Berkely in 2014 no one could have predicted its impact.
After 5 years it now has nearly 9000 citations. In this paper, the authors introduced a
fundamental concept for all modern object detection networks: Combining region
proposals with CNN’s. They called this method R-CNN.

This will be the first entry in a 3 part series covering R-CNN, fast R-CNN and faster R-
CNN. You will need to fully understand the intuition behind the concepts here to
understand the following articles. Remember, knowing the fundamentals well is much
more important than “half-knowing” modern state-of-the-art approaches.

R-CNN System

https://medium.com/@selfouly/r-cnn-3a9beddfd55a 1/6
11/24/2020 R-CNN (Object Detection). A beginners guide to one of the most… | by Sharif Elfouly | Medium

The R-CNN system

The problem the R-CNN system tries to solve it is to locate objects in an image (object
detection). What do you do to solve this? You could start with a sliding window
approach. When using this method you just go over the whole image with different sized
rectangles and look at those smaller images in a brute-force-method. The problem is you
will have a giant number of smaller images to look at. To our luck, other smart people
developed algorithms to smartly choose those so-called region proposals. To simplify
this concept:

Region proposals are just smaller parts of the original image, that we think could contain
the objects we are searching for.

Region proposals
There are different region proposal algorithms we can choose from. These are “normal”
algorithms that work out of the box. We don’t have to train them or anything. In the case
of this paper, they use the selective search method to generate region proposals. I found
a very good and detailed explanation on how, the algorithm works here. But keep in
mind:

The R-CNN is agnostic to the region proposal method.

You can choose any method you like and it would work either way.

This will create nearly 2000 different regions we will have to look at. This sounds like a
big number, but it’s still very small compared to the brute-force sliding window
approach.

CNN
In the next step, we take each region proposal and we will create a feature vector
representing this image in a much smaller dimension using a Convolutional Neural
Network (CNN).

https://medium.com/@selfouly/r-cnn-3a9beddfd55a 2/6
11/24/2020 R-CNN (Object Detection). A beginners guide to one of the most… | by Sharif Elfouly | Medium

AlexNet

They use the AlexNet as a feature extractor. Don’t forget it’s 2014 and AlexNet is still
kind of state-of-the-art (Oh, how times have changed…).

One question we need to answer:

If you only use the AlexNet as a feature extractor how do we train this thing?

Well, this is one fundamental issue with this R-CNN system. You can’t train the whole
system in one go (This will be solved by the fast R-CNN system). Rather, you will need to
train every part independently. That means that the AlexNet was trained before on a
classification task. After the training, they removed the last softmax layer. Now the last
layer is the fully connected 4096-dimensional one. This means that our features are
4096 dimensional.

Another important thing to keep in mind is that the input to the AlexNet is always the
same (227, 227, 3). The image proposals have different shapes though. Many of them
are smaller or larger than the required size. So we will need to resize every region
proposal.

To summerize the task of the CNN:

Input and output of the CNN

SVM
https://medium.com/@selfouly/r-cnn-3a9beddfd55a 3/6
11/24/2020 R-CNN (Object Detection). A beginners guide to one of the most… | by Sharif Elfouly | Medium

We created feature vectors from the image proposals. Now we will need to classify those
feature vectors. We want to detect what class of object those feature vectors represent.
For this, we use an SVM classificator. We have one SVM for each object class and we use
them all. This means that for one feature vector we have n outputs, where n is the
number of different objects we want to detect. The output is a confidence score. How
confident are we that this particular feature vector represents this class.

The thing that confused me when I read this paper for the first time, was how we trained
those different SVM’s. Well, we train them on feature vectors created by the AlexNet.
That means, that we have to wait until we fully trained the CNN before we can train the
SVM. The training is not parallelizable. Because we know when training what feature
vector represented which class we can easily train the different SVM’s in a supervised-
learning way.

To summerize:
1. We created different image proposals from one image.

2. Then we created a feature vector from those proposals using the CNN.

3. In the end we classified each feature vector with the SVM’s for each object class.

The output:

Now we have image proposals that are classified on every object class. How do we bring
them all back to the image? We use something called greedy non-maximum
suppression. This is a fancy word for the following concept:

We reject a region (image proposal) if it has an intersection-over-union (IoU) overlap with a

higher scoring selected region.

We combine each region if there is an overlap we take the proposal with the higher score
(calculated by the SVM). We do this step for each object class independently. After this
ends we only keep regions with a score higher than 0.5.

Bounding Box Regressor (optional)

I want to mention the Bounding Box Regressor at the end because it is not a
fundamental building block of the R-CNN System. It’s a greate idea though and the
authors found that it improves the average precision by 3%. So how does it work?
https://medium.com/@selfouly/r-cnn-3a9beddfd55a 4/6
11/24/2020 R-CNN (Object Detection). A beginners guide to one of the most… | by Sharif Elfouly | Medium

When you are training the Bounding Box Regressor your input is the center, width and
height in pixels of the region proposal and the label is the ground truth bounding box.
The goal as stated in the paper is:

Our goal is to learn a transformation that maps a proposed box P to a ground-truth box G.

Conclusion
Another interesting discovery made in this paper is that it was highly effective to pre-
train the CNN on a task with a lot of data (for example image classification) and after
that to fine tune the network for the actual task, which was the object detection.

I really believe that it is necessary to fully understand the concepts presented in this
paper to fully grasp more modern approaches in object detection. I hope this article
helped you understand those ideas and more importantly helped you get an intuition
about the different parts in the R-CNN system. The next part of this series will be about
fast R-CNN which builds up directly on top of R-CNN system.

Thank you for reading and keep up the learning!

If you want more and stay up to date you can find me here:

@elfouly_sharif

GitHub

Machine Learning AI Data Science Neural Networks Convolutional Network

About Help Legal

https://medium.com/@selfouly/r-cnn-3a9beddfd55a 5/6
11/24/2020 R-CNN (Object Detection). A beginners guide to one of the most… | by Sharif Elfouly | Medium

Get the Medium app

https://medium.com/@selfouly/r-cnn-3a9beddfd55a 6/6

Presentation (Theoretical Evaluation)
No ratings yet
Presentation (Theoretical Evaluation)
107 pages
MV cs4243 2024 Amir 6 p2
No ratings yet
MV cs4243 2024 Amir 6 p2
95 pages
Object Detection Slides
No ratings yet
Object Detection Slides
90 pages
Object Detection
No ratings yet
Object Detection
76 pages
Object Detection
No ratings yet
Object Detection
96 pages
The Framework For Object Detection: Generalized R-CNN
No ratings yet
The Framework For Object Detection: Generalized R-CNN
127 pages
Unit 3
No ratings yet
Unit 3
45 pages
L7 Detection
No ratings yet
L7 Detection
54 pages
Manitou Operators Manual - M50-M70 - EN
100% (1)
Manitou Operators Manual - M50-M70 - EN
178 pages
Od Segment 221219 043435
No ratings yet
Od Segment 221219 043435
40 pages
John Deere 310 Tractor Loader Backhoe Service Manual
0% (2)
John Deere 310 Tractor Loader Backhoe Service Manual
22 pages
L10 Lecture Detection - Segmentation v2.5
No ratings yet
L10 Lecture Detection - Segmentation v2.5
35 pages
Yolo Family
No ratings yet
Yolo Family
40 pages
CVR FDP
No ratings yet
CVR FDP
37 pages
Deep Learning Algorithms For Object Detection
No ratings yet
Deep Learning Algorithms For Object Detection
43 pages
cv2021 Lec6 Object Detection - 1600 - PDF - Gdrive.vip
No ratings yet
cv2021 Lec6 Object Detection - 1600 - PDF - Gdrive.vip
60 pages
Faster R-CNN - Deep Dive Into Object Detection
No ratings yet
Faster R-CNN - Deep Dive Into Object Detection
31 pages
DINTA Object Recognition
No ratings yet
DINTA Object Recognition
47 pages
Lecture 5
No ratings yet
Lecture 5
36 pages
cs231n 2018 ds06
No ratings yet
cs231n 2018 ds06
38 pages
Object Detection1
No ratings yet
Object Detection1
29 pages
Deep Learning: Dr. Sanjeev Sharma
No ratings yet
Deep Learning: Dr. Sanjeev Sharma
61 pages
139 Pretrained Networks Object Detection
No ratings yet
139 Pretrained Networks Object Detection
22 pages
Eir January2018
No ratings yet
Eir January2018
1,071 pages
1 ObjectDetection
No ratings yet
1 ObjectDetection
46 pages
CSE4261 Lecture-12
No ratings yet
CSE4261 Lecture-12
24 pages
TensorFlow in 1 Day: Make your own Neural Network
From Everand
TensorFlow in 1 Day: Make your own Neural Network
Krishna Rungta
3.5/5 (10)
Yolo
No ratings yet
Yolo
24 pages
Object Detection and Identification
67% (3)
Object Detection and Identification
20 pages
Operations and Service Manual 69NT40-561-300 To 399: Container Refrigeration
100% (1)
Operations and Service Manual 69NT40-561-300 To 399: Container Refrigeration
154 pages
Fast Methods For Deep Learning Based Object Detection
No ratings yet
Fast Methods For Deep Learning Based Object Detection
43 pages
M10 - Introduction To TensorFlow, Deep Learning and Application
No ratings yet
M10 - Introduction To TensorFlow, Deep Learning and Application
25 pages
10 R CNN
No ratings yet
10 R CNN
28 pages
Real Time Object Detection System
No ratings yet
Real Time Object Detection System
31 pages
Object Detection Using CNN-RCNN.-1
No ratings yet
Object Detection Using CNN-RCNN.-1
14 pages
Lenc 15 RCNN
No ratings yet
Lenc 15 RCNN
12 pages
Understanding and Implementing Faster R-CNN - by Rishabh Singh - Medium
No ratings yet
Understanding and Implementing Faster R-CNN - by Rishabh Singh - Medium
14 pages
(2018) RFB
No ratings yet
(2018) RFB
16 pages
BTP PPT Phase1
No ratings yet
BTP PPT Phase1
14 pages
Report 34
No ratings yet
Report 34
22 pages
BTP Report Faster R CNN Compressed
No ratings yet
BTP Report Faster R CNN Compressed
32 pages
Najibi G-CNN An Iterative CVPR 2016 Paper
No ratings yet
Najibi G-CNN An Iterative CVPR 2016 Paper
9 pages
Li 2021 J. Phys.: Conf. Ser. 1827 012085
No ratings yet
Li 2021 J. Phys.: Conf. Ser. 1827 012085
11 pages
Diagnostics and Measurements PDF
No ratings yet
Diagnostics and Measurements PDF
64 pages
R CNN Regions With Convolutional Neural Network Features
No ratings yet
R CNN Regions With Convolutional Neural Network Features
8 pages
Comprehensive Review of R-CNN and Its Variant Arch
No ratings yet
Comprehensive Review of R-CNN and Its Variant Arch
8 pages
Ross Girshick Et Al - in 2013 Proposed An Architecture Called R-CNN (Region
No ratings yet
Ross Girshick Et Al - in 2013 Proposed An Architecture Called R-CNN (Region
6 pages
Object Detection Techniques A Review
No ratings yet
Object Detection Techniques A Review
9 pages
R-CNN Minus R: Karel Lenc Andrea Vedaldi
No ratings yet
R-CNN Minus R: Karel Lenc Andrea Vedaldi
9 pages
5 Major Computervision Technique
No ratings yet
5 Major Computervision Technique
10 pages
Object Detection
No ratings yet
Object Detection
57 pages
Introduction - Fast R-CNN (Object Detection) - by Sharif Elfouly - Medium
No ratings yet
Introduction - Fast R-CNN (Object Detection) - by Sharif Elfouly - Medium
4 pages
Machine Learning - Advanced Concepts
From Everand
Machine Learning - Advanced Concepts
Derrick Mwiti
No ratings yet
Transmission Diagnostics 6T30-40
No ratings yet
Transmission Diagnostics 6T30-40
276 pages
Last Lab Report
No ratings yet
Last Lab Report
6 pages
Mask R-CNN
No ratings yet
Mask R-CNN
4 pages
A Comprehensive Survey of The R-CNN Family For Object Detection
No ratings yet
A Comprehensive Survey of The R-CNN Family For Object Detection
6 pages
Face Detection With The Faster R-CNN
No ratings yet
Face Detection With The Faster R-CNN
6 pages
04.scaffold Manual
No ratings yet
04.scaffold Manual
6 pages
Real Time Object Detection in Surveillance Cameras With 2xjeq74wam
No ratings yet
Real Time Object Detection in Surveillance Cameras With 2xjeq74wam
8 pages
Obstacle Detection and Classification Using Deep Learning For Tracking in High-Speed Autonomous Driving
No ratings yet
Obstacle Detection and Classification Using Deep Learning For Tracking in High-Speed Autonomous Driving
6 pages
The Ultimate Guide To Object Detection
No ratings yet
The Ultimate Guide To Object Detection
16 pages
Objectdetection
No ratings yet
Objectdetection
7 pages
FFI 2025 Event Brochure New
No ratings yet
FFI 2025 Event Brochure New
20 pages
R-CNN and FR-CNN Report: Methods Used at The Core of Object Detection
No ratings yet
R-CNN and FR-CNN Report: Methods Used at The Core of Object Detection
4 pages
Ding 2018 IOP Conf. Ser. Mater. Sci. Eng. 322 062024
No ratings yet
Ding 2018 IOP Conf. Ser. Mater. Sci. Eng. 322 062024
6 pages
IMINT Target Acquisition Using Deep Learning
No ratings yet
IMINT Target Acquisition Using Deep Learning
5 pages
R-CNN, Fast R-CNN, Faster R-CNN, YOLO - Object Detection Algorithms
No ratings yet
R-CNN, Fast R-CNN, Faster R-CNN, YOLO - Object Detection Algorithms
11 pages
BS 1881-112 1983 Concrete Methods of Accelerated Curing of Test Cubes
No ratings yet
BS 1881-112 1983 Concrete Methods of Accelerated Curing of Test Cubes
11 pages
Electromagnetic Brake Project
No ratings yet
Electromagnetic Brake Project
3 pages
Industrial Energy Efficiency
No ratings yet
Industrial Energy Efficiency
19 pages
CB Insights CVC Report 2023
No ratings yet
CB Insights CVC Report 2023
132 pages
Client Tutorial
No ratings yet
Client Tutorial
78 pages
Functions: Güntner Motor Management
No ratings yet
Functions: Güntner Motor Management
2 pages
MCQ
No ratings yet
MCQ
8 pages
PDF
No ratings yet
PDF
9 pages
Application of Six Bar Linkage For Path Reverse Rotation IJERTCONV7IS11110
No ratings yet
Application of Six Bar Linkage For Path Reverse Rotation IJERTCONV7IS11110
2 pages
Sandhya - Self Intro
No ratings yet
Sandhya - Self Intro
1 page
Interface Generac G Panel Modbus
No ratings yet
Interface Generac G Panel Modbus
1 page
SQL CREATE TABLE Statement
No ratings yet
SQL CREATE TABLE Statement
9 pages
1.MEMS Introduction
No ratings yet
1.MEMS Introduction
37 pages
Calix Gigaspire Blast U6
No ratings yet
Calix Gigaspire Blast U6
9 pages
Intertek IEC 62443 Security Levels
No ratings yet
Intertek IEC 62443 Security Levels
2 pages
Accustar II Das 20
No ratings yet
Accustar II Das 20
2 pages
Files2Sql - Manual (PDF Library)
No ratings yet
Files2Sql - Manual (PDF Library)
32 pages
Class Orientation 1st Sem 2021 2022 Edit
No ratings yet
Class Orientation 1st Sem 2021 2022 Edit
40 pages
Project 3 Q&A: Jonathan Krause
No ratings yet
Project 3 Q&A: Jonathan Krause
58 pages
8086 - Instruction Set
No ratings yet
8086 - Instruction Set
54 pages
Wimax Technology PDF
No ratings yet
Wimax Technology PDF
39 pages
Reed Switch: ORD2210V
No ratings yet
Reed Switch: ORD2210V
8 pages
Region of Interest Pooling Explained
No ratings yet
Region of Interest Pooling Explained
12 pages
Addressing Modes: - Accessing Operands (Data) in Various Ways
No ratings yet
Addressing Modes: - Accessing Operands (Data) in Various Ways
8 pages

R-CNN (Object Detection) - A Beginners Guide To One of The Most - by Sharif Elfouly - Medium

Uploaded by

R-CNN (Object Detection) - A Beginners Guide To One of The Most - by Sharif Elfouly - Medium

Uploaded by

11/24/2020 R-CNN (Object Detection).

A beginners guide to one of the most… | by Sharif Elfouly | Medium

Get started Open in app

R-CNN (Object Detection)

Sharif Elfouly Jul 16, 2019 · 5 min read

The R-CNN system

The R-CNN is agnostic to the region proposal method.

One question we need to answer:

To summerize the task of the CNN:

Input and output of the CNN

We reject a region (image proposal) if it has an intersection-over-union (IoU) overlap with a

Bounding Box Regressor (optional)

Thank you for reading and keep up the learning!

Machine Learning AI Data Science Neural Networks Convolutional Network

About Help Legal

Get the Medium app

You might also like