0% found this document useful (0 votes)

28 views6 pages

Retina Net

The document discusses the RetinaNet model for object detection. It introduces RetinaNet and explains why it was developed, describing the challenges with existing single-stage detectors. The architecture of RetinaNet is then broken down, including the backbone network, classification and regression subnetworks, and use of focal loss to address class imbalance.

Uploaded by

sudhanshu2198

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

28 views6 pages

Retina Net

Uploaded by

sudhanshu2198

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

8/12/23, 12:19 PM RetinaNet Model for object detection explanation - TowardsMachineLearning

Home About Us Contact Us Blogs  Live Sessions

Testimonials Careers

RetinaNet Model for object detection explanation

Subnetwork for object Classification

Subnetwork for object Regression
4. Focal Loss
5. Final Notes

In this article, I’ll introduce you to the architecture of RetinaNet model & working of it. Cherry on
top? In next article , we’ll build a “Face mask detector” using RetinaNet to help us in this
ongoing pandemic.

What is RetinaNet Model: –

Facebook AI research (FAIR ) team has introduced RetinaNet model with aim to tackle dense and
small objects detection problem.

For this reason, it has become a popular object detection model that one can use with aerial and
satellite imagery also.

Researchers have introduced RetinaNet by making two improvements over existing single stage
object detection models –

Feature Pyramid Networks (FPN)

Focal Loss

Need of RetinaNet Model: –

Both classic one stage detection methods, like boosted detectors, DPM & more recent methods
4 5
like SSD evaluate almost 10 to 10 candidate locations per image but only a few locations
contain objects (i.e. Foreground) and rest are just background objects. This leads to class
imbalance problem.

And this turn out to be the central cause of making performance of one stage detectors inferior
to two stage detectors.

Hence , researchers have introduced RetinaNet Model with concept called Focal Loss to fill in for
the class imbalances and inconsistencies of the single shot object detectors like YOLO and SSD
,while dealing with extreme foreground-background classes.

Architecture of RetinaNet Model: –

In essence, we can break down RetinaNet architecture in to 3 following components:

1. Backbone Network (i.e. Bottom up pathway + Top down pathway with lateral connections
eg. ResNet + FPN)
2. Sub-network for object Classification
3. Sub-network for object Regression

https://towardsmachinelearning.org/retinanet-model-for-object-detection-explanation/ 2/7
8/12/23, 12:19 PM RetinaNet Model for object detection explanation - TowardsMachineLearning

Figure 1 :- RetinaNet Model Architecture Source

For better understanding, Let’s understand each component of architecture separately –

The backbone Network: –

Bottom up pathway – Bottom up pathway (eg. ResNet) is used for feature extraction. So, It
calculates the feature maps at different scales, irrespective of the input image size.

Top down pathway with lateral connections– The top down pathway up samples the spatially
coarser feature maps from higher pyramid levels, and the lateral connections merge the top-
down layers and the bottom-up layers with the same spatial size.

Higher level feature maps tend to have small resolution though semantically stronger and is
therefore more suitable for detecting larger objects; on the contrary, grid cells from lower level
feature maps have high resolution and hence are better at detecting smaller objects

So, with combination of the top-down pathway and its lateral connections with bottom up
pathway, which do not require much extra computation, every level of the resulting feature maps
can be both semantically and spatially strong

Hence this architecture is scale-invariant and can provide better performance both in terms of
speed and accuracy.

Sub-network for object Classification: –

Fully convolutional network (FCN) is attached to each FPN level for object classification. As it’s
shown in diagram above , This subnetwork incorporates 3*3 convolutional layers with 256 filter
followed by another 3*3 convolutional layer with K*A filters. Hence output feature map would be
of size W*H*KA , where W & H are proportional to the width and height of input feature map and
K & A are number of object class and anchor boxes respectively.

At last , researchers have used Sigmoid layer (not softmax) for object classification.

And reason for last convolution layer to have KA filters is because , if there’re “A ” number of
anchor box proposals for each position in feature map obtained from last convolution layer then
each anchor box has possibility to be classified in K number of classes . So the output feature
map would be of size KA channels or filters.

https://towardsmachinelearning.org/retinanet-model-for-object-detection-explanation/ 3/7
8/12/23, 12:19 PM RetinaNet Model for object detection explanation - TowardsMachineLearning

Sub-network for object Regression: –

The regression subnetwork is attached to each feature map of the FPN in parallel to the
classification subnetwork. The design of the regression subnetwork is identical to that of the
classification subnet, except that the last convolutional layer is of size 3*3 with 4 filters
resulting in output feature map with size of W*H*4A .

Reason for last convolution layer to have 4 filters is because in order to localize the class
objects, regression sub-network produces 4 numbers for each anchor box that predict the
relative offset (in terms of center coordinates, width and height) between the anchor box and
the ground truth box. Therefore, the output feature map of the regression sub-net has 4A filters
or channels.

So by now we’ve little clarity on RetinaNet model for object detection architecture. Now let’s
understand most discussed topic topic of RetinaNet model for object detection and that is Focal
loss.

Focal Loss : –
Focal Loss (FL) is an improved version of Cross-Entropy Loss (CE) that tries to handle the class
imbalance problem by assigning more weights to hard or easily misclassified examples (i.e.
background with noisy texture or partial object or the object of our interest ) and to down-
weight easy examples (i.e. Background objects).

So Focal Loss reduces the loss contribution from easy examples and increases the importance
of correcting misclassified examples.)

Focal loss is just an extension of cross entropy loss function that would down-weight easy
examples and focus training on hard negatives. So to achieve this researchers have proposed
γ
(1 − pt ) to the cross entropy loss ,with a tunable focusing parameter γ ≥= 0

RetinaNet object detection method uses an α-balanced variant of the focal loss, where α=0.25,
γ=2 works the best.

https://towardsmachinelearning.org/retinanet-model-for-object-detection-explanation/ 4/7
8/12/23, 12:19 PM RetinaNet Model for object detection explanation - TowardsMachineLearning

Figure 1 . Focal loss vs probability of ground truth class Source

So one can define focal loss as –

F L(p < em > t) = −α < /em > t(1 − p < em > t)γ ln (p < /em > t)

The focal loss is visualized for several values of γϵ [0, 5] ,refer Figure 1.

Focal Loss characteristics:-

We shall note following properties of the focal loss-

1. When an example is misclassified and pt is small, the modulating factor is near 1 and the

loss is unaffected.
2. As pt → 1 ,the factor goes to 0 and the loss for well classified examples is down weighed.

3. The focusing parameter γ smoothly adjusts the rate at which easy examples are down-
weighted.
4. As γ is increased , the effect of modulating factor is likewise increased. ( After a lot of
experiments and trails , researchers have found γ = 2 to work best.

Note:- When γ = 0 , FL is equivalent to CE. (Shown blue curve in Figure 1)

You can read about Focal loss in detail in this article , Where I’ve talked about evolution of cross
entropy into Focal loss, need of focal loss, comparison of focal loss with Cross entropy.

https://towardsmachinelearning.org/retinanet-model-for-object-detection-explanation/ 5/7
8/12/23, 12:19 PM RetinaNet Model for object detection explanation - TowardsMachineLearning

And cherry on top, I’ve used couple of examples to explain why Focal loss is better than cross
entropy.

End Points: –
Retina Net is a powerful model that uses Feature Pyramid Network & ResNet as its backbone.

In general RetinaNet is a good choice to start an object detection project, in particular if you
need to quickly get good results. In next article we’ll build a solution using RetinaNet model.

If you’ve enjoyed this article, leave a few claps, it will encourage me to explore further machine
learning opportunities 🙂

References: –
http://arxiv.org/abs/1605.06409
https://arxiv.org/pdf/1708.02002.pdf
https://developers.arcgis.com/python/guide/how-retinanet-works/
https://analyticsindiamag.com/what-is-retinanet-ssd-focal-loss/
https://github.com/fizyr/keras-retinanet
https://www.freecodecamp.org/news/object-detection-in-colab-with-fizyr-retinanet-
efed36ac4af3/
https://deeplearningcourses.com/
https://blog.zenggyu.com/en/post/2018-12-05/retinanet-explained-and-demystified/

Article Credit:-
Name:- Praveen Kumar
Founder:- TowardsMachineLearning.Org

Deep Learning Focal loss for RCNN RCNN Family Region Proposal
dense object Simplified (Fast R-CNN Network (RPN)

https://towardsmachinelearning.org/retinanet-model-for-object-detection-explanation/ 6/7

Report For Retinanet
No ratings yet
Report For Retinanet
7 pages
Advanced Deep Learning Based Object Detection Methods
No ratings yet
Advanced Deep Learning Based Object Detection Methods
36 pages
Machine Learning - Advanced Concepts
From Everand
Machine Learning - Advanced Concepts
Derrick Mwiti
No ratings yet
Focal Loss For Dense Object Detection
No ratings yet
Focal Loss For Dense Object Detection
10 pages
Detnet: A Backbone Network For Object Detection: Abstract
No ratings yet
Detnet: A Backbone Network For Object Detection: Abstract
17 pages
Object and Face Detection Based On Center-Net 1
No ratings yet
Object and Face Detection Based On Center-Net 1
7 pages
Wang NAS-FCOS Fast Neural Architecture Search For Object Detection CVPR 2020 Paper
No ratings yet
Wang NAS-FCOS Fast Neural Architecture Search For Object Detection CVPR 2020 Paper
9 pages
Focal Loss For Dense Object Detection
No ratings yet
Focal Loss For Dense Object Detection
9 pages
R-FCN: Object Detection Via Region-Based Fully Convolutional Networks
No ratings yet
R-FCN: Object Detection Via Region-Based Fully Convolutional Networks
11 pages
Lecture 7 Deep Learning in Object Detection 2025
No ratings yet
Lecture 7 Deep Learning in Object Detection 2025
43 pages
Object Detection and Identification
67% (3)
Object Detection and Identification
20 pages
MV cs4243 2024 Amir 6 p2
No ratings yet
MV cs4243 2024 Amir 6 p2
95 pages
Pyramid Image Processing: Exploring the Depths of Visual Analysis
From Everand
Pyramid Image Processing: Exploring the Depths of Visual Analysis
Fouad Sabry
No ratings yet
cv2021 Lec6 Object Detection - 1600 - PDF - Gdrive.vip
No ratings yet
cv2021 Lec6 Object Detection - 1600 - PDF - Gdrive.vip
60 pages
Object Detection With Deep Learning
No ratings yet
Object Detection With Deep Learning
3 pages
Object Detection Presentation
No ratings yet
Object Detection Presentation
12 pages
Objectdetection
No ratings yet
Objectdetection
7 pages
Cviii 2024 Ws
No ratings yet
Cviii 2024 Ws
45 pages
Final Report - Removed
No ratings yet
Final Report - Removed
43 pages
Lin 2018 Focal Loss PDF
No ratings yet
Lin 2018 Focal Loss PDF
10 pages
Metaretinanet
No ratings yet
Metaretinanet
12 pages
Contextual Image Classification: Understanding Visual Data for Effective Classification
From Everand
Contextual Image Classification: Understanding Visual Data for Effective Classification
Fouad Sabry
No ratings yet
1 ObjectDetection
No ratings yet
1 ObjectDetection
46 pages
Object Detectionwith Convolutional Neural Networks
No ratings yet
Object Detectionwith Convolutional Neural Networks
12 pages
Object Detection ppt-1
100% (2)
Object Detection ppt-1
16 pages
Computer Vision Fundamental Matrix: Please, suggest a subtitle for a book with title 'Computer Vision Fundamental Matrix' within the realm of 'Computer Vision'. The suggested subtitle should not have ':'.
From Everand
Computer Vision Fundamental Matrix: Please, suggest a subtitle for a book with title 'Computer Vision Fundamental Matrix' within the realm of 'Computer Vision'. The suggested subtitle should not have ':'.
Fouad Sabry
No ratings yet
CVR FDP
No ratings yet
CVR FDP
37 pages
W11 Lecture ITS69204 Image Recognition
No ratings yet
W11 Lecture ITS69204 Image Recognition
44 pages
Ding 2018 IOP Conf. Ser. Mater. Sci. Eng. 322 062024
No ratings yet
Ding 2018 IOP Conf. Ser. Mater. Sci. Eng. 322 062024
6 pages
Unit 3
No ratings yet
Unit 3
45 pages
Last Lab Report
No ratings yet
Last Lab Report
6 pages
Hidden Line Removal: Unveiling the Invisible: Secrets of Computer Vision
From Everand
Hidden Line Removal: Unveiling the Invisible: Secrets of Computer Vision
Fouad Sabry
No ratings yet
Computer Vision Graph Cuts: Exploring Graph Cuts in Computer Vision
From Everand
Computer Vision Graph Cuts: Exploring Graph Cuts in Computer Vision
Fouad Sabry
No ratings yet
Yolo Family
No ratings yet
Yolo Family
40 pages
Realtime Visual Recognition in Deep Convolutional Neural Networks
No ratings yet
Realtime Visual Recognition in Deep Convolutional Neural Networks
13 pages
Beyond Mask RCNN and RetinaNet
No ratings yet
Beyond Mask RCNN and RetinaNet
26 pages
Assignment-2:DIP: Mr. Victor Mageto CP10101610245
No ratings yet
Assignment-2:DIP: Mr. Victor Mageto CP10101610245
10 pages
The Ultimate Guide To Object Detection
No ratings yet
The Ultimate Guide To Object Detection
16 pages
An Efficient Object Detection Algorithm Based On Compressed Networks
No ratings yet
An Efficient Object Detection Algorithm Based On Compressed Networks
13 pages
Presentation (Theoretical Evaluation)
No ratings yet
Presentation (Theoretical Evaluation)
107 pages
Week5 Computer Vision
No ratings yet
Week5 Computer Vision
58 pages
Object Detect
No ratings yet
Object Detect
12 pages
Scale Invariant Feature Transform: Unveiling the Power of Scale Invariant Feature Transform in Computer Vision
From Everand
Scale Invariant Feature Transform: Unveiling the Power of Scale Invariant Feature Transform in Computer Vision
Fouad Sabry
No ratings yet
Detect To Track and Track To Detect
No ratings yet
Detect To Track and Track To Detect
10 pages
5 Major Computervision Technique
No ratings yet
5 Major Computervision Technique
10 pages
1 Realtimeobjectdetection
No ratings yet
1 Realtimeobjectdetection
6 pages
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: NAIVE BAYES, NEAREST NEIGHBORS and NEURAL NETWORKS: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: NAIVE BAYES, NEAREST NEIGHBORS and NEURAL NETWORKS: Examples with MATLAB
César Pérez López
No ratings yet
CNN Models To Detect Multiple Leds For Multilateral Occ.: Project: Ieee P802.15 Ig Vat
No ratings yet
CNN Models To Detect Multiple Leds For Multilateral Occ.: Project: Ieee P802.15 Ig Vat
9 pages
Zeng 2020
No ratings yet
Zeng 2020
12 pages
Overview of Object Detection Based On Deep Learnin
No ratings yet
Overview of Object Detection Based On Deep Learnin
7 pages
Tian FCOS Fully Convolutional One-Stage Object Detection ICCV 2019 Paper
No ratings yet
Tian FCOS Fully Convolutional One-Stage Object Detection ICCV 2019 Paper
10 pages
Object Detection Using ELAN
No ratings yet
Object Detection Using ELAN
6 pages
ObjectDetectionPhase2 Demo
No ratings yet
ObjectDetectionPhase2 Demo
16 pages
Fairmot Explained 1
No ratings yet
Fairmot Explained 1
19 pages
Development of Framework For Detecting Smoking Scenes
No ratings yet
Development of Framework For Detecting Smoking Scenes
5 pages
2004 10934v1 PDF
No ratings yet
2004 10934v1 PDF
17 pages
Slides 11 - Image Pattern Classification
No ratings yet
Slides 11 - Image Pattern Classification
86 pages
Multi View Three Dimensional Reconstruction: Advanced Techniques for Spatial Perception in Computer Vision
From Everand
Multi View Three Dimensional Reconstruction: Advanced Techniques for Spatial Perception in Computer Vision
Fouad Sabry
No ratings yet
End-to-End Object Detection With Fully Convolutional Network
No ratings yet
End-to-End Object Detection With Fully Convolutional Network
13 pages
Ayush Singh Research
No ratings yet
Ayush Singh Research
5 pages
Chapter 3.the Case Study Method
No ratings yet
Chapter 3.the Case Study Method
5 pages
Impact of Colonialism On Africa and Its Economic Development
No ratings yet
Impact of Colonialism On Africa and Its Economic Development
8 pages
SCAPS Manual Most Recent
No ratings yet
SCAPS Manual Most Recent
137 pages
Humour&Gender in The Marvellous Mrs Maisel by Alina-Laura Chitu
No ratings yet
Humour&Gender in The Marvellous Mrs Maisel by Alina-Laura Chitu
110 pages
Aiml Notes Chapter-3
No ratings yet
Aiml Notes Chapter-3
34 pages
1 (B) - Laterally Loaded Piles
No ratings yet
1 (B) - Laterally Loaded Piles
6 pages
Faculty Profile: Professional (Industry) Experience (32 Years)
No ratings yet
Faculty Profile: Professional (Industry) Experience (32 Years)
1 page
Module 2 Notes
No ratings yet
Module 2 Notes
30 pages
Complex Thought FINAL
No ratings yet
Complex Thought FINAL
25 pages
Ud Module 4
No ratings yet
Ud Module 4
105 pages
San Chit
No ratings yet
San Chit
2 pages
5.2 Understanding Inheritance
No ratings yet
5.2 Understanding Inheritance
18 pages
ASHRAE Weather Data
No ratings yet
ASHRAE Weather Data
1 page
bml-205 KK en
No ratings yet
bml-205 KK en
1 page
Genome Organization in E. Coli
No ratings yet
Genome Organization in E. Coli
7 pages
Agnico Eagle 2023 Sustainability Performance Data - 25042024
No ratings yet
Agnico Eagle 2023 Sustainability Performance Data - 25042024
147 pages
TiO2 APPLAB989092510 1
No ratings yet
TiO2 APPLAB989092510 1
3 pages
FFBL FML FPCL Answer Key
No ratings yet
FFBL FML FPCL Answer Key
19 pages
Konica Monolta Drum (Photoconductor) DR512-DR512K
No ratings yet
Konica Monolta Drum (Photoconductor) DR512-DR512K
4 pages
8 Total Quality Management Principles - Lucidchart Blog
No ratings yet
8 Total Quality Management Principles - Lucidchart Blog
12 pages
GS Syllabus Legal Aspect of Education Aug 2024
No ratings yet
GS Syllabus Legal Aspect of Education Aug 2024
7 pages
UoS BABS 3 HRM Assignment
No ratings yet
UoS BABS 3 HRM Assignment
15 pages
q2 Activity Sheets - Grade 3
100% (2)
q2 Activity Sheets - Grade 3
13 pages
RAMA - 54211 - 05071181320069 - 0031107101 - 0012046201 - 01 - Front - Ref
No ratings yet
RAMA - 54211 - 05071181320069 - 0031107101 - 0012046201 - 01 - Front - Ref
23 pages
Short Essay On Abraham Lincoln
100% (2)
Short Essay On Abraham Lincoln
3 pages
Thesis Definition of Terms Format
100% (3)
Thesis Definition of Terms Format
4 pages
Chemistry Investigatory Project
33% (3)
Chemistry Investigatory Project
11 pages
1911 Encyclopædia Britannica
No ratings yet
1911 Encyclopædia Britannica
301 pages
BNAD 277 Tableau Assignment
No ratings yet
BNAD 277 Tableau Assignment
1 page
Sports Mania
No ratings yet
Sports Mania
33 pages