0% found this document useful (0 votes)

18 views40 pages

Digging Into Sample Assignment Methods For Object Detection

The document discusses different sample assignment methods used in object detection models to associate feature maps extracted from input images with ground truth bounding boxes. It explains how region proposal networks in two-stage detectors like Faster R-CNN and single-shot detectors like RetinaNet and YOLO assign objectness scores and regression targets to anchors based on their intersection over union with ground truth boxes. The sample assignment process determines which anchors are considered as foreground, background, or ignored during training based on predefined IoU thresholds.

Uploaded by

Vaalu Siva

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

18 views40 pages

Digging Into Sample Assignment Methods For Object Detection

Uploaded by

Vaalu Siva

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 40

Digging into

Sample Assignment Methods

for Object Detection
Hiroto Honda
Oct. 1, 2020
About Me

Hiroto Honda

- Mobility Technologies Co., Ltd. (Japan)

- homepage: https://hirotomusiker.github.io/

- blogs: Digging Into Detectron 2

- kaggle master: 6th place at Open Images Challenge ‘19

- Interests: Object Detection, Human Pose Estimation, Image Restoration

Today I talk about..

How to deﬁne training samples of

object detection
for the given feature map
and ground-truth boxes
Today I don’t talk about...

Accuracy and Inference Time Comparison among

Object Detectors
because
It’s Hard to See the Diﬀerence between
Sampling Methods in a Fair Way
Object Detection Input:
Image
Output:
Bounding Boxes
(xywh + class id + confidence)

from: [H1]
How Object Detection Works
Example of 2-stage Detector [H1][3]: Faster R-CNN [1] + Feature Pyramid Network [2]
Object Detectors Decomposed

every grid cell

dense head is responsible for
detection

backbone

neck
roi head

recognition of one object

from one ROI feature map
Object Detectors Decomposed

every grid cell

dense head is responsible for
detection

backbone 1-stage
(single-shot)
detector

neck 2-stage
roi head detector

recognition of one object

from one ROI feature map
Object Detectors Decomposed

2-stage detector

1-stage (single-shot) detector

detector name backbone neck dense head roi head

Faster R-CNN [1] w/ FPN[2] ResNet FPN RPN Fast RCNN

Mask R-CNN [4] ResNet FPN RPN Mask RCNN

RetinaNet [5] ResNet FPN RetinaNetHead -

EfficientDet [6] EfficientNet BiFPN RetinaNetHead -

YOLO [7-11] darknet etc YOLO-FPN YOLO layer -

SSD [12] VGG - SSDHead -

How are Feature Maps and Ground Truth Associated?

from: [H1]
Region Proposal Network

detector name backbone neck dense head roi head

Faster R-CNN [1] w/ FPN[2] ResNet FPN RPN Fast RCNN

Mask R-CNN [4] ResNet FPN RPN Mask RCNN

Region Proposal Network (RPN)

INPUT

OUTPUT from: [H1]

Multi-Scale Detection Results (objectness)
stride = 4 stride = 8

stride = 16 stride = 32 stride = 64

from: [H1]
visualization of an objectness channel
(corresponding to one of three anchors)
Anchors

three anchors per scale

aspect ratio : (1,1), (1, 2), (2, 1)

from: [H1]
Anchors on Each Grid Cell

from: [H1]

Grid cells at the coarse scale have large anchors

= responsible for detecting large objects
How are Feature Maps and Ground Truth Associated?

from: [H1]

Answer: Deﬁne the ‘foreground grid cells’ by matching

‘anchors’ with GT boxes
Intersection Over Union (IoU)

A IoU = A ∩ B / A ∪ B
B

IoU=0.95

IoU=0.15
IoU Matrix for Anchor-GT Matching
position 0 position 1 position 2

anchors

GT box 0 0 0 0.61 0.28 0 0 0 0 0

GT box 1 0 0 0 0 0 0 0.98 0 0 IoU value

from: [H1]
matched with
ignored background
GT box 1,
foreground

foreground (IoU ≧ T1) -> objectness target=1, regression target

background (IoU < T2) : objectness target=0, no regression loss

ignored (T2 ≦ IoU < T1) T1 and T2: predeﬁned threshold values
Sample Assignment of RPN
Box Regression After Sample Assignment

Δx =(x-xa)/wa)
Δy =(y-ya)/ha
Δw = log(w/wa)
Δh = log(h/ha)

from: [H1]

RPN learns relative size and location between GT boxes and anchors
RetinaNet / EﬃcientDet

detector name backbone neck dense head roi head

RetinaNet [5] ResNet FPN RetinaNetHead -

EfficientDet [6] EfficientNet BiFPN RetinaNetHead -

RetinaNet
Input Image
BGR, H, W

Backbone

C2
C3
RetinaNetHead
C4 C5
P6,
stem

P7
cls_subnet -> cls_score

bbox_subnet -> bbox_pred

+ + +
+

P4 P5
P2 P3
EﬃcientDet
Input Image
BGR, H, W

Backbone

Backbone: EﬃcientNet
C2
Neck: BiFPNC3 RetinaNetHead
C4 C5
P6,
stem

P7
cls_subnet -> cls_score

bbox_subnet -> bbox_pred

+ + +
+

P4 P5
P2 P3
Sample Assignment of RetinaNet and EﬃcientDet
same as RPN - number of anchors and IoU thresholds are diﬀerent

position 0 position 1 position 2 architecture num. anchors T1 T2

at grid cell

anchors Faster R-CNN 3 0.7 0.3

GT box 0 0 0 0.41 0.28 0 0 0 0 0 RetinaNet 9 0.5 0.4

GT box 1 0 0 0 0 0 0 0.68 0 0 EfficientDet 3 0.5 0.5

IoU value
[3]
matched with
ignored background
GT box 1,
foreground

foreground (IoU ≧ T1) : class target = one-hot, regression target

background (IoU < T2): class target = zeros, no regression loss

ignored (T2 ≦ IoU < T1) [only RetinaNet] T1 and T2: predeﬁned threshold values
YOLO v1 / v2 / v3 / v4 / v5

detector name backbone neck dense head roi head

YOLO [7-11] darknet etc YOLO-FPN YOLO layer -

YOLO detector
YOLOv3 architecture

darknet53

YOLO Layer

bbox, class score, confidence

P4 P5
P3

What makes YOLO is the YOLO layer

Sample Assignment of YOLO v2 / v3 for the details, see [H2]
position 0 position 1 position 2

anchors

GT box 0 0 0 0.38 0.18 0 0 0 0 0 max

・・・
GT box 1 0 0 0 0 0 0 0.98 0 0 max

matched background matched with

with GT box 0, GT box 1,
foreground foreground

foreground (max-IoU) : objectness = 1. regression target

background (other than max-IoU anchors): objectness = 0, no regression loss

ignored (IoU between prediction and GT > T1) T1: predeﬁned threshold values

only one anchor is assigned to one GT

Sample Assignment of YOLO v4 / v5
position 0 position 1 position 2

anchors

GT box 0 0 0 0.88 0.78 0 0 0 0 0

・・・
GT box 1 0 0 0 0 0 0 0.98 0 0

matched with
matched
GT box 1,
with GT box 0,
foreground
foreground

foreground (v4: IoU > T1, v5: box w, h ratio < Ta ) : objectness = 1. regression target

background (v4: IoU > T1, v5: box w, h ratio > Ta) : objectness = 0, no regression loss

ignored (IoU > T2, only YOLOv4)

multiple anchors can be assigned to one GT
Sample Assignment Comparison - YOLOv3 vs YOLOv5

YOLOv5 assigns three feature points for one target center -> higher recall

see my kaggle discussion topic for the YOLOv5 details

https://www.kaggle.com/c/global-wheat-detection/discussion/172436
Sample Assignment of YOLO series

version scale num. anchors assignment method assigned

per scale anchors per GT

YOLO v1 1 0 center position comparison single

YOLO v2 1 9 IoU comparison single

YOLO v3 3 3 IoU comparison single

YOLO v4 3 3 IoU comparison multiple

YOLO v5 3 3 box size comparison multiple

additional neighboring 2 cells

target assignment is so diﬀerent between YOLO versions - which one is the best?
“Anchor-Free” Detectors

detector name backbone neck dense head roi head

FCOS [13] ResNet FPN FCOSHead -

CenterNet (objects as points) [14] Hourglass CenterNetHead -

FCOS

- Assign all the grid cells that fall into the GT box
- only at the appropriate scale
- ‘Center-ness’ score is used additionally to suppress low-quality predictions far
from the GT center
Objects as Points (CenterNet)

- objectness (center) target: heatmap with

Gaussian kernels around GT centers
- regression target assignment: one grid cell
+ surrounding points (optional)
Adaptive Sample Selection

detector name backbone neck dense head roi head

ATSS [15] ResNet FPN ATSSHead -

Adaptive Sample Selection
Adaptively deﬁne IoU threshold for each GT box

IoUthreshold
= mean(IoUs) + std(IoUs)

sample candidates : K=9

nearby anchors from the
GT center

Improves performance of both

anchor-based and anchor-free detectors
Adaptive Sample Selection
anchors

GT box 0 0 0 0.88 0.28 0 0 0 0 0 IoU threshold=0.71

・・・
GT box 1 0 0 0 0 0.18 0.24 0.22 0 0 IoU threshold=0.21

matched matched with

with GT box 0, GT box 1,
foreground foreground

candidate anchors whose centers are

foreground (positive)
close to the GT centers

background (negative)

ignored
multiple anchors can be assigned for one GT
High recall but includes low-quality positives
Conclusion

- An object detector can be decomposed into backbone, neck,

dense detection head and ROI head
- The core of dense detection is ground-truth sample
assignment to the feature map
- Assignment method varies among detectors
- anchor based or point based
- allow multiple anchors per GT or not
- ﬁxed or adaptive IoU threshold
- Adaptive IoU thresholding improves performance of both
anchor-based and anchor-free detectors
References
[1] Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. Faster R-CNN: Towards real-time object detection with region proposal
networks. In NIPS, 2015.
[2] T.-Y. Lin, P. Dollar, R. Girshick, K. He, B. Hariharan, and S. Belongie. Feature pyramid networks for object detection. In CVPR, 2017.
[3] Yuxin Wu, Alexander Kirillov, Francisco Massa, Wan-Yen Lo and Ross Girshick, Detectron2.
https://github.com/facebookresearch/detectron2, 2019.
[4] Kaiming He, Georgia Gkioxari, Piotr Dollar, and Ross Girshick. Mask R-CNN. In ICCV, 2017.
[5] T.-Y. Lin, P. Goyal, R. Girshick, K. He, and P. Dollar. Focal loss for dense object detection. In ICCV, 2017.
[6] Mingxing Tan, Ruoming Pang, and Quoc V Le. EfficientDet: Scalable and efficient object detection. In CVPR , 2020.
[7] Joseph Redmon, Santosh Divvala, Ross Girshick, and Ali Farhadi. You only look once: Unified, real-time object detection. In
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 779– 788, 2016.
[8] Joseph Redmon and Ali Farhadi. YOLO9000: better, faster, stronger. In Proceedings of the IEEE Conference on Computer Vision and
Pattern Recognition (CVPR), pages 7263– 7271, 2017.
[9] Joseph Redmon and Ali Farhadi. YOLOv3: An incremental improvement. arXiv preprint arXiv:1804.02767, 2018.
[10] Alexey Bochkovskiy, Chien-Yao Wang, and Hong-Yuan Mark Liao. Yolov4: Optimal speed and accuracy of object detection. arXiv
preprint arXiv:2004.10934, 2020.
[11] YOLOv5, https://github.com/ultralytics/yolov5 , as of version 3.0
[12] Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott Reed, Cheng-Yang Fu, and Alexander C 12 Berg. SSD: Single
shot multibox detector. In ECCV, 2016.
[13] Zhi Tian, Chunhua Shen, Hao Chen, and Tong He. FCOS: Fully convolutional one-stage object detection. In ICCV, 2019.
[14] Xingyi Zhou, Dequan Wang, and Philipp Krahenbuhl. Objects as points. arXiv preprint arXiv:1904.07850, 2019.
[15] Shifeng Zhang, Cheng Chi, Yongqiang Yao, Zhen Lei, and Stan Z Li. Bridging the gap between anchor-based and anchor-free detection
via adaptive training sample selection. In CVPR, 2020.
References

Hiroto Honda’s medium blogs

[H1] Digging Into Detectron 2
[part 1] : Introduction - Basic Network Architecture and Repo Structure
[part 2] : Feature Pyramid Network
[part 3] : Data Loader and Ground Truth Instances
[part 4] : Region Proposal Network
[part 5]: ROI (Box) Head
[H2] Reproducing Training Performance of YOLOv3 in PyTorch
[Part 0]: Introduction
[Part 1]: Network Architecture and channel elements of YOLO layers
[Part 2]: How to assign targets to multi-scale anchors

Code in Mips That Will Play Battleships
No ratings yet
Code in Mips That Will Play Battleships
6 pages
CG Data Management System
No ratings yet
CG Data Management System
2 pages
Anatomy Of: Domain - Driven Design
No ratings yet
Anatomy Of: Domain - Driven Design
24 pages
EdgeYOLO AnEdge-Real-Time Object Detector
No ratings yet
EdgeYOLO AnEdge-Real-Time Object Detector
7 pages
YOLO v2
No ratings yet
YOLO v2
9 pages
Yolo Paper
No ratings yet
Yolo Paper
10 pages
YOLO Series Algorithms in Object Detection of Unmanned Aerial Vehicles: A Survey
No ratings yet
YOLO Series Algorithms in Object Detection of Unmanned Aerial Vehicles: A Survey
30 pages
Varifocal Net
No ratings yet
Varifocal Net
11 pages
You Only Look Once - Unified, Real-Time Object Detection
No ratings yet
You Only Look Once - Unified, Real-Time Object Detection
10 pages
TSP CMC 49710
No ratings yet
TSP CMC 49710
19 pages
Tian FCOS Fully Convolutional One-Stage Object Detection ICCV 2019 Paper
No ratings yet
Tian FCOS Fully Convolutional One-Stage Object Detection ICCV 2019 Paper
10 pages
Overview of YOLO ObjectDetectionAlgorithm
No ratings yet
Overview of YOLO ObjectDetectionAlgorithm
7 pages
Yolopdf
No ratings yet
Yolopdf
10 pages
Documents 2025-0 (v3) - Object Detection Object Detection-L3 v3
No ratings yet
Documents 2025-0 (v3) - Object Detection Object Detection-L3 v3
170 pages
Object Detection Method Based On Yolov3 Using Deep Learning Networks
No ratings yet
Object Detection Method Based On Yolov3 Using Deep Learning Networks
4 pages
Real-Time Face Detection Based On YOLO
No ratings yet
Real-Time Face Detection Based On YOLO
4 pages
YOLO-LITE: A Real-Time Object Detection Algorithm Optimized For Non-GPU Computers
No ratings yet
YOLO-LITE: A Real-Time Object Detection Algorithm Optimized For Non-GPU Computers
8 pages
"Object Detection With Yolo": A Seminar On
No ratings yet
"Object Detection With Yolo": A Seminar On
14 pages
1 ObjectDetection
No ratings yet
1 ObjectDetection
46 pages
Object Detection
No ratings yet
Object Detection
31 pages
YOLO Evolution Through Time
No ratings yet
YOLO Evolution Through Time
5 pages
7 外文翻译1
No ratings yet
7 外文翻译1
10 pages
Multiscale Object Detection in Remote Sensing Images Using 1qh06jan
No ratings yet
Multiscale Object Detection in Remote Sensing Images Using 1qh06jan
10 pages
Object Detect
No ratings yet
Object Detect
12 pages
Object Detection Method Based On YOLOv3 Using - Deep Learning Networks
No ratings yet
Object Detection Method Based On YOLOv3 Using - Deep Learning Networks
4 pages
Deep Learning YOLOv2
No ratings yet
Deep Learning YOLOv2
3 pages
(IJCST-V8I3P4) :sakshi Gupta, Dr. T. Uma Devi
No ratings yet
(IJCST-V8I3P4) :sakshi Gupta, Dr. T. Uma Devi
5 pages
CenterNet Keypoint Triplets PDF
No ratings yet
CenterNet Keypoint Triplets PDF
10 pages
Real Time Object Detection
No ratings yet
Real Time Object Detection
8 pages
YOLO Object Detection Explained - A Beginner's Guide - DataCamp
No ratings yet
YOLO Object Detection Explained - A Beginner's Guide - DataCamp
14 pages
Yolo
No ratings yet
Yolo
10 pages
YOLO
No ratings yet
YOLO
43 pages
Ref 14
No ratings yet
Ref 14
5 pages
The Basics of Object Detection YOLO SSD R-CNN
No ratings yet
The Basics of Object Detection YOLO SSD R-CNN
4 pages
Object and Face Detection Based On Center-Net 1
No ratings yet
Object and Face Detection Based On Center-Net 1
7 pages
Od Segment
No ratings yet
Od Segment
53 pages
YOLO-LITE: A Real-Time Object Detection Algorithm Optimized For Non-GPU Computers
No ratings yet
YOLO-LITE: A Real-Time Object Detection Algorithm Optimized For Non-GPU Computers
8 pages
Ymer 230109
100% (1)
Ymer 230109
11 pages
Signature Object Detection Based On YOLOv3
No ratings yet
Signature Object Detection Based On YOLOv3
4 pages
Object Detection Technique (YOLO)
No ratings yet
Object Detection Technique (YOLO)
19 pages
Yolo
No ratings yet
Yolo
32 pages
Yolov3: An Incremental Improvement: Joseph Redmon, Ali Farhadi
No ratings yet
Yolov3: An Incremental Improvement: Joseph Redmon, Ali Farhadi
6 pages
10 - CPU Based YOLO A Real Time Object Detection Algorithm
No ratings yet
10 - CPU Based YOLO A Real Time Object Detection Algorithm
4 pages
Advanced Deep Learning Based Object Detection Methods
No ratings yet
Advanced Deep Learning Based Object Detection Methods
36 pages
On Hyperbolic Embeddings in Object Detection
No ratings yet
On Hyperbolic Embeddings in Object Detection
19 pages
Maaz Assignment # 3 Deep Learning
No ratings yet
Maaz Assignment # 3 Deep Learning
5 pages
Week 05
No ratings yet
Week 05
38 pages
YOLOCS：基于密集通道压缩的特征空间固化目标检测
No ratings yet
YOLOCS：基于密集通道压缩的特征空间固化目标检测
9 pages
Yolov10: Real-Time End-To-End Object Detection: Ao Wang Hui Chen Lihao Liu Kai Chen Zijia Lin Jungong Han Guiguang Ding
No ratings yet
Yolov10: Real-Time End-To-End Object Detection: Ao Wang Hui Chen Lihao Liu Kai Chen Zijia Lin Jungong Han Guiguang Ding
21 pages
Yolo India
No ratings yet
Yolo India
14 pages
YOLO V2 For Object Detection
No ratings yet
YOLO V2 For Object Detection
38 pages
YOLO
No ratings yet
YOLO
31 pages
YOLO Based Detection and Classification of Objects in Video Records
No ratings yet
YOLO Based Detection and Classification of Objects in Video Records
5 pages
Cornernet: Detecting Objects As Paired Keypoints
No ratings yet
Cornernet: Detecting Objects As Paired Keypoints
17 pages
2022 - Enhanced Feature Fusion and Multiple Receptive Fields Object Detection
No ratings yet
2022 - Enhanced Feature Fusion and Multiple Receptive Fields Object Detection
12 pages
MJEER-Volume 30-Issue 1 - Page 52-57
No ratings yet
MJEER-Volume 30-Issue 1 - Page 52-57
6 pages
2004 10934v1 PDF
No ratings yet
2004 10934v1 PDF
17 pages
Applsci 13 12977
No ratings yet
Applsci 13 12977
21 pages
Electronics-Object Detection YOLO
No ratings yet
Electronics-Object Detection YOLO
12 pages
TensorFlow深度学习项目实战: Chinese Edition
From Everand
TensorFlow深度学习项目实战: Chinese Edition
Posts & Telecom Press
No ratings yet
Neo Geo Architecture: Architecture of Consoles: A Practical Analysis, #23
From Everand
Neo Geo Architecture: Architecture of Consoles: A Practical Analysis, #23
Rodrigo Copetti
No ratings yet
Machine Learning - Advanced Concepts
From Everand
Machine Learning - Advanced Concepts
Derrick Mwiti
No ratings yet
Virtual Boy Architecture: Architecture of Consoles: A Practical Analysis, #17
From Everand
Virtual Boy Architecture: Architecture of Consoles: A Practical Analysis, #17
Rodrigo Copetti
No ratings yet
National: Workshop On Machine Learning For Industry 4.0
No ratings yet
National: Workshop On Machine Learning For Industry 4.0
2 pages
Hijacked Journals - Beall's List of Predatory - Journals and Publishers
No ratings yet
Hijacked Journals - Beall's List of Predatory - Journals and Publishers
7 pages
Audio CD Player AM Detector Echo-Canceller: Presented by K. Sivamani M.SC., M.Phil.
No ratings yet
Audio CD Player AM Detector Echo-Canceller: Presented by K. Sivamani M.SC., M.Phil.
24 pages
Problem: Given and Calculate The Intrinsic Carrier Concentration and For Ge and Gaas at 300K? Answer: Material Si 1.1 Gaas 1.4 Ge 0.66
No ratings yet
Problem: Given and Calculate The Intrinsic Carrier Concentration and For Ge and Gaas at 300K? Answer: Material Si 1.1 Gaas 1.4 Ge 0.66
1 page
NN V V N: PN - Junction Diode
No ratings yet
NN V V N: PN - Junction Diode
1 page
Webcam
No ratings yet
Webcam
2 pages
Summer Training in Statistics: Basic Mathematics Probability Statistics
No ratings yet
Summer Training in Statistics: Basic Mathematics Probability Statistics
1 page
C3000GT PM en 09
No ratings yet
C3000GT PM en 09
127 pages
Code Blue PDF
No ratings yet
Code Blue PDF
9 pages
Deloitte Full Test 1 Q
No ratings yet
Deloitte Full Test 1 Q
13 pages
64482-International Price Index 23 24 v11
No ratings yet
64482-International Price Index 23 24 v11
30 pages
ECC For EBS
100% (1)
ECC For EBS
6 pages
Multiple Choice Questions (1-5) 1 Tick For Each Correct Answer PDF
No ratings yet
Multiple Choice Questions (1-5) 1 Tick For Each Correct Answer PDF
2 pages
Carrier VRF Xct7 2022
No ratings yet
Carrier VRF Xct7 2022
186 pages
PS-Sheffield-MA Landscape Management
No ratings yet
PS-Sheffield-MA Landscape Management
2 pages
Bavleen Revised
No ratings yet
Bavleen Revised
4 pages
QMM Report Tata Steel
100% (1)
QMM Report Tata Steel
33 pages
Key To Corrections - LEVEL 2 MODULE 3
No ratings yet
Key To Corrections - LEVEL 2 MODULE 3
10 pages
Electrical Installation Level 5 Learning Guide
No ratings yet
Electrical Installation Level 5 Learning Guide
76 pages
Synthetic
No ratings yet
Synthetic
6 pages
SCADA
No ratings yet
SCADA
12 pages
Noise in Aviation
No ratings yet
Noise in Aviation
70 pages
A Study On The Performance of Insurance Companies in 1xynrowx1f
No ratings yet
A Study On The Performance of Insurance Companies in 1xynrowx1f
13 pages
Easy Love Spell
50% (2)
Easy Love Spell
2 pages
Rez Sisters Thesis
100% (3)
Rez Sisters Thesis
7 pages
Taxi Reimbursement Request Form 07.31.24 - 0
No ratings yet
Taxi Reimbursement Request Form 07.31.24 - 0
2 pages
Specifications-700-HC Relays: Relay and Timer Specifications
No ratings yet
Specifications-700-HC Relays: Relay and Timer Specifications
1 page
Test Bank For Financial Accounting, 11th Edition: Albrecht - 2025 Version Is Available With All Chapters
100% (9)
Test Bank For Financial Accounting, 11th Edition: Albrecht - 2025 Version Is Available With All Chapters
37 pages
NS & Tech - Grade 4 - Terminology List - IsiZulu
No ratings yet
NS & Tech - Grade 4 - Terminology List - IsiZulu
11 pages
Activity 3 Earths Interior
No ratings yet
Activity 3 Earths Interior
3 pages
Rotax 912 Operator's Manual
No ratings yet
Rotax 912 Operator's Manual
85 pages
Circuit Design Powerful Blad Tinkercad
No ratings yet
Circuit Design Powerful Blad Tinkercad
1 page
Standard Requirements For Tourist Land, Water &
100% (1)
Standard Requirements For Tourist Land, Water &
29 pages
Lift Manuals - Manuale Delle Parti - CHASSIS, MAST, OPTIONS & INTERNAL HOSING - PDF Tav 4 Ver
No ratings yet
Lift Manuals - Manuale Delle Parti - CHASSIS, MAST, OPTIONS & INTERNAL HOSING - PDF Tav 4 Ver
3 pages
AE 814 Compliance of Draft Construction Stage Report For TMP (PKG-I To III)
No ratings yet
AE 814 Compliance of Draft Construction Stage Report For TMP (PKG-I To III)
12 pages

Digging Into Sample Assignment Methods For Object Detection

Uploaded by

Digging Into Sample Assignment Methods For Object Detection

Uploaded by

Digging into

Sample Assignment Methods

- Mobility Technologies Co., Ltd. (Japan)

- blogs: Digging Into Detectron 2

- kaggle master: 6th place at Open Images Challenge ‘19

- Interests: Object Detection, Human Pose Estimation, Image Restoration

How to deﬁne training samples of

Accuracy and Inference Time Comparison among

every grid cell

recognition of one object

every grid cell

recognition of one object

1-stage (single-shot) detector

detector name backbone neck dense head roi head

Faster R-CNN [1] w/ FPN[2] ResNet FPN RPN Fast RCNN

Mask R-CNN [4] ResNet FPN RPN Mask RCNN

RetinaNet [5] ResNet FPN RetinaNetHead -

EfficientDet [6] EfficientNet BiFPN RetinaNetHead -

YOLO [7-11] darknet etc YOLO-FPN YOLO layer -

SSD [12] VGG - SSDHead -

detector name backbone neck dense head roi head

Faster R-CNN [1] w/ FPN[2] ResNet FPN RPN Fast RCNN

Mask R-CNN [4] ResNet FPN RPN Mask RCNN

OUTPUT from: [H1]

stride = 16 stride = 32 stride = 64

three anchors per scale

aspect ratio : (1,1), (1, 2), (2, 1)

Grid cells at the coarse scale have large anchors

Answer: Deﬁne the ‘foreground grid cells’ by matching

GT box 0 0 0 0.61 0.28 0 0 0 0 0

GT box 1 0 0 0 0 0 0 0.98 0 0 IoU value

foreground (IoU ≧ T1) -> objectness target=1, regression target

background (IoU < T2) : objectness target=0, no regression loss

detector name backbone neck dense head roi head

RetinaNet [5] ResNet FPN RetinaNetHead -

EfficientDet [6] EfficientNet BiFPN RetinaNetHead -

bbox_subnet -> bbox_pred

bbox_subnet -> bbox_pred

position 0 position 1 position 2 architecture num. anchors T1 T2

anchors Faster R-CNN 3 0.7 0.3

GT box 0 0 0 0.41 0.28 0 0 0 0 0 RetinaNet 9 0.5 0.4

GT box 1 0 0 0 0 0 0 0.68 0 0 EfficientDet 3 0.5 0.5

foreground (IoU ≧ T1) : class target = one-hot, regression target

background (IoU < T2): class target = zeros, no regression loss

detector name backbone neck dense head roi head

YOLO [7-11] darknet etc YOLO-FPN YOLO layer -

bbox, class score, confidence

What makes YOLO is the YOLO layer

GT box 0 0 0 0.38 0.18 0 0 0 0 0 max

matched background matched with

foreground (max-IoU) : objectness = 1. regression target

background (other than max-IoU anchors): objectness = 0, no regression loss

only one anchor is assigned to one GT

GT box 0 0 0 0.88 0.78 0 0 0 0 0

ignored (IoU > T2, only YOLOv4)

see my kaggle discussion topic for the YOLOv5 details

version scale num. anchors assignment method assigned

YOLO v1 1 0 center position comparison single

YOLO v2 1 9 IoU comparison single

YOLO v3 3 3 IoU comparison single

YOLO v4 3 3 IoU comparison multiple

YOLO v5 3 3 box size comparison multiple

detector name backbone neck dense head roi head

FCOS [13] ResNet FPN FCOSHead -

CenterNet (objects as points) [14] Hourglass CenterNetHead -

- objectness (center) target: heatmap with

detector name backbone neck dense head roi head

ATSS [15] ResNet FPN ATSSHead -

sample candidates : K=9

Improves performance of both

GT box 0 0 0 0.88 0.28 0 0 0 0 0 IoU threshold=0.71

matched matched with

candidate anchors whose centers are

- An object detector can be decomposed into backbone, neck,

Hiroto Honda’s medium blogs

You might also like