0% found this document useful (0 votes)
48 views5 pages

Performance Indicator Survey For Object Detection

This document discusses performance indicators for object detection in images. It introduces the 2020 International Conference on Control, Automation and Systems taking place in Busan, Korea from October 13-16, 2020. The document then summarizes key performance indicators used to evaluate object detection algorithms, including true/false positives and negatives, precision, recall, F1-score, and the PR curve. It notes that indicators for single-class detection differ somewhat from those for multi-class detection, which also considers classification performance.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
48 views5 pages

Performance Indicator Survey For Object Detection

This document discusses performance indicators for object detection in images. It introduces the 2020 International Conference on Control, Automation and Systems taking place in Busan, Korea from October 13-16, 2020. The document then summarizes key performance indicators used to evaluate object detection algorithms, including true/false positives and negatives, precision, recall, F1-score, and the PR curve. It notes that indicators for single-class detection differ somewhat from those for multi-class detection, which also considers classification performance.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

2020 20th International Conference on Control, Automation and Systems (ICCAS 2020)

Oct. 13—16, 2020; BEXCO, Busan, Korea

Performance Indicator Survey for Object Detection


Inho Park*, Sungho Kim
Department of Electonic Engineering, Yeungnam University, KyeungSan, 38541, Korea
(Tel : +82-053-810-4379; E-mail : [email protected], [email protected])

Abstract: In recent image processing, beyond the object recognition problem, deep learning has been used in various
aspects such as object detection, semantic segmentation. In addition, classic technique-based detection has been
performed variously. These technologies are applied in various systems such as factory automation systems, automatic
target recognition (ATR) systems, autonomous driving systems, etc. Object detection is performed in various categories
such as people, vehicles and animals, etc. And it is operated for various situations which contain different object size,
image size, distance range from near to remote, changeable environment, etc. For the situation analysis, indicators need
to be used appropriately. And when researchers make some algorithm for object detection, if there are no any evaluation
indicators, the algorithm can’t be demonstrated. So, it is important to know about performance indicators of object
detection. Various indicators are used in object detection. As a result, this paper introduces performance indicators of
object detection. The main purpose of the survey is that researchers find the proper performance indicator for object
detection. And It can help to compare the detection result with a different algorithm result, exactly and effectively.

Keywords: Object Detection, Performance Indicator.

1. INTRODUCTION multi-class objects sometimes includes object


In every research, it is important to know about the classification [1-3, 5]. In that case, we need to consider
performance indicator. That is related with correct classification indicator.
judgement of an algorithm. And researcher can compare There are many indicators. Recall and Precision,
that the algorithm is better than previous. Performance F1-score are commonly used in object detection [4].
indicators used in object detection vary. Sometimes, the PR Curve is important in many papers
Nowadays, many state-of-the-art algorithms of object
[4, 5]. The interpolation area under the PR Curve can be
detection contain object classification [1-3]. In this
paper, performance indicators for object detection are used to compare one algorithm with another or a class
divided into two parts: single object detection and with another. When somebody make some algorithm, it
multiple object detection. And detection of multiple is very important to use performance indicators in the
objects includes several object classification indicators. right place.
The remainder of this paper is organized as follows:
Section 2 introduces related work about object detection
and performance indicators. Section 3 explains 3. PERFORMANCE INDICATOR
performance indicator divided into two part: single class
object detection indicator, multiple class object The performance indicators for object detection are
detection indicator. The paper is concluded in Section 4. divided into two parts: single object detection and
multiple object detection. And multiple object detection
2. RELATED WORK sometimes contains the object classification. So, three
parts are written: single class object detection indicator,
As many deep learning networks have been recently multiple class object detection indicator, Time/FPS.
released, it is seen that the boundary between object And multiple class object detection part is related with
single class object detection.
detection and recognition has been weakened. Detector
like SSD [1], Faster-RCNN [2] and recently released 3.1 Single Class Object Detection indicator
YOLO-v4 [3] contain functions not only object
bounding box, but also classification. Recent trends of TP, FP, FN, TN: True Positive, False Positive, False
object detectors include classification. Negative, True Negative.
Obviously, object detection means that computer
- True Positive (TP): An object is detected, and it is
technology for image processing or computer vision true.
find some object in an image. But object detection for - False Positive (FP): An object is detected, and it is

978-89-93215-19-9/20/$31.00 ©ICROS 284


Authorized licensed use limited to: FhI fur Integrierte Schaltungen Angewandte Elek. Downloaded on March 27,2023 at 17:13:18 UTC from IEEE Xplore. Restrictions apply.
false. - True Positive Rate (TPR, recall, sensitivity): Among
- False Negative (FN): An object isn’t detected, and it the boxes that should be detected (ground truth), True
is false. Positive [7].
- True Negative (TN): An object isn’t detected, and it is - False Positive Rate (FPR, specificity): Among the
true. boxes that should not be detected, False Positive.
- True Negative Rate (TNR, 1-specificity): Among the
FP is false alarm and FN is miss rate. The detecting boxes that should not be detected, True Negative.
result is related with IoU (Interaction of Union). The - False Negative Rate (FNR, 1-sensitivity, miss rate):
positive result must be satisfied that IoU about a Among the boxes that should be detected (ground truth),
predicted box and a ground truth box area is bigger than False Negative.
threshold value which is the adjustable parameter. In
object detection, TN isn’t usually considered, because TP
TPR = recall = sensitivity =---------- (4)
many default bounding boxes which don’t match with TP + FN
ground truth boxes (belong to TN) are made by FN
networks. SSD [1] makes 8762 default bounding boxes FNR = 1—sensitivity = ---------- (5)
TP + FN
per a class. So, consideration of TN makes the
detection result pool and it becomes difficult to decide
TPR, which is called ‘recall’ or ‘sensitivity’, are
which algorithm is better. After the detection process,
usually used in detection field. And FPR is often used
the highest IoU score box will be TP, and the others that
for TPR/FPR (=ROC) graph. But it is related with
have a smaller score than the box and are adjacent will
binary classification problem. Because of TN, the field
be FP. Non-Maximum Suppression [2, 6] can remove
of object detection ignore FPR, TNR. Object detection
the FP, but it can remain after NMS. COCO detection
consider TPR, FNR for performance indicating.
challenge [14, 15] which is an open challenge to prove
detector’s performance uses 3 kind of IoU to measure
F1-Score: The merit of F1 score is that contains recall
AP (is simply explained in Section 3.1). AP 0.5, AP
and precision together [5, 7]. In TP = 5, FP = 95, FN = 0
0.75, and AP 0.5:0.05:0.95 is used. AP 0.5:0.05:0.95 is
case, the precision is 0.05, recall is 1. Recall shows that
average of the AP using 10 types of IoU from 0.5 to
the case performs good, but that is false. Because FP is
0.95 at 0.05 intervals.
so big. F1-Score consider both of recall and precision.
The eq. 6 is about the F1-Score. F1-Score of this case is
Area(Bp If Bgt)
(1) about 0.0952.
Area(Bp U B0 )
2 x precision x recall
f 1- score =------------------------- (6)
PPV, NPV, FDR, FOR: The denominator component precision + recall
is based on the detection result [7].
Accuracy: The Accuracy of detection doesn’t consider
- Positive Predicted Value (PPV, precision): Among TN. And the denominator is about all detection box.
positive results correctly detected. The numerator is TP which is correctly detected.
- Negative Predicted Value (NPV): Among negative
results correctly undetected. TP
Accuracy (7)
- False Discovery Rate (FDR): Among positive results TP + FP + FN
incorrectly detected.
- False Omission Rate (FOR): Among negative results PR Curve: PR Curve is Precision-Recall Curve. In this
incorrectly undetected. paper, we introduce two kind curves for Precision,
Recall. In the first graph, x-axis is recall, y-axis is
TP precision. In the second graph, x-axis is 1-precision,
PPV = precision =---------- (2)
TP + FP y-axis is recall. First curve shape is right-down form
FP through x-axis (Fig. 1). Second curve shape is right-up
FDR = 1—precision =---------- (3) form through x-axis. [8] use PR curve and [9] use
TP + FP
recall/1-precision curve to show a performance result,
PPV is same with precision which is often seen respectively.
precision-recall curve. And NPV and FOR contain TN Fig. 1 is a class ‘car’ detection result. Test images are
in the denominator, so it is not considered in object 10, detection objects are 8 per an image. Threshold IoU
detection. was 0.5. For drawing the graph, refer to [10, 13]. The
AP measure method is 11-point interpolation which is
TPR, FPR, TNR, FNR: The denominator component is explained simply next part. PR curve is used more often
based on the boxes which will be detected. than recall versus 1-precision curve.

285
Authorized licensed use limited to: FhI fur Integrierte Schaltungen Angewandte Elek. Downloaded on March 27,2023 at 17:13:18 UTC from IEEE Xplore. Restrictions apply.
Fig. 1 Precision-Recall curve Fig. 2 Confusion Matrix

Average Precision (AP): Average Precision is Area Macro-Average precision, recall, F1-Score: If the
Under Curve (AUC) region of PR Curve. The AP is number of each class is balanced, Macro-Average form
important for understanding the PR Curve. It is proper can be used. k is the number of class [11].
values for evaluating the algorithm graph performance.
Here are two method to measure AP. 1) 11-point Macro-Average precision
interpolation AP [10], 2) all points interpolation AP.
11-point interpolation AP needs smaller calculation then y TP>
all points interpolation AP. It only considers 11-point at i l TPi + FPi
x-axis, but all point interpolation considers all point at precisionM =- (8)
k
x-axis. 11-point interpolation divides the PR graph’s - Macro-Average recall
x-axis into [0, 0.1, 0.2, ..., 1] which is 10 sections. The
PASCAL VOC Challenge [10] used 11-point
interpolation AP. So, the AP became conventional. [13] yTP
TPt + F N
explains a drawing process of PR curve and a recallM = (9)
calculating AP value in detail. Additionally, COCO k
detection [14, 15] classifies AP the object size small, ■Macro-Average F1 -Score
medium, and large. (unit: pixel, small: area < 32x32,
medium: 32x32 < area < 96x96, large: 96x96 < area) „ 2x precision,, x recall,,
f 1- score =m - — (10)
precisionM + recall,
3.2 Multiple Class Object Detection indicator
The multiple class object detection is the extension of Micro-Average precision, recall, F1-Score: If the
the single class. The contents of 3.1 are included for number of each class is unbalanced, Micro-Average
each class, and additional considerations are as follows. form can be used. The equation is following.
Confusion Matrix/Table: The confusion matrix/table
- Micro-Average precision
is obviously classification indicator to make a
performance easy to analyze. But the traditional i TP>
detectors have an additional classification function after precision^ = ———--------- (11)
detection process, so we consider the confusion matrix
at this section. Some state-of-the-art deep leaning-based 1 (TP + FPt)
i=1
detector also have classification function in network [1],
- Micro-Average recall
[2]. But almost of them do not use confusion matrix as
performance indicator. Because the FP and FN are
calculated as a detection result of a single class. Those iT P
are not related with different classes. [5] uses the recall—= ^ ^ (12)
confusion matrix as a classification performance
indicator after detection process. Fig 2. is an example of i (TP + FN,)
1=1
classifier which has 5 classes. y-axis is predicted result
- Micro-Average F1 -Score
after classifying process. And x-axis is ground truth.
2 x precisionu x recall
f \ - score„ = — — (13)
M precision—+ recall—

286
Authorized licensed use limited to: FhI fur Integrierte Schaltungen Angewandte Elek. Downloaded on March 27,2023 at 17:13:18 UTC from IEEE Xplore. Restrictions apply.
In image classification field, the important feature of
Micro-Average is that the sum of FP is same with the X TP
sum of FN. So, Recall, Precision and Fl-Score are same. O.A = - i=1 (14)
Fig. 3 shows the simple example of classifier. Blue Xi=1 ( +
TP FP + FNi )
points are a class A, yellow points are a class B, black
points are a class C and green points are a class D. The
ground truth of the classifier is red lines, blue lines are Mean Accuracy: Mean Accuracy is the mean of each
the result of classification. In this case, the FP of B is class accuracy. When the number of each class is
balanced, it is useful.
same with FN of A. The property can be shown in blue
area of the Fig. 3. With the same reason, yellow, black
and green area show similar characteristic. As a result, TP 1
the sum of FP equals the sum of FN. It also can be Tp + FP + FNt )
explained with Confusion Matrix. At Fig. 2, The index
box predicted as class 2 but actual class 1 is FP of class
2 and FN of class l. For the similar reason, all boxes not 3.3 Time and FPS
included in TP or TN are FP of some class and FN of Detection time or FPS is important indicator. If a
another class. So, the same results are obtained as the mobile machine uses some detector, detector
previous description. performance is related with the detecting time. ATR and
On the other hand, the micro-average score in object self-driving car also require fast detection time. In these
detection is a little bit different with classification scene. cases, immediate reaction is important for the machine.
Micro-average F1-score, precision, and recall are The taking time or FPS are determined through network
different each other. Because the FP and FN are not parameters, network structure and hardware
related with a different class, those are calculated as a environment, etc. If the hardware environment of two
detection result of a single class. And a background machine is different, time and FPS is also not same. At
class is not usually considered in inference stage. The same conditions of the environment, the comparison of
class is often omitted from the measurement. As a result, time/FPS becomes useful information for a network
the sum of FP and the sum of FN is not equal. The comparison.
different class’ results of precision, recall, and fl-score
do not affect to each other. 4. RESULT

There are many performance indicators, and those


S i
that are frequently used are listed. It is important not
• \
° 0 C classification only to arrange performance indicators for object
o °
• \ O detection, but also to use them in right place. Frequently,
• \ the class imbalance problem is considered in object
detection scene [12]. As a result, Micro-Average
______ Q _____ J
0 \
0
o \

v » •
\ • precision, recall, f1-score and Overall Accuracy reflect
o\

• \ •
class imbalance more than Macro-Average precision,
0
0

recall, f1-score, and Mean Accuracy. If it is possible to


0
0 0

0 0
0
10

• • > check about the number of classes, speed of each


network, and undetected items, it will be easy to
ground truth construct an algorithm that derives the desired results.

Fig. 3 Example of a class classifier ACKNOWLEDGEMENT

mAP: mean Average Precision (mAP) is for multiple This study was supported by the Agency for Defense
class AP, and it consider the mean of every class AP. Development (UD200005FD).
After getting the AP of each class, simply mean all [1-3]. This research was supported by Basic Science
mAP does not reflect a class imbalance problem. It only Research Program through the National Research
calculates mean of AP. Foundation of Korea (NRF) funded by the Ministry of
Education (NRF-2018R1D1A3B07049069).
This paper was supported by Korea Institute for
Overall Accuracy: Overall Accuracy is the sum of Advancement of Technology (KIAT) grant funded by
every class TP per all case. when the number of each the Korea Government (MOTIE) (P0008473, HRD
class is unbalanced, it is useful. Program for Industrial Innovation)

REFERENCES
[1] Wei Liu, et al., “SSD: Single Shot MultiBox

287
Authorized licensed use limited to: FhI fur Integrierte Schaltungen Angewandte Elek. Downloaded on March 27,2023 at 17:13:18 UTC from IEEE Xplore. Restrictions apply.
Detector, Berg, Computer Vision”, ECCV 2016 p.
21-37, 2016.
[2] Shaoqing Ren, at el., “Faster R-CNN: Towards
Real-Time Object Detection with Region
Proposal Networks”, IEEE Transactions on
Pattern Analysis and Machine Intelligence, Vol.
39, No. 6, p. 1137-1149, 2017.
[3] Alexey Bochkovskiy, et al., “YOLOv4: Optimal
Speed and Accuracy of Object Detection”,
arXiv:2004.10934, 2020.
[4] Peter A. Flach, Meelis Kull,
“Precision-Recall-Gain Curves: PR Analysis
Done Right”, Advances in Neural Information
Processing Systems 28 (NIPS 2015), 2015.
[5] Junwei Han, et al., “Efficient, simultaneous
detection of multi-class geospatial targets based
on visual saliency modeling and discriminative
learning of sparse coding”, ISPRS Journal of
Photogrammetry and Remote Sensing, Vol. 89,
p. 37-48, 2014.
[6] Navaneeth Bodla, et al. “Soft-NMS -­
Improving Object Detection With One Line of
Code”, The IEEE International Conference on
Computer Vision (ICCV), pp. 5561-5569, 2017.
[7] Goutte C., Gaussier E., “A Probabilistic
Interpretation of Precision, Recall and F-Score,
with Implication for Evaluation.”, Advances in
Information Retrieval. ECIR, 2005.
[8] Gong Cheng, et al., “Scalable multi-class
geospatial object detection in
high-spatial-resolution remote sensing images”,
2014 IEEE Geoscience and Remote Sensing
Symposium, 2014
[9] Wanceng Zhang, et al, “A generic
discriminative part-based model for geospatial
object detection in optical remote sensing
images”, ISPRS Journal of Photogrammetry and
Remote Sensing, Vol. 99, p. 30-44, 2015.
[10] Mark Everingham, et al., “The PASCAL Visual
Object Classes (VOC) Challenge”, International
Journal of Computer Vision, Vol. 88, p.
303-338, 2010.
[11] Vincent Van Asch, “Macro- and micro-averaged
evaluation measures [[BASIC DRAFT]]”, 2013.
[12] Kemal Oksuz, at el. “Imbalance Problems in
Object Detection: A Review, IEEE Transactions
on Pattern Analysis and Machine
Intelligence (Early Access), 2020
[13] Rafael Padilla, et al. “A survey on Performance
Metrics for Object-Detection Algorithms”, 2020
International Conference on Systems, Signals
and Image Processing (IWSSIP), 2020
[14] “COCO 2020 Object Detection Task”,
https://cocodataset.org/#detection-2020
[15] Tsung-Yi Lin, et al. “Microsoft COCO:
Common Objects in Context”, Computer
Vision - ECCV 2014. pp 740-755, 2014

288
Authorized licensed use limited to: FhI fur Integrierte Schaltungen Angewandte Elek. Downloaded on March 27,2023 at 17:13:18 UTC from IEEE Xplore. Restrictions apply.

You might also like