Performance Indicator Survey For Object Detection
Performance Indicator Survey For Object Detection
Abstract: In recent image processing, beyond the object recognition problem, deep learning has been used in various
aspects such as object detection, semantic segmentation. In addition, classic technique-based detection has been
performed variously. These technologies are applied in various systems such as factory automation systems, automatic
target recognition (ATR) systems, autonomous driving systems, etc. Object detection is performed in various categories
such as people, vehicles and animals, etc. And it is operated for various situations which contain different object size,
image size, distance range from near to remote, changeable environment, etc. For the situation analysis, indicators need
to be used appropriately. And when researchers make some algorithm for object detection, if there are no any evaluation
indicators, the algorithm can’t be demonstrated. So, it is important to know about performance indicators of object
detection. Various indicators are used in object detection. As a result, this paper introduces performance indicators of
object detection. The main purpose of the survey is that researchers find the proper performance indicator for object
detection. And It can help to compare the detection result with a different algorithm result, exactly and effectively.
285
Authorized licensed use limited to: FhI fur Integrierte Schaltungen Angewandte Elek. Downloaded on March 27,2023 at 17:13:18 UTC from IEEE Xplore. Restrictions apply.
Fig. 1 Precision-Recall curve Fig. 2 Confusion Matrix
Average Precision (AP): Average Precision is Area Macro-Average precision, recall, F1-Score: If the
Under Curve (AUC) region of PR Curve. The AP is number of each class is balanced, Macro-Average form
important for understanding the PR Curve. It is proper can be used. k is the number of class [11].
values for evaluating the algorithm graph performance.
Here are two method to measure AP. 1) 11-point Macro-Average precision
interpolation AP [10], 2) all points interpolation AP.
11-point interpolation AP needs smaller calculation then y TP>
all points interpolation AP. It only considers 11-point at i l TPi + FPi
x-axis, but all point interpolation considers all point at precisionM =- (8)
k
x-axis. 11-point interpolation divides the PR graph’s - Macro-Average recall
x-axis into [0, 0.1, 0.2, ..., 1] which is 10 sections. The
PASCAL VOC Challenge [10] used 11-point
interpolation AP. So, the AP became conventional. [13] yTP
TPt + F N
explains a drawing process of PR curve and a recallM = (9)
calculating AP value in detail. Additionally, COCO k
detection [14, 15] classifies AP the object size small, ■Macro-Average F1 -Score
medium, and large. (unit: pixel, small: area < 32x32,
medium: 32x32 < area < 96x96, large: 96x96 < area) „ 2x precision,, x recall,,
f 1- score =m - — (10)
precisionM + recall,
3.2 Multiple Class Object Detection indicator
The multiple class object detection is the extension of Micro-Average precision, recall, F1-Score: If the
the single class. The contents of 3.1 are included for number of each class is unbalanced, Micro-Average
each class, and additional considerations are as follows. form can be used. The equation is following.
Confusion Matrix/Table: The confusion matrix/table
- Micro-Average precision
is obviously classification indicator to make a
performance easy to analyze. But the traditional i TP>
detectors have an additional classification function after precision^ = ———--------- (11)
detection process, so we consider the confusion matrix
at this section. Some state-of-the-art deep leaning-based 1 (TP + FPt)
i=1
detector also have classification function in network [1],
- Micro-Average recall
[2]. But almost of them do not use confusion matrix as
performance indicator. Because the FP and FN are
calculated as a detection result of a single class. Those iT P
are not related with different classes. [5] uses the recall—= ^ ^ (12)
confusion matrix as a classification performance
indicator after detection process. Fig 2. is an example of i (TP + FN,)
1=1
classifier which has 5 classes. y-axis is predicted result
- Micro-Average F1 -Score
after classifying process. And x-axis is ground truth.
2 x precisionu x recall
f \ - score„ = — — (13)
M precision—+ recall—
286
Authorized licensed use limited to: FhI fur Integrierte Schaltungen Angewandte Elek. Downloaded on March 27,2023 at 17:13:18 UTC from IEEE Xplore. Restrictions apply.
In image classification field, the important feature of
Micro-Average is that the sum of FP is same with the X TP
sum of FN. So, Recall, Precision and Fl-Score are same. O.A = - i=1 (14)
Fig. 3 shows the simple example of classifier. Blue Xi=1 ( +
TP FP + FNi )
points are a class A, yellow points are a class B, black
points are a class C and green points are a class D. The
ground truth of the classifier is red lines, blue lines are Mean Accuracy: Mean Accuracy is the mean of each
the result of classification. In this case, the FP of B is class accuracy. When the number of each class is
balanced, it is useful.
same with FN of A. The property can be shown in blue
area of the Fig. 3. With the same reason, yellow, black
and green area show similar characteristic. As a result, TP 1
the sum of FP equals the sum of FN. It also can be Tp + FP + FNt )
explained with Confusion Matrix. At Fig. 2, The index
box predicted as class 2 but actual class 1 is FP of class
2 and FN of class l. For the similar reason, all boxes not 3.3 Time and FPS
included in TP or TN are FP of some class and FN of Detection time or FPS is important indicator. If a
another class. So, the same results are obtained as the mobile machine uses some detector, detector
previous description. performance is related with the detecting time. ATR and
On the other hand, the micro-average score in object self-driving car also require fast detection time. In these
detection is a little bit different with classification scene. cases, immediate reaction is important for the machine.
Micro-average F1-score, precision, and recall are The taking time or FPS are determined through network
different each other. Because the FP and FN are not parameters, network structure and hardware
related with a different class, those are calculated as a environment, etc. If the hardware environment of two
detection result of a single class. And a background machine is different, time and FPS is also not same. At
class is not usually considered in inference stage. The same conditions of the environment, the comparison of
class is often omitted from the measurement. As a result, time/FPS becomes useful information for a network
the sum of FP and the sum of FN is not equal. The comparison.
different class’ results of precision, recall, and fl-score
do not affect to each other. 4. RESULT
v » •
\ • precision, recall, f1-score and Overall Accuracy reflect
o\
• \ •
class imbalance more than Macro-Average precision,
0
0
0 0
0
10
mAP: mean Average Precision (mAP) is for multiple This study was supported by the Agency for Defense
class AP, and it consider the mean of every class AP. Development (UD200005FD).
After getting the AP of each class, simply mean all [1-3]. This research was supported by Basic Science
mAP does not reflect a class imbalance problem. It only Research Program through the National Research
calculates mean of AP. Foundation of Korea (NRF) funded by the Ministry of
Education (NRF-2018R1D1A3B07049069).
This paper was supported by Korea Institute for
Overall Accuracy: Overall Accuracy is the sum of Advancement of Technology (KIAT) grant funded by
every class TP per all case. when the number of each the Korea Government (MOTIE) (P0008473, HRD
class is unbalanced, it is useful. Program for Industrial Innovation)
REFERENCES
[1] Wei Liu, et al., “SSD: Single Shot MultiBox
287
Authorized licensed use limited to: FhI fur Integrierte Schaltungen Angewandte Elek. Downloaded on March 27,2023 at 17:13:18 UTC from IEEE Xplore. Restrictions apply.
Detector, Berg, Computer Vision”, ECCV 2016 p.
21-37, 2016.
[2] Shaoqing Ren, at el., “Faster R-CNN: Towards
Real-Time Object Detection with Region
Proposal Networks”, IEEE Transactions on
Pattern Analysis and Machine Intelligence, Vol.
39, No. 6, p. 1137-1149, 2017.
[3] Alexey Bochkovskiy, et al., “YOLOv4: Optimal
Speed and Accuracy of Object Detection”,
arXiv:2004.10934, 2020.
[4] Peter A. Flach, Meelis Kull,
“Precision-Recall-Gain Curves: PR Analysis
Done Right”, Advances in Neural Information
Processing Systems 28 (NIPS 2015), 2015.
[5] Junwei Han, et al., “Efficient, simultaneous
detection of multi-class geospatial targets based
on visual saliency modeling and discriminative
learning of sparse coding”, ISPRS Journal of
Photogrammetry and Remote Sensing, Vol. 89,
p. 37-48, 2014.
[6] Navaneeth Bodla, et al. “Soft-NMS -
Improving Object Detection With One Line of
Code”, The IEEE International Conference on
Computer Vision (ICCV), pp. 5561-5569, 2017.
[7] Goutte C., Gaussier E., “A Probabilistic
Interpretation of Precision, Recall and F-Score,
with Implication for Evaluation.”, Advances in
Information Retrieval. ECIR, 2005.
[8] Gong Cheng, et al., “Scalable multi-class
geospatial object detection in
high-spatial-resolution remote sensing images”,
2014 IEEE Geoscience and Remote Sensing
Symposium, 2014
[9] Wanceng Zhang, et al, “A generic
discriminative part-based model for geospatial
object detection in optical remote sensing
images”, ISPRS Journal of Photogrammetry and
Remote Sensing, Vol. 99, p. 30-44, 2015.
[10] Mark Everingham, et al., “The PASCAL Visual
Object Classes (VOC) Challenge”, International
Journal of Computer Vision, Vol. 88, p.
303-338, 2010.
[11] Vincent Van Asch, “Macro- and micro-averaged
evaluation measures [[BASIC DRAFT]]”, 2013.
[12] Kemal Oksuz, at el. “Imbalance Problems in
Object Detection: A Review, IEEE Transactions
on Pattern Analysis and Machine
Intelligence (Early Access), 2020
[13] Rafael Padilla, et al. “A survey on Performance
Metrics for Object-Detection Algorithms”, 2020
International Conference on Systems, Signals
and Image Processing (IWSSIP), 2020
[14] “COCO 2020 Object Detection Task”,
https://cocodataset.org/#detection-2020
[15] Tsung-Yi Lin, et al. “Microsoft COCO:
Common Objects in Context”, Computer
Vision - ECCV 2014. pp 740-755, 2014
288
Authorized licensed use limited to: FhI fur Integrierte Schaltungen Angewandte Elek. Downloaded on March 27,2023 at 17:13:18 UTC from IEEE Xplore. Restrictions apply.