Paper Survey On Performance Metrics For Object Detection Algorithms

This document discusses performance metrics for evaluating object detection algorithms. It reviews common metrics like average precision (AP) and introduces variants of AP. AP estimates the area under the precision-recall curve and is widely used but can produce ambiguous results depending on implementation. The document also describes object detection challenges and their use of metrics like AP and mean AP to evaluate participants. It proposes a standard implementation of AP that can benchmark algorithms across datasets.

Uploaded by

test pt

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

444 views6 pages

Paper Survey On Performance Metrics For Object Detection Algorithms

Uploaded by

test pt

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

A Survey on Performance Metrics for

Object-Detection Algorithms
Rafael Padilla1 , Sergio L. Netto2 , Eduardo A. B. da Silva3
1,2,3 PEE, COPPE, Federal University of Rio de Janeiro, P.O. Box 68504, RJ, 21945-970, Brazil
{rafael.padilla, sergioln,eduardo}@smt.ufrj.br

Abstract—This work explores and compares the plethora of

metrics for the performance evaluation of object-detection algo-
rithms. Average precision (AP), for instance, is a popular metric
for evaluating the accuracy of object detectors by estimating the
area under the curve (AUC) of the precision × recall relationship.
Depending on the point interpolation used in the plot, two
different AP variants can be defined and, therefore, different
results are generated. AP has six additional variants increasing the
possibilities of benchmarking. The lack of consensus in different
works and AP implementations is a problem faced by the academic
and scientific communities. Metric implementations written in
different computational languages and platforms are usually
distributed with corresponding datasets sharing a given bounding-
box description. Such projects indeed help the community with
evaluation tools, but demand extra work to be adapted for other (a) (b)
datasets and bounding-box formats. This work reviews the most
used metrics for object detection detaching their differences,
applications, and main concepts. It also proposes a standard
implementation that can be used as a benchmark among different
datasets with minimum adaptation on the annotation files.
Keywords—object-detection metrics, average precision, object-
detection challenges, bounding boxes.

I. I NTRODUCTION
Object detection is an extensively studied topic in the field
of computer vision. Different approaches have been employed
(c)
to solve the growing need for accurate object detection mod-
els [1]. The Viola-Jones framework [2], for instance, became Fig. 1: Examples of detections performed by YOLO [20] in
popular due to its successful application in the face-detection different datasets. (a) PASCAL VOC; (b) personal dataset; (c)
problem [3], and was later applied to different subtasks such as COCO. Besides the bounding box coordinates of a detected
pedestrian [4] and car [5] detections. More recently, with the object, the output also includes the confidence level and its
popularization of the convolutional neural networks (CNN) [6]– class.
[9] and GPU-accelerated deep-learning frameworks, object-
detection algorithms started being developed from a new per-
spective [10], [11]. Works as Overfeat [12], R-CNN [13], Fast are mostly represented by their top-left and bottom-right co-
R-CNN [14], Faster R-CNN [15], R-FCN [16], SSD [17] and ordinates (xini , yini , xend , yend ), with a notable exception being
YOLO [18]–[20] highly increased the performance stantards the YOLO [18]–[20] algorithm, that differs from the others by
on the field. World famous competitions such as VOC PAS- outlining the bounding boxes by their center coordinates, width,
xcenter ycenter box width box height
CAL Challenge [21], COCO [22], ImageNet Object Detection and height image width , image height , image width , image height .
Challenge [23], and Google Open Images Challenge [24] have Different challenges, competitions, and hackathons [21],
as their top object-detection algorithms methods inspired on [23]–[27] attempt to assess the performance of object de-
the aforementioned works. Differently from algorithms such as tections in specific scenarios by using real-world annotated
the Viola-Jones, CNN-based detectors are flexible enough to be images [28]–[30]. In these events, participants are given a
trained with several (hundreds or even a few thousands) classes. testing nonannotated image set in which objects have to be
A detector outcome is commonly composed of a list of detected by their proposed works. Some competitions provide
bounding boxes, confidence levels and classes, as seen in their own (or 3rd-party) source code, allowing the participants
Figure 1. However, the standard output-file format varies a to evaluate their algorithms in an annotated validation image
lot for different detection algorithms. Bounding-box detections set before submitting their testing-set detections. In the end,
each team sends a list of bounding-boxes coordinates with their predicted bounding box Bp and the ground-truth bounding box
respective classes and (sometimes) their confidence levels to be Bgt divided by the area of union between them, that is
evaluated.
area(Bp ∩ Bgt )
In most competitions, the average precision (AP) and its J(Bp , Bgt ) = IOU = , (1)
area(Bp ∪ Bgt )
derivations are the metrics adopted to assess the detections
and thus rank the teams. The PASCAL VOC dataset [31] and as illustrated in Figure 2.
challenge [21] provide their own source code to measure the
AP and the mean AP (mAP) over all object classes. The City
Intelligence Hackathon [27] uses the source code distributed
in [32] to rank the participants also on AP and mAP. The Ima-
geNet Object Localization challenge [23] does not recommend
any code to compute their evaluation metric, but provides a
pseudo-code explaining it. The Open Images 2019 [24] and
Google AI Open Images [26] challenges use mAP, referencing
Fig. 2: Intersection Over Union (IOU).
a tool to evaluate the results [33], [34]. The Lyft 3D Object
Detection for Autonomous Vehicles challenge [25] does not
By comparing the IOU with a given threshold t, we can
reference any external tool, but uses the AP averaged over 10
classify a detection as being correct or incorrect. If IOU ≥ t
different thresholds, the so-called AP@50:5:95 metric.
then the detection is considered as correct. If IOU < t the
This work reviews the most popular metrics used to evalu- detection is considered as incorrect.
ate object-detection algorithms, including their main concepts, Since, as stated above, the true negatives (TN) are not used in
pointing out their differences, and establishing a comparison be- object detection frameworks, one refrains to use any metric that
tween different implementations. In order to introduce its main is based on the TN, such as the TPR, FPR and ROC curves [36].
contributions, this work is divided into the following topics: Instead, the assessment of object detection methods is mostly
Section II explains the main performance metrics employed based on the precision P and recall R concepts, respectively
in the field of object detection and how the AP metric can defined as
produce ambiguous results; Section III describes some of the
TP TP
most known object detection challenges and their employed P = = , (2)
performance metrics, whereas Section IV presents a project TP + FP all detections
implementing the AP metric to be used with any annotation TP TP
format. R = = . (3)
TP + FN all ground truths
Precision is the ability of a model to identify only relevant
II. M AIN P ERFORMANCE M ETRICS objects. It is the percentage of correct positive predictions.
Among different annotated datasets used by object detection Recall is the ability of a model to find all relevant cases (all
challenges and the scientific community, the most common ground-truth bounding boxes). It is the percentage of correct
metric used to measure the accuracy of the detections is the AP. positive predictions among all given ground truths.
Before examining the variations of the AP, we should review The precision × recall curve can be seen as a trade-off
some concepts that are shared among them. The most basic are between precision and recall for different confidence values
the ones defined below: associated to the bounding boxes generated by a detector. If the
confidence of a detector is such that its FP is low, the precision
• True positive (TP): A correct detection of a ground-truth will be high. However, in this case, many positives may be
bounding box; missed, yielding a high FN, and thus a low recall. Conversely,
• False positive (FP): An incorrect detection of a nonexistent if one accepts more positives, the recall will increase, but the FP
object or a misplaced detection of an existing object; may also increase, decreasing the precision. However, a good
• False negative (FN): An undetected ground-truth bounding object detector should find all ground-truth objects (F N = 0 ≡
box; high recall) while identifying only relevant objects (F P = 0 ≡
It is important to note that, in the object detection context, high precision). Therefore, a particular object detector can
a true negative (TN) result does not apply, as there are infinite be considered good if its precision stays high as its recall
number of bounding boxes that should not be detected within increases, which means that if the confidence threshold varies,
any given image. the precision and recall will still be high. Hence, a high area
The above definitions require the establishment of what a under the curve (AUC) tends to indicate both high precision
“correct detection” and an “incorrect detection” are. A common and high recall. Unfortunately, in practical cases, the precision
way to do so is using the intersection over union (IOU). It is × recall plot is often a zigzag-like curve, posing challenges to
a measurement based on the Jaccard Index, a coefficient of an accurate measurement of its AUC. This is circumvented by
similarity for two sets of data [35]. In the object detection processing the precision × recall curve in order to remove the
scope, the IOU measures the overlapping area between the zigzag behavior prior to AUC estimation. There are basically
two approaches to do so: the 11-point interpolation and all-
point interpolation.
In the 11-point interpolation, the shape of the precision
× recall curve is summarized by averaging the maximum
precision values at a set of 11 equally spaced recall levels [0,
0.1, 0.2, ... , 1], as given by
1 X
AP11 = Pinterp (R), (4)
11
R∈{0,0.1,...,0.9,1}

where
Pinterp (R) = max P (R̃). (5)
R̃:R̃≥R

In this definition of AP, instead of using the precision Fig. 3: Example of 24 detections (red boxes) performed by an
P (R) observed at each recall level R, the AP is obtained object detector aiming to detect 15 ground-truth objects (green
by considering the maximum precision Pinterp (R) whose recall boxes) belonging to the same class.
value is greater than R.
In the all-point interpolation, instead of interpolating only 11
equally spaced points, one may interpolate through all points 1; otherwise it is set to 0 and it is considered as FP. Some
in such way that: detectors can output multiple detections overlapping a single
ground truth (e.g. detections D and E in Image 2; G, H and
X
APall = (Rn+1 − Rn )Pinterp (Rn+1 ), (6) I in Image 3). For those cases the detection with the highest
n confidence is considered a TP and the others are considered
where as FP, as applied by the PASCAL VOC 2012 challenge. The
Pinterp (Rn+1 ) = max P (R̃). (7) columns Acc TP and Acc FP accumulate the total amount of
R̃:R̃≥Rn+1 TP and FP along all the detections above the corresponding
confidence level. Figure 4 depicts the calculated precision and
In this case, instead of using the precision observed at
recall values for this case.
only few points, the AP is now obtained by interpolating the
precision at each level, taking the maximum precision whose TABLE I: Computation of Precision and Recall Values for IOU
recall value is greater or equal than Rn+1 . threshold = 30%
The mean AP (mAP) is a metric used to measure the
accuracy of object detectors over all classes in a specific detection confidence TP FP acc TP acc FP precision recall
database. The mAP is simply the average AP over all classes R 95% 1 0 1 0 1 0.0666
Y 95% 0 1 1 1 0.5 0.0666
[15], [17], that is J 91% 1 0 2 1 0.6666 0.1333
N A 88% 0 1 2 2 0.5 0.1333
1X U 84% 0 1 2 3 0.4 0.1333
mAP = APi , (8) C 80% 0 1 2 4 0.3333 0.1333
N i=1 M 78% 0 1 2 5 0.2857 0.1333
F 74% 0 1 2 6 0.25 0.1333
with APi being the AP in the ith class and N is the total D 71% 0 1 2 7 0.2222 0.1333
number of classes being evaluated. B 70% 1 0 3 7 0.3 0.2
H 67% 0 1 3 8 0.2727 0.2
P 62% 1 0 4 8 0.3333 0.2666
A. A Practical Example E 54% 1 0 5 8 0.3846 0.3333
X 48% 1 0 6 8 0.4285 0.4
As stated previously, the AP is calculated individually for N 45% 0 1 6 9 0.4 0.4
each class. In the example shown in Figure 3, the boxes T 45% 0 1 6 10 0.375 0.4
represent detections (red boxes identified by a letter - A, B, K 44% 0 1 6 11 0.3529 0.4
Q 44% 0 1 6 12 0.3333 0.4
..., Y) and the ground truth (green boxes) of a given class. V 43% 0 1 6 13 0.3157 0.4
The percentage value drawn next to each red box represents I 38% 0 1 6 14 0.3 0.4
L 35% 0 1 6 15 0.2857 0.4
the detection confidence for this object class. In order to S 23% 0 1 6 16 0.2727 0.4
evaluate the precision and recall of the 24 detections among G 18% 1 0 7 16 0.3043 0.4666
O 14% 0 1 7 17 0.2916 0.4666
the 15 ground-truth boxes distributed in seven images, an IOU
threshold t needs to be established. In this example, let us
consider as a TP detection box one having IOU ≥ 30%. Note As mentioned above, each interpolation method yields a
that each value of IOU threshold provides a different AP metric, different AP result, as given by (Figure 5):
and thus the threshold used must always be indicated.
Table I presents each detection ordered by their confidence 1
level. For each detection, if its area overlaps 30% or more of AP11 = (1 + 0.6666 + 0.4285 + 0.4285 + 0.4285)
11
a ground truth (IOU ≥ 30%), the TP column is identified as AP11 = 26.84%,
Fig. 6: Precision × Recall curves of points from Table I
Fig. 4: Precision x Recall curve with values calculated for each applying interpolation with all points.
detection in Table I.

each detection was reported as the tiebreaker (usually one or

and (Figure 6): more evaluation files contain the detections to be evaluated),
but in general there is no common consensus by the evaluation
APall = 1 ∗ (0.0666 − 0) + 0.6666 ∗ (0.1333 − 0.0666) tools.
+ 0.4285 ∗ (0.4 − 0.1333) + 0.3043 ∗ (0.4666 − 0.4) III. O BJECT-D ETECTION C HALLENGES AND T HEIR AP
APall = 24.56%. VARIANTS
Constantly, new techniques are being developed and new
different state-of-the-art object-detection algorithms are arising.
Comparing their results with different works is not an easy
task. Sometimes the applied metrics vary or the implementation
used by the different authors may not be the same, generating
dissimilar results. This section covers the main challenges and
their most popular AP variants found in the literature.
The PASCAL VOC [31] is an object-detection challenge
released in 2005. From 2005 to 2012, a new version of the
Pascal VOC was released with increased numbers of images
and classes, starting at four classes, reaching 20 classes in
its last update. The PASCAL VOC competition still accepts
submissions, revealing state-of-the-art algorithms for object
detections ever since. In this trail, the challenge applies the 11-
Fig. 5: Precision × Recall curves of points from Table I using point interpolated precision (see Section II) and uses the mean
the 11-point interpolation approach. AP over all of its classes to rank the submission performances,
as implemented by the provided development kit.
From what we have seen so far, benchmarks are not truly The Open Images 2019 challenge [24] in its object-detection
comparable if the method used to calculate the AP is not track uses the Open Images Dataset [29] containing 12.2 M
reported. Works found in the literature [1], [9], [12]–[20], [37] annotated bounding boxes across 500 object categories on
usually neither mention the method used nor reference the 1.7 M images. Due to its hierarchical annotations, the same
adopted tool to evaluate their results. This problem does not object can belong to a main class and multiple sub-classes
occur much often in challenges, as it is a common practice (e.g. ‘helmet’ and ‘football helmet’). Because of that, the users
to have a reference software tool included in order for the should report the class and subclasses of a given detection. If
participants to evaluate their results. Also, it is not rare to somehow only the main class is correctly reported for a detected
occur cases where a detector sets the same confidence level bounding box, the unreported subclasses affect negatively the
for different detections. Table I, for example, illustrates that score, as it is counted as a false negative. The metric employed
detections R and Y obtained the same confidence level (95%). by the aforementioned challenge is the mean AP over all classes
Depending on the criterion used by a certain implementation, using the Tensorflow Object Detection API [33].
one or other detection can be sorted as the first detection in the The COCO detection challenge (bounding box) [22] is a
table, directly affecting the final result of an object-detection competition which provides bounding-box coordinates of more
algorithm. Some implementations may consider the order that than 200,000 images comprising 80 object categories. The
submitted works are ranked according to metrics gathered into be evaluated by the metrics employed in other datasets (COCO,
four main groups. for instance).
• AP: The AP is evaluated with different IOUs. It can be
calculated for 10 IOUs varying in a range of 50% to 95% IV. A N O PEN -S OURCE P ERFORMANCE M ETRIC
with steps of 5%, usually reported as AP@50:5:95. It also R EPOSITORY
can be evaluated with single values of IOU, where the
most common values are 50% and 75%, reported as AP50 In order to help other researchers and the academic com-
and AP75 respectively; munity to obtain trustworthy results that can be comparable re-
• AP Across Scales: The AP is determined for objects in gardless the detector, the database, or the format of the ground-
three different sizes: small (with area < 322 pixels), truth annotations, a library was developed in Python with the
medium (with 322 < area < 962 pixels), and large (with AP metric that can be extended to its variations. Easy-to-use
area > 962 pixels); functions implement the same metrics used as benchmark by
• Average Recall (AR): The AR is estimated by the maxi- the most popular competitions and object-detection researches.
mum recall values given a fixed number of detections per The proposed implementation does not require modifications
image (1, 10 or 100) averaged over IOUs and classes; of the detection model to match complicated input formats,
• AR Across Scales: The AR is determined for objects in avoiding conversions to XML or JSON files. To assure the
the same three different sizes as in the AP Across Scales, accuracy of the results, the implementation followed to the
usually reported as AR-S, AR-M, and AR-L, respectively; letter the definitions and our results were carefully compared
Tables II and III present results obtained by different object against the official implementations and the results are precisely
detectors for the COCO and PASCAL VOC challenges, as given the same. The variations of the AP metric such as mAP, AP50,
in [20], [38]. Due to different bounding-box annotation formats, AP75 and AP@50:5:95 using the 11-point or the all-point
researchers tend to report only the metrics supported by the interpolations can be obtained with the proposed library.
source code distributed with each dataset. Besides that, works The input data (ground-truth bounding boxes and detected
that use datasets with other annotation formats [39] are forced bounding boxes) format was simplified requiring a single
to convert their annotations to PASCAL VOC’s and COCO’s format to compute all AP variation metrics. The format required
formats before using their evaluation codes. is straightforward and can support the most popular detectors.
For the ground-truth bounding boxes, a single text file for each
TABLE II: Results using AP variants obtained by different image should be created with each line in one of the following
methods on COCO dataset [40]. formats:
methods AP@50:5:95 AP50 AP75 AP-S AP-M AP-L
Faster R-CNN with ResNet-101 [9], [15] 34.9 55.7 37.4 15.6 38.7 50.9
<class> <left> <top> <right> <bottom>
Faster R-CNN with FPN [15], [41] 36.2 59.1 39.0 18.2 39.0 48.2 <class> <left> <top> <width> <height>
Faster R-CNN by G-RMI [15], [42] 34.7 55.5 36.7 13.5 38.1 52.0
Faster R-CNN with TDM [15], [43] 36.8 57.7 39.2 16.2 39.8 52.1
YOLO v2 [19] 21.6 44.0 19.2 5.0 22.4 35.5 For the detections, a text file for each image should include
YOLO v3 [20] 33.0 57.9 34.4 18.3 35.4 41.9
SSD513 with ResNet-101 [9], [17] 31.2 50.4 33.3 10.2 34.5 49.8 a line for each bounding box in one of the following formats:
DSSD513 with ResNet-101 [9], [44] 33.2 53.3 35.2 13.0 35.4 51.1
RetinaNet [40] 39.1 59.1 42.3 21.8 42.7 50.2
<class> <confidence> <left> <top> <right> <bottom>
<class> <confidence> <left> <top> <width> <height>
TABLE III: Results using AP variant (mAP) obtained by The second options support YOLO’s output bounding-box
different methods on PASCAL VOC 2012 dataset [38]. formats. Besides specifying the input formats of the bounding
methods mAP boxes, one can also set the IOU threshold used to consider
Faster R-CNN * [15] 70.4
a TP (useful to calculate the metrics AP@50:5:95, AP50 and
YOLO v1 [18] 57.9 AP75) and the interpolation method (11-point interpolation or
YOLO v2 ** [19] 78.2 interpolation with all points). The tool will output the plots as
SSD300 ** [17] 79.3
SSD512 ** [17] 82.2
in Figures 5 and 6, the final mAP and the AP for each class,
giving a better view of the results for each class. The tool
(*) trained with PASCAL VOC dataset images only, while (**) trained with
COCO dataset images.
also provides an option to generate the output images with the
bounding boxes drawn on it as shown in Figure 1.
The metric AP50 in Table II is calculated in the same way as The project distributed with this paper can be accessed at:
the metric mAP in Table III, but as the methods were trained https:// github.com/ rafaelpadilla/ Object-Detection-Metrics. So
and tested in different datasets, one obtains different results in far, our framework has helped researchers to obtain AP metrics
both evaluations. Due to the need of conversions between the and its variations in a simple way, supporting the most popular
bounding-box annotations among different datasets, researchers formats used by datasets, avoiding conversions to XML or
in general do not evaluate all methods with all possible metrics. JSON files. The proposed tool has been used as the official
In practice, it would be more meaningful if methods trained and tool in the competition [27], adopted in 3rd-party libraries such
tested with one dataset (PASCAL VOC, for instance) could also as [45] and used by many other works as in [46]–[48].
R EFERENCES
[26] G. Research. Google ai open images - object de-
[1] W. Hu, T. Tan, L. Wang, and S. Maybank, “A survey on visual surveil- tection track. [Online]. Available: https://www.kaggle.com/c/
lance of object motion and behaviors,” IEEE Transactions on Systems, google-ai-open-images-object-detection-track/
Man, and Cybernetics, Part C: Applications and Reviews, vol. 34, no. 3, [27] City intelligence hackathon. [Online]. Available: https://belvisionhack.ru
pp. 334–352, Aug 2004. [28] O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang,
[2] P. Viola and M. Jones, “Rapid object detection using a boosted cascade A. Karpathy, A. Khosla, M. Bernstein et al., “Imagenet large scale visual
of simple features,” in IEEE Computer Society Conference on Computer recognition challenge,” International Journal of Computer Vision, vol.
Vision and Pattern Recognition, vol. 1, Dec 2001, p. 511518. 115, no. 3, pp. 211–252, 2015.
[3] R. Padilla, C. Costa Filho, and M. Costa, “Evaluation of haar cascade [29] I. Krasin, T. Duerig, N. Alldrin, V. Ferrari, S. Abu-El-Haija,
classifiers designed for face detection,” World Academy of Science, A. Kuznetsova, H. Rom, J. Uijlings, S. Popov, S. Kamali, M. Malloci,
Engineering and Technology, vol. 64, pp. 362–365, 2012. J. Pont-Tuset, A. Veit, S. Belongie, V. Gomes, A. Gupta, C. Sun,
[4] E. Ohn-Bar and M. M. Trivedi, “To boost or not to boost? on the limits G. Chechik, D. Cai, Z. Feng, D. Narayanan, and K. Murphy, “Openim-
of boosted trees for object detection,” in IEEE International Conference ages: A public dataset for large-scale multi-label and multi-class image
on Pattern Recognition, Dec 2016, pp. 3350–3355. classification,” 2017.
[5] Z. Sun, G. Bebis, and R. Miller, “On-road vehicle detection: A review,” [30] T. Lin, M. Maire, S. J. Belongie, L. D. Bourdev, R. B. Girshick, J. Hays,
IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 28, P. Perona, D. Ramanan, P. Dollár, and C. L. Zitnick, “Microsoft COCO:
no. 5, pp. 694–711, May 2006. common objects in context,” CoRR, 2014.
[6] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification [31] M. Everingham, L. Van Gool, C. K. I. Williams, J. Winn, and A. Zis-
with deep convolutional neural networks,” in International Conference on serman, “The pascal visual object classes (voc) challenge,” International
Neural Information Processing Systems, 2012, pp. 1097–1105. Journal of Computer Vision, vol. 88, no. 2, pp. 303–338, Jun. 2010.
[7] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, [32] R. Padilla. Metrics for object detection. [Online]. Available: https:
V. Vanhoucke, and A. Rabinovich, “Going deeper with convolutions,” //github.com/rafaelpadilla/Object-Detection-Metrics
in IEEE Conference on Computer Vision and Pattern Recognition, June [33] M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G. S.
2015, pp. 1–9. Corrado, A. Davis, J. Dean, M. Devin, S. Ghemawat, I. Goodfellow,
[8] Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning A. Harp, G. Irving, M. Isard, Y. Jia, R. Jozefowicz, L. Kaiser, M. Kudlur,
applied to document recognition,” in Proceedings of the IEEE, 1998, pp. J. Levenberg, D. Mané, R. Monga, S. Moore, D. Murray, C. Olah,
2278–2324. M. Schuster, J. Shlens, B. Steiner, I. Sutskever, K. Talwar, P. Tucker,
[9] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for V. Vanhoucke, V. Vasudevan, F. Viégas, O. Vinyals, P. Warden, M. Watten-
image recognition,” in IEEE Conference on Computer Vision and Pattern berg, M. Wicke, Y. Yu, and X. Zheng, “TensorFlow: Large-scale machine
Recognition, Jun 2016, pp. 770–778. learning on heterogeneous systems,” 2015.
[10] G. E. Hinton, S. Osindero, and Y.-W. Teh, “A fast learning algorithm for [34] TensorFlow. Detection evaluation protocols. [Online]. Available: https:
deep belief nets,” Neural Computation, vol. 18, no. 7, pp. 1527–1554, //github.com/tensorflow
2006. [35] P. Jaccard, “Étude comparative de la distribution florale dans une portion
[11] G. E. Hinton and R. R. Salakhutdinov, “Reducing the dimensionality of des alpes et des jura,” Bulletin de la Societe Vaudoise des Sciences
data with neural networks,” Science, vol. 313, no. 5786, pp. 504–507, Naturelles, vol. 37, pp. 547–579, 1901.
Jul. 2006. [36] J. A. Hanley and B. J. McNeil, “The meaning and use of the area under a
[12] P. Sermanet, D. Eigen, X. Zhang, M. Mathieu, R. Fergus, and Y. Le- receiver operating characteristic (roc) curve.” Radiology, vol. 143, no. 1,
Cun, “Overfeat: Integrated recognition, localization and detection using pp. 29–36, 1982.
convolutional networks,” CoRR, 2013. [37] D. Yoo, S. Park, J.-Y. Lee, A. S. Paek, and I. So Kweon, “Attentionnet:
[13] R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Rich feature hierar- Aggregating weak directions for accurate object detection,” in IEEE
chies for accurate object detection and semantic segmentation,” in IEEE International Conference on Computer Vision, 2015, pp. 2659–2667.
Conference on Computer Vision and Pattern Recognition, Jun 2014. [38] Z.-Q. Zhao, P. Zheng, S.-t. Xu, and X. Wu, “Object detection with deep
[14] R. Girshick, “Fast r-cnn,” in IEEE International Conference on Computer learning: A review,” IEEE Transactions on Neural Networks and Learning
Vision, Dec 2015. Systems, vol. 30, no. 11, pp. 3212–3232, 2019.
[15] S. Ren, K. He, R. Girshick, and J. Sun, “Faster r-cnn: Towards real-time [39] “An annotated video database for abandoned-object detection in a clut-
object detection with region proposal networks,” in Advances in Neural tered environment,” in International Telecommunications Symposium,
Information Processing Systems 28, 2015, pp. 91–99. 2014, pp. 1–5.
[16] J. Dai, Y. Li, K. He, and J. Sun, “R-FCN: object detection via region-
[40] T.-Y. Lin, P. Goyal, R. Girshick, K. He, and P. Dollár, “Focal loss for
based fully convolutional networks,” CoRR, 2016.
dense object detection,” in IEEE International Conference on Computer
[17] W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. E. Reed, C. Fu, and A. C.
Vision, 2017, pp. 2980–2988.
Berg, “SSD: single shot multibox detector,” CoRR, 2015.
[41] T.-Y. Lin, P. Dollár, R. Girshick, K. He, B. Hariharan, and S. Belongie,
[18] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You only look once:
“Feature pyramid networks for object detection,” in IEEE Conference on
Unified, real-time object detection,” in IEEE Conference on Computer
Computer Vision and Pattern Recognition, 2017, pp. 2117–2125.
Vision and Pattern Recognition, 2016, pp. 779–788.
[42] J. Huang, V. Rathod, C. Sun, M. Zhu, A. Korattikara, A. Fathi, I. Fischer,
[19] J. Redmon and A. Farhadi, “Yolo9000: Better, faster, stronger,” in IEEE
Z. Wojna, Y. Song, S. Guadarrama et al., “Speed/accuracy trade-offs for
Conference on Computer Vision and Pattern Recognition, 2017, pp. 7263–
modern convolutional object detectors,” in IEEE Conference on Computer
7271.
Vision and Pattern Recognition, 2017, pp. 7310–7311.
[20] ——, “Yolov3: An incremental improvement,” Technical Report, 2018.
[21] M. Everingham, S. M. A. Eslami, L. Van Gool, C. K. I. Williams, [43] A. Shrivastava, R. Sukthankar, J. Malik, and A. Gupta, “Beyond skip
J. Winn, and A. Zisserman, “The pascal visual object classes challenge: A connections: Top-down modulation for object detection,” arXiv, 2016.
retrospective,” International Journal of Computer Vision, vol. 111, no. 1, [44] C.-Y. Fu, W. Liu, A. Ranga, A. Tyagi, and A. C. Berg, “Dssd: Deconvo-
pp. 98–136, Jan. 2015. lutional single shot detector,” arXiv, 2017.
[22] Coco detection challenge (bounding box). [Online]. Available: https: [45] C. R. I. of Montreal (CRIM). thelper package. [Online]. Available:
//competitions.codalab.org/competitions/20794 https://thelper.readthedocs.io/en/latest/thelper.optim.html
[23] ImageNet. Imagenet object localization challenge. [Online]. Available: [46] C. Adleson and D. C. Conner, “Comparison of classical and cnn-based
https://www.kaggle.com/c/imagenet-object-localization-challenge/ detection techniques for state estimation in 2d,” Journal of Computing
[24] G. Research. Open images 2019 - object detec- Sciences in Colleges, vol. 35, no. 3, pp. 122–133, 2019.
tion challenge. [Online]. Available: https://www.kaggle.com/c/ [47] A. Borji and S. M. Iranmanesh, “Empirical upper-bound in object
open-images-2019-object-detection/ detection and more,” arXiv, 2019.
[25] Lyft. Lyft 3d object detection for autonomous [48] D. Caschili, M. Poncino, and T. Italia, “Optimization of cnn-based
vehicles. [Online]. Available: https://www.kaggle.com/c/ object detection algorithms for embedded systems,” Masters dissertation,
3d-object-detection-for-autonomous-vehicles/ Politecnico di Torino, 2019.

10 1109@iwssip48289 2020 9145130
No ratings yet
10 1109@iwssip48289 2020 9145130
6 pages
Object Detection in Deep Learning
No ratings yet
Object Detection in Deep Learning
61 pages
Object Detection and Segmentation
No ratings yet
Object Detection and Segmentation
37 pages
Performance Indicators for Object Detection
No ratings yet
Performance Indicators for Object Detection
5 pages
Deep Learning for Object Detection
No ratings yet
Deep Learning for Object Detection
59 pages
YOLO Evolution for Tech Experts
No ratings yet
YOLO Evolution for Tech Experts
31 pages
Comprehensive YOLO Framework Review
No ratings yet
Comprehensive YOLO Framework Review
34 pages
Object Detection
No ratings yet
Object Detection
3 pages
Image and Video Analytics Unit 3
No ratings yet
Image and Video Analytics Unit 3
18 pages
A Comprehensive Review of YOLO From YOLOv1 To YOLO
No ratings yet
A Comprehensive Review of YOLO From YOLOv1 To YOLO
27 pages
Yolo
No ratings yet
Yolo
34 pages
Unit 3
No ratings yet
Unit 3
19 pages
Research Paper G19
No ratings yet
Research Paper G19
5 pages
Object Detection Metrics Explained
No ratings yet
Object Detection Metrics Explained
32 pages
Understanding Object Detection Techniques
No ratings yet
Understanding Object Detection Techniques
46 pages
02 Object-Detection Slide
No ratings yet
02 Object-Detection Slide
25 pages
Object Detection
No ratings yet
Object Detection
76 pages
Introduction To Object Detection
No ratings yet
Introduction To Object Detection
24 pages
Object Detection and Identification Report
No ratings yet
Object Detection and Identification Report
45 pages
Real Time Object Detection in Surveillance Cameras With 2xjeq74wam
No ratings yet
Real Time Object Detection in Surveillance Cameras With 2xjeq74wam
8 pages
Deep Learning for Daily Object Detection
No ratings yet
Deep Learning for Daily Object Detection
6 pages
Object Detection Project Report
No ratings yet
Object Detection Project Report
45 pages
CV Project
No ratings yet
CV Project
7 pages
Finalreport
No ratings yet
Finalreport
56 pages
John 2020 Comparative
No ratings yet
John 2020 Comparative
7 pages
Object Detection for the Visually Impaired
No ratings yet
Object Detection for the Visually Impaired
4 pages
Paper Survey On Performance Metrics For Object Detection Algorithms
No ratings yet
Paper Survey On Performance Metrics For Object Detection Algorithms
7 pages
Focus-And-Detect A Small Object Detection Framework For Aerial Images
No ratings yet
Focus-And-Detect A Small Object Detection Framework For Aerial Images
9 pages
基于相似距离的微小物体检测标签分配
No ratings yet
基于相似距离的微小物体检测标签分配
8 pages
Electronics-Object Detection YOLO
No ratings yet
Electronics-Object Detection YOLO
12 pages
Efficient Crowd Counting with YOLOv5
No ratings yet
Efficient Crowd Counting with YOLOv5
3 pages
Box Loss
No ratings yet
Box Loss
16 pages
Object Detection with OpenCV in Python
No ratings yet
Object Detection with OpenCV in Python
5 pages
Aws RP
No ratings yet
Aws RP
11 pages
A Novel Model To Detect and Categorize Objects From Images by Using A Hybrid Machine Learning Model
No ratings yet
A Novel Model To Detect and Categorize Objects From Images by Using A Hybrid Machine Learning Model
13 pages
Arjun 1123
No ratings yet
Arjun 1123
20 pages
390 Submission
No ratings yet
390 Submission
5 pages
Lecture 7 Deep Learning in Object Detection 2025
No ratings yet
Lecture 7 Deep Learning in Object Detection 2025
43 pages
Yolo
100% (1)
Yolo
32 pages
Object Detection Security System Report
No ratings yet
Object Detection Security System Report
13 pages
Performance Analysis of Deep Learning Based Object Detection Algorithms On COCO Benchmark: A Comparative Study
No ratings yet
Performance Analysis of Deep Learning Based Object Detection Algorithms On COCO Benchmark: A Comparative Study
18 pages
Module 6
No ratings yet
Module 6
83 pages
SEMINAR
No ratings yet
SEMINAR
4 pages
Applsci 12 07825
No ratings yet
Applsci 12 07825
23 pages
V2I41
No ratings yet
V2I41
7 pages
Lec36 Obj Detn
No ratings yet
Lec36 Obj Detn
60 pages
Drone and Improved Human Detection in Sea Using Pi Pico
No ratings yet
Drone and Improved Human Detection in Sea Using Pi Pico
4 pages
Deep Learning Object Detection IoU
No ratings yet
Deep Learning Object Detection IoU
2 pages
Gaussian Bounding Boxes and Probabilistic Intersection-over-Union For Object Detection
No ratings yet
Gaussian Bounding Boxes and Probabilistic Intersection-over-Union For Object Detection
21 pages
Overview of Object Detection Evaluation Metrics - by Youssef Hosni - Towards AI
No ratings yet
Overview of Object Detection Evaluation Metrics - by Youssef Hosni - Towards AI
10 pages
Deep Learning for Object Tracking
No ratings yet
Deep Learning for Object Tracking
3 pages
Weapon Detection with YOLO Models
No ratings yet
Weapon Detection with YOLO Models
10 pages
"Object Detection With Yolo": A Seminar On
No ratings yet
"Object Detection With Yolo": A Seminar On
14 pages
CenterNet-Based Object & Face Detection
No ratings yet
CenterNet-Based Object & Face Detection
7 pages
TensorFlow Object Detection Guide
No ratings yet
TensorFlow Object Detection Guide
21 pages
Deep Learning Object Detection in TensorFlow
No ratings yet
Deep Learning Object Detection in TensorFlow
10 pages
Computer Vision in Painting Line
No ratings yet
Computer Vision in Painting Line
20 pages
SEMINAR
No ratings yet
SEMINAR
13 pages
Junior IT Support
No ratings yet
Junior IT Support
5 pages
PPT-Unit-2-Services and Components of Operating Systems
No ratings yet
PPT-Unit-2-Services and Components of Operating Systems
44 pages
UG Spruhx5g PDF
No ratings yet
UG Spruhx5g PDF
2,740 pages
Welcome To PDFelement
No ratings yet
Welcome To PDFelement
11 pages
Key Cloud Computing Features & Models
No ratings yet
Key Cloud Computing Features & Models
3 pages
LTM Essentials WBT Lab FAQ
No ratings yet
LTM Essentials WBT Lab FAQ
4 pages
Oracle E-Business Suite Guide
No ratings yet
Oracle E-Business Suite Guide
4 pages
Manual EPOS2 ApplicationNotesCollection PDF
No ratings yet
Manual EPOS2 ApplicationNotesCollection PDF
226 pages
ALU - GPON Basics Configuration PDF
No ratings yet
ALU - GPON Basics Configuration PDF
67 pages
SIMS Content - Viewer-Quick - Start-R17
No ratings yet
SIMS Content - Viewer-Quick - Start-R17
23 pages
Firewalls Introduction To Fire Walls: Firewall Services
No ratings yet
Firewalls Introduction To Fire Walls: Firewall Services
6 pages
Genetic Algorithm Toolbox Guide
No ratings yet
Genetic Algorithm Toolbox Guide
105 pages
Codesys Opc Da Server SL: Product Description
No ratings yet
Codesys Opc Da Server SL: Product Description
2 pages
Photoshop Useful Shortcut Key Chart For Windows - : Main Toolbar
No ratings yet
Photoshop Useful Shortcut Key Chart For Windows - : Main Toolbar
6 pages
TRADING SYSTEM LAB Product Guide Version 1
0% (1)
TRADING SYSTEM LAB Product Guide Version 1
11 pages
Ajay Ralegankar
No ratings yet
Ajay Ralegankar
2 pages
Safe 5.0
No ratings yet
Safe 5.0
13 pages
Iii Ece NM Name List 2023-24 Even
No ratings yet
Iii Ece NM Name List 2023-24 Even
4 pages
DDI0475C Corelink Nic400 Network Interconnect r0p2 TRM
No ratings yet
DDI0475C Corelink Nic400 Network Interconnect r0p2 TRM
74 pages
WPD Microproject Final
No ratings yet
WPD Microproject Final
10 pages
Military TELNET Protocol Guide
No ratings yet
Military TELNET Protocol Guide
40 pages
De Lorenzo - Smart Grid Eng
No ratings yet
De Lorenzo - Smart Grid Eng
29 pages
Python by Example (Learning To Program in 150 Challenges) More Tkinter (Lacey, Nichola) (Z-Library)
No ratings yet
Python by Example (Learning To Program in 150 Challenges) More Tkinter (Lacey, Nichola) (Z-Library)
10 pages
Identity Federation
No ratings yet
Identity Federation
6 pages
IT251 Java Practical List
No ratings yet
IT251 Java Practical List
10 pages
Happy Feeder plusIII GB
No ratings yet
Happy Feeder plusIII GB
70 pages
SQL B12
No ratings yet
SQL B12
3 pages
Question Paper Format HRITU (Mca)
No ratings yet
Question Paper Format HRITU (Mca)
2 pages
SAP FI Certification Exam Questions
No ratings yet
SAP FI Certification Exam Questions
9 pages
Fs-Helicon-X3 Propulsion Control System Upgrades
No ratings yet
Fs-Helicon-X3 Propulsion Control System Upgrades
1 page