0% found this document useful (0 votes)

84 views10 pages

Choi Gaussian YOLOv3 An Accurate and Fast Object Detector Using Localization ICCV 2019 Paper

This document proposes Gaussian YOLOv3, a method to improve the accuracy of YOLOv3 object detection for autonomous vehicles while maintaining real-time performance. It models bounding box coordinates as Gaussian parameters to estimate localization uncertainty. It also predicts this uncertainty to reduce false positives and increase true positives during detection. Compared to YOLOv3, Gaussian YOLOv3 improves mean average precision on standard datasets while detecting over 42 frames per second, making it suitable for autonomous driving.

Uploaded by

Gayatri Kulkarni

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

84 views10 pages

Choi Gaussian YOLOv3 An Accurate and Fast Object Detector Using Localization ICCV 2019 Paper

Uploaded by

Gayatri Kulkarni

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

Gaussian YOLOv3: An Accurate and Fast Object Detector Using Localization

Uncertainty for Autonomous Driving

Jiwoong Choi1 , Dayoung Chun1 , Hyun Kim2 , and Hyuk-Jae Lee1

1
Seoul National University, 2 Seoul National University of Science and Technology
{jwchoi, jjeonda}@capp.snu.ac.kr, [email protected], [email protected]

Abstract tion must accurately detect cars, pedestrians, traffic signs,

traffic lights, etc. in real time to ensure safe and correct
The use of object detection algorithms is becoming in- control decisions [25]. To detect such objects, various sen-
creasingly important in autonomous vehicles, and object sors such as cameras, light detection and ranging (Lidar),
detection at high accuracy and a fast inference speed is es- and radio detection and ranging (Radar) are generally used
sential for safe autonomous driving. A false positive (FP) in autonomous vehicles [27]. Among these various types
from a false localization during autonomous driving can of sensors, a camera sensor can accurately identify the ob-
lead to fatal accidents and hinder safe and efficient driv- ject type based on texture and color features and is more
ing. Therefore, a detection algorithm that can cope with cost-effective [24] than other sensors. In particular, deep-
mislocalizations is required in autonomous driving applica- learning based object detection using camera sensors is be-
tions. This paper proposes a method for improving the de- coming more important in autonomous vehicles because it
tection accuracy while supporting a real-time operation by achieves a better level of accuracy than humans in terms of
modeling the bounding box (bbox) of YOLOv3, which is the object detection, and consequently it has become an essen-
most representative of one-stage detectors, with a Gaussian tial method [11] in autonomous driving systems.
parameter and redesigning the loss function. In addition, An object detection algorithm for autonomous vehicles
this paper proposes a method for predicting the localization should satisfy the following two conditions. First, a high
uncertainty that indicates the reliability of bbox. By using detection accuracy of the road objects is required. Sec-
the predicted localization uncertainty during the detection ond, a real-time detection speed is essential for a rapid re-
process, the proposed schemes can significantly reduce the sponse of a vehicle controller and a reduced latency. Deep-
FP and increase the true positive (TP), thereby improving learning based object detection algorithms, which are indis-
the accuracy. Compared to a conventional YOLOv3, the pensable in autonomous vehicles, can be classified into two
proposed algorithm, Gaussian YOLOv3, improves the mean categories: two-stage and one-stage detectors. Two-stage
average precision (mAP) by 3.09 and 3.5 on the KITTI and detectors, e.g., Fast R-CNN [8], Faster R-CNN [22], and R-
Berkeley deep drive (BDD) datasets, respectively. Never- FCN [4], conduct a first stage of region proposal generation,
theless, the proposed algorithm is capable of real-time de- followed by a second stage of object classification and bbox
tection at faster than 42 frames per second (fps) and shows regression. These methods generally show a high accu-
a higher accuracy than previous approaches with a similar racy but have a disadvantage of a slow detection speed and
fps. Therefore, the proposed algorithm is the most suitable lower efficiency. One-stage detectors, e.g., SSD [17] and
for autonomous driving applications. YOLO [19], conduct object classification and bbox regres-
sion concurrently without a region proposal stage. These
methods generally have a fast detection speed and high ef-
1. Introduction ficiency but a low accuracy. In recent years, to take advan-
tage of both types of method and to compensate for their re-
In recent years, deep learning has been actively applied spective disadvantages, object detectors combining various
in various fields including computer vision [9], autonomous schemes have been widely studied [1, 11, 29, 28, 16]. MS-
driving [5], and social network services [15]. The devel- CNN [1], a two-stage detector, improves the detection speed
opment of sensors and GPU along with deep learning al- by conducting detection on various intermediate network
gorithms has accelerated research into autonomous vehi- layers. SINet [11], also a two-stage detector, enables a fast
cles based on artificial intelligence. An autonomous vehi- detection using a scale-insensitive network. CFENet [29], a
cle with self-driving capability without a driver interven- one-stage detector, uses a comprehensive feature enhance-

502
ment module based on SSD to improve the detection accu- tection. However, because they focused on a two-stage de-
racy. RefineDet [28], also a one-stage detector, improves tector, their method cannot support a real-time operation,
the detection accuracy by applying an anchor refinement and remaining a bbox overlap problem, so it is unsuitable
module and an object detection module. Another one-stage for self-driving applications.
detector, RFBNet [16], applies a receptive field block to im- To overcome the problems of previous object detec-
prove the accuracy. However, using an input resolution of tion studies, this paper proposes a novel object detec-
512 × 512 or higher, which is widely applied in object de- tion algorithm suitable for autonomous driving based on
tection algorithms for achieving a high accuracy, previous YOLOv3 [21]. YOLOv3 can detect multiple objects with
studies [1, 11, 29, 28] have been unable to meet a real-time a single inference, and its detection speed is therefore ex-
detection speed of above 30 fps, which is a prerequisite for tremely fast; in addition, by applying a multi-stage de-
self-driving applications. Even if real-time detection is pos- tection method, it can complement the low accuracy of
sible in [16], it is difficult to apply to autonomous driving YOLO [19] and YOLOv2 [20]. Based on these advantages,
due to a low accuracy. This indicates that these previous YOLOv3 is suitable for autonomous driving applications,
schemes are incomplete in terms of a trade-off between ac- but generally achieves a lower accuracy than a two-stage
curacy and detection speed, and consequently, have a limi- method. It is therefore essential to improve the accuracy
tation in their application to self-driving systems. while maintaining a real-time object detection capability.
To achieve this goal, the present paper proposes a method
In addition, one of the most critical problems of most for improving the detection accuracy by modeling the bbox
conventional deep-learning based object detection algo- coordinates of YOLOv3, which only outputs deterministic
rithms is that, whereas the bbox coordinates (i.e., localiza- values, as the Gaussian parameters (i.e., the mean and vari-
tion) of the detected object are known, the uncertainty of ance), and redesigning the loss function of bbox. Through
the bbox result is not. Thus, conventional object detectors this Gaussian modeling, a localization uncertainty for a
cannot prevent mislocalizations (i.e., FPs) because they out- bbox regression task in YOLOv3 can be estimated. Further-
put the deterministic results of the bbox without information more, to further improve the detection accuracy, a method
regarding the uncertainty. In autonomous driving, an FP de- for reducing the FP and increasing the TP by utilizing the
notes an incorrect detection result of bbox on an object that predicted localization uncertainty of bbox during the detec-
is not the ground-truth (GT), or an inaccurate detection re- tion process is proposed. This study is therefore the first
sult of bbox on the GT, whereas a TP denotes an accurate attempt to model the localization uncertainty in YOLOv3
detection result of bbox on the GT. An FP is extremely dan- and to utilize this factor in a practical manner. As a result,
gerous under autonomous driving because it causes exces- the proposed Gaussian YOLOv3 can cope with mislocaliza-
sive reactions such as unexpected braking, which can re- tions in autonomous driving applications. In addition, be-
duce the stability and efficiency of driving and lead to a cause the proposed method is modeled only in bbox of the
fatal accident [18, 23] as well as confusion in the deter- YOLOv3 detection layer (i.e., the output layer), the addi-
mination of an accurate object detection. In other words, tional computation cost is negligible, and the proposed algo-
it is extremely important to predict the uncertainty of the rithm consequently maintains the real-time detection speed
detected bboxes and to consider this factor along with the of over 42 fps with an input resolution of 512 × 512 despite
objectness score and class scores for reducing the FP and the significant improvements in performance. Compared to
preventing autonomous driving accidents. For this reason, the baseline algorithm (i.e., YOLOv3), the proposed Gaus-
various studies have been conducted on predicting uncer- sian YOLOv3 improves the mAP by 3.09 and 3.5 on the
tainty in deep learning. Kendall et al. [12] proposed a mod- KITTI [7] and BDD [26] datasets, respectively. In addi-
eling method for uncertainty prediction using a Bayesian tion, the proposed algorithm reduces the FP by 41.40% and
neural network in deep learning. Feng et al. [6] proposed a 40.62%, respectively, and increases the TP by 7.26% and
method for predicting uncertainty by applying Kendall et al. 4.3%, respectively, on the KITTI and BDD datasets. As a
’s scheme [12] to 3D vehicle detection using a Lidar sensor. result, in terms of the trade-off between accuracy and de-
However, the methods proposed by Kendall et al. [12] and tection speed, the proposed algorithm is suitable for au-
Feng et al. [6] only predict the level of uncertainty, and do tonomous driving because it significantly improves the de-
not utilize this factor in actual applications. Choi et al. [2] tection accuracy and addresses the mislocalization problem
proposed a method for predicting uncertainty in real time while supporting a real-time operation.
using a Gaussian mixture model and applied the method to
an autonomous driving application. However, it was applied 2. Background
to the steering angle, and not object detection, and a compli-
cated distribution is therefore modeled, increasing the com- Instead of the region proposal method used in two-stage
putational complexity. He et al. [10] proposed an approach detectors, YOLO [19] detects objects by dividing an image
for predicting uncertainty and utilized it toward object de- into grid units. The feature map of the YOLO output layer is

503
0 Image grid
1 4
18 36 97 98 103
61 85 86 91 96
79 84
+ … … … … … … … … Responsible grid
* * Prediction boxes
for detecting car
80
92 Box 0 Box 1 Box 2
104
81
Input image
93
Convolution layer 82
105
Up-sample layer
Route layer 94

Detection layer
tx ty t w th Pobj P0 P1 . . . Pn
… Further layers 106
Bounding box Objectness
Class scores
+ Addition coordinates score
* Concatenation Components in the prediction box

(a) (b)

Figure 1: (a) Network architecture of YOLOv3 and (b) attributes of its prediction feature map.

designed to output bbox coordinates, the objectness score, of small-sized convolution filers of 1 × 1 and 3 × 3 like
and the class scores, and thus YOLO enables the detec- YOLOv2 [20], the detection speed is as fast as YOLO [19]
tion of multiple objects with a single inference. Therefore, and YOLOv2 [20]. Therefore, in terms of the trade-off
the detection speed is much faster than that of conventional between accuracy and speed, YOLOv3 is suitable for
methods. However, owing to the processing of the grid unit, autonomous driving applications and is widely used in
localization errors are large and the detection accuracy is autonomous driving research [3]. However, in general, it
low, and thus it is unsuitable for autonomous driving ap- still has a lower accuracy than a two-stage detector using
plications. To address these problems, YOLOv2 [20] has a region proposal stage. To compensate for this drawback,
been proposed. YOLOv2 improves the detection accuracy as taking advantage of the smaller complexity of YOLOv3
compared to YOLO by using batch normalization for the than that of a two-stage detector, a more efficient detector
convolution layer, and applying an anchor box, multi-scale for an autonomous driving application can be designed by
training, and fine-grained features. However, the detection applying the additional method for improving accuracy to
accuracy is still low for small or dense objects. Therefore, YOLOv3 [21]. The Gaussian modeling and loss function
YOLOv2 is unsuitable for autonomous driving applications, reconstruction of YOLOv3 proposed in this paper can
where a high accuracy is required for dense road objects and improve the accuracy by reducing the influence of noisy
small objects such as traffic signs and lights. data during training and predict the localization uncertainty.
To overcome the disadvantages of YOLOv2, In addition, the detection accuracy can be further enhanced
YOLOv3 [21] has been proposed. YOLOv3 consists by using this predicted localization uncertainty. A detailed
of convolution layers, as shown in Figure 1a, and is description of the above aspects is provided in Section 3.
constructed of a deep network for an improved accuracy.
YOLOv3 applies a residual skip connection to solve the 3. Gaussian YOLOv3
vanishing gradient problem of deep networks and uses
3.1. Gaussian modeling
an up-sampling and concatenation method that preserves
fine-grained features for small object detection. The As shown in Figure 1b, the prediction feature map of
most prominent feature is the detection at three different YOLOv3 [21] has three prediction boxes per grid, where
scales in a similar manner as used in a feature pyramid each prediction box consists of bbox coordinates (i.e., tx ,
network [13]. This allows YOLOv3 to detect objects ty , tw , and th ), the objectness score, and class scores.
with various sizes. In more detail, when an image of YOLOv3 outputs the objectness (i.e., whether an object is
three channels of R, G, and B is input into the YOLOv3 present or not in the bbox) and class (i.e., the category of
network, as shown in Figure 1a, information on the object the object), as a score of between zero and one. An object
detection (i.e., bbox coordinates, objectness score, and is then detected based on the product of these two values.
class scores) is output from three detection layers. The Unlike the objectness and class information, bbox coordi-
predicted results of the three detection layers are combined nates are output as deterministic coordinate values instead
and processed using non-maximum suppression. After of a score, and thus the confidence of the detected bbox is
that, the final detection results are determined. Because unknown. Moreover, the objectness score does not reflect
YOLOv3 is a fully convolutional network consisting only the reliability of the bbox well. It therefore does not know

504
how uncertain the result of bbox is. In contrast, the uncer- tx Σtx μty Σty μtw Σtw μth Σth P0 P1 ... Pn
Pobj
tainty of bbox, which is predicted by the proposed method,
Gaussian parameter of bounding box Objectness
serves as the bbox score, and can thus be used as an indi- coordinates score
Class scores
cator of how uncertain the bbox is. The results for this are
described in Section 4.1.
In YOLOv3, bbox regression is to extract the bbox cen-
ter information (i.e., tx and ty ) and bbox size information
(i.e., tw and th ). Because there is only one correct answer tx ty tw th
(i.e., the GT) for the bbox of an object, complex modeling μtx x μty y μtw w μth h
is not required for predicting the localization uncertainty. In
other words, the uncertainty of bbox can be modeled using
Figure 2: Components in the prediction box of proposed algo-
each single Gaussian model of tx , ty , tw , and th . A single
rithm.
Gaussian model of output y for a given test input x whose
output consists of Gaussian parameters is as follows:
109 FLOPs are required. Thus, the penalty for the detec-
p(y|x) = N (y; µ(x), Σ(x)), (1)
tion speed is extremely low because the computation cost
where µ(x) and Σ(x) are the mean and variance functions, increases only by 0.04% as compared with before the mod-
respectively. eling. The related results are shown in Section 4.
To predict the uncertainty of bbox, each of the bbox coor- 3.2. Reconstruction of loss function
dinates in the prediction feature map is modeled as the mean
(µ) and variance (Σ), as shown in Figure 2. The outputs of For training, YOLOv3 [21] uses the sum of the squared
bbox are µ̂tx , Σ̂tx , µ̂ty , Σ̂ty , µ̂tw , Σ̂tw , µ̂th , and Σ̂th . Con- error loss for bbox, and the binary cross-entropy loss for the
sidering the structure of the detection layer in YOLOv3, the objectness and class. Because the bbox coordinates are out-
Gaussian parameters for tx , ty , tw , and th are preprocessed put as Gaussian parameters through Gaussian modeling, the
as follows: loss function of bbox is redesigned as a negative log likeli-
hood (NLL) loss, whereas the loss function for objectness
µtx = σ(µ̂tx ), µty = σ(µ̂ty ), µtw = µ̂tw , µth = µ̂th (2) and class is not changed. The loss function redesigned for
bbox is as follows:
Σtx = σ(Σ̂tx ), Σty = σ(Σ̂ty ),
(3) H X
W X
X K
Σtw = σ(Σ̂tw ), Σth = σ(Σ̂th ) Lx = − γijk log(N (xG
ijk |µtx (xijk ),
(5)
1 i=1 j=1 k=1
σ(x) = . (4)
(1 + exp(−x)) Σtx (xijk )) + ε),
The mean value of each coordinate in the detection layer where Lx is the NLL loss of tx coordinate and the others
is the predicted coordinate of bbox, and each variance rep- (i.e., Ly , Lw , and Lh ) are the same as Lx except for each
resents the uncertainty of each coordinate. µtx and µty in parameter. W and H are the number of grids of each width
(2) must represent the center coordinates of bbox inside the and height, respectively, and K is the number of anchors.
grid, which are thus processed as values between zero and Moreover, µtx (xijk ) denotes the tx coordinates, which is
one with the sigmoid function in (4). The variances of each the output of the detection layer of the proposed algorithm,
coordinate in (3) are also processed as values between zero at the k-th anchor in the (i, j) grid. In addition, Σtx (xijk ) is
and one with a sigmoid function. In YOLOv3, the width also the output of the detection layer, indicating the uncer-
and height information of bbox are processed through tw , tainty of tx coordinate, and xG ijk is the GT of tx coordinate.
th , bbox prior, and exponential functions [21]. In other The GT of bbox is then computed as follows:
words, µtw and µth in (2), which indicate the tw and th of
YOLOv3, are not processed as sigmoid functions because xG G G G
ijk = x × W − i, yijk = y × H − j (6)
they can have both negative and positive values.
Single Gaussian modeling for predicting the uncer- wG × IW hG × IH
G G
tainty of bbox only applies to the bbox coordinates of the wijk = log( ), hijk = log( ), (7)
Awk Ahk
YOLOv3 detection layer shown in Figure 1a. Therefore,
the overall computational complexity of the algorithm does where xG , y G , wG , and hG are the ratios of a GT bbox in an
not increase significantly. In a 512 × 512 input resolution image, IW and IH are the width and height of the resized
and ten classes, YOLOv3 requires 99 × 109 FLOPs; how- image, and Aw h
k and Ak denote the width and height of the
ever, after a single Gaussian modeling for bbox, 99.04 × k-th anchor box prior, respectively. In YOLOv3, centroid of

505
bbox is calculated in grid units, and size of bbox is calcu- Cr. in (10) indicates the detection criterion for Gaus-
lated based on an anchor box, and thus the GT is processed sian YOLOv3, σ(Object) is the objectness score, and
accordingly for training. σ(Classi ) is the score of the i-th class. In addition,
U ncertaintyaver , which is localization uncertainty, indi-
obj
ωscale × δijk cates the average of the uncertainties of the predicted bbox
γijk = (8) coordinates. Localization uncertainty has a value between
2
zero and one, such as the objectness score and class scores,
ωscale = 2 − wG × hG . (9) and the higher the localization uncertainty, the lower the
confidence of the predicted bbox. The results of the pro-
ωscale in (8) is calculated based on the width and height
posed Gaussian YOLOv3 are described in Section 4.
ratios of the GT bbox in an image, as shown in (9). It pro-
vides different weights according to the object size during
obj 4. Experimental Results
training. In addition, δijk in (8) is a parameter applied to
include in the loss only if there is an anchor that is most In the experiment, the KITTI dataset [7], which is com-
suitable in the current object among the predefined anchors. monly used in autonomous driving research, and the BDD
This parameter is assigned as a value of one when the inter- dataset [26], which is the latest published autonomous driv-
section over union (IOU) of the GT and the k-th anchor box ing dataset, are used. The KITTI dataset consists of three
in the (i, j) grid are the largest, and is assigned as a value of classes: car, cyclist, and pedestrian, and consists of 7,481
zero if there is no appropriate GT. For a numerical stability images for training and 7,518 images for testing. Because
of the logarithmic function, ε is assigned a value of 10−9 . there is no GT for testing, the training and validation sets
Because YOLOv3 uses the sum of the squared error loss are made by randomly splitting the training set in half [25].
for bbox, it is unable to cope with noisy data during training. The BDD dataset consists of ten classes: bike, bus, car, mo-
However, the redesigned loss function of bbox can provide tor, person, rider, traffic light, traffic sign, train, and truck.
a penalty to the loss through the uncertainty for inconsistent The ratio of training, validation, and test set is 7:1:2. In
data during training. That is, the model can be learned by this paper, a test set is utilized for the performance eval-
concentrating on consistent data. Therefore, the redesigned uation. In general, the IOU threshold (TH) of the KITTI
loss function of bbox makes the model more robust to noisy dataset is set to 0.7 for cars and 0.5 for cyclists and pedestri-
data [12]. Through this loss attenuation [12], it is possible ans [7], whereas the IOU TH of the BDD dataset is 0.75 for
to improve the accuracy of the algorithm. all classes [26]. In both YOLOv3 and Gaussian YOLOv3
training, the batch size is 64 and the learning rate is 0.0001.
3.3. Utilization of localization uncertainty The anchor size is extracted using k-means clustering for
The proposed Gaussian YOLOv3 can obtain the uncer- each training set of KITTI and BDD. The anchors used in
tainty of bbox for every detection object in an image. Be- the training and evaluation are shown in Table 1. Other
cause it is not an uncertainty for the entire image, it is pos- studies are trained using the default settings in the official
sible to apply uncertainty to each detection result. YOLOv3 code of each algorithm. The experiment is conducted on an
considers only the objectness score and class scores during NVIDIA GTX 1080 Ti with CUDA 8.0 and cuDNN v7.
object detection, and cannot consider the bbox score during 4.1. Validation in utilizing localization uncertainty
the detection process because the score information for the
bbox coordinates is unknown. However, Gaussian YOLOv3 Figure 3 shows the relationship between the IOU and lo-
can output the localization uncertainty, which is the score of calization uncertainty of bbox for the KITTI and BDD val-
bbox. Therefore, localization uncertainty can be considered idation sets. These results are plotted for cars, which is the
along with the objectness score and class scores during the dominant class for all data, and the localization uncertainty
detection process. The proposed algorithm applies localiza- is predicted using the proposed algorithm. To show a typi-
tion uncertainty to the detection criteria of YOLOv3 such cal tendency, the IOU is divided increments of 0.1, and the
that bbox with high uncertainty among the predicted results average value of the IOU and the average value of the local-
is filtered through the detection process. In this way, pre- ization uncertainty are calculated for each range and used as
dictions with high confidence of objectness, class, and bbox a representative value. As shown in Figure 3, the IOU value
are finally selected. Thus, Gaussian YOLOv3 can reduce tends to increase as the localization uncertainty decreases
the FP and increase the TP, which results in improving the in both datasets. A larger IOU indicates that the coordi-
detection accuracy. The proposed detection criterion con- nates of the predicted bbox are closer to those of the GT.
sidering the localization uncertainty is as follows: Based on these results, the localization uncertainty of the
proposed algorithm effectively represents the confidence of
Cr. = σ(Object) × σ(Classi ) × (1 − U ncertaintyaver ). the predicted bbox. It is therefore possible to cope with mis-
(10) localizations and improve the accuracy by utilizing the lo-

506
Anchor 0 Anchor 1 Anchor 2 1
KITTI training set 0.9 KITTI
First detection layer (49,240) (82,170) (118,206) BDD
Second detection layer (45,76) (27,172) (67,116) 0.8
Third detection layer (13,30) (23,53) (17,102) 0.7
BDD training set
0.6

IOU
First detection layer (73,175) (141,178) (144,291)
Second detection layer (32,97) (57,64) (92,109) 0.5
Third detection layer (7,10) (14,24) (27,43)
0.4

Table 1: Results of anchor boxes of training sets. 0.3

0.2
0.1
calization uncertainty predicted by the proposed algorithms. 0.06 0.1 0.14 0.18 0.22 0.26 0.3 0.34 0.38
Localization Uncertainty

4.2. Performance evaluation of Gaussian YOLOv3 Figure 3: IOU versus localization uncertainty on KITTI and BDD
validation sets.
To demonstrate the superiority of the proposed algo-
rithm, its performance (i.e., accuracy and detection speed)
is compared with that of other studies [1, 11, 17, 28, 29, 16, proposed method is 1.8-times better than that of SINet [11].
21]. In the experiment on the KITTI validation set, the other Because there is a trade-off between the accuracy and detec-
studies [1, 11, 17, 28, 16, 21] are trained and evaluated using tion speed, for a fair comparison, the input resolution of the
the official published code of each algorithm. In the case of proposed algorithm is changed and evaluated considering
CFENet [29], the result of the KITTI object detection leader the fps of SINet [11]. The experimental results show that
board is used because the official code has not been pub- the mAP of Gaussian YOLOv3 with a 704 × 704 resolution
lished. In the experiment on the BDD test data, the results shown in the last row of Table 2 is 86.79 at 24.91 fps, and
for the BDD test set of SSD [17], CFENet [29], and Re- consequently, Gaussian YOLOv3 outperforms SINet [11]
fineDet [28] are specified in CFENet [29], and thus the sim- in terms of the accuracy and detection speed.
ulation results of these studies are from [29], whereas the re-
Table 3 shows the performance of the proposed approach
maining comparative studies [1, 11, 16, 21] are trained and
and other methods for the BDD test set. Gaussian YOLOv3
evaluated using the official published codes because these
improves the mAP by 3.5 compared with YOLOv3, and
studies have not been developed as targets for BDD datasets
the detection speed is 42.5 fps, which is almost the same
and therefore have not been evaluated with BDD datasets in
as YOLOv3. In addition, Gaussian YOLOv3 is 3.5 fps
previous studies. For a fair comparison of the one-stage de-
faster than the RFBNet [16], which has the fastest opera-
tectors, the input resolution is set as in CFENet [29]. The
tion speed among the previous studies except for YOLOv3,
two-stage detector uses the default resolution of each offi-
despite the accuracy of Gaussian YOLOv3 outperforming
cial published code. The official evaluation method of each
that of RFBNet [16] by 3.9 mAP. In addition, compared to
dataset is used for an accuracy comparison, and IOU TH
CFENet [29], which has the highest accuracy among the
is set to the value mentioned before. For a comparison of
previous methods, the performance of Gaussian YOLOv3
the accuracy, mAP, which has been widely used in previous
with a 736 × 736 input resolution in the last row of Table 3
studies on object detection, is selected.
shows a better mAP of 1.7 and faster operation speed of
Table 2 shows the performance of the proposed algo-
1.5 fps, and consequently, Gaussian YOLOv3 outperforms
rithm and other methods using the KITTI validation set.
CFENet [29] in terms of the accuracy and detection speed.
The mAP of the proposed algorithm, Gaussian YOLOv3,
improves by 3.09 compared to that of YOLOv3, and the Furthermore, on the COCO dataset [14], the AP of Gaus-
detection speed is 43.13 fps, which enables real-time de- sian YOLOv3 is 36.1, which is 3.1 higher than YOLOv3.
tection with a slight difference from YOLOv3. Gaussian In particular, the AP75 (i.e., strict metric) of Gaussian
YOLOv3 is 3.93 fps faster than that of RFBNet [16], which YOLOv3 is 39.0, which is 4.6 higher than that of YOLOv3.
has the fastest operation speed among the previous studies These results indicate that the proposed algorithm outper-
with the exception of YOLOv3, despite the mAP of Gaus- forms YOLOv3 in general dataset as well as KITTI and
sian YOLOv3 outperforming that of RFBNet [16] by more BDD.
than 10.17. In addition, although the mAP of Gaussian Based on these experimental results, because the pro-
YOLOv3 with a 512 × 512 resolution is 1.81 lower than posed algorithm can significantly improve the accuracy
that of SINet [11], which has the highest accuracy among with little penalty in speed compared to YOLOv3, Gaussian
the previous methods, it is noteworthy that the fps of the YOLOv3 is superior to the previous methods.

507
Average precision (%)
Detection algorithm Car Pedestrian Cyclist mAP (%) FPS Input size
E M H E M H E M H
MS-CNN [1] 92.54 90.49 79.23 87.46 81.34 72.49 90.13 87.59 81.11 84.71 8.13 1920×576
SINet [11] 99.11 90.59 79.77 88.09 79.22 70.30 94.41 86.61 80.68 85.42 23.98 1920×576
SSD [17] 88.37 87.84 79.15 50.33 48.87 44.97 48.00 52.51 51.52 61.29 28.93 512×512
RefineDet [28] 98.96 90.44 88.82 84.40 77.44 73.52 86.33 80.22 79.15 84.36 27.81 512×512
CFENet [29] 90.33 90.22 84.85 - - - - - - - 0.25 -
RFBNet [16] 87.41 88.35 83.41 65.85 61.30 57.71 74.46 72.73 69.75 73.44 39.20 512×512
YOLOv3 [21] 85.68 76.89 75.89 83.51 78.37 75.16 88.94 80.64 79.62 80.52 43.57 512×512
Gaussian YOLOv3 90.61 90.20 81.19 87.84 79.57 72.30 89.31 81.30 80.20 83.61 43.13 512×512
Gaussian YOLOv3 98.74 90.48 89.47 87.85 79.96 76.81 90.08 86.59 81.09 86.79 24.91 704×704

Table 2: Performance comparison using KITTI validation set. E, M, and H refer to easy, moderate, and hard, respectively.

Detection algorithm mAP (%) FPS Input size Gaussian Variation

YOLOv3
MS-CNN [1] 5.7 6.0 1920×576 YOLOv3 rate (%)
SINet [11] 9.0 18.2 1920×576 KITTI validation set
SSD [17] 14.1 23.1 512×512 # of FP 1,681 985 -41.40
RefineDet [28] 17.4 22.3 512×512 # of TP 13,575 14,560 +7.26
CFENet [29] 19.1 21.0 512×512 # of GT 17,607 17,607 0
RFBNet [16] 14.5 39.0 512×512 BDD validation set
# of FP 86,380 51,296 -40.62
YOLOv3 [21] 14.9 42.9 512×512
# of TP 57,261 59,724 +4.30
Gaussian YOLOv3 18.4 42.5 512×512 # of GT 185,578 185,578 0
Gaussian YOLOv3 20.8 22.5 736×736
Table 4: Numerical evaluation of FP and TP.
Table 3: Performance comparison using BDD test set.

for the baseline and Gaussian YOLOv3. The detection TH

4.3. Visual and numerical evaluation of FP and TP is the same as the mentioned before. The KITTI and BDD
validation sets are used to calculate the FP and TP because
For a visual evaluation of Gaussian YOLOv3, Figures 4
the GT is provided in the validation set. For more accurate
and 5 show the detection examples of the baseline and
measurements, the FP and TP of the two datasets are cal-
Gaussian YOLOv3 for the KITTI validation set and the
culated using the official evaluation code of BDD because
BDD test set, respectively. The detection TH is 0.5, which
the KITTI official evaluation method does not count the
is the default test TH of YOLOv3. The results in the first
FP when bbox is within a certain size. For the KITTI and
row of Figure 4 and in the first column of Figure 5 show that
BDD validation sets, Gaussian YOLOv3 reduces the FP by
Gaussian YOLOv3 can detect objects that YOLOv3 cannot
41.40% and 40.62%, respectively, compared to YOLOv3.
find, thereby increasing its TP. These positive results are
In addition, it increases the TP by 7.26% and 4.3%, re-
obtained because the Gaussian modeling and loss function
spectively. It should be noted that the reduction in the FP
reconstruction of YOLOv3 proposed in this paper can pro-
prevents unnecessary unexpected braking, and the increase
vide a loss attenuation effect in the learning process, so that
in the TP prevents fatal accidents from object detection er-
the learning accuracy for bbox can be improved, which en-
rors. In conclusion, Gaussian YOLOv3 shows a better per-
hances the performance of objectness. Next, the results in
formance than YOLOv3 for both the FP and TP related to
the second row of Figure 4 and in the second column of
the safety of autonomous vehicles. Based on the results de-
Figure 5 show that Gaussian YOLOv3 can complement in-
scribed in Sections 4.1, 4.2, and 4.3, the proposed algorithm
correct object detection results found by YOLOv3. In ad-
outperforms previous studies and is most suitable for au-
dition, the results in the third row of Figure 4 and in the
tonomous driving applications.
third column of Figure 5 show that Gaussian YOLOv3 can
accurately detect bbox of object inaccurately detected by
YOLOv3. Based on these results, Gaussian YOLOv3 can
5. Conclusion
significantly reduce the FP and increase the TP, and con- A high accuracy and real-time detection speed of an ob-
sequently, the driving stability and efficiency are improved ject detection algorithm are extremely important for the
and fatal accidents can be prevented. safety and real-time control of autonomous vehicles. Var-
For a numerical evaluation of the FP and TP of Gaus- ious studies related to camera-based autonomous driving
sian YOLOv3, Table 4 shows the numbers of FPs and TPs have been conducted, but are unsatisfactory based on a

508
Figure 4: Detection results of the baseline and proposed algorithms on the KITTI validation set. The first column shows the detection
results of YOLOv3, whereas the second column shows the detection results of Gaussian YOLOv3.

Figure 5: Detection results of the baseline and proposed algorithms on the BDD test set. The first and second rows show the detection
results of YOLOv3 and Gaussian YOLOv3, respectively, and each color is related to a particular object class.

trade-off between the accuracy and operation speed. For tection speed. As a result, the proposed algorithm can sig-
this reason, this paper proposes an object detection algo- nificantly improve the camera-based object detection sys-
rithm that achieves the best trade-off between accuracy and tem for autonomous driving, and is consequently expected
speed for autonomous driving. Through Gaussian mod- to contribute significantly to the wide use of autonomous
eling, loss function reconstruction, and the utilization of driving applications.
localization uncertainty, the proposed algorithm improves
the accuracy, increases the TP, and significantly reduces Acknowledgement
the FP, while maintaining the real-time capability. Com-
pared to the baseline, the proposed Gaussian YOLOv3 al- This work was supported by the National Research
gorithm improves the mAP by 3.09 and 3.5 for the KITTI Foundation of Korea (NRF) grant funded by the Ko-
and BDD datasets, respectively. Furthermore, because the rea government (MSIT) (No. 2019R1F1A1057530) and
proposed algorithm has a higher accuracy than the previ- ”The Project of Industrial Technology Innovation” through
ous studies with a similar fps, the proposed algorithm is the Ministry of Trade, Industry and Energy (MOTIE)
excellent in terms of the trade-off between accuracy and de- (10082585,2017).

509
References Zitnick. Microsoft coco: Common objects in context. In
European conference on computer vision, pages 740–755.
[1] Zhaowei Cai, Quanfu Fan, Rogerio S Feris, and Nuno Vas- Springer, 2014.
concelos. A unified multi-scale deep convolutional neural
[15] Feng Liu, Bingquan Liu, Chengjie Sun, Ming Liu, and Xiao-
network for fast object detection. In European conference
long Wang. Deep learning approaches for link prediction in
on computer vision, pages 354–370. Springer, 2016.
social network services. In International Conference on Neu-
[2] Sungjoon Choi, Kyungjae Lee, Sungbin Lim, and Songhwai
ral Information Processing, pages 425–432. Springer, 2013.
Oh. Uncertainty-aware learning from demonstration using
mixture density networks with sampling-free variance mod- [16] Songtao Liu, Di Huang, et al. Receptive field block net for
eling. In 2018 IEEE International Conference on Robotics accurate and fast object detection. In Proceedings of the Eu-
and Automation (ICRA), pages 6915–6922. IEEE, 2018. ropean Conference on Computer Vision (ECCV), pages 385–
400, 2018.
[3] Aleksa Ćorović, Velibor Ilić, Siniša Durić, Malisa Marijan,
and Bogdan Pavković. The real-time detection of traffic par- [17] Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian
ticipants using yolo algorithm. In 2018 26th Telecommuni- Szegedy, Scott Reed, Cheng-Yang Fu, and Alexander C
cations Forum (TELFOR), pages 1–4. IEEE, 2018. Berg. Ssd: Single shot multibox detector. In European con-
[4] Jifeng Dai, Yi Li, Kaiming He, and Jian Sun. R-fcn: Object ference on computer vision, pages 21–37. Springer, 2016.
detection via region-based fully convolutional networks. In [18] Aarian Marshall. False positive: Self-driving cars and the
Advances in neural information processing systems, pages agony of knowing what matters. WIRED Transportation,
379–387, 2016. 2018.
[5] Xuerui Dai. Hybridnet: A fast vehicle detection system for [19] Joseph Redmon, Santosh Divvala, Ross Girshick, and Ali
autonomous driving. Signal Processing: Image Communi- Farhadi. You only look once: Unified, real-time object de-
cation, 70:79–88, 2019. tection. In Proceedings of the IEEE conference on computer
[6] Di Feng, Lars Rosenbaum, and Klaus Dietmayer. Towards vision and pattern recognition, pages 779–788, 2016.
safe autonomous driving: Capture uncertainty in the deep [20] Joseph Redmon and Ali Farhadi. Yolo9000: better, faster,
neural network for lidar 3d vehicle detection. In 2018 21st stronger. In Proceedings of the IEEE conference on computer
International Conference on Intelligent Transportation Sys- vision and pattern recognition, pages 7263–7271, 2017.
tems (ITSC), pages 3266–3273. IEEE, 2018. [21] Joseph Redmon and Ali Farhadi. Yolov3: An incremental
[7] Andreas Geiger, Philip Lenz, and Raquel Urtasun. Are we improvement. arXiv preprint arXiv:1804.02767, 2018.
ready for autonomous driving? the kitti vision benchmark [22] Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun.
suite. In 2012 IEEE Conference on Computer Vision and Faster r-cnn: Towards real-time object detection with region
Pattern Recognition, pages 3354–3361. IEEE, 2012. proposal networks. In Advances in neural information pro-
[8] Ross Girshick. Fast r-cnn. In Proceedings of the IEEE inter- cessing systems, pages 91–99, 2015.
national conference on computer vision, pages 1440–1448,
[23] Young-Woo Seo, Nathan Ratliff, and Chris Urmson. Self-
2015.
supervised aerial images analysis for extracting parking lot
[9] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun.
structure. In Twenty-First International Joint Conference on
Deep residual learning for image recognition. In Proceed-
Artificial Intelligence, 2009.
ings of the IEEE conference on computer vision and pattern
recognition, pages 770–778, 2016. [24] Junqing Wei, Jarrod M Snider, Junsung Kim, John M Dolan,
Raj Rajkumar, and Bakhtiar Litkouhi. Towards a viable au-
[10] Yihui He, Xiangyu Zhang, Marios Savvides, and Kris Ki-
tonomous driving research platform. In 2013 IEEE Intelli-
tani. Softer-nms: Rethinking bounding box regression for
gent Vehicles Symposium (IV), pages 763–770. IEEE, 2013.
accurate object detection. arXiv preprint arXiv:1809.08545,
2018. [25] Bichen Wu, Forrest Iandola, Peter H Jin, and Kurt Keutzer.
[11] Xiaowei Hu, Xuemiao Xu, Yongjie Xiao, Hao Chen, Squeezedet: Unified, small, low power fully convolu-
Shengfeng He, Jing Qin, and Pheng-Ann Heng. Sinet: A tional neural networks for real-time object detection for au-
scale-insensitive convolutional neural network for fast vehi- tonomous driving. In Proceedings of the IEEE Conference
cle detection. IEEE Transactions on Intelligent Transporta- on Computer Vision and Pattern Recognition Workshops,
tion Systems, 20(3):1010–1019, 2019. pages 129–137, 2017.
[12] Alex Kendall and Yarin Gal. What uncertainties do we need [26] Fisher Yu, Wenqi Xian, Yingying Chen, Fangchen Liu, Mike
in bayesian deep learning for computer vision? In Advances Liao, Vashisht Madhavan, and Trevor Darrell. Bdd100k: A
in neural information processing systems, pages 5574–5584, diverse driving video database with scalable annotation tool-
2017. ing. arXiv preprint arXiv:1805.04687, 2018.
[13] Tsung-Yi Lin, Piotr Dollár, Ross Girshick, Kaiming He, [27] Chi Zhang, Yuehu Liu, Danchen Zhao, and Yuanqi Su. Road-
Bharath Hariharan, and Serge Belongie. Feature pyramid view: A traffic scene simulator for autonomous vehicle simu-
networks for object detection. In Proceedings of the IEEE lation testing. In 17th International IEEE Conference on In-
Conference on Computer Vision and Pattern Recognition, telligent Transportation Systems (ITSC), pages 1160–1165.
pages 2117–2125, 2017. IEEE, 2014.
[14] Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, [28] Shifeng Zhang, Longyin Wen, Xiao Bian, Zhen Lei, and
Pietro Perona, Deva Ramanan, Piotr Dollár, and C Lawrence Stan Z Li. Single-shot refinement neural network for object

510
detection. In Proceedings of the IEEE Conference on Com-
puter Vision and Pattern Recognition, pages 4203–4212,
2018.
[29] Qijie Zhao, Yongtao Wang, Tao Sheng, and Zhi Tang. Com-
prehensive feature enhancement module for single-shot ob-
ject detector. In Asian conference on computer vision.
Springer, 2018.

511

MRO Intelligence Report PDF
No ratings yet
MRO Intelligence Report PDF
9 pages
Self Attention Yolov3
No ratings yet
Self Attention Yolov3
12 pages
Research Article: A Real-Time Object Detector For Autonomous Vehicles Based On Yolov4
No ratings yet
Research Article: A Real-Time Object Detector For Autonomous Vehicles Based On Yolov4
11 pages
Computational Intelligence and Neuroscience - 2021 - Wang - A Real Time Object Detector For Autonomous Vehicles Based On
No ratings yet
Computational Intelligence and Neuroscience - 2021 - Wang - A Real Time Object Detector For Autonomous Vehicles Based On
11 pages
Sensors 23 03385
No ratings yet
Sensors 23 03385
20 pages
RRPN: Radar Region Proposal Network For Object Detection in Autonomous Vehicles
No ratings yet
RRPN: Radar Region Proposal Network For Object Detection in Autonomous Vehicles
5 pages
Improving The Vehicle Small Object Detection Algorithm of Yolov5
No ratings yet
Improving The Vehicle Small Object Detection Algorithm of Yolov5
11 pages
Autonomous Drivingg
No ratings yet
Autonomous Drivingg
12 pages
Comparative Analysis of Feature Descriptors and Classifiers For Real-Time Object Detection
No ratings yet
Comparative Analysis of Feature Descriptors and Classifiers For Real-Time Object Detection
11 pages
DAYOLO
No ratings yet
DAYOLO
19 pages
Enhancing Object Detection in Self Driving Cars Using A 3nb1910g
No ratings yet
Enhancing Object Detection in Self Driving Cars Using A 3nb1910g
12 pages
Major PRC-1 ppt-1
No ratings yet
Major PRC-1 ppt-1
12 pages
Project Report (Group 9)
No ratings yet
Project Report (Group 9)
20 pages
Final Project Paper Akash
No ratings yet
Final Project Paper Akash
5 pages
YOLOv8n-FAWL Object Detection For Autonomous Driving Using YOLOv8 Network On Edge Devices
No ratings yet
YOLOv8n-FAWL Object Detection For Autonomous Driving Using YOLOv8 Network On Edge Devices
12 pages
Real-Time Image Segmentation and Objec1111 Tracking For Autonomous Vehicles
No ratings yet
Real-Time Image Segmentation and Objec1111 Tracking For Autonomous Vehicles
5 pages
Pedestrian Detection System Based On Deep Learning
No ratings yet
Pedestrian Detection System Based On Deep Learning
5 pages
Sensors 22 04833
No ratings yet
Sensors 22 04833
17 pages
Report 34
No ratings yet
Report 34
22 pages
Enhancement of Robustness in Object Detection Module For Advanced Driver Assistance Systems
No ratings yet
Enhancement of Robustness in Object Detection Module For Advanced Driver Assistance Systems
6 pages
Electronics 13 02790
No ratings yet
Electronics 13 02790
15 pages
Fast and Accurate Object Detector For Autonomous D
No ratings yet
Fast and Accurate Object Detector For Autonomous D
14 pages
IEEE Journal Submission Trans On MTT Example
No ratings yet
IEEE Journal Submission Trans On MTT Example
5 pages
2021 12 Masters Thesis Maximilian Fortkord Compressed
No ratings yet
2021 12 Masters Thesis Maximilian Fortkord Compressed
84 pages
Computer Vision-Based Lane Detection and Detection of Vehicle, Traffic Sign, Pedestrian G Öztürk
No ratings yet
Computer Vision-Based Lane Detection and Detection of Vehicle, Traffic Sign, Pedestrian G Öztürk
13 pages
A Rich Feature Fusion Single-Stage Object Detector
No ratings yet
A Rich Feature Fusion Single-Stage Object Detector
8 pages
Aws RP
No ratings yet
Aws RP
11 pages
JOIV - Template - A Thorough Review of Vehicle Detection and Distance Estimation Using Deep Learning in Autonomous Cars
No ratings yet
JOIV - Template - A Thorough Review of Vehicle Detection and Distance Estimation Using Deep Learning in Autonomous Cars
10 pages
Cin2022 4423744
No ratings yet
Cin2022 4423744
12 pages
Second Progress Report UID - 17BCS2127
No ratings yet
Second Progress Report UID - 17BCS2127
13 pages
2004 10934v1 PDF
No ratings yet
2004 10934v1 PDF
17 pages
(2024-AEJ) Z-YOLOv8s-based Approach For Road Object Recognition in Complex Traffic Scenarios
No ratings yet
(2024-AEJ) Z-YOLOv8s-based Approach For Road Object Recognition in Complex Traffic Scenarios
14 pages
Yolo Report
No ratings yet
Yolo Report
23 pages
Overview of Object Detection Based On Deep Learnin
No ratings yet
Overview of Object Detection Based On Deep Learnin
7 pages
Report 34
No ratings yet
Report 34
26 pages
Maity 2021
No ratings yet
Maity 2021
6 pages
An Application of A Deep Learning Algorithm For Automatic Detection of Unexpected Accidents Under Bad CCTV Monitoring Conditions in Tunnels
No ratings yet
An Application of A Deep Learning Algorithm For Automatic Detection of Unexpected Accidents Under Bad CCTV Monitoring Conditions in Tunnels
5 pages
YED-YOLO: An Object Detection Algorithm For Automatic Driving
No ratings yet
YED-YOLO: An Object Detection Algorithm For Automatic Driving
9 pages
Improvement of Object Detection Based On Faster R - 220904 150051
No ratings yet
Improvement of Object Detection Based On Faster R - 220904 150051
5 pages
Aiav Unit 2 Notes
No ratings yet
Aiav Unit 2 Notes
8 pages
Kim 2019
No ratings yet
Kim 2019
9 pages
Regional Feature Fusion For On-Road Detection of Objects Using Camera and 3D-Lidar in High-Speed Autonomous Vehicles
No ratings yet
Regional Feature Fusion For On-Road Detection of Objects Using Camera and 3D-Lidar in High-Speed Autonomous Vehicles
20 pages
Helmet Detection Using Machine Learning and Automatic License Final
75% (4)
Helmet Detection Using Machine Learning and Automatic License Final
47 pages
The Real-Time Detection of Traffic Participants Using YOLO Algorithm
No ratings yet
The Real-Time Detection of Traffic Participants Using YOLO Algorithm
4 pages
A Real-Time Collision Detection System For Vehicles
No ratings yet
A Real-Time Collision Detection System For Vehicles
6 pages
Literature Survey For Robotics
No ratings yet
Literature Survey For Robotics
6 pages
Wen Wen 2021 Thesis
No ratings yet
Wen Wen 2021 Thesis
114 pages
Object Detection Using YOLO: Challenges, Architectural Successors, Datasets and Applications
No ratings yet
Object Detection Using YOLO: Challenges, Architectural Successors, Datasets and Applications
33 pages
Object Detection Classification and Tracking For Autonomous Veh
No ratings yet
Object Detection Classification and Tracking For Autonomous Veh
46 pages
1 s2.0 S0167865523000727 Main
No ratings yet
1 s2.0 S0167865523000727 Main
8 pages
EdgeYOLO AnEdge-Real-Time Object Detector
No ratings yet
EdgeYOLO AnEdge-Real-Time Object Detector
7 pages
YOLOPv2 bdd100k
No ratings yet
YOLOPv2 bdd100k
8 pages
An Object Detection System Based On YOLO in Traffic Scene
No ratings yet
An Object Detection System Based On YOLO in Traffic Scene
5 pages
25320-Article Text-29383-1-2-20230626
No ratings yet
25320-Article Text-29383-1-2-20230626
9 pages
Automatic Vehicles Report
No ratings yet
Automatic Vehicles Report
19 pages
4 - Engineering Vehicles Detection For Warehouse Surveillance System Based On Modified YOLOv4-Tiny
No ratings yet
4 - Engineering Vehicles Detection For Warehouse Surveillance System Based On Modified YOLOv4-Tiny
17 pages
Improved Vision-Based Vehicle Detection and Classification by Optimized YOLOv4
No ratings yet
Improved Vision-Based Vehicle Detection and Classification by Optimized YOLOv4
14 pages
Abstract + Introduction
No ratings yet
Abstract + Introduction
15 pages
Implementation of An Improved Multi-Object Detection, Tracking, and Counting For Autonomous Driving
No ratings yet
Implementation of An Improved Multi-Object Detection, Tracking, and Counting For Autonomous Driving
29 pages
TOOD - Task-Aligned One-Stage Object Detection
No ratings yet
TOOD - Task-Aligned One-Stage Object Detection
12 pages
Machine Learning - Advanced Concepts
From Everand
Machine Learning - Advanced Concepts
Derrick Mwiti
No ratings yet
BIO 101 - Lecture Notes 1
No ratings yet
BIO 101 - Lecture Notes 1
20 pages
Astm C40 C40M 16
No ratings yet
Astm C40 C40M 16
1 page
Computer Network - CS610 Power Point Slides Lecture 12
No ratings yet
Computer Network - CS610 Power Point Slides Lecture 12
20 pages
Project On Mysql
No ratings yet
Project On Mysql
67 pages
WEG - Transformer
No ratings yet
WEG - Transformer
20 pages
Affected Models: Date: November, 2006 No. 2006-06 (W) MODELS: 2007 Evinrude SUBJECT: Engine Software Update
No ratings yet
Affected Models: Date: November, 2006 No. 2006-06 (W) MODELS: 2007 Evinrude SUBJECT: Engine Software Update
2 pages
Electrophysiology Devices Market Report
No ratings yet
Electrophysiology Devices Market Report
7 pages
Ground Floor Containment Overall Layout
No ratings yet
Ground Floor Containment Overall Layout
1 page
RW A. Com: An Essay On Criticism
No ratings yet
RW A. Com: An Essay On Criticism
1 page
The Hydrologic Budget
100% (1)
The Hydrologic Budget
6 pages
Guiding Principle:: Title: Training Guide For Dcws On Self Help Assessment
No ratings yet
Guiding Principle:: Title: Training Guide For Dcws On Self Help Assessment
33 pages
TECH - ELEC-Difference Between Capacitor and Supercapacitor
No ratings yet
TECH - ELEC-Difference Between Capacitor and Supercapacitor
24 pages
Understanding Color and Color Schemes
No ratings yet
Understanding Color and Color Schemes
20 pages
PCP Comprehensive Solutions
No ratings yet
PCP Comprehensive Solutions
8 pages
Topic 2 Linear Programming
No ratings yet
Topic 2 Linear Programming
64 pages
Q. No Sub Q.No Answer: (Autonomous)
No ratings yet
Q. No Sub Q.No Answer: (Autonomous)
23 pages
Moog Valves DIVelectricalInterfaces Manual
No ratings yet
Moog Valves DIVelectricalInterfaces Manual
108 pages
Introduction To Management Accounting
No ratings yet
Introduction To Management Accounting
30 pages
Forward and Inverse Modeling of Gravity Data
No ratings yet
Forward and Inverse Modeling of Gravity Data
14 pages
Aug 1-27 Final
No ratings yet
Aug 1-27 Final
90 pages
Modelling of Fluid Power Systems
No ratings yet
Modelling of Fluid Power Systems
85 pages
SIL Selection SIL Verification With ExSIlentia Syllabus
0% (1)
SIL Selection SIL Verification With ExSIlentia Syllabus
3 pages
ECRI - Grade 1 - Unit 0 Smart Start
No ratings yet
ECRI - Grade 1 - Unit 0 Smart Start
156 pages
Ece 34 - Microprocessor System Project
No ratings yet
Ece 34 - Microprocessor System Project
3 pages
Critical Thinking
No ratings yet
Critical Thinking
3 pages
Documents 12-01-2022
No ratings yet
Documents 12-01-2022
4,940 pages
Summer Internship
No ratings yet
Summer Internship
2 pages
Reference Photo:: 9-7/8 In. (250.8mm) QD503X
No ratings yet
Reference Photo:: 9-7/8 In. (250.8mm) QD503X
1 page
HDI OnQ RandI Set A Closed To Arrival Control On Rate Levels V1.0
No ratings yet
HDI OnQ RandI Set A Closed To Arrival Control On Rate Levels V1.0
11 pages

Choi Gaussian YOLOv3 An Accurate and Fast Object Detector Using Localization ICCV 2019 Paper

Uploaded by

Choi Gaussian YOLOv3 An Accurate and Fast Object Detector Using Localization ICCV 2019 Paper

Uploaded by

Gaussian YOLOv3: An Accurate and Fast Object Detector Using Localization

Uncertainty for Autonomous Driving

Jiwoong Choi1 , Dayoung Chun1 , Hyun Kim2 , and Hyuk-Jae Lee1

Abstract tion must accurately detect cars, pedestrians, traffic signs,

Table 1: Results of anchor boxes of training sets. 0.3

Detection algorithm mAP (%) FPS Input size Gaussian Variation

for the baseline and Gaussian YOLOv3. The detection TH

You might also like