0% found this document useful (0 votes)
53 views7 pages

05 Tiny - SSD - A - Tiny - Single-Shot - Detection - Deep - Convolutional - Neural - Network - For - Real-Time - Embedded - Object - Detection

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
53 views7 pages

05 Tiny - SSD - A - Tiny - Single-Shot - Detection - Deep - Convolutional - Neural - Network - For - Real-Time - Embedded - Object - Detection

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

2018 15th Conference on Computer and Robot Vision

Tiny SSD: A Tiny Single-shot Detection Deep Convolutional Neural Network for
Real-time Embedded Object Detection

Alexander Wong, Mohammad Javad Shafiee, Francis Li, Brendan Chwyl


Dept. of Systems Design Engineering
University of Waterloo, DarwinAI
{a28wong, mjshafiee}@uwaterloo.ca, {francis, brendan}@darwinai.ca

Abstract—Object detection is a major challenge in computer


vision, involving both object classification and object localiza-
tion within a scene. While deep neural networks have been
shown in recent years to yield very powerful techniques for
tackling the challenge of object detection, one of the biggest
challenges with enabling such object detection networks for
widespread deployment on embedded devices is high compu-
tational and memory requirements. Recently, there has been
an increasing focus in exploring small deep neural network
architectures for object detection that are more suitable for em-
bedded devices, such as Tiny YOLO and SqueezeDet. Inspired Figure 1. Tiny SSD results on the VOC test set. The bounding boxes,
by the efficiency of the Fire microarchitecture introduced in categories, and confidences are shown.
SqueezeNet and the object detection performance of the single-
shot detection macroarchitecture introduced in SSD, this paper
introduces Tiny SSD, a single-shot detection deep convolutional
neural network for real-time embedded object detection that capable of single-digit frame rates on a high-end graphics
is composed of a highly optimized, non-uniform Fire sub-
processing unit (GPU). As such, more efficient deep neural
network stack and a non-uniform sub-network stack of highly
optimized SSD-based auxiliary convolutional feature layers networks for real-time embedded object detection is highly
designed specifically to minimize model size while maintaining desired given the large number of operational scenarios that
object detection performance. The resulting Tiny SSD possess such networks would enable, ranging from smartphones to
a model size of 2.3MB (∼26X smaller than Tiny YOLO) while aerial drones.
still achieving an mAP of 61.3% on VOC 2007 (∼4.2% higher
than Tiny YOLO). These experimental results show that very Recently, there has been an increasing focus in exploring
small deep neural network architectures can be designed for small deep neural network architectures for object detection
real-time object detection that are well-suited for embedded that are more suitable for embedded devices. For example,
scenarios. Redmon et al. introduced YOLO [11] and YOLOv2 [12],
Keywords-object detection; deep neural network; embedded; which were designed with speed in mind and was able to
real-time; single-shot achieve real-time object detection performance on a high-end
Nvidia Titan X desktop GPU. However, the model size of
I. I NTRODUCTION YOLO and YOLOv2 remains very large in size (753 MB and
Object detection can be considered a major challenge in 193 MB, respectively), making them too large from a memory
computer vision, as it involves a combination of object classi- perspective for most embedded devices. Furthermore, their
fication and object localization within a scene (see Figure 1). object detection speed drops considerably when running on
The advent of modern advances in deep learning [7], [6] embedded chips [14]. To address this issue, Tiny YOLO [10]
has led to significant advances in object detection, with the was introduced where the network architecture was reduced
majority of research focuses on designing increasingly more considerably to greatly reduce model size (60.5 MB) as well
complex object detection networks for improved accuracy as greatly reduce the number of floating point operations
such as SSD [9], R-CNN [1], Mask R-CNN [2], and other required (just 6.97 billion operations) at a cost of object
extended variants of these networks [4], [8], [15]. Despite the detection accuracy (57.1% on the twenty-category VOC 2017
fact that such object detection networks have showed state- test set). Similarly, Wu et al. introduced SqueezeDet [16], a
of-the-art object detection accuracies beyond what can be fully convolutional neural network that leveraged the efficient
achieved by previous state-of-the-art methods, such networks Fire microarchitecture introduced in SqueezeNet [5] within
are often intractable for use for embedded applications due an end-to-end object detection network architecture. Given
to computational and memory constraints. In fact, even faster that the Fire microarchitecture is highly efficient, the resulting
variants of these networks such as Faster R-CNN [13] are only SqueezeDet had a reduced model size specifically for the

978-1-5386-6481-0/18/$31.00 ©2018 IEEE 95


DOI 10.1109/CRV.2018.00023

Authorized licensed use limited to: ULAKBIM UASL - Uluslararasi Kibris Universitesi. Downloaded on March 22,2023 at 04:59:01 UTC from IEEE Xplore. Restrictions apply.
purpose of autonomous driving. However, SqueezeDet has
only been demonstrated for objection detection with limited
object categories (only three) and thus its ability to handle
larger number of categories have not been demonstrated.
As such, the design of highly efficient deep neural network
architectures that are well-suited for real-time embedded
object detection while achieving improved object detection
accuracy on a variety of object categories is still a challenge
worth tackling.
In an effort to achieve a fine balance between object
detection accuracy and real-time embedded requirements
(i.e., small model size and real-time embedded inference
speed), we take inspiration by both the incredible efficiency
of the Fire microarchitecture introduced in SqueezeNet [5]
and the powerful object detection performance demonstrated Figure 2. An illustration of the Fire microarchitecture. The output of
by the single-shot detection macroarchitecture introduced previous layer is squeezed by a squeeze convolutional layer of 1 × 1 filters,
in SSD [9]. The resulting network architecture achieved which reduces the number of input channels to 3 × 3 filters. The result of
the squeeze convolutional layers is passed into the expand convolutional
in this paper is Tiny SSD, a single-shot detection deep layer which consists of both 1 × 1 and 3 × 3 filters.
convolutional neural network designed specifically for real-
time embedded object detection. Tiny SSD is composed
of a non-uniform highly optimized Fire sub-network stack, 2) reduce the number of input channels to 3 × 3 filters
which feeds into a non-uniform sub-network stack of highly where possible, and
optimized SSD-based auxiliary convolutional feature layers, 3) perform downsampling at a later stage in the network.
designed specifically to minimize model size while retaining This principled designed strategy led to the design of what
object detection performance. the authors referred to as the Fire module, which consists of
This paper is organized as follows. Section 2 describes the a squeeze convolutional layer of 1 × 1 filters (which realizes
highly optimized Fire sub-network stack leveraged in the Tiny the second design strategy of effectively reduces the number
SSD network architecture. Section 3 describes the highly of input channels to 3 × 3 filters) that feeds into an expand
optimized sub-network stack of SSD-based convolutional convolutional layer comprised of both 1 × 1 filters and 3 × 3
feature layers used in the Tiny SSD network architecture. filters (which realizes the first design strategy of effectively
Section 4 presents experimental results that evaluate the reducing the number of 3 × 3 filters). An illustration of the
efficacy of Tiny SSD for real-time embedded object detection. Fire microarchitecture is shown in Figure 2.
Finally, conclusions are drawn in Section 5. Inspired by the elegance and simplicity of the Fire
microarchitecture design, we design the first sub-network
II. O PTIMIZED F IRE S UB - NETWORK S TACK stack of the Tiny SSD network architecture as a standard
The overall network architecture of the Tiny SSD network convolutional layer followed by a set of highly optimized
for real-time embedded object detection is composed of two Fire modules. One of the key challenges to designing this
main sub-network stacks: i) a non-uniform Fire sub-network sub-network stack is to determine the ideal number of Fire
stack, and ii) a non-uniform sub-network stack of highly modules as well as the ideal microarchitecture of each of
optimized SSD-based auxiliary convolutional feature layers, the Fire modules to achieve a fine balance between object
with the first sub-network stack feeding into the second sub- detection performance and model size as well as inference
network stack. In this section, let us first discuss in detail speed. First, it was determined empirically that 10 Fire
the design philosophy behind the first sub-network stack modules in the optimized Fire sub-network stack provided
of the Tiny SSD network architecture: the optimized fire strong object detection performance. In terms of the ideal
sub-network stack. microarchitecture, the key design parameters of the Fire
A powerful approach to designing smaller deep neural microarchitecture are the number of filters of each size
network architectures for embedded inference is to take a (1 × 1 or 3 × 3) that form this microarchitecture. In the
more principled approach and leverage architectural design SqueezeNet network architecture that first introduced the
strategies to achieve more efficient deep neural network Fire microarchitecture [5], the microarchitectures of the Fire
microarchitectures [3], [5]. A very illustrative example of modules are largely uniform, with many of the modules
such a principled approach is the SqueezeNet [5] network ar- sharing the same microarchitecture configuration. In an effort
chitecture, where three key design strategies were leveraged: to achieve more optimized Fire microarchitectures on a per-
module basis, the number of filters of each size in each Fire
1) reduce the number of 3 × 3 filters as much as possible,

96

Authorized licensed use limited to: ULAKBIM UASL - Uluslararasi Kibris Universitesi. Downloaded on March 22,2023 at 04:59:01 UTC from IEEE Xplore. Restrictions apply.
Table I
T HE OPTIMIZED F IRE SUB - NETWORK STACK OF THE T INY SSD
NETWORK ARCHITECTURE . T HE NUMBER OF FILTERS AND INPUT SIZE TO
EACH LAYER ARE REPORTED FOR THE CONVOLUTIONAL LAYERS AND
F IRE MODULES . E ACH F IRE MODULE IS REPORTED IN ONE ROW FOR A
BETTER REPRESENTATION . ”x@S – y@E1 – z@E3" STANDS FOR x
NUMBERS OF 1 × 1 FILTERS IN THE SQUEEZE CONVOLUTIONAL LAYER , y
NUMBERS OF 1 × 1 FILTERS AND z NUMBERS OF 3 × 3 FILTERS IN THE
EXPAND CONVOLUTIONAL LAYER .

Type / Stride Filter Shapes Input Size


Conv1 / s2 3 × 3 × 57 300 × 300
Pool1 / s2 3×3 149 × 149
Fire1 15@S – 49@E1 – 53@E3 74 × 74
Concat1
Fire2 15@S – 54@E1 – 52@E3 74 × 74
Concat2
Pool3 / s2 3×3 74 × 74
Figure 3. An illustration of the network architecture of the second Fire3 29@S – 92@E1 – 94@E3 37 × 37
sub-network stack of Tiny SSD. The output of three Fire modules and Concat3
two auxiliary convolutional feature layers, all with highly optimized Fire4 29@S – 90@E1 – 83@E3 37 × 37
microarchitecture configurations, are combined together for object detection. Concat4
Pool5 / s2 3×3 37 × 37
Fire5 44@S – 166@E1 – 161@E3 18 × 18
module is optimized to have as few parameters as possible Concat5
Fire6 45@S – 155@E1 – 146@E3 18 × 18
while still maintaining the overall object detection accuracy.
Concat6
As a result, the optimized Fire sub-network stack in the Tiny Fire7 49@S – 163@E1 – 171@E3 18 × 18
SSD network architecture is highly non-uniform in nature for Concat7
an optimal sub-network architecture configuration. Table I Fire8 25@S – 29@E1 – 54@E3 18 × 18
shows the overall architecture of the highly optimized Fire Concat8
sub-network stack in Tiny SSD, and the number of parameters Pool9 / s2 3×3 18 × 18
in each layer of the sub-network stack. Fire 9 37@S – 45@E1 – 56@E3 9×9
Concat9
Pool10 / s2 3×3
III. O PTIMIZED S UB - NETWORK S TACK OF SSD- BASED Fire10 38@S – 41@E1 – 44@E3 4×4
C ONVOLUTIONAL F EATURE L AYERS Concat10
In this section, let us first discuss in detail the design
philosophy behind the second sub-network stack of the Tiny lutional predictors with highly optimized microarchitecture
SSD network architecture: the sub-network stack of highly configurations (see Figure 3).
optimized SSD-based auxiliary convolutional feature layers. As with the Fire microarchitecture, a key challenge to
One of the most widely-used and effective object detection designing this sub-network stack is to determine the ideal
network macroarchitectures in recent years has been the microarchitecture of each of the auxiliary convolutional
single-shot multibox detection (SSD) macroarchitecture [9]. feature layers and convolutional predictors to achieve a fine
The SSD macroarchitecture augments a base feature extrac- balance between object detection performance and model
tion network architecture with a set of auxiliary convolutional size as well as inference speed. The key design parameters
feature layers and convolutional predictors. The auxiliary of the auxiliary convolutional feature layer microarchitecture
convolutional feature layers are designed such that they are the number of filters that form this microarchitecture.
decrease in size in a progressive manner, thus enabling the As such, similar to the strategy taken for constructing
flexibility of detecting objects within a scene across different the highly optimized Fire sub-network stack, the number
scales. Each of the auxiliary convolutional feature layers of filters in each auxiliary convolutional feature layer is
can then be leveraged to obtain either: i) a confidence score optimized to minimize the number of parameters while
for a object category, or ii) a shape offset relative to default preserving overall object detection accuracy of the full Tiny
bounding box coordinates [9]. As a result, a number of object SSD network. As a result, the optimized sub-network stack
detections can be obtained per object category in this manner of auxiliary convolutional feature layers in the Tiny SSD
in a powerful, end-to-end single-shot manner. network architecture is highly non-uniform in nature for
Inspired by the powerful object detection performance an optimal sub-network architecture configuration. Table II
and multi-scale flexibility of the SSD macroarchitecture [9], shows the overall architecture of the optimized sub-network
the second sub-network stack of Tiny SSD is comprised of stack of the auxiliary convolutional feature layers within the
a set of auxiliary convolutional feature layers and convo- Tiny SSD network architecture, along with the number of

97

Authorized licensed use limited to: ULAKBIM UASL - Uluslararasi Kibris Universitesi. Downloaded on March 22,2023 at 04:59:01 UTC from IEEE Xplore. Restrictions apply.
Table II size reductions while having a negligible effect on object
T HE OPTIMIZED SUB - NETWORK STACK OF THE AUXILIARY detection accuracy.
CONVOLUTIONAL FEATURE LAYERS WITHIN THE T INY SSD NETWORK
ARCHITECTURE . T HE INPUT SIZES TO EACH CONVOLUTIONAL LAYER
AND KERNEL SIZES ARE REPORTED . V. E XPERIMENTAL R ESULTS AND D ISCUSSION
Type / Stride Filter Shape Input Size To study the utility of Tiny SSD for real-time embed-
Conv12-1 / s2 3 × 3 × 51 4×4 ded object detection, we examine the model size, object
Conv12-2 3 × 3 × 46 4×4 detection accuracies, and computational operations on the
Conv13-1 3 × 3 × 55 2×2 VOC2007/2012 datasets. For evaluation purposes, the Tiny
Conv13-2 3 × 3 × 85 2×2 YOLO network [10] was used as a baseline reference com-
Fire4-mbox-loc 3 × 3 × 16 37 × 37 parison given its popularity for embedded object detection,
Fire4-mbox-conf 3 × 3 × 84 37 × 37
and was also demonstrated to possess one of the smallest
Fire8-mbox-loc 3 × 3 × 24 18 × 18
Fire8-mbox-conf 3 × 3 × 126 18 × 18 model sizes in literature for object detection on the VOC
Fire9-mbox-loc 3 × 3 × 24 9×9 2007/2012 datasets (only 60.5MB in size and requiring
Fire9-mbox-conf 3 × 3 × 126 9×9 just 6.97 billion operations). The VOC2007/2012 datasets
Fire10-mbox-loc 3 × 3 × 24 4×4 consist of natural images that have been annotated with 20
Fire10-mbox-conf 3 × 3 × 126 4×4 different types of objects, with illustrative examples shown
Conv12-2-mbox-loc 3 × 3 × 24 2×2
in Figure 4. The tested deep neural networks were trained
Conv12-2-mbox-conf 3 × 3 × 126 2×2
Conv13-2-mbox-loc 3 × 3 × 16 1×1 using the VOC2007/2012 training datasets, and the mean
Conv13-2-mbox-conf 3 × 3 × 84 1×1 average precision (mAP) was computed on the VOC2007
test dataset to evaluate the object detection accuracy of the
deep neural networks.

parameters in each layer. A. Training Setup


The proposed Tiny SSD network was trained for 220,000
Table III
O BJECT DETECTION ACCURACY RESULTS OF T INY SSD ON VOC 2007 iterations in the Caffe framework with training batch size of
TEST SET. T INY YOLO RESULTS ARE PROVIDED AS A BASELINE 24. RMSProp was utilized as the training policy with base
COMPARISON .
learning rate set to 0.00001 and γ = 0.5.
Model Model mAP
Name size (VOC 2007) B. Discussion
Tiny YOLO [10] 60.5MB 57.1%
Table III shows the model size and the object detection
Tiny SSD 2.3MB 61.3%
accuracy of the proposed Tiny SSD network on the VOC
2007 test dataset, along with the model size and the object
detection accuracy of Tiny YOLO. A number of interesting
Table IV observations can be made. First, the resulting Tiny SSD
R ESOURCE USAGE OF T INY SSD.
possesses a model size of 2.3MB, which is ∼26X smaller
Model Total number Total number than Tiny YOLO. The significantly smaller model size of
Name of Parameters of MACs
Tiny SSD compared to Tiny YOLO illustrates its efficacy
Tiny SSD 1.13M 571.09M
for greatly reducing the memory requirements for leveraging
Tiny SSD for real-time embedded object detection purposes.
Second, it can be observed that the resulting Tiny SSD
IV. PARAMETER P RECISION O PTIMIZATION was still able to achieve an mAP of 61.3% on the VOC
2007 test dataset, which is ∼4.2% higher than that achieved
In this section, let us discuss the parameter precision
using Tiny YOLO. Figure 5 demonstrates several example
optimization strategy for Tiny SSD. For embedded scenarios
object detection results produced by the proposed Tiny SSD
where the computational requirements and memory require-
compared to Tiny YOLO. It can be observed that Tiny SSD
ments are more strict, an effective strategy for reducing
has comparable object detection results as Tiny YOLO in
computational and memory footprint of deep neural networks
some cases, while in some cases outperforms Tiny YOLO in
is reducing the data precision of parameters in a deep neural
assigning more accurate category labels to detected objects.
network. In particular, modern CPUs and GPUs have moved
For example, in the first image case, Tiny SSD is able to
towards accelerated mixed precision operations as well as
detect the chair in the scene, while Tiny YOLO misses the
better handling of reduced parameter precision, and thus the
chair. In the third image case, Tiny SSD is able to identify
ability to take advantage of these factors can yield noticeable
the dog in the scene while Tiny YOLO detects two bounding
improvements for embedded scenarios. For Tiny SSD, the
boxes around the dog, with one of the bounding boxes
parameters are represented in half precision floating-point
incorrectly labeling it as cat. This significant improvement
format, thus leading to further deep neural network model

98

Authorized licensed use limited to: ULAKBIM UASL - Uluslararasi Kibris Universitesi. Downloaded on March 22,2023 at 04:59:01 UTC from IEEE Xplore. Restrictions apply.
Figure 4. Example images from the Pascal VOC dataset. The ground-truth bounding boxes and object categories are shown for each image.

in object detection accuracy when compared to Tiny YOLO [4] Jonathan Huang, Vivek Rathod, Chen Sun, Menglong Zhu,
illustrates the efficacy of Tiny SSD for providing more Anoop Korattikara, Alireza Fathi, Ian Fischer, Zbigniew Wojna,
reliable embedded object detection performance. Furthermore, Yang Song, Sergio Guadarrama, et al. Speed/accuracy trade-
offs for modern convolutional object detectors. In IEEE CVPR,
as seen in Table IV, Tiny SSD requires just 571.09 million 2017.
MAC operations to perform inference, making it well-suited
for real-time embedded object detection. These experimental [5] Forrest N Iandola, Song Han, Matthew W Moskewicz, Khalid
results show that very small deep neural network architectures Ashraf, William J Dally, and Kurt Keutzer. Squeezenet:
can be designed for real-time object detection that are well- Alexnet-level accuracy with 50x fewer parameters and< 0.5
suited for embedded scenarios. mb model size. arXiv preprint arXiv:1602.07360, 2016.

VI. C ONCLUSIONS [6] Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. Im-
In this paper, a single-shot detection deep convolutional agenet classification with deep convolutional neural networks.
neural network called Tiny SSD is introduced for real-time In NIPS, 2012.
embedded object detection. Composed of a highly optimized,
[7] Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. Deep
non-uniform Fire sub-network stack and a non-uniform sub- learning. Nature, 2015.
network stack of highly optimized SSD-based auxiliary
convolutional feature layers designed specifically to minimize [8] Tsung-Yi Lin, Piotr Dollár, Ross Girshick, Kaiming He,
model size while maintaining object detection performance, Bharath Hariharan, and Serge Belongie. Feature pyramid
Tiny SSD possesses a model size that is ∼26X smaller than networks for object detection. In CVPR, volume 1, page 4,
Tiny YOLO, requires just 571.09 million MAC operations, 2017.
while still achieving an mAP of that is ∼4.2% higher than
Tiny YOLO on the VOC 2007 test dataset. These results [9] Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian
Szegedy, Scott Reed, Cheng-Yang Fu, and Alexander C Berg.
demonstrates the efficacy of designing very small deep neural
SSD: Single shot multibox detector. In European conference
network architectures such as Tiny SSD for real-time object on computer vision, pages 21–37. Springer, 2016.
detection in embedded scenarios.
ACKNOWLEDGMENT [10] J. Redmon. YOLO: Real-time object detection.
https://pjreddie.com/darknet/yolo/, 2016.
The authors thank Natural Sciences and Engineering Re-
search Council of Canada, Canada Research Chairs Program, [11] Joseph Redmon, Santosh Divvala, Ross Girshick, and Ali
DarwinAI, and Nvidia for hardware support. Farhadi. You only look once: Unified, real-time object
detection. In Proceedings of the IEEE conference on computer
R EFERENCES vision and pattern recognition, pages 779–788, 2016.
[1] Ross Girshick, Jeff Donahue, Trevor Darrell, and Jitendra
Malik. Rich feature hierarchies for accurate object detection [12] Joseph Redmon and Ali Farhadi. YOLO9000: better, faster,
and semantic segmentation. In Proceedings of the IEEE stronger. arXiv preprint, 1612, 2016.
conference on computer vision and pattern recognition, pages
580–587, 2014.
[13] Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun.
[2] K. He, G. Gkioxari, P. Dollar, and R. Girshick. Mask r-cnn. Faster R-CNN: Towards real-time object detection with region
ICCV, 2017. proposal networks. In Advances in neural information
processing systems, pages 91–99, 2015.
[3] Andrew G. Howard, Menglong Zhu, Bo Chen, Dmitry
Kalenichenko, Weijun Wang, Tobias Weyand, Marco An- [14] Mohammad Javad Shafiee, Brendan Chywl, Francis Li, and
dreetto, and Hartwig Adam. Mobilenets: Efficient convo- Alexander Wong. Fast YOLO: A fast you only look once
lutional neural networks for mobile vision applications. arXiv system for real-time embedded object detection in video. arXiv
preprint arXiv:1704.04861, 2017. preprint arXiv:1709.05943, 2017.

99

Authorized licensed use limited to: ULAKBIM UASL - Uluslararasi Kibris Universitesi. Downloaded on March 22,2023 at 04:59:01 UTC from IEEE Xplore. Restrictions apply.
Input Image Tiny YOLO Tiny SSD
Figure 5. Example object detection results produced by the proposed Tiny SSD compared to Tiny YOLO. It can be observed that Tiny SSD has comparable
object detection results as Tiny YOLO in some cases, while in some cases outperforms Tiny YOLO in assigning more accurate category labels to detected
objects. This significant improvement in object detection accuracy when compared to Tiny YOLO illustrates the efficacy of Tiny SSD for providing more
reliable embedded object detection performance.

100

Authorized licensed use limited to: ULAKBIM UASL - Uluslararasi Kibris Universitesi. Downloaded on March 22,2023 at 04:59:01 UTC from IEEE Xplore. Restrictions apply.
[15] Abhinav Shrivastava, Abhinav Gupta, and Ross Girshick.
Training region-based object detectors with online hard
example mining. In Proceedings of the IEEE Conference
on Computer Vision and Pattern Recognition, pages 761–769,
2016.

[16] Bichen Wu, Forrest Iandola, Peter H Jin, and Kurt Keutzer.
Squeezedet: Unified, small, low power fully convolutional
neural networks for real-time object detection for autonomous
driving. arXiv preprint arXiv:1612.01051, 2016.

101

Authorized licensed use limited to: ULAKBIM UASL - Uluslararasi Kibris Universitesi. Downloaded on March 22,2023 at 04:59:01 UTC from IEEE Xplore. Restrictions apply.

You might also like