0% found this document useful (0 votes)
63 views2 pages

Paper - Review - 2 - EfficientDet - Scalable and Efficient Object Detection

The document introduces EfficientDet, a scalable object detection model that achieves state-of-the-art accuracy with fewer parameters and computations than previous models. It builds upon EfficientNet and incorporates a weighted bi-directional feature pyramid network and compound scaling to scale the resolution, depth and width of the model components. Experiments show EfficientDet outperforms prior models on COCO and Pascal VOC datasets while being faster on GPUs and CPUs.

Uploaded by

Atul Verma
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
63 views2 pages

Paper - Review - 2 - EfficientDet - Scalable and Efficient Object Detection

The document introduces EfficientDet, a scalable object detection model that achieves state-of-the-art accuracy with fewer parameters and computations than previous models. It builds upon EfficientNet and incorporates a weighted bi-directional feature pyramid network and compound scaling to scale the resolution, depth and width of the model components. Experiments show EfficientDet outperforms prior models on COCO and Pascal VOC datasets while being faster on GPUs and CPUs.

Uploaded by

Atul Verma
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

EfficientDet: Scalable and Efficient Object Detection

Problem Statement:
As one of the core applications in computer vision, object detection has become increasingly important in
scenarios that demand high accuracy, but have limited model size and latency, such as robotics and
driverless cars. Unfortunately, many current high-accuracy detectors are computationally expensive and
have large sizes.
Although there have been previous works aimed at achieving better efficiency, they usually do so by
sacrificing accuracy. Moreover they only focus on a small range of resource requirements, while the variety
of real-world applications have a wide range of resource constraints.
Question: Is it possible to build a scalable detection architecture with both higher accuracy and better
efficiency across a wide spectrum of resource constraints?

Summary
To design accurate and efficient object detectors that can also adapt to a wide range of resource constraints,
EfficientDet was introduced. It builds upon the previous work on scaling neural networks (EfficientNet),
and incorporates some new features.
The authors incorporated two major features in the current model:
❖ A weighted bi-directional feature pyramid network (BiFPN) for easy and fast multi-scale
feature fusion. It learns the importance of different input features and repeatedly applies
top-down and bottom-up multi-scale feature fusion.
❖ A new compound scaling method for simultaneous scaling of the resolution, depth, and width
for all backbone, feature network, and box/class prediction networks.
EfficientDet achieves state-of-the-art accuracy with much fewer parameters and FLOPs than previous object
detectors. EfficientDet is also up to 3x to 8x faster on GPU/CPU than previous detectors. It also gives better
performances on semantic segmentation.

Model Architecture
The backbone networks used are ImageNet
pretrained EfficientNets. The proposed BiFPN
serves as the feature network, which takes level
3–7 features {P3, P4, P5, P6, P7} from the
backbone network and repeatedly applies
top-down and bottom-up bidirectional feature
fusion. These fused features are fed to a class
and box network to produce object class and
bounding box predictions respectively. The class
and box network weights are shared across all
levels of features.

Loss Function & Training Hyper-Parameters


Each model is trained using SGD optimizer with momentum 0.9 and weight decay 4e-5. Commonly-used
focal loss is employed with α = 0.25 and γ = 1.5, and aspect ratio {1/2, 1, 2}.

Experiments and Performance


➢ EfficientDet is evaluated on the COCO 2017 detection datasets with 118K training images.
➢ EfficientDet-D7 achieves a mean average precision (mAP) of 52.2, exceeding the prior state-of-the-art
model by 1.5 points, while using 4x fewer parameters and 9.4x less computation.
➢ Under similar accuracy constraints, EfficientDet models are 2x-4x faster on GPU, and 5x-11x faster on CPU.
➢ In Pascal VOC 2012 semantic segmentation, EfficientDet outperforms DeepLabV3+ by gaining 1.7% better
accuracy with 9.8x fewer FLOPs.

Scope for Improvement


➢ The scaling is only heuristic-based and might not be optimal. There is surely room for improvement in how
the scaling is done and optimal scaling parameters might be obtained.
➢ The optimization methods are applied exclusively to one-stage detectors; it can be extended to also include
two-stage detectors, which tend to be more flexible and accurate.
REVIEWED BY:

ATUL VERMA (19B090004)


ANIRUDHA SINGH MERTIA (190100016)

You might also like