0% found this document useful (0 votes)

5 views11 pages

Auxiliary Bounding Box Regression For Object Detec

This paper presents a novel method for object detection in optical remote sensing imagery, focusing on auxiliary bounding box regression (BBR) to improve detection accuracy. The proposed approach involves two stages: thresholding based on confidence values and area-based BBR to eliminate oversized and undersized bounding boxes. Experimental results demonstrate that this method significantly enhances the performance of region-based detectors like RCNN, Fast-RCNN, and Faster-RCNN.

Uploaded by

Shahid Karim

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views11 pages

Auxiliary Bounding Box Regression For Object Detec

Uploaded by

Shahid Karim

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

Sensing and Imaging (2021) 22:5

https://doi.org/10.1007/s11220-020-00319-x

ORIGINAL PAPER

Auxiliary Bounding Box Regression for Object Detection

in Optical Remote Sensing Imagery

Shahid Karim1,2 · Ye Zhang2 · Shoulin Yin2 · Irfana Bibi3

Received: 28 March 2020 / Revised: 16 August 2020 / Accepted: 10 October 2020

Abstract
Object detection in optical remote sensing imagery is being explored to deal with
arbitrary orientations and complex appearance which is still a major issue in recent
years. To perceive a better solution to the addressed problem, the post-processing
of bounding boxes (BBs) has been evaluated and discussed for the applications of
object detection. In this paper, the proposed method has divided into two stages; the
first stage is based on thresholding of BBs with respect to the confidence values and
the second stage is based on the area-based BB regression (BBR). In BBR, the area
of each BB was estimated then the oversized and undersized BBs were removed with
respect to the size of objects which are being detected. The widely known region-
based approaches RCNN, Fast-RCNN and Faster-RCNN are used for evaluation and
comparative analysis validates the proposed framework. The results show that the
proposed post-processing is very effective for each kind of region-based detector.

Keywords Bounding box · Thresholding · Area-based · Regression · RCNN · Object

detection

1 Introduction

The advanced developments in computer vision technologies are the reasons for
the enhanced performance of object detection and localization [1–6]. Object detec-
tion has played a vital role to create a comfortable and secure environment such
as vehicle detection is very useful for traffic control systems, surveillance, moni-
toring and management [1]. In recent developments of cutting-edge techniques, the

* Shahid Karim
[email protected]
1
Department of Computer Science, ILMA University, Karachi, Pakistan
2
School of Electronics and Information Engineering, Harbin Institute of Technology,
Harbin 150001, China
3
School of Computer Science and Technology, Xidian University, Xi’an 710071, China

13
Vol.:(0123456789)

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

5 Page 2 of 10 Sensing and Imaging (2021) 22:5

Convolutional Neural Network (CNN) based techniques are more promising, and
object detection is continuously a prominent issue. Furthermore, bounding boxes
(BBs) play a vital role in object detection tasks, and regression of these BBs is
obligatory to advance the performance of object detection.
Bounding box regression (BBR) is the core stage in the field of computer vision.
In literature, numerous BBR approaches have been proposed to upgrade the detec-
tion performance. Firstly, it is important to locate the candidate regions inside the
image, usually which are recognized as region proposals. The test image might com-
prehend by multiple numbers of objects and it is necessary to locate all the nec-
essary or prominent existing objects in a given image. Secondly, the outcome of
proposals is essential to be refined for better performance of object detection. The
sliding window approach is well-known among previous approaches to generate
such significant regions in an image [7].
Similarly, an optimal BBR approach has been proposed for Region-based CNN
(RCNN) which enhanced the performance of object detection [8]. On the other
hand, various reasons for decreasing the accuracy of object detection were discussed
and solutions were suggested to overwhelm these detection problems [9]. Particu-
larly, the specific problems were related to occlusions, errors in the localization,
size of the object, viewpoint, visibility and aspect ratio. More importantly, incorrect
detection reduces the performance so the leading advantage of BBR approach is to
refine the detections. The BBR method is very low-cost and appropriate for localiza-
tion applications. An optimal BBR is computationally not expensive and capable to
compensate for the imperfections of detections/localizations.
Our work mainly addresses the bounding box regression problem. Firstly, we ana-
lyze the detection performance of state-of-the-art algorithms such as RCNN, Fast-
RCNN and Faster-RCNN. Then, we propose a bounding box regression approach
based on thresholding to improve the detection results. Subsequently, we adopted
the area based bounding box regression approach which tremendously improves the
detection results. Lastly, the proposed approach has been compared with the default
performance of state of the art detection methods.
The remaining sections of the paper are arranged as follows: Sect. 2 elaborates
on the previous study; Sect. 3 is based on the methodology. Section 4 presents the
results and discussion. Finally, Sect. 5 concludes our work with future directions.

2 Previous Work

The improvements of object detection considerably belong to pre and post-process-

ing of the state of the art methods. The features need pre-processing to play a key
role in the betterment of the detector/classifier [10, 11]. Similarly, post-processing
is also essential to get optimum results. The bounding boxes play an important role
to upgrade the performance of computer vision applications. The most common
method for BBR is the sliding window method in which the piece of locations has
been evaluated for prediction. The size of sliding window is dependent upon the
size of kernel or size of the object which is being observed. Prevalently, the size of
the bounding box can be varied according to the application such as [2 × 2], [3 × 3],

13
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Sensing and Imaging (2021) 22:5 Page 3 of 10 5

[5 × 5] and so on. At the completion of the process, the best match has been found
among all the predictions [12].
Actually, there are two types of detection approaches that have been proposed:
region-based approaches and regression-based approaches in the last few years. In
region-based approaches, the candidate regions have been found at the pre-process-
ing stage [7, 8, 13, 14]. Among region-based approaches, initially, the candidate
regions are classified and then refined for improvement. The main disadvantage of
region approaches is the computationally expensive regression which can be ignored
due to optimal detection accuracy. On the other hand, regression-based approaches
do not utilize BBR at the pre-processing step and the proposed methodologies have
succeeded to improve the performance of detections by getting into own typical
strategies [12, 15, 16]. As in the case of You Look Only Once (YOLO) approach,
the BBs have been significantly reduced or diminished as compared to selective
search (SS) [17] in R-CNN which is specifically based on spatial constraints [15].
Single shot multi-box detector (SSD) requires the ground truth and an input image
at the starting stage. Later on, it estimates a set of BBs (e.g., 4) with distinct ratios at
every location in feature maps and finds the confidence and offsets for every single
BB [14].
Furthermore, the detection accuracy is being measured by improving the mean
average precision (mAP) for multiple thresholds [18, 19]. The process of optimal
localization and unwanted mislocalization can be limited by specifying the value of
intersection over union (IoU). If the ground truths and predicted boxes overlap more
than the given IoU, then localization/detection is acceptable [19]. Furthermore, the
IoU is not promising for non-overlapping BBs and cannot estimate the distance
when BBs are not overlapping. To overcome this major issue, a Generalized Inter-
section over Union (GIoU) based regression method has been introduced by incor-
porating it as a loss into standard detection approaches [20]. Somehow, BBR loss is
also a considerable metric to enhance localization accuracy. Mostly, this drawback
has been observed in the ground truths and localization variance consents to merge
the neighboring BBs in non-maximum suppression (NMS) [21]. By incorporating
these both metrics IoU and BBR loss, detection performance can be optimized con-
structively [22]. On the other hand, the Distance-IoU (DIoU) has been proposed to
overcome the limitations of IoU and GIoU. This DIoU has been utilized in NMS to
suppress the redundant BBs [23]. The DIoU loss is able to uphold the edge provided
by the IoU loss and reduce the distance between two boxes which helps the target
estimation to be more accurate [24].

3 Methodology

Several methods for bounding box regression (BBR) have been contemplated in the
previous work. BBR is the fundamental requirement in object detection, localization,
recognition and classification applications. Initially, we have trained the detectors such
as RCNN [8], Fast-RCNN [25] and Faster-RCNN [7] in a conventional way then con-
tinued towards post-processing. Our proposed method is based on two steps to decrease
BBs; 1) thresholding and 2) area-based bounding box regression. The parameters of

13
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
5 Page 4 of 10 Sensing and Imaging (2021) 22:5

selected detectors were set as default such as the number of epochs was set to 10, batch
size 32 and learning rate 1e−6. These parameters are meaningful for further fine-tuning
of the network. The flow diagram of our proposed framework is shown in Fig. 1.

3.1 Dataset

The datasets of optical remote sensing imageries are motivating the upcoming devel-
opments such as a recently developed dataset named object detection in aerial images
(DOTA) [26], which is very useful for several remote sensing applications. Simulta-
neously, it consists of several images of dissimilar objects such as vehicles, airplanes,
oil storage tanks, playgrounds, tennis courts, swimming pools, and so on. Due to the
availability of several images for the application of multiple object detections, we have
utilized DOTA for our proposed framework. The training images were 446 images with
5271 ROIs for airplane detection, 47 images with 1913 ROIs for oil tank detection, 500
images with 729 ROIs for playground detection, 103 images with 1091 ROIs for swim-
ming pool detection, and 239 images with 2003 ROIs for tennis court detection. All
training images were taken from DOTA except playground images which were taken
from Google Earth.

3.2 Thresholding

There are several ways to optimize the object detection model in which thresholding
is very common in various applications. In the given object detection framework, the
score/confidence based thresholding has been utilized to enhance the detection accu-
racy. The outcome of airplane detection without thresholding can be observed in
Fig. 3b. There are several unwanted BBs which hold the confidence less than 0.9. Simi-
larly, it happens with other objects in the detection process. In experiments, the thresh-
olding was limited to 0.9 to reduce the unwanted false alarms. Several limits were ana-
lyzed such as 0.75, 0.8, 0.85, and it is observed that the optimal results were found at
the threshold of 0.9. There are five numbers of BBs which were eliminated by thresh-
olding as shown in Fig. 3c.

3.3 Area‑based Bounding‑Box Regression

The accuracy assessments of detection and localization models significantly depend on

BBR [7, 8, 14, 25, 27] therefore, area-based BBR is adopted to enhance the detection
accuracy. During the experiments, the area of each BB was calculated by multiplying
the height and width of each BB. After calculating the area of each BB, the unwanted

Fig. 1 Working principle of the proposed model

13
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Sensing and Imaging (2021) 22:5 Page 5 of 10 5

Fig. 2 The detailed illustration of proposed model

Fig. 3 The images are presented to describe the significance of the proposed framework a original
image, b airplane detection without any post-processing, c airplane detection with thresholding and d
airplane detection with thresholding and area-based BBR

13
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
5 Page 6 of 10 Sensing and Imaging (2021) 22:5

BBs were removed by area-based BBR. Suppose, we can get the optimal results hav-
ing the BBs with the area range of 100–150 square meters then BBs with area higher
than 150 and lower than 100 square meters will be eliminated by area-based BBR.
Lastly, area-based BBR was applied to reduce the BBs and outcomes were promising
as compared to obtained results without thresholding and area-based BBR. The yellow
boxes in Fig. 2 represent the required bounding boxes while red and green bounding
boxes were eliminated through our proposed framework. The area of bounding boxes is
denoted by k as shown in Fig. 2. The value of k should be in the specified range to get
the appropriate bounding boxes otherwise eliminated. The appraisal of the proposed
method can be visualized from the difference of BBs in Fig. 3b, d. There are nine num-
bers of BBs which were eliminated by area-based BBR as shown in Fig. 3d.

4 Experimental Results and Discussions

In this section, we evaluate and discuss the effects of the proposed framework for
object detection. The large dataset DOTA was utilized to train the well-known
region-based CNN models and the performance of the proposed framework was
analyzed. Firstly, we have trained object detectors such as RCNN, Fast-RCNN and
Faster-RCNN with DOTA images. Multiple detectors were selected to evaluate the
effectiveness of the proposed method, and comparative study leads towards better
recommendations. Secondly, the proposed framework was implemented to enhance
detection performance. The results associated with BBR are shown in Fig. 3 in which
(a) is the original test image, (b) is the detection result without post-processing, (c)
after thresholding and (d) after implementing area-based BBR and these results were
obtained by implementing Faster-RCNN. The analysis for the airplane detection was
conducted on a small test set which was based on 35 test images with the existence
of 218 numbers of airplanes. The accuracy assessment was conducted on test images
and metrics have been given in Tables 1 and 2 before and after implementing area-
based BBR respectively. The parameters TP, FP and FN have used for true positive,
false positive and false negative respectively as given in Tables 1 and 2.
Similarly, the optimal results were conducted on other objects after utilizing
thresholding and area-based BBR as shown in Fig. 4a–d. The accuracy assessment

Table 1 Accuracy assessment Approach\statistics TP FP FN Precision Recall

of the proposed model without
area-based BBR RCNN [8] 201 178 17 0.530 0.922
Fast-RCNN [25] 199 267 19 0.427 0.913
Faster-RCNN [7] 204 21 14 0.907 0.936

Table 2 Accuracy assessment of Approach\statistics TP FP FN Precision Recall

the proposed model with area-
based BBR RCNN [8] 201 41 17 0.831 0.922
Fast-RCNN [25] 199 33 19 0.858 0.913
Faster-RCNN [7] 204 09 14 0.958 0.936

13
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Sensing and Imaging (2021) 22:5 Page 7 of 10 5

of the proposed framework is given in Table 1 which clearly shows that the proposed
method is promising to enhance the recall and precision rates for object detection.
To ensure improvement in accuracy assessment, we have investigated the precision
and recall rate for airplanes. The optimal results of detections for oil tanks, play-
grounds, tennis courts and swimming pools have been shown in Fig. 4. These results
authenticate that the proposed approach is promising to reduce the false alarms.

Fig. 4 The images are presented to describe the significance of the proposed framework after implemen-
tation of thresholding and area-based BBR for the detection of similar-sized objects individually such as
a oil tank, b playground, c swimming pool, and d tennis court

13
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
5 Page 8 of 10 Sensing and Imaging (2021) 22:5

There are fourteen numbers of unwanted BBs that have cleared away as shown in
Fig. 3.
According to the previous work, the performance of RCNN, Fast-RCNN and
Faster-RCNN was promising and all methods are worthwhile for specific applica-
tions. The proposed methodology improves the performance of each detector indi-
vidually as the proposed method is based on post-processing and it is advantageous
in such a way of post-processing adaptation. More specifically, it can be applied to
any kind of detection method at the post-processing stage such as after finishing the
training stage of detectors or detection process as well. Conclusively, it enhances the
accuracy of detectors after the implementation of every single step such as thresh-
olding and area-based BBR.
Several methods for object detection are being analyzed to make improvements
and the core differences exist between the working strategies of distinct methods.
For instance, the key role of SSD is that it contributes some feature maps on the
uppermost of YOLO which are taken from distinct layers and it affects the speed and
accuracy of SSD. Additionally, SSD takes more memory as compared to YOLO due
to feature maps addition, and the computational speed of YOLO is faster than Faster
and Fast RCNN. In contrast, the accuracy and mAP of YOLO are lower than Fast
and Faster RCNN [15].
TP
Precision = (1)
TP + FP

TP
Recall = (2)
TP + FN

5 Conclusions

In this paper, we have presented a novel approach to reduce the false positive alarms
in object detection or localization process. The proposed approach is based on
thresholding and area-based BBR. According to experiments, the thresholding has
effectively reduced the unwanted boxes and area-based BBR is more effective than
thresholding. Results show that the proposed method reduces unwanted BBs with a
substantial ratio. The widely known object detectors RCNN, Fast-RCNN and Faster-
RCNN were employed to reveal the comparative study of object detections. There
is only one constraint associated with area-based BBR to get the optimal results of
object detection that the size of objects which are being detected should be similar
in all test images. If the size of the objects will change significantly, then it will be
difficult to reduce the BBs promisingly.
The optimal solutions are still required for accuracy enhancement of object
detection which can be considered in the future study. Furthermore, the proposed
approach can be improved for the detection of multiple size objects.

13
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Sensing and Imaging (2021) 22:5 Page 9 of 10 5

Acknowledgments This work was supported by the National Natural Science Foundation of China under
Grants 61471148.

References
1. Tayara, H., Soo, K. G., & Chong, K. T. (2018). Vehicle detection and counting in high-resolution
aerial images using convolutional regression neural network. IEEE Access, 6, 2220–2230.
2. Li, K., Cheng, G., Bu, S., & You, X. (2018). Rotation-insensitive and context-augmented object
detection in remote sensing images. IEEE Transactions on Geoscience and Remote Sensing, 56(4),
2337–2348.
3. Koga, Y., Miyazaki, H., & Shibasaki, R. (2018). A CNN-based method of vehicle detection from
aerial images using hard example mining. Remote Sensing, 10(1), 124.
4. Bazi, Y., & Melgani, F. (2018). Convolutional SVM networks for object detection in UAV imagery.
IEEE Transactions on Geoscience and Remote Sensing, 56(6), 3107–3118.
5. ElMikaty, M., & Stathaki, T. (2018). Car detection in aerial images of dense urban areas. IEEE
Transactions on Aerospace and Electronic Systems, 54(1), 51–63.
6. Qiu, S., Wen, G., Deng, Z., Liu, J., & Fan, Y. (2018). Accurate non-maximum suppression for object
detection in high-resolution remote sensing images. Remote Sensing Letters, 9(3), 237–246.
7. Ren, S., He, K., Girshick, R., & Sun, J. (2015). Faster R-CNN: Towards real-time object detection
with region proposal networks. In Advances in neural information processing systems (pp. 91–99).
8. Girshick, R., Donahue, J., Darrell, T., Berkeley, U. C., & Malik, J. (2014). Rich feature hierar-
chies for accurate object detection and semantic segmentation. In Proceedings of IEEE confer-
ence on computer vision pattern recognition (Columbus, Ohio) (pp. 2–9). https://doi.org/10.1109/
CVPR.2014.81.
9. Hoiem, D., Chodpathumwan, Y., & Dai, Q. (2012). Diagnosing error in object detectors. In Euro-
pean conference on computer vision (pp. 340–353).
10. Karim, S., Zhang, Y., Asif, M. R., & Ali, S. (2017). Comparative analysis of feature extraction
methods in satellite imagery. Journal of Applied Remote Sensing, 11(4), 42618.
11. Karim, S., Zhang, Y., Ali, S., & Asif, M. R. (2018). An improvement of vehicle detection under
shadow regions in satellite imagery. In Proceedings of SPIE—the international society for optical
engineering (Vol. 10615). https://doi.org/10.1117/12.2303518.
12. Liu, W., et al. (2016). SSD: Single shot multibox detector. In European conference on computer
vision (pp. 21–37).
13. He, K., Zhang, X., Ren, S., & Sun, J. (2014). Spatial pyramid pooling in deep convolutional net-
works for visual recognition. In European conference on computer vision (pp. 346–361).
14. Dai, J., Li, Y., He, K., & Sun, J. (2016). R-fcn: Object detection via region-based fully convolutional
networks. In Advances in neural information processing systems (pp. 379–387).
15. Redmon, J., Divvala, S., Girshick, R., & Farhadi, A. (2016). You only look once: Unified, real-time
object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition
(Las Vegas, NV, USA) (pp. 779–788).
16. Najibi, M., Rastegari, M., & Davis, L. S. (2016). G-cnn: An iterative grid based object detector. In
Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 2369–2377).
17. Uijlings, J. R. R., Van De Sande, K. E. A., Gevers, T., & Smeulders, A. W. M. (2013). Selective
search for object recognition. International Journal on Computer Vision, 104(2), 154–171.
18. Gidaris, S., & Komodakis, N. (2016). Locnet: Improving localization accuracy for object detection.
In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 789–798).
19. Dickerson, N. L. (2017). Refining bounding-box regression for object localization. Dissertations
and Theses, Paper 3940. https://doi.org/10.15760/etd.5824.
20. Rezatofighi, H., Tsoi, N., Gwak, J., Sadeghian, A., Reid, I., & Savarese, S. (2019). Generalized
intersection over union: A metric and a loss for bounding box regression. In Proceedings of the
IEEE conference on computer vision and pattern recognition (pp. 658–666).
21. He, Y., Zhu, C., Wang, J., Savvides, M., & Zhang, X. (2019). Bounding box regression with uncer-
tainty for accurate object detection. In Proceedings of the IEEE conference on computer vision and
pattern recognition (pp. 2888–2897).

13
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
5 Page 10 of 10 Sensing and Imaging (2021) 22:5

22. Qian, X., Lin, S., Cheng, G., Yao, X., Ren, H., & Wang, W. (2020). Object detection in remote
sensing images based on improved bounding box regression and multi-level features fusion. Remote
Sensing, 12(1), 143.
23. Zheng, Z., Wang, P., Liu, W., Li, J., Ye, R., & Ren, D. (2020). Distance-IoU loss: Faster and better
learning for bounding box regression. In AAAI (pp. 12993–13000).
24. Yuan, D., Chang, X., & He, Z. (2020). Accurate bounding-box regression with distance-IoU loss for
visual tracking. arXiv:2007.01864
25. Girshick, R. (2015). Fast r-cnn. In Proceedings of the IEEE international conference on computer
vision (Boston, Massachusetts) (pp. 1440–1448).
26. Xia, G.-S., et al. (2018). DOTA: A large-scale dataset for object detection in aerial images. In Pro-
ceedings of CVPR.
27. Hariharan, B., Arbeláez, P., Girshick, R., & Malik, J. (2015). Hypercolumns for object segmentation
and fine-grained localization. In Proceedings of the IEEE conference on computer vision and pat-
tern recognition, 2015 (pp. 447–456).

Publisher’s Note Springer Nature remains neutral with regard to jurisdictional claims in published
maps and institutional affiliations.

13
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Terms and Conditions
Springer Nature journal content, brought to you courtesy of Springer Nature Customer Service Center
GmbH (“Springer Nature”).
Springer Nature supports a reasonable amount of sharing of research papers by authors, subscribers
and authorised users (“Users”), for small-scale personal, non-commercial use provided that all
copyright, trade and service marks and other proprietary notices are maintained. By accessing,
sharing, receiving or otherwise using the Springer Nature journal content you agree to these terms of
use (“Terms”). For these purposes, Springer Nature considers academic use (by researchers and
students) to be non-commercial.
These Terms are supplementary and will apply in addition to any applicable website terms and
conditions, a relevant site licence or a personal subscription. These Terms will prevail over any
conflict or ambiguity with regards to the relevant terms, a site licence or a personal subscription (to
the extent of the conflict or ambiguity only). For Creative Commons-licensed articles, the terms of
the Creative Commons license used will apply.
We collect and use personal data to provide access to the Springer Nature journal content. We may
also use these personal data internally within ResearchGate and Springer Nature and as agreed share
it, in an anonymised way, for purposes of tracking, analysis and reporting. We will not otherwise
disclose your personal data outside the ResearchGate or the Springer Nature group of companies
unless we have your permission as detailed in the Privacy Policy.
While Users may use the Springer Nature journal content for small scale, personal non-commercial
use, it is important to note that Users may not:

1. use such content for the purpose of providing other users with access on a regular or large scale
basis or as a means to circumvent access control;
2. use such content where to do so would be considered a criminal or statutory offence in any
jurisdiction, or gives rise to civil liability, or is otherwise unlawful;
3. falsely or misleadingly imply or suggest endorsement, approval , sponsorship, or association
unless explicitly agreed to by Springer Nature in writing;
4. use bots or other automated methods to access the content or redirect messages
5. override any security feature or exclusionary protocol; or
6. share the content in order to create substitute for Springer Nature products or services or a
systematic database of Springer Nature journal content.
In line with the restriction against commercial use, Springer Nature does not permit the creation of a
product or service that creates revenue, royalties, rent or income from our content or its inclusion as
part of a paid for service or for other commercial gain. Springer Nature journal content cannot be
used for inter-library loans and librarians may not upload Springer Nature journal content on a large
scale into their, or any other, institutional repository.
These terms of use are reviewed regularly and may be amended at any time. Springer Nature is not
obligated to publish any information or content on this website and may remove it or features or
functionality at our sole discretion, at any time with or without notice. Springer Nature may revoke
this licence to you at any time and remove access to any copies of the Springer Nature journal content
which have been saved.
To the fullest extent permitted by law, Springer Nature makes no warranties, representations or
guarantees to Users, either express or implied with respect to the Springer nature journal content and
all parties disclaim and waive any implied warranties or warranties imposed by law, including
merchantability or fitness for any particular purpose.
Please note that these rights do not automatically extend to content, data or other material published
by Springer Nature that may be licensed from third parties.
If you would like to use or distribute our Springer Nature journal content to a wider audience or on a
regular basis or in any other manner not expressly permitted by these Terms, please contact Springer
Nature at

[email protected]

Chassis Control Systems
100% (2)
Chassis Control Systems
312 pages
"Object Detection With Yolo": A Seminar On
No ratings yet
"Object Detection With Yolo": A Seminar On
14 pages
TN4611 PDF
No ratings yet
TN4611 PDF
11 pages
C# Practical Solution
No ratings yet
C# Practical Solution
61 pages
Yolo
No ratings yet
Yolo
32 pages
ML Cheat Sheet
50% (2)
ML Cheat Sheet
74 pages
Real Time Object Detection System
No ratings yet
Real Time Object Detection System
31 pages
Bounding Box Regression With Uncertainty For Accurate Object Detection
No ratings yet
Bounding Box Regression With Uncertainty For Accurate Object Detection
10 pages
Accurate Single Stage Detector Using Recurrent Rolling Convolution
No ratings yet
Accurate Single Stage Detector Using Recurrent Rolling Convolution
9 pages
Object Detection Using Adaptive Mask RCNN
No ratings yet
Object Detection Using Adaptive Mask RCNN
12 pages
An Improved Rotation Invariant CNN-based Detector With Rotatable Bounding Boxes For Aerial Image Detection
No ratings yet
An Improved Rotation Invariant CNN-based Detector With Rotatable Bounding Boxes For Aerial Image Detection
5 pages
Varifocal Net
No ratings yet
Varifocal Net
11 pages
Learning A Rotation Invariant Detector With Rotatable Bounding Box
No ratings yet
Learning A Rotation Invariant Detector With Rotatable Bounding Box
9 pages
6999-Article Text-10228-1-10-20200525
No ratings yet
6999-Article Text-10228-1-10-20200525
8 pages
8 ObectDectection
No ratings yet
8 ObectDectection
60 pages
Module 6
No ratings yet
Module 6
83 pages
IJISAE 20 Divya+kumawat 3 1834
No ratings yet
IJISAE 20 Divya+kumawat 3 1834
10 pages
C11240283S19
No ratings yet
C11240283S19
4 pages
Paper Survey On Performance Metrics For Object Detection Algorithms
No ratings yet
Paper Survey On Performance Metrics For Object Detection Algorithms
6 pages
Remotesensing 14 00984 v2
No ratings yet
Remotesensing 14 00984 v2
21 pages
FastBounding BoxEstimationbasedFaceDetection
No ratings yet
FastBounding BoxEstimationbasedFaceDetection
14 pages
Deep Learning: Dr. Sanjeev Sharma
No ratings yet
Deep Learning: Dr. Sanjeev Sharma
61 pages
Grid R-CNN
No ratings yet
Grid R-CNN
9 pages
Choi Gaussian YOLOv3 An Accurate and Fast Object Detector Using Localization ICCV 2019 Paper
No ratings yet
Choi Gaussian YOLOv3 An Accurate and Fast Object Detector Using Localization ICCV 2019 Paper
10 pages
Doloriel Improving The Detection of Small Oriented Objects in Aerial Images WACVW 2023 Paper
No ratings yet
Doloriel Improving The Detection of Small Oriented Objects in Aerial Images WACVW 2023 Paper
10 pages
Unlike Classification Networks Such As ResNets or VGG Net
No ratings yet
Unlike Classification Networks Such As ResNets or VGG Net
3 pages
cvpr10 SteinerTrees
No ratings yet
cvpr10 SteinerTrees
8 pages
Lu Grid R-CNN CVPR 2019 Paper
No ratings yet
Lu Grid R-CNN CVPR 2019 Paper
10 pages
Report 34
No ratings yet
Report 34
22 pages
Bottom-Up Object Detection by Grouping Extreme and Center Points
No ratings yet
Bottom-Up Object Detection by Grouping Extreme and Center Points
10 pages
2410.13842v1 2
No ratings yet
2410.13842v1 2
18 pages
Improvement of Object Detection Based On Faster R - 220904 150051
No ratings yet
Improvement of Object Detection Based On Faster R - 220904 150051
5 pages
10 1109@iwssip48289 2020 9145130
No ratings yet
10 1109@iwssip48289 2020 9145130
6 pages
DX Diag
No ratings yet
DX Diag
27 pages
Object Detection and Segmentation
No ratings yet
Object Detection and Segmentation
37 pages
Generalized Focal Loss: Learning Qualified and Distributed Bounding Boxes For Dense Object Detection
No ratings yet
Generalized Focal Loss: Learning Qualified and Distributed Bounding Boxes For Dense Object Detection
14 pages
CSE4261 Lecture-12
No ratings yet
CSE4261 Lecture-12
24 pages
Gaussian Bounding Boxes and Probabilistic Intersection-over-Union For Object Detection
No ratings yet
Gaussian Bounding Boxes and Probabilistic Intersection-over-Union For Object Detection
21 pages
Najibi G-CNN An Iterative CVPR 2016 Paper
No ratings yet
Najibi G-CNN An Iterative CVPR 2016 Paper
9 pages
Report 34
No ratings yet
Report 34
26 pages
Mpdiou: A Loss For E Cient and Accurate Bounding Box Regression
No ratings yet
Mpdiou: A Loss For E Cient and Accurate Bounding Box Regression
13 pages
Boundary Viên Thuốc
No ratings yet
Boundary Viên Thuốc
12 pages
Jimaging 10 00197
No ratings yet
Jimaging 10 00197
19 pages
SEMINAR
No ratings yet
SEMINAR
13 pages
ML Study Design - Google Street View Blurring System
No ratings yet
ML Study Design - Google Street View Blurring System
11 pages
Gao, Packer, Koller - Unknown - A Segmentation-Aware Object Detection Model With Occlusion Handling-Annotated
No ratings yet
Gao, Packer, Koller - Unknown - A Segmentation-Aware Object Detection Model With Occlusion Handling-Annotated
8 pages
1 ObjectDetection
No ratings yet
1 ObjectDetection
46 pages
Object Detection
No ratings yet
Object Detection
76 pages
Aws RP
No ratings yet
Aws RP
11 pages
Comparative Analysis of Feature Descriptors and Classifiers For Real-Time Object Detection
No ratings yet
Comparative Analysis of Feature Descriptors and Classifiers For Real-Time Object Detection
11 pages
Li 2021 J. Phys.: Conf. Ser. 1827 012085
No ratings yet
Li 2021 J. Phys.: Conf. Ser. 1827 012085
11 pages
Second Progress Report UID - 17BCS2127
No ratings yet
Second Progress Report UID - 17BCS2127
13 pages
Project Report (Group 9)
No ratings yet
Project Report (Group 9)
20 pages
Object Detection Using Mask R-CNN
No ratings yet
Object Detection Using Mask R-CNN
5 pages
Precedence, Dominance and C-Command: Binding Theory
100% (1)
Precedence, Dominance and C-Command: Binding Theory
6 pages
Enhancing Bounding Box Regression For Object Detection Dimensional Angle Precision IoU-Loss
No ratings yet
Enhancing Bounding Box Regression For Object Detection Dimensional Angle Precision IoU-Loss
19 pages
Scheme of Examination
No ratings yet
Scheme of Examination
42 pages
Yolo Vs RCNN
No ratings yet
Yolo Vs RCNN
5 pages
Applsci 12 01354 v2
No ratings yet
Applsci 12 01354 v2
14 pages
Machine Learning-Based Object Movement Prediction Method Using Occupancy Grid Maps From Roadside Sensor
No ratings yet
Machine Learning-Based Object Movement Prediction Method Using Occupancy Grid Maps From Roadside Sensor
4 pages
Analytical Study On Object Detection Using Yolo Algorithm
No ratings yet
Analytical Study On Object Detection Using Yolo Algorithm
3 pages
Chapter 2 Exercises and Answers: Answers Are in Blue
No ratings yet
Chapter 2 Exercises and Answers: Answers Are in Blue
6 pages
Unit 3
No ratings yet
Unit 3
45 pages
Image and Video Analytics Unit 3
No ratings yet
Image and Video Analytics Unit 3
18 pages
CE600E - V2.2-Duplex Continues Rectification
No ratings yet
CE600E - V2.2-Duplex Continues Rectification
132 pages
Deep Learning Object Detection IoU
No ratings yet
Deep Learning Object Detection IoU
2 pages
Electronics-Object Detection YOLO
No ratings yet
Electronics-Object Detection YOLO
12 pages
Schourup Discourse Markers Lingua 1999
No ratings yet
Schourup Discourse Markers Lingua 1999
39 pages
(25452835 - Transactions On Aerospace Research) Infrared Signature Suppression Systems in Modern Military Helicopters PDF
No ratings yet
(25452835 - Transactions On Aerospace Research) Infrared Signature Suppression Systems in Modern Military Helicopters PDF
21 pages
UFO Glasnost Marina Popowitsch LQ
No ratings yet
UFO Glasnost Marina Popowitsch LQ
288 pages
Discrete-Time Simulation With Simulink: ECE4560: Digital Control Laboratory
No ratings yet
Discrete-Time Simulation With Simulink: ECE4560: Digital Control Laboratory
5 pages
Caps Maths English GR R FS
No ratings yet
Caps Maths English GR R FS
286 pages
M2 Lesson 4 Slides For Students
No ratings yet
M2 Lesson 4 Slides For Students
48 pages
Nasa 5020a - Its All in The Preload - Predictive Engineering Fea Consulting Engineering Service 20201230
No ratings yet
Nasa 5020a - Its All in The Preload - Predictive Engineering Fea Consulting Engineering Service 20201230
8 pages
Introduction
No ratings yet
Introduction
24 pages
Getting Started With Swiper
No ratings yet
Getting Started With Swiper
4 pages
Company SNP (Eng) - Color - 1-6-61
No ratings yet
Company SNP (Eng) - Color - 1-6-61
95 pages
Prosthodontics: Criteria For Special Tray Construction
No ratings yet
Prosthodontics: Criteria For Special Tray Construction
6 pages
CO423 - Swarm and Evolutionary Computing - Notes by V Daneesha
No ratings yet
CO423 - Swarm and Evolutionary Computing - Notes by V Daneesha
41 pages
Dataform
No ratings yet
Dataform
17 pages
Last Minute Notes
No ratings yet
Last Minute Notes
2 pages
TC 20140501 0022-Desbloqueado PDF
No ratings yet
TC 20140501 0022-Desbloqueado PDF
5 pages
Efficient Layout Design of Junctionless Transistor Based 6-T
No ratings yet
Efficient Layout Design of Junctionless Transistor Based 6-T
7 pages
W5-Group III Cations
No ratings yet
W5-Group III Cations
10 pages
Lab Module 1.2 - NMK 11103
No ratings yet
Lab Module 1.2 - NMK 11103
10 pages
Logical Micro Instructions in Computer Organization and Architecture
No ratings yet
Logical Micro Instructions in Computer Organization and Architecture
9 pages
Sol 5
No ratings yet
Sol 5
7 pages
0 - A Manual For The Part-Compositor Framework
No ratings yet
0 - A Manual For The Part-Compositor Framework
10 pages
Machine Learning - Advanced Concepts
From Everand
Machine Learning - Advanced Concepts
Derrick Mwiti
No ratings yet
Contextual Image Classification: Understanding Visual Data for Effective Classification
From Everand
Contextual Image Classification: Understanding Visual Data for Effective Classification
Fouad Sabry
No ratings yet

Auxiliary Bounding Box Regression For Object Detec

Uploaded by

Auxiliary Bounding Box Regression For Object Detec

Uploaded by

Sensing and Imaging (2021) 22:5

Auxiliary Bounding Box Regression for Object Detection

Shahid Karim1,2 · Ye Zhang2 · Shoulin Yin2 · Irfana Bibi3

Received: 28 March 2020 / Revised: 16 August 2020 / Accepted: 10 October 2020

Keywords Bounding box · Thresholding · Area-based · Regression · RCNN · Object

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

The improvements of object detection considerably belong to pre and post-process-

3.3 Area‑based Bounding‑Box Regression

The accuracy assessments of detection and localization models significantly depend on

Fig. 1 Working principle of the proposed model

Fig. 2 The detailed illustration of proposed model

4 Experimental Results and Discussions

Table 1 Accuracy assessment Approach\statistics TP FP FN Precision Recall

Table 2 Accuracy assessment of Approach\statistics TP FP FN Precision Recall

You might also like

3.3 Area‑based Bounding‑Box Regression

4 Experimental Results and Discussions