0% found this document useful (0 votes)
13 views8 pages

Paper UnderwaterObjectDetectionUsingYOLOV4

The paper discusses the implementation of YOLOv4 for underwater animal detection, highlighting its effectiveness in overcoming challenges such as murkiness and low visibility. YOLOv4 achieved a mean average precision of 97.96% and a processing speed of 46.6 frames per second, demonstrating its superiority over traditional methods. The study utilized an open-source dataset containing various underwater species to evaluate the model's performance in real-time applications.

Uploaded by

kathiresang345
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views8 pages

Paper UnderwaterObjectDetectionUsingYOLOV4

The paper discusses the implementation of YOLOv4 for underwater animal detection, highlighting its effectiveness in overcoming challenges such as murkiness and low visibility. YOLOv4 achieved a mean average precision of 97.96% and a processing speed of 46.6 frames per second, demonstrating its superiority over traditional methods. The study utilized an open-source dataset containing various underwater species to evaluate the model's performance in real-time applications.

Uploaded by

kathiresang345
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/354556654

Underwater Animal Detection Using YOLOV4

Conference Paper · August 2021


DOI: 10.1109/ICCSCE52189.2021.9530877

CITATIONS READS

23 887

5 authors, including:

Iza sazanita Isa Siti Noraini Sulaiman


Centre for Electrical Engineering StudiesUniversiti Teknologi MARA MARA University of Technology
129 PUBLICATIONS 814 CITATIONS 130 PUBLICATIONS 1,208 CITATIONS

SEE PROFILE SEE PROFILE

All content following this page was uploaded by Iza sazanita Isa on 21 February 2022.

The user has requested enhancement of the downloaded file.


Underwater Animal Detection Using YOLOV4

Mohamed Syazwan Asyraf Bin Rosli1, Iza Sazanita Isa1, Mohd Ikmal Fitri Maruzuki1, Siti Noraini Sulaiman1, Ibrahim Ahmad2
1School of Electrical Engineering, College of Engineering, Universiti Teknologi MARA,

Cawangan Pulau Pinang, Pulau Pinang, Malaysia


[email protected]
2Arca Biru Sdn Bhd (i-Kerpan),

Kedah Aquaculture Complex, Ayer Hitam, Kedah, Malaysia


[email protected]

Abstract—Underwater computer vision system has been From literature, numerous developments have been
widely used for many underwater applications such as ocean accomplished and produced excellent results despite the
exploration, biological research and monitoring underwater life underwater challenges such as variation in lighting and water
sustainability. However, in counterpart of the underwater murkiness. With the implementation through deep
environment, there are several challenges arise such as water architecture with numerous features, deep learning model has
murkiness, dynamic background, low light and low visibility the capability to achieve high performance primarily in the
which limits the ability to explore this area. To overcome these area of computer vision specially for underwater object
challenges, there is a crucial to improve underwater vision detection.
system that able to efficiently adapt with varying environments.
Therefore, it is great of significance to propose an efficient and Technically, the modern machine learning utilizes
precise underwater detection by using YOLOv4 based on deep Convolutional Neural Network (CNN) as based networks.
learning algorithm. In the research, an open-source underwater Due to the need of domain expertise and human intervention
dataset was used to investigate YOLOv4 performance based on in traditional machine learning, many researchers assured that
metrics evaluation of precision and processing speed (FPS). The deep learning as it has its flexibility and supremacy in the
result shows that YOLOv4 able to achieve a remarkable of accuracy on certain application [2] [3]. In addition, many
97.96% for mean average precision with frame per second of comparative studies had proven that the performance of deep
46.6. This study shows that YOLOv4 model is highly significant
learning based on CNN surpassed the traditional methods [4]
to be implemented in underwater vision system as it possesses
[5]. This is because, the main factor for object detection using
ability to accurately detect underwater objects with haze and
low-light environments.
deep learning with CNN was due to the inclusion of
classification and object localization. In other way, CNN gives
Keywords—Underwater detection, computer vision, YOLOv4, benefit for the image classification approach since it has the
mean average precision, real-time ability to learn by assigning weights and biases to various
objects in the image. Generally, modern object detection has
I. INTRODUCTION two types of detection which is multistage detection and single
In recent few years, various type of underwater vision stage detection. Region based Convolutional Neural Network
system has been developed to practically integrate with (R-CNN) [6] is the pioneer for multistage detector whilst
underwater vehicles such as autonomous underwater vehicle Faster R-CNN [7] is the latest model improvement. Even
(AUV) and remotely operated vehicle (ROV). The vast though Faster R-CNN is able to achieve good accuracy but it
demand in such application of underwater vision system is has limited ability to achieve sufficient speed for real time
significant due to the need of huge amounts of data collection. implementation [8].
Furthermore, the underwater vision system is applicable for In the past recent years, You Only Look Once (YOLO) has
analyzing and understanding in-depth criteria that range from been arisen to be one of the most famous architecture for
inspection of physical oceanography to the identification and single stage detector. This architecture is famous due to its
counting of marine biology for biological research. Most of efficacy, fast and accurate [8] [9]. Typically, the YOLO
the modern underwater research are equipped with devices architecture has four version which is YOLOv1 [10],
such as camera that can withstand underwater pressure, YOLOv2 [11], YOLOv3 [12] and the latest one is YOLOv4
corrosion and most importantly need to be waterproof. With [13]. Latest improvement in YOLOv4 optimizes both speed
this rapid system development, underwater vision system is and accuracy which is composed of CSPdarknet53 as its
not left behind to utilize those devices with numerous backbone. This backbone can enhance the learning capability
computer vision and artificial intelligence algorithm which of CNN by helping to build robust object detection model
can help to accelerate the practical research. especially for underwater computer vision. In addition, a
The integration between object detection with deep block called Spatial Pyramid Pooling (SPP) also was added
learning is one of the applications that is used in underwater into the backbone in order to increase the receptive field and
vision system. The object detection involves a process of capture the most significant features which will benefit in
training the classifier to understand and learn on semantic and object with varying visibility.
high-level features to classify different images. The In this paper, single stage detector which is YOLOv4 was
conceptual of object detection is precisely estimate the desired trained and tested using underwater dataset to justify the
object and locate the position of the objects in each image [1].
model robustness in detecting object with several challenges Jalal et al. [20] by combining Optical flow and Gaussian
including various visibility. The dataset used was The mixture model (GMM) with YOLOv3 algorithm. The study
Brackish dataset that is composed of 6 different classes of revealed that the GMM and optical flow alone failed to
underwater animals [14]. This dataset is challenging since it produce acceptable score for fish detection compare to
was recorded 9 meters below surface of brackish strait in YOLO. However, the study further enhanced the YOLO
northern part of Denmark. The YOLOv4 model’s model and the score had increased in around 5% of F1-score.
performance will be tested based on two major evaluations All these hybrids proposed system can achieve better accuracy
which is mean Average Precision (mAP) and Frame Per but utilize relatively high computational power due to
Second (FPS). This paper will be divided into several section complex mixture of algorithm. Therefore, this application
where Section 2 is for related research and implementation may result in poor resulted in poor real-time performance.
while Section 3 will elaborate on the method used to train and
test the YOLOv4 model. Finally, Section 4 describes results Aforementioned, several advantages have been
and discussion of underwater detection based on speed and highlighted to prove the effectiveness of YOLO models for
accuracy. underwater implementation especially in terms of detection
precision and processing speed. The way of feature extraction
II. RELATED RESEARCH AND IMPLEMENTATION layer was built based on CNN leads to a good result in
precision whilst the ability to achieve high frame per second
Considering high-efficiency performance in object is due to the single-stage detection scheme. From the
detection, YOLO have been implemented by many literature review of underwater computer vision application
researchers for application of underwater detection which and research, YOLOv3 is the most popular algorithms in
have more challenging environment especially in murky water YOLO family. Its breakthrough in deep learning-based
and low light surroundings. As example, a research by Xu et computer vision has yield many applications especially
al. [15] utilized YOLOv3 for underwater fish detection for underwater detection. Recent development in YOLO
waterpower applications. The datasets used to train and test advancement called YOLOv4 is still new and the application
the model were very challenging with high turbidity, high for underwater still low especially in underwater animal
velocity and murky water as the three datasets were recorded detection. The motivation of this work is to study and utilize
at marine and hydrokinetic energy projects and river YOLOv4 in detecting underwater creatures by using
hydropower projects. The training and testing of the model challenging underwater dataset which will test YOLOv4
shows adequate results for mean average precision (mAP) of capability in terms of precision and its real-time application.
53.92%. Apart from underwater animal detection, underwater
computer vision also been used for other underwater purpose. III. METHODOLOGY
One of it is detection of underwater pipeline leakage proposed
by X Zhao et al. [16]. The research used YOLOv3 algorithm This section will expose the acquisition and preparation of
with a total of 900 three-channels images as dataset to locate underwater dataset to YOLOv4 architecture based on Deep
oil spill point of the underwater pipeline. The trained model Convolutional Neural Network (DCNN) detection model. In
able to achieve 77.5% of leakage point detection accuracy addition, this section includes the evaluation made for
with 36 frames per second of processing time. YOLOv4 performance towards underwater dataset.
Generally, the overall proposed work in this study is presented
Meanwhile, M. Fulton et al. [17] proposed robotic as shown in Fig. 1.
detection of marine litter for UAV system through several
detection models namely YOLOv2, Tiny-YOLO, Faster R-
CNN and Single Shot Detector (SSD). Literally, the research
has concluded that YOLOv2 strikes the best balance between
detection accuracy and processing speed. Another research
adopts YOLO as architecture for object detection that aims for
underwater sustainability was proposed by Wu et al [18]. In
order to overcome challenges such as light absorption and low
visibility in turbid waters, this research implemented
YOLOv4 to detect underwater rubbish using ROV. YOLOv4
model was trained using 1120 images from 3 different sources
which captured through phone, by the ROV and scrap through
internet. Generally, the study claimed that the trained model is
“fast and effective” where it able to achieve mAP of 82.7%.
Furthermore, the proposed system has successfully
implemented into hardware which is ROV to detect rubbish in
underwater. Fig. 1. Overall proposed object detection of YOLOv4

YOLO algorithm also has been combined with other A. Dataset Acquisition
algorithm that could help to enhance YOLO capability in
detecting underwater object. Mohamed et al. [19] utilized In object detection, majority of the dataset used are in the
YOLOv3 for fish detection and tracking application in fish form of image. The dataset contains a large number of images
farms. In the study, pre-processing of the underwater images that are used for training an algorithm with the goal to learn
was executed using Multi-Scale Retinex (MSR) algorithm the detail of feature in every image. Hence, the algorithm is
while optical flow algorithm was used to track fish. From the able to find most common or predictable pattern of the dataset
result, it shows that the model is able to track the fish as a whole. In this study, an underwater open-source dataset
trajectory with the help of YOLO compared to without [14] was applied to build YOLOv4 detection system for
YOLO. The hybrid algorithm also had been proposed by A
underwater application and continuously investigate the overall process of deep learning framework and training
model performance within scope of the challenging dataset. platform is depicted in Fig. 2.
The dataset was taken from The Brackish Dataset [14] that
contains six underwater categories namely big fish, jellyfish,
crab, shrimp, small fish and starfish. With several
environment effects such as variation in luminosity, water
murkiness and low resolution, this challenging dataset was
recorded in brackish strait in Limfjorden that runs through
Aalborg, Denmark. The dataset consists of 10,995 annotated
files and 14,518 images extracted from recorded videos. The
dataset was separated into 80% for training, 10% for
validation and 10% for testing. The features and description
of the dataset is summarized in Table I.
Since YOLO is a supervised learning, this dataset was
established with annotation by standardized YOLO
annotation bounding box format. Each image will have its
own annotation in .txt file. For YOLO, it has a specific
annotation format that consist of 5 components which is
object-id, center x, center y, width and height. The object-id
Fig. 2. Overall proposed object detection of YOLOv4
represent the class number while center x and center y will
represent the coordinates of the center points of the bounding
boxes. The width and height are the representation of the size
of the bounding box. C. YOLOv4 Model Execution
YOLOv4 is a single stage detector where the network is
separated into 4 sections namely input, backbone, neck and
TABLE I. THE BRACKISH DATASET FEATURE AND DESCRIPTION dense prediction as shown in Fig. 3. Since it is supervised
Dataset Feature Description
learning, so it requires labelled images with bounding boxes
to be fed as input during training. The backbone of YOLOv4
Annotated Image 25,613 annotations 10,995 images with is define as the essential feature-extraction architecture. The
annotation while backbone still integrated with original YOLOv3’s backbone
3523 images without
annotations (only which is Darknet53 but with an improvement that utilize
background) Cross-Stage-Partial (CSP) [23] connections which later the
Number of Classes 6 Big fish, Jellyfish,
backbone called as CSPDarknet53.
Crab, Shrimp, Small
fish and Starfish
Training 11,614 images
Validation 1452 images
Testing 1452 images

Fig. 3. Overall proposed object detection of YOLOv4

B. Deep Learning Framework and Training Platform Next, is neck section where this section will mix and
Neural network framework is used to provide flexible combine the features in the backbone first before being fed for
APIs and configuration options for performance optimization detection purpose. YOLOv4’s authors pick modified version
where it is designed to facilitate and fasten the training of deep of Path Aggregation Network (PANet) [24] as the neck for the
learning models [21]. In this study, the neural network architecture. Apart from that, YOLOv4 also adopt the usage
framework that was used is an open source framework called of Spatial Pyramid Pooling (SPP) [25]. Before a feature move
Darknet. Darknet is written in C and CUDA. Using this neural to fully connected layer for prediction, it needs to be flattened
network framework also will allow the execution of the first. The final section which is dense prediction or also known
training and detection to be made in Graphical Processing Unit as head plays important roles in producing final prediction and
(GPU) which is faster compare using Central Processing Unit locating bounding boxes. YOLOv4 deploys same head as
(CPU). YOLOv3 which the network detects the bounding coordinates
as well as confidence score for specific class.
YOLOv4 was trained and tested using Jupyter notebook in
Google Colaboratory. “Colab” for short is an open platform Before the training begin, YOLOv4 configuration file was
that allows user to write and execute Python and mostly used also modified in order to define several parameters that will
for machine learning since it provides free and powerful be used during training. The parameters that was set in
computing resources including GPU [22]. In this paper, Tesla configuration file is listed as in Table II. The training was set
T4 was chosen as GPU to train and test the YOLOv4 model. to run for 12000 iterations. In machine learning, to provide
In addition, Google Drive was connected with Colab to allow unbiased evaluation to the final trained model, a test dataset
the training weight to be saved in particular iteration. The was used to assess the model performance. For testing the
This work was supported by grants from the Ministry of Higher Education
(MOHE), Malaysia under the FRGS grant of 600-RMI/FRGS 5/3
(291/2019).
performance speed, the trained model was tested using a test IV. RESULTS AND DISCUSSIONS
video where the FPS was assessed. Overall, an excellent result had been achieved for a single
stage deep learning based object detection. The result shown
in Table III were obtained on the test dataset, which was
TABLE II. YOLOV4 TRAINING PARAMETERS CONFIGURATION considered to be challenging to the models This dataset also
Batch Size 64
provided unbiased representation of how YOLOv4 trained
model react to “never seen images”. Fig. 4 shows the training
Subdivision 16 curve of mAP versus iteration. The model started to converge
Width 416 with a good performance at 4000th iteration and having
stagnant performance at 10000th iteration.
Height 416
Momentum 0.949
Decay 0.0005
Learning Rate 0.001
Activation Mish

D. Performance Evaluation
In order to evaluate the performance of the YOLO model,
the evaluation criterions were measured and calculated based
on five common evaluation metrics that are Precision and
Recall as shown in Eq. 1 and Eq. 2 respectively. Precision
indicates that out of all predicted instances that belongs to
particular class, actually belonged to that particular class. In
addition, precision is used to reflect the robustness of Fig. 4. [email protected] versus iteration
detection where high precision returns in truer detected object
TABLE III. YOLOV4 PERFORMANCE METRICS
than false detected in the trained model. Meanwhile the recall
determines the ability of the model to find all relevant Performance Metrics Result (%)
instances in the dataset. High precision indicates a low value Precision 94.00
of the false positive though generally correlated with a small
number of false negatives for recall. Another evaluation used Recall 97.00
is F1-measure as shown in Eq. 3 which represents the F1-Score 95.48
harmonic mean of precision and recall.
mAP @ 0.5 97.96
𝑇𝑟𝑢𝑒 𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒 (𝑇𝑃)
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 = (1)
𝑇𝑟𝑢𝑒 𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒 (𝑇𝑃)+𝐹𝑎𝑙𝑠𝑒𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒 (𝐹𝑃)

Result obtained from Table III shows the impressiveness


𝑇𝑟𝑢𝑒 𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒 (𝑇𝑃)
of YOLOv4 performance towards The Brackish dataset. With
𝑅𝑒𝑐𝑎𝑙𝑙 = (2) 94% of Precision, it shows that the network’s backbone works
𝑇𝑟𝑢𝑒 𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒 (𝑇𝑃)+𝐹𝑎𝑙𝑠𝑒𝑁𝑒𝑔𝑎𝑡𝑖𝑣𝑒 (𝐹𝑁)
efficiently. CSPDarknet53 which based on DenseNet helps to
connect convolutional layers with benefit in enhancing feature
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 × 𝑅𝑒𝑐𝑎𝑙𝑙 propagation and encourage the network to reuse features from
𝐹1 − 𝑆𝑐𝑜𝑟𝑒 = 2 × (3) previous layers. In addition, by passing unedited feature from
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛+𝑅𝑒𝑐𝑎𝑙𝑙
previous layers, it will significantly improve feature learning
and hence improve the classification of the underwater animal.
Next, the networks were also evaluated with mean average Besides, higher value in recall prove YOLOv4 capability in
precision (mAP) as denoted in Eq. 4. The mAP for object finding or extracting the relevant underwater animal in
detection is defined as the average of the AP calculated for all brackish environment. Higher recall indicates higher
classes involved. Finally, Frame Per Second (FPS) as in Eq.5 sensitivity in YOLOv4 where this algorithm efficiently
is used to express on how fast the model can process the input returns relevant results.
in one second.
In object detection, mAP is one of the most crucial
performance evaluations as it calculates average precision for
each class across varied intersection over union (IoU). In this
∑𝑘
𝑖+1 𝐴𝑃𝑖
𝑚𝐴𝑃 = (4) research, the threshold was set at 0.5 and it produced an
𝑘
impressive result of 97.96%. The performance demonstrates
that YOLOv4 performed well in locating the bounding boxes
𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝐹𝑟𝑎𝑚𝑒𝑠 of underwater animal since mAP is comparing the ground
𝐹𝑃𝑆 = (5) truth of the bounding box to the detected box. For detection in
𝑇𝑜𝑡𝑎𝑙 𝐷𝑒𝑡𝑒𝑐𝑡𝑖𝑜𝑛 𝑇𝑖𝑚𝑒,𝑠
dense prediction phase, YOLOv4 used Non-Max Suppression
(NMS) with Distance-IoU (DIoU) or in short called DIoU-
NMS. The NMS alone works by eliminating boxes that
represent the same objects by using IoU metric system. But
there was problem arise due to confidence threshold where
two overlapping different boxes are suppressed into one box.
With the help of using DIoU, the issue can be solved and lead
to better in conserving the correct detection boxes [26].
Several improvements have been made by the YOLOv4’s
authors especially on data augmentation which helps in
enriching the YOLOv4 capability for underwater usage.
Mosaic [13] and Cutmix [27] are two data augmentation
process executed in this research. It helps in expanding the
training set of The Brackish Dataset. This allows the
underwater detection model to be exposed with different
semantic situations. In addition, The Brackish Dataset consist
of small and far-sight underwater animals such as crabs and
starfish so there is a need to encounter this challenge. Since, (c)
mosaic data augmentation combines four images became one,
it helps the model to less concentrate on the surrounding
scenes and improve the ability to focus on smaller objects.
Fig. 5 (a)-(d) shows the image results that shows a good
performance in detecting underwater animals in a murky
environment which has low visibility and performing well in
detecting objects across scales. The ability in detecting objects
at longer distance that resulted in small scale objects was a
proved to be the benefits from the process of up-sample layers
that concatenated with the previous layer which helps preserve
the fine-grained features for small instances [12].

(d)
Fig. 5. Detection output (a) – (d) ability to detect underwater life and
classes differentiate

Apart from performance in classification and detection,


processing speed of underwater detection also should be
highlighted. In this study, the processing speed of trained
YOLOv4 was able to achieve a remarkable performance of
46.6 FPS. This is the major benefit of using single-stage object
detection where it needs only a single pass through the
network and predicts the bounding box of the objects. This
prove that it can works perfectly for real time application.
(a) V. CONCLUSION
In this research, YOLOv4 was trained and tested using
underwater dataset to investigate the model robustness in
detecting object with several underwater challenges. With
97.96% of mAP and 46.6 FPS, YOLOv4 proven to be
excellent in feature extraction and locating bounding boxes
along with applicable for real time application. In addition,
YOLOv4 also perform well with the challenging underwater
dataset which has varying visibility and low light condition.
There is limitation in this research such as a need to use
multiple challenging underwater datasets to provide better
comparison result to YOLOv4 performance. In future, this
research can be implemented for autonomous underwater
vehicle (AUV) and remotely operated vehicle (ROV) usage
(b) that can benefit in ocean exploration, biological research and
for underwater sustainability.
ACKNOWLEDGMENT
Authors wish to thank to members of the Advanced
Control System and Computing Research Group (ACSCRG),
Advanced Rehabilitation Engineering in Diagnostic and
Monitoring Research Group (AREDiM) and School of
Electrical Engineering, College of Engineering, Universiti
Teknologi MARA, Cawangan P. Pinang for providing the
assistance and guidance for the field works. The authors are comparative study of open source deep learning frameworks,”
grateful to Research Management Institute (RMI) and 2018 9th Int. Conf. Inf. Commun. Syst. ICICS 2018, vol. 2018-
Janua, pp. 72–77, 2018.
Universiti Teknologi MARA (UiTM), Cawangan Pulau [22] E. Bisong, “Google Colaboratory,” in Building Machine Learning
Pinang for administrative and financial supports. With special and Deep Learning Models on Google Cloud Platform, Berkeley,
thank as this publication was made possible by grants from the CA: Apress, 2019, pp. 59–64.
Ministry of Higher Education (MOHE), Malaysia under the [23] C. Y. Wang, H. Y. Mark Liao, Y. H. Wu, P. Y. Chen, J. W. Hsieh,
FRGS grant of FRGS/1/2019/TK04/UITM/02/19. and I. H. Yeh, “CSPNet: A new backbone that can enhance
learning capability of CNN,” IEEE Comput. Soc. Conf. Comput.
REFERENCES Vis. Pattern Recognit. Work., vol. 2020-June, pp. 1571–1580,
2020.
[1] Z. Q. Zhao, P. Zheng, S. T. Xu, and X. Wu, “Object Detection with [24] S. Liu, L. Qi, H. Qin, J. Shi, and J. Jia, “Path Aggregation Network
Deep Learning: A Review,” IEEE Trans. Neural Networks Learn. for Instance Segmentation,” Proc. IEEE Comput. Soc. Conf.
Syst., vol. 30, no. 11, pp. 3212–3232, 2019. Comput. Vis. Pattern Recognit., pp. 8759–8768, 2018.
[2] N. O’Mahony et al., Deep Learning vs. Traditional Computer [25] K. He, X. Zhang, S. Ren, and J. Sun, “Spatial pyramid pooling in
Vision, vol. 943. Springer International Publishing, 2020. deep convolutional networks for visual recognition,” Lect. Notes
[3] A. Lee, “Comparing Deep Neural Networks and Traditional Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect.
Vision Algorithms in Mobile Robotics,” Notes Bioinformatics), vol. 8691 LNCS, no. PART 3, pp. 346–361,
Pdfs.Semanticscholar.Org, 2015. 2014.
[4] K. Horak and R. Sablatnig, “Deep learning concepts and datasets [26] Z. Zheng, P. Wang, W. Liu, J. Li, R. Ye, and D. Ren, “Distance-
for image recognition: overview 2019,” no. February 2020, p. 100, IoU loss: Faster and better learning for bounding box regression,”
2019. arXiv, no. 2, 2019.
[5] P. Ding, Y. Zhang, P. Jia, and X. ling Chang, “A Comparison: [27] S. Yun, D. Han, S. Chun, S. J. Oh, J. Choe, and Y. Yoo, “CutMix:
Different DCNN Models for Intelligent Object Detection in Regularization strategy to train strong classifiers with localizable
Remote Sensing Images,” Neural Process. Lett., vol. 49, no. 3, pp. features,” Proc. IEEE Int. Conf. Comput. Vis., vol. 2019-Octob,
1369–1379, 2019. pp. 6022–6031, 2019.
[6] R. Girshick, J. Donahue, T. Darrell, J. Malik, and U. C. Berkeley,
“Rich feature hierarchies for accurate object detection and
semantic segmentation,” in 2014 IEEE Conference on Computer
Vision and Pattern Recognition, 2014, pp. 580–587.
[7] S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN: Towards
Real-Time Object Detection with Region Proposal Networks,”
IEEE Trans. Pattern Anal. Mach. Intell., vol. 39, no. 6, pp. 1137–
1149, 2017.
[8] A. M. Algorry, A. G. García, and A. G. Wofmann, “Real-Time
Object Detection and Classification of Small and Similar Figures
in Image Processing,” Proc. - 2017 Int. Conf. Comput. Sci.
Comput. Intell. CSCI 2017, pp. 516–519, 2018.
[9] B. Benjdira, T. Khursheed, A. Koubaa, A. Ammar, and K. Ouni,
“Car Detection using Unmanned Aerial Vehicles: Comparison
between Faster R-CNN and YOLOv3,” 2019 1st Int. Conf.
Unmanned Veh. Syst. UVS 2019, pp. 1–6, 2019.
[10] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You only
look once: Unified, real-time object detection,” Proc. IEEE
Comput. Soc. Conf. Comput. Vis. Pattern Recognit., vol. 2016-
Decem, pp. 779–788, 2016.
[11] J. Redmon and A. Farhadi, “YOLO9000: Better, faster, stronger,”
Proc. - 30th IEEE Conf. Comput. Vis. Pattern Recognition, CVPR
2017, vol. 2017-Janua, pp. 6517–6525, 2017.
[12] J. Redmon and A. Farhadi, “YOLOv3: An Incremental
Improvement,” arXiv: 1804.02767v1, 2018.
[13] A. Bochkovskiy, C. Y. Wang, and H. Y. M. Liao, “YOLOv4:
Optimal Speed and Accuracy of Object Detection,” arXiv, 2020.
[14] M. Pedersen, J. B. Haurum, R. Gade, T. B. Moeslund, and N.
Madsen, “Detection of Marine Animals in a New Underwater
Dataset with Varying Visibility,” in IEEE Conference on
Computer Vision and Pattern Recognition Workshops, 2019, no.
June, pp. 18–26.
[15] W. Xu and S. Matzner, “Underwater fish detection using deep
learning for water power applications,” Proc. - 2018 Int. Conf.
Comput. Sci. Comput. Intell. CSCI 2018, pp. 313–318, 2018.
[16] X. Zhao, X. Wang, and Z. Du, “Research on Detection Method for
the Leakage of Underwater Pipeline by YOLOv3,” 2020 IEEE Int.
Conf. Mechatronics Autom. ICMA 2020, pp. 637–642, 2020.
[17] M. Fulton, J. Hong, M. J. Islam, and J. Sattar, “Robotic detection
of marine litter using deep visual detection models,” Proc. - IEEE
Int. Conf. Robot. Autom., vol. 2019-May, pp. 5752–5758, 2019.
[18] Y. Wu, P. Shih, L. Chen, and H. Samani, “Towards Underwater
Sustainability using ROV Equipped with Deep Learning System.”
[19] H. E. D. Mohamed et al., “MSR-YOLO: Method to Enhance Fish
Detection and Tracking in Fish Farms,” Procedia Comput. Sci.,
vol. 170, no. 2019, pp. 539–546, 2020.
[20] A. Jalal, A. Salman, A. Mian, M. Shortis, and F. Shafait, “Fish
detection and species classification in underwater environments
using deep learning with temporal information,” Ecol. Inform., vol.
57, no. April, p. 101088, 2020.
[21] A. Shatnawi, G. Al-Bdour, R. Al-Qurran, and M. Al-Ayyoub, “A
View publication stats

You might also like