0% found this document useful (0 votes)
83 views

Underwater Object Detection Using Image Enhancement and Deep Learning Models

The document discusses using deep learning models for underwater object detection from images captured by autonomous underwater vehicles. It evaluates YOLOv8, YOLOv7, and YOLOv5 models on publicly available underwater datasets. The YOLOv5 and YOLOv8 models achieved the highest mean average precision on the Brackish dataset, demonstrating the potential for deep learning in automated underwater object detection.

Uploaded by

deshmukhneha833
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
83 views

Underwater Object Detection Using Image Enhancement and Deep Learning Models

The document discusses using deep learning models for underwater object detection from images captured by autonomous underwater vehicles. It evaluates YOLOv8, YOLOv7, and YOLOv5 models on publicly available underwater datasets. The YOLOv5 and YOLOv8 models achieved the highest mean average precision on the Brackish dataset, demonstrating the potential for deep learning in automated underwater object detection.

Uploaded by

deshmukhneha833
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Underwater Object Detection using Image

Enhancement and Deep Learning Models


Adane Nega Tarekegn Faouzi Alaya Cheikh Mohib Ullah Erik Tobias Sollesnes
Department of Computer Department of Computer Department of Computer USEA Ocean Data
Science Science Science Oslo, Norway
Norwegian University of Norwegian University of Norwegian University of erik.sollesnes@useaoceanda
2023 11th European Workshop on Visual Information Processing (EUVIP) | 979-8-3503-4218-5/23/$31.00 ©2023 IEEE | DOI: 10.1109/EUVIP58404.2023.10323047

Science and Technology Science and Technology Science and Technology ta.com
Gjøvik, Norway Gjøvik, Norway Gjovik, Norway
[email protected] [email protected] [email protected]

Cornelia Alexandru Saeed Nourizadeh Azar Erdeniz Erol George Suciu


Research & Development R&D and AI Department Elkon Elektrik San. Tic. A.S Research & Development
Department OBSS Technology Istanbul, Turkey Department
BEIA Consult International Istanbul, Turkey [email protected] BEIA Consult International
Bucharest, Romania [email protected] Bucharest, Romania
[email protected] om.tr [email protected]

Abstract—Autonomous underwater vehicles (AUVs) are pipes and cables, conducting environmental surveillance, and
efficient robotic tools, offering a wide range of applications in ocean exploring wrecks and archaeological sites. The flourishing
exploration and research, such as oceanographic mapping, growth of artificial intelligence (AI) and intelligent systems
environmental monitoring, and archaeology. Incorporating an are indispensable technologies to accomplish these tasks and
automatic object detection system with AUVs can substantially play an important role in the development of AUVs.
improve their ability to perceive and recognize objects in a Integrating an intelligent object detection system on board can
complicated and often hazardous environment. Currently, detecting significantly enhance the perception and recognition
underwater objects relied on a man-in-the-loop approach, where
capability of AUVs.
AUVs captured vast amounts of data and saved them in memory for
Underwater object detection methods rely on either
offline processing. This study investigates the use of deep learning
for automatic image preprocessing and object detection, evaluating
acoustic images or optical images [2]. Sonars and vision
and comparing three state-of-the-art YOLO (You Only Look Once) cameras are key perception equipment used to identify and
models, including YOLOv8, YOLOv7, and YOLOv5. Extensive detect objects in underwater environments. In contrast to
experiments were conducted using publicly available underwater sonars, optical images captured by vision cameras offer higher
image datasets, revealing that the pre-trained models attain superior resolution and a greater amount of detailed information [3].
performance on the Brackish dataset. YOLOv5 and YOLOv8 Moreover, optical systems are more cost-effective in terms of
achieved the highest mean average precision (mAP) with a score of acquisition methods. As a result, there is an increasing interest
99%, while YOLOv7 scored 89%. Furthermore, an underwater in using optical systems for underwater target detection.
image enhancement algorithm is employed on the URPC2021 Traditionally, the task of detecting underwater objects was
dataset, significantly improving the detection accuracy with a 3% performed by a man-in-the-loop approach where AUVs
increase in mAP across all three models. In terms of inference speed, capture imaging data and store them in memory for offline
YOLOv5 demonstrated the highest frames per second (FPS), while processing by expert analysis [4]. However, there is an
maintaining comparable performance in mAP and recall. increasing demand for automatic underwater processing to
enable on the fly decision-making and to extend mission
Keywords—Underwater robotics, AUV, underwater object times. Specifically, undersea exploration using automatic
detection, image enhancement, YOLOv8, YOLOv7, YOLOv5. object detection has two advantages. Firstly, it allows AUVs
I. INTRODUCTION to make real-time decisions based on the data it collects where
accurate detection and recognition of objects undersea is
The marine environment is a diverse and intricate part of imperative, thereby saving a lot of time and allowing longer
the Earth's surface, serving a significant role in sustaining both surveys. Secondly, real-time underwater object detection can
the environment and human populations. It provides valuable enable greater autonomy for AUVs, which perform
minerals, oil, gas, and other aquatic resources, making it a preprogrammed missions. AI-powered AUVs are expected to
target for marine exploration endeavours [1]. However, its perform not only to collect data but also to perceive and react
harsh conditions hinder exploration through traditional means, to the data it collects immediately (e.g., reinspection of
rendering it the least explored environment. In recent years, interesting objects).
the development of underwater robots, such as autonomous In the last few years, deep learning (DL) techniques have
underwater vehicles (AUVs) provides a great opportunity to revolutionized the field of computer vision and have fuelled
explore and protect the resources beneath the water. AUVs the practical application of underwater object detection.
come with various sensing devices, including underwater Villon et al. [5] compared the traditional approach (histogram
cameras, sonars, depth sensors, and lighting. They are also of oriented gradients + support vector machine) with the deep
equipped with other payload devices that enable them to learning method in coral reef fish detection and their
monitor underwater environments and carry out intricate experimental analysis showed the superiority of the deep
underwater operations. These operations include capturing learning method for object detection underwater. In their
marine organisms, creating oceanographic maps, inspecting

979-8-3503-4218-5/23/$31.00 ©2023 IEEE


Authorized licensed use limited to: Visvesvaraya Technological University Belagavi. Downloaded on May 20,2024 at 06:20:34 UTC from IEEE Xplore. Restrictions apply.
work, Wang et al. [6] introduced a deep learning architecture used to detect and recognize objects. The performance of the
that incorporates convolutional encoding and decoding trained models was evaluated on the test sets of the three
features to recognize objects underwater. Their proposed publicly available underwater image datasets.
framework utilizes a pre-trained convolutional model,
AlexNet, which was initially trained for the ImageNet task. B. Underwater Datasets
The authors transfer the knowledge of the first two layers of This paper employs three publicly available datasets for
the model to facilitate the underwater detection task. underwater object detection. The underwater robot
Similarly, another study [7] utilized deep convolutional professional contest 2021 (URPC2021) dataset [10], the
networks, transfer learning, and data augmentation to develop Brackish dataset [11], and the Aquarium dataset [12] were
a real-time fish detection and tracking framework from video used for training underwater object detection algorithms.
monitoring systems of AUVs. Recently, Michael et al. [8] Fig.2 illustrates the statistical data that displays the number of
created a method for detecting litter in underwater targets in each dataset.
environments using visual deep learning to tackle the issue of The URPC2021 dataset is an underwater robot
plastic debris pollution. The researchers assessed the professional contest dataset of 2021, which was created to
effectiveness and precision of different deep learning models, evaluate the performance of underwater object detection
such as Faster RCNN, SSD, YOLOv2, and Tiny-YOLO. algorithms. The dataset consists of 8200 underwater images
Faster RCNN was found to have the best performance, that were extracted from videos captured by an underwater
although with a weakened inference time. YOLOv2 achieved robot ROV in natural environments. The dataset includes box-
a good trade-off between speed and accuracy. level annotations for four categories of objects: holothurian,
This paper aims to explore the use of the latest deep- echinus, starfish, and scallops. The echinus category stands as
learning techniques for automatic object detection in AUVs the most prevalent class, followed by starfish, holothurian, and
and to determine the most suitable algorithm for deployment scallop, in terms of abundance, as shown in Fig.2 (a).
using underwater images and videos. AUVs require real-time The Brackish dataset, created in 2019, is an underwater
decision-making, where correct detection and classification of
objects underwater is imperative. Detection based on classical
computer vision is difficult and error-prone due to manually
created feature extractions. The key highlights of our study are
outlined in the following.
• An underwater image enhancement pipeline that
incorporates colour correction, dehazing, and contrast
enhancement is developed to enhance the quality of
underwater images and improve detection accuracy.
• Evaluate the performance of three state-of-the-art YOLO
models (YOLOv5, YOLOv7, and YOLOv8) for detecting
marine objects in challenging underwater conditions.
• Conduct a comprehensive experimental study on three Fig.2. Statistical distribution of targets in each dataset.
different underwater benchmark datasets to determine the (a) URPC2021 dataset, and (b) Brackish dataset
most effective object detection approach for AUVs. dataset comprising more than 14,000 frames. It was created by
annotating real filmed underwater videos and encompasses six
II. METHODS AND MATERIALS distinct classes of underwater objects: Big fish, Crab,
A. Overall Framework Jellyfish, Shrimp, Small fish, and Starfish. The dataset was
collected using three mounted cameras positioned on the
Fig.1 presents the overall framework of the underwater seabed, resulting in a diverse collection of images and
object detection network proposed in this study. Three viewpoints.
publicly available underwater image datasets were employed The Aquarium dataset is relatively smaller, consisting of
to train the YOLO (You Only Look Once) models [9], only 638 images collected from two aquariums. However, it
providing a diverse range of images with varying lighting still contains multiple bounding boxes with seven different
conditions, water depths, and underwater scenes. Initially, classes of underwater objects, which include fish, jellyfish,
images undergo pre-processing via augmentation and an penguin, puffin, shark, starfish, and stingray. The dataset was
underwater image-enhancement pipeline. The resulting labelled for object detection.
image, along with the original image, is then utilized as input
data for the object-detection network. Three YOLO
frameworks (namely, YOLOv5, YOLOv7, and YOLOv8) are

Fig.1. General framework of the object detection architecture

Authorized licensed use limited to: Visvesvaraya Technological University Belagavi. Downloaded on May 20,2024 at 06:20:34 UTC from IEEE Xplore. Restrictions apply.
C. Underwater Image Preprocessing 2017, YOLOv5 achieved an AP of 50.7% with an image size
Images captured underwater suffer from low visibility and of 640 pixels. Additionally, YOLOv5 is known for its ease of
colour distortions caused by light scattering by particles in the use, training, and deployment. YOLOv7 is a modified version
water and wavelength-dependent light absorption, unlike of YOLOv5, incorporating several enhancements, including
images taken on the surface. Light absorption results in the use of residual blocks, skip connections, and anchor boxes,
significant colour distortion and loss of image information, to improve both accuracy and speed while reducing false
while light scattering produces haze effects, suppresses image positives [21]. YOLOv8 [22], which was recently introduced
details, and reduces image contrast [13]. Detecting underwater by Ultralytics, claims to be the current leader in real-time
objects using cameras is challenging due to these negative object detection. It offers faster processing speeds than
effects, as well as other complex background interferences previous versions of YOLO and supports state-of-the-art
such as camera shaking and non-uniform illumination, computer vision algorithms, such as instance segmentation
affecting real-time detection performance underwater. Fig. 3 and image classification.
shows some low-quality underwater images taken from the E. Model Evaluation Measures
URPC2021 dataset. Image (a) depicts a low-resolution
underwater image. In image (b), a noticeable colour bias is The standard metrics in object detection were used for
present, and the overall style appears to be dominated by green evaluation and comparison of the models, including,
tones. Image (c) exhibits a haze effect caused by light precision, recall, precision-recall curve, average precision
scattering in underwater environments. The issue with image (AP), and mean average precision (mAP) with intersection
(d) lies in its low contrast and the presence of a colour cast. To over union (IoU).
improve the visual quality of such underwater images and to Precision is the fraction of correct detections among all
the detections made by the model, while Recall is the fraction
of correct detections among all the true objects in the scene.
Higher values of both metrics indicate better performance.
Intersection over Union (IoU) measures the overlap
between the predicted bounding box and the ground truth
bounding box. For example, how much of the picture does the
predicted bounding box cover? An IoU value of 1.0 indicates
a perfect overlap, while values closer to 0 indicate little to no
Fig.3 Example of an underwater images on the URPC2021 dataset. overlap.
Average Precision (AP) measures the average precision
enhance the detection accuracy, underwater image pre- across all recall values. A higher AP value indicates better
processing, such as image enhancement or restoration, is an performance. Mean Average Precision (mAP) is the average
essential step [14]. AP value across all object classes. It is commonly used in
In this study, underwater image enhancement techniques, object detection competitions to evaluate the overall
such as dehazing, colour correction, and contrast enhancement performance of a model.
have been applied to remove haze and the colour cast from • mAP@ 0.5: is the average of AP of all pictures in each
images [15][16]. The enhancement technique used in this category when IoU is set to 0.5.
paper is based on a single-image approach that enhances • mAP@ 0.5:0.95: This is the average of mAP considering
underwater images without requiring prior knowledge of light different IoU thresholds (from 0.5 to 0.95 in steps of 0.05)
properties or imaging models [17][18]. The image
enhancement module consists of a series of independent III. EXPERIMENTS AND DISCUSSIONS
processing steps. These steps are designed to effectively This section presents the implementation details and the
correct the degraded images and enhance their quality for experimental results of the proposed framework along with
improved object recognition. detailed discussions.
D. Underwater Object Detectors A. Experimental details
The objective of a contemporary object detector is to This study utilized the NTNU's IDUN computing cluster
identify both the location and type of object in every input [23] for all experiments and implementations. The cluster
image. The state-of-the-art detectors consist of three primary comprises over 70 nodes and 90 general-purpose graphics
components: a backbone that extracts features and generates a processing units (GPGPUs), each of which is equipped with
feature map representation of the input image through a at least 128 GB of main memory and two Intel Xeon cores and
reliable image classifier, a neck that is connected to the is connected to an Infiniband network. Half of the nodes are
backbone and functions as a feature aggregator by assembling fitted with two or more NVIDIA Tesla P100 or V100
feature maps from various stages of the backbone and GPGPUs. For training and testing YOLO models, the study
integrating these multi-level features, and a head that employed CUDA 11.7, the PyTorch 2.0 framework, anaconda
identifies bounding boxes and conducts classification 3 with Jupyter Notebook, and Python 3.9.12.
predictions [19], as shown in Fig. 1. The training epochs were set to 150 for the YOLO models.
The present study utilized YOLO architectures for YOLOv5 and YOLOv8 models were trained on input images
underwater object detection. YOLO has gained widespread of size 640 × 640. In contrast, YOLOv7 was trained on size of
use as a real-time object detection system due to its 416 × 416 input images. All three models utilized stochastic
exceptional speed and accuracy, resulting in its popularity in gradient descent (SGD) as the default optimizer. The
various fields like robotics, autonomous vehicles, and video hyperparameters used for training and testing the models are
surveillance. Specifically, three advanced YOLO detectors - summarized in Table 1. The image pre-processing method is
YOLOv5, YOLOv7, and YOLOv8 - were compared in this applied to the training dataset that involves data preparation,
study. YOLOv5 is a single-stage object detection algorithm, noise reduction, augmentation, and image enhancement. In
which is more efficient and versatile than its earlier iterations addition, we generated annotations in YOLO format, along
[20]. During evaluation on the MS COCO dataset test-dev

Authorized licensed use limited to: Visvesvaraya Technological University Belagavi. Downloaded on May 20,2024 at 06:20:34 UTC from IEEE Xplore. Restrictions apply.
with configuring parameters within the pre-trained models. trained solely on the raw dataset. Notably, there is a significant
All three datasets (Aquarium, Brackish and URPC2021) used increase of 3% in mAP for YOLOv5 and YOLOv7 and a 4%
in this experiment were split into training (80%), validation increase for YOLOv8. Fig. 6 shows the visualization of
(10%) and testing (10%). detection results on the URPC2021 dataset, where the
Table 1. Parameters used in the experiment.
different coloured squares in the figure represent the various
targets in each YOLO model, such as echinus (red),
YOLO Batch epochs Learning Weight Input holothurian (pink), starfish (yellow), and scallop (orange) in
Model size rate decay shape YOLOv5 and YOLOv8. YOLOv5 identified 3 bounding
YOLOv5s 32 150 0.01 0.0005 640 × 640
YOLOv7x 16 150 0.01 0.0005 416 × 416
boxes for echinus, 2 for starfish, and 1 for holothurian, all with
relatively high confidence scores. Despite the demonstrated
YOLOv8 32 150 0.001 0.001 640 × 640
improvement in performance using the image enhancement
B. Experimental Results algorithm, YOLOv7 and YOLOv8 models still face
Before implementing the underwater image enhancement challenges in accurately detecting Holothurian species,
algorithm, we conducted an experiment on the original resulting in instances of missed detections. This is evident in
datasets to identify the dataset that posed a great challenge for Fig.6 where YOLOv5 successfully detects all classes, while
our object detection task. Our preliminary analysis of the YOLOv7 and YOLOv8 failed to detect Holothurian (i.e.,
Aquarium and Brackish datasets revealed that the YOLO missing detection) and exhibit occurrences of false detection
models performed exceptionally well, exhibiting high in the URPC2021 dataset. The detection results on the
precision, recall, and mean average precision (detailed brackish are visualized in Fig.7. It was observed that all three
performance scores can be found in Table 2). Conversely, the models successfully detected all the fish species (shrimp,
models yielded unsatisfactory results on the UPRC2021 starfish, and 2 crabs), with YOLOv5 and YOLOv8 having
dataset. The poor results obtained from the models can be nearly equal recognition performance. Fig.8 illustrates an
attributed to the significant challenges posed by the complex example of the results of using the testing dataset for the
environments present in the UPRC2021 dataset. These Aquarium dataset and shows the detection output of three
challenges include low resolution, haze due to motion blur, different YOLO models, namely YOLOv5, YOLOv7, and
low contrast and colour cast as well as the frequency of YOLOv8. These models were able to detect all fishes and
smaller objects in a complex underwater environment. Fig.4 stingrays, demonstrating their flexibility to model and predict
demonstrates an instance of the original distorted images on various dimensions and scales. They were also able to
(Fig.4a) from the URPC2021 dataset alongside its handle the complexity of distinguishing between the
corresponding enhanced images (Fig.4b), obtained through underwater animals (such as fish and stingray) and the
the underwater image enhancement algorithm. background, a significant challenge in underwater image
analysis. Of the three models, YOLOv7 demonstrated the
highest confidence value for detecting the stingray, with a
value of 0.99. YOLOv8 came in second with a confidence
value of 0.94, and YOLOv5 had the lowest confidence of 0.87.

Fig. 4. Sample underwater images on the URPC2021 dataset:


(a) Original images (b) Enhanced images.

In order to improve the recognition performance of pre-trained


YOLO models, a combined approach is employed, utilizing Fig.6. Detection outputs on UPRC2021 dataset: (a)YOLOv5,
both enhanced images and the original images as inputs for b) YOLOv7, (c) YOLOv8.
the object-detection network. Fig.5 illustrates the detection
results in terms of [email protected], comparing the performance
with and without the image enhancement algorithm at 100
epochs. The model trained with image enhancement
demonstrates superior performance compared to the model

Fig.7. Detection outputs on Brackish dataset: (a)YOLOv5,


b) YOLOv7, (c) YOLOv8.
Performance

Fig.5. Comparison of [email protected] for 100 epochs. Training with image


Fig.8. Detection outputs on Aquarium dataset: (a)YOLOv5,
enhancement shows a greater improvement in mAP compared to training
b) YOLOv7, (c) YOLOv8.
without image enhancement for (a) YOLOv5 and (b) YOLOv8.

Authorized licensed use limited to: Visvesvaraya Technological University Belagavi. Downloaded on May 20,2024 at 06:20:34 UTC from IEEE Xplore. Restrictions apply.
The PR curves of all classes for the YOLO models, along enhanced dataset exhibit higher performance in terms of all
with the overall class curve, are depicted in Fig.9 to the accuracy metrics compared to those on the original dataset.
demonstrate their performance on the Brackish dataset. The This highlights the effectiveness of image enhancement in
average of all mAP classes was used to calculate the overall improving the detection accuracy on the URPC2021 dataset.
class curve. YOLOv5 and YOLOv8 exhibited nearly equal On the URPC2021 enhanced dataset and Aquarium
performance with the largest area under the curve in the figure, dataset, YOLOv7 achieved the highest performance values
indicating better detection results for all target classes, across all metrics measured. However, when applied to the
especially for ‘crabs’, and ‘starfish’, with AP values of 99.5%. Brackish dataset, YOLOv7 exhibited lower performance
compared to other models. YOLOv5 and YOLOv8 exhibited
excellent recognition performance on the Brackish dataset,
outperforming YOLOv7 with the highest mean average
precision ([email protected]) value of 99%. When considering recall,
the YOLOv5 model exhibited superior performance
compared to YOLOv7 and YOLOv8, achieving a value of
98.5%. On the other hand, YOLOv8 attained the highest
performance with [email protected]:0.95 value of 85.6%, as shown
in Table 2. Although YOLOv8 is claimed to be state-of-the-
art and is expected to surpass previous versions of YOLO
models in terms of performance, the detection outputs
achieved on the three underwater datasets were slightly
similar to those obtained with YOLOv5 across all evaluation
metrics. However, since the YOLOv8 research is still in
progress, it is challenging to exploit its full potential.
In addition to the detection accuracy metrics, the
computational complexity of the YOLO models is provided in
terms of FPS (frames per second), GFLOPS (giga floating
point operations), and parameter size, as shown in Table 2.
FLOPs and FPS are metrics that respectively gauge the
computational complexity and detection speed of a detector,
while parameter size determines the deployability of the
detector. In terms of GFLOPS, the YOLOv5 architecture
demonstrates an estimated computational cost of 16 GFLOPS,
outperforming YOLOv7 and YOLOv8 across all three
datasets (Aquarium, Brackish, and URPC2021). This
Fig.9. The precision-recall curve of (a)YOLOv5, (b)YOLOv7, indicates that YOLOv5 requires less computational power for
and (c)YOLOv8 on Brackish dataset. object detection tasks compared to the latest versions.
The [email protected] value for YOLOv5 and YOLOv8 was 99% Additionally, YOLOv5 excels in terms of parameter size,
approximately. Conversely, YOLOv7 had the lowest area making it more suitable for real-time detection and deployable
under the PR curve, resulting in a lower mAP value. Certain on computing-constrained underwater vehicles such as AUVs
classes, like 'crab' and 'starfish,' are generally easier to detect (autonomous underwater vehicles). With FPS, the YOLOv5
due to their prevalence on the seafloor. In contrast, the 'small model achieved the highest FPS on the Aquarium dataset at
fish' and 'jellyfish' classes pose greater difficulty for models to 135 and on the URPC2021 dataset at 169 when executed on a
learn because they can appear anywhere in the image with Tesla V100 GPU. In contrast, YOLOv7 had the lowest FPS
similar frequency and have relatively small sizes. Table 2 on these two datasets. On the Brackish dataset, YOLOv8
provides a detailed comparison of the three YOLO models on outperformed the other YOLO models with the highest
the original and enhanced dataset of URPC2021, as well as the execution speed of 162 FPS. However, it had relatively a
Brackish and Aquarium datasets. The comparison slower FPS on the UPRC2021 dataset.
encompasses various metrics such as accuracy, speed, and In general, YOLOv5 can be optimized to deliver
latency, providing a detailed analysis of each model's competitive detection performance while utilizing fewer
performance across these all the datasets. A closer look at the FLOPS and achieving higher speeds. This can make it a good
table, it is evident that the detection results on the URPC2021 choice for AUVs that possess limited computing capability
Table 2: Performance results of the different YOLO models on the different datasets, including the enhanced URPC dataset.
background. highlighting.

Authorized licensed use limited to: Visvesvaraya Technological University Belagavi. Downloaded on May 20,2024 at 06:20:34 UTC from IEEE Xplore. Restrictions apply.
and memory, as it meets their requirements effectively. [3] K. Liu and Y. Liang, “Enhancement of underwater optical images
Despite their slightly lower inference speed compared to based on background light estimation and improved adaptive
transmission fusion,” Opt. Express, 2021, doi: 10.1364/oe.428626.
YOLOv5, both YOLOv7 and YOLOv8 can provide
[4] G. S. Kumar, U. V. Painumgal, M. N. V. C. Kumar, and K. H. V.
preferable options for AUVs in terms of robustness and Rajesh, “Autonomous Underwater Vehicle for Vision Based
detection performance in complex underwater environments. Tracking,” 2018. doi: 10.1016/j.procs.2018.07.021.
[5] S. Villon, M. Chaumont, G. Subsol, S. Villéger, T. Claverie, and D.
IV. CONCLUSIONS Mouillot, “Coral reef fish detection and recognition in underwater
Achieving efficient recognition of objects underwater has videos by supervised machine learning: Comparison between deep
learning and HOG+SVM methods,” 2016. doi: 10.1007/978-3-319-
been one of the main objectives of autonomous underwater 48680-2_15.
vehicles (AUVs). This paper explores the use of vision-based
[6] X. Wang, J. Ouyang, D. Li, and G. Zhang, “Underwater Object
deep learning algorithms for automatic object detection in Recognition Based on Deep Encoding-Decoding Network,” J. Ocean
(AUVs) using challenging underwater scenes. Three publicly Univ. China, vol. 18, no. 2, pp. 376–382, Apr. 2019, doi:
available underwater datasets, Aquarium, Brackish, and 10.1007/s11802-019-3858-x.
UPRC2021, were used to compare the performance of three [7] X. Sun et al., “Transferring deep knowledge for object recognition in
detection algorithms, namely YOLOv5, YOLOv7, and Low-quality underwater videos,” Neurocomputing, vol. 275, pp. 897–
908, Jan. 2018, doi: 10.1016/j.neucom.2017.09.044.
YOLOv8. An underwater image enhancement pipeline was
developed to improve and support the object detection task [8] M. Fulton, J. Hong, M. J. Islam, and J. Sattar, “Robotic detection of
marine litter using deep visual detection models,” 2019. doi:
using these algorithms. The objective was to select the best 10.1109/ICRA.2019.8793975.
algorithm or model that could be integrated as a target [9] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You only look
detection and recognition component of the AUV. The study once: Unified, real-time object detection,” 2016. doi:
demonstrates that YOLOv7 achieved superior accuracy and 10.1109/CVPR.2016.91.
speed in detecting underwater objects on the Aquarium and [10] Z. Liu, Y. Zhuang, P. Jia, C. Wu, H. Xu, and Z. Liu, “A Novel
enhanced version of the URPC2021 dataset, achieving Underwater Image Enhancement Algorithm and an Improved
precision rates of 98% and 92% respectively. However, it Underwater Biological Detection Pipeline,” J. Mar. Sci. Eng., 2022,
doi: 10.3390/jmse10091204.
struggled with inference time when tested on the Tesla V100
[11] A. Jesus, C. Zito, C. Tortorici, E. Roura, and G. DeMasi, “Underwater
GPU, resulting in slower execution speed compared to other Object Classification and Detection: First results and open challenges,”
YOLO models. YOLOv8 shows a good balance of accuracy 2022. doi: 10.1109/OCEANSChennai45887.2022.9775417.
and speed, while YOLOv5 provides the best inference times [12] Roboflow, “Underwater Object Detection Dataset,” Kaggle, 2020.
on GPU. On the Aquarium and UPRC2021 datasets, YOLOv8 https://www.kaggle.com/datasets/slavkoprytula/aquari um-data-cots
and YOLOv5 achieved nearly equal performance in terms of [13] J. Y. Chiang and Y. C. Chen, “Underwater image enhancement by
precision, recall, [email protected], and mAP@0,5:0.95, but wavelength compensation and dehazing,” IEEE Trans. Image Process.,
YOLOv5 was the fastest algorithm that outperformed both 2012, doi: 10.1109/TIP.2011.2179666.
YOLOv7 and YOLOv8 across all three datasets. [14] C. Li et al., “An Underwater Image Enhancement Benchmark Dataset
Overall, the study highlights the potential of vision-based and beyond,” IEEE Trans. Image Process., 2020, doi:
10.1109/TIP.2019.2955241.
deep learning algorithms in underwater object detection and
[15] M. Afifi, B. Price, S. Cohen, and M. S. Brown, “When color constancy
uses an image enhancement algorithm for improving system goes wrong: Correcting improperly white-balanced images,” 2019. doi:
performance. The lack of high-quality underwater datasets 10.1109/CVPR.2019.00163.
and images remains a significant challenge in the development [16] Y. Wang, W. Song, G. Fortino, L. Z. Qi, W. Zhang, and A. Liotta, “An
of target detection in underwater environments. Future Experimental-Based Review of Image Enhancement and Image
research efforts will focus on optimizing the most effective Restoration Methods for Underwater Imaging,” IEEE Access. 2019.
models by collecting a large and diverse set of underwater doi: 10.1109/ACCESS.2019.2932130.
datasets and employing image enhancement techniques to [17] Y. T. Peng, K. Cao, and P. C. Cosman, “Generalization of the Dark
improve the overall quality of underwater images, which are Channel Prior for Single Image Restoration,” IEEE Trans. Image
Process., 2018, doi: 10.1109/TIP.2018.2813092.
crucial for the practicality of the system in real-world
[18] P. Drews-Jr, E. Do Nascimento, F. Moraes, S. Botelho, and M.
applications. Campos, “Transmission estimation in underwater single images,”
2013. doi: 10.1109/ICCVW.2013.113.
ACKNOWLEDGMENT [19] T. Diwan, G. Anirudh, and J. V. Tembhurne, “Object detection using
This research work was supported by the ADRIATIC YOLO: challenges, architectural successors, datasets and
project (cooperAtion unDerwater foR effIcient operATions applications,” Multimed. Tools Appl., 2023, doi: 10.1007/s11042-022-
13644-y.
vehICles) co-funded by the MarTERA partners Romanian
[20] U. Nepal and H. Eslamiat, “Comparing YOLOv3, YOLOv4 and
Executive Unit for Financing Higher Education, Research, YOLOv5 for Autonomous Landing Spot Detection in Faulty UAVs,”
Development and Innovation (UEFISCDI), the Scientific and Sensors, 2022, doi: 10.3390/s22020464.
Technological Research Council of Turkey (TÜBITAK) and [21] C.-Y. Wang, A. Bochkovskiy, and H.-Y. M. Liao, “YOLOv7:
the Research Council of Norway (RCN) and the European Trainable bag-of-freebies sets new state-of-the-art for real-time object
Union. detectors,” Jul. 2022, [Online]. Available:
http://arxiv.org/abs/2207.02696
REFERENCES [22] J. Solawetz and Francesco, “What is YOLOv8? The Ultimate Guide.,”
Roboflow, 2023.
[1] Z. Liu, M. Ling, T. Zhu, and D. Xu, “Safety Analysis of Shrinkage
Monitoring Equipment in Marine Resource Exploration,” J. Coast. [23] M. Själander, M. Jahre, G. Tufte, and N. Reissmann, “EPIC: An
Res., 2020, doi: 10.2112/JCR-SI105-051.1. Energy-Efficient, High-Performance GPGPU Computing Research
Infrastructure,” pp. 1–6, 2019, [Online].
[2] H. Ghafoor and Y. Noh, “An overview of next-generation underwater
Available:http://arxiv.org/abs/1912.05848
target detection and tracking: An integrated underwater architecture,”
IEEE Access. 2019. doi: 10.1109/ACCESS.2019.2929932.

Authorized licensed use limited to: Visvesvaraya Technological University Belagavi. Downloaded on May 20,2024 at 06:20:34 UTC from IEEE Xplore. Restrictions apply.

You might also like