0% found this document useful (0 votes)

26 views20 pages

Electronics 11 00739 v2

The document presents a modified Yolov3 deep learning network for ship detection using visible and infrared images, aimed at enhancing port management efficiency. The modified architecture improves detection accuracy and processing speed, achieving a mean average precision of 93.2% and processing 104 frames per second while reducing computational complexity. The study demonstrates the effectiveness of the proposed method through experimental results on a self-built dataset of 5557 images from various ship types.

Uploaded by

sktxuyxa2004

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

26 views20 pages

Electronics 11 00739 v2

Uploaded by

sktxuyxa2004

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 20

electronics

Article
Modified Yolov3 for Ship Detection with Visible and
Infrared Images
Lena Chang 1 , Yi-Ting Chen 2, *, Jung-Hua Wang 2,3 and Yang-Lang Chang 4

1 Department of Communications, Navigation and Control Engineering, National Taiwan Ocean University,
Keelung 202301, Taiwan; [email protected]
2 Department of Electrical Engineering, National Taiwan Ocean University, Keelung 202301, Taiwan;
[email protected]
3 Department of Electrical Engineering, AI Research Center, National Taiwan Ocean University,
Keelung 202301, Taiwan
4 Department of Electrical Engineering, National Taipei University of Technology, Taipei 106344, Taiwan;
[email protected]
* Correspondence: [email protected]; Tel.: +886-02-2462-2192 (ext. 7214)

Abstract: As the demands for international marine transportation increase rapidly, effective port
management has become an important issue. Automatic ship recognition can facilitate the realization
of smart ports, and improve the efficiency of port operation and management. In order to take
into account the processing efficiency and detection accuracy at the same time, the study presented
an improved deep-learning network based on You only look once version 3 (Yolov3) for all-day
ship detection with visible and infrared images. Yolov3 network can simultaneously improve the
recognition ability of large and small objects through multiscale feature-extraction architecture.
Considering reducing computational time and network complexity with relatively competitive
detection accuracy, the study modified the architecture of Yolov3 by choosing an appropriate input
image size, fewer convolution filters, and detection scales. In addition, the reduced Yolov3 was further

modified with the spatial pyramid pooling (SPP) module to improve the network performance in
Citation: Chang, L.; Chen, Y.-T.;
Wang, J.-H.; Chang, Y.-L. Modified
feature extraction. Therefore, the proposed modified network can achieve the purpose of multi-scale,
Yolov3 for Ship Detection with multi-type, and multi-resolution ship detection. In the study, a common self-built data set was
Visible and Infrared Images. introduced, aiming to conduct all-day and real-time ship detection. The data set included a total of
Electronics 2022, 11, 739. https:// 5557 infrared and visible light images from six common ship types in northern Taiwan ports. The
doi.org/10.3390/electronics11050739 experimental results on the data set showed that the proposed modified network architecture achieved
Academic Editor:
acceptable performance in ship detection, with the mean average precision (mAP) of 93.2%, processing
Abdeldjalil Ouahabi 104 frames per second (FPS), and 29.2 billion floating point operations (BFLOPs). Compared with the
original Yolov3, the proposed method can increase mAP and FPS by about 5.8% and 8%, respectively,
Received: 15 February 2022
while reducing BFLOPs by about 47.5%. Furthermore, the computational efficiency and detection
Accepted: 26 February 2022
performance of the proposed approach have been verified in the comparative experiments with some
Published: 27 February 2022
existing convolutional neural networks (CNNs). In conclusion, the proposed method can achieve
Publisher’s Note: MDPI stays neutral high detection accuracy with lower computational costs compared to other networks.
with regard to jurisdictional claims in
published maps and institutional affil- Keywords: ship detection; Yolov3; spatial pyramid pooling; infrared images; visible images
iations.

1. Introduction
Copyright: © 2022 by the authors.
Licensee MDPI, Basel, Switzerland. With the dramatic increase in the demand for international maritime trade, effective
This article is an open access article management of ports plays a pivotal role in many developing countries. In addition,
distributed under the terms and real-time monitoring of ships to provide safe coastal areas is also an important issue when
conditions of the Creative Commons developing fishery economy and maritime transportation. As computer vision and artificial
Attribution (CC BY) license (https:// intelligence develop rapidly, the application of intelligence surveillance systems has been
creativecommons.org/licenses/by/ gradually used in various fields. Recently, ports are getting smarter through intelligent
4.0/). navigation, automation, and reducing the need for manpower. For instance, the target

Electronics 2022, 11, 739. https://doi.org/10.3390/electronics11050739 https://www.mdpi.com/journal/electronics

Electronics 2022, 11, 739 2 of 20

detection technology based on deep learning algorithms has attracted widespread attention
in the field of autonomous ship navigation and intelligent ship monitoring [1]. Moreover,
real-time detection of ships based on computer vision technology has greatly improved
port management and maritime inspections [2].
Ship detection plays an important but challenging role in the field of image recogni-
tion. There are two types of data available for ship detection, radar images, and optical
images. In general, radar images cover a wider range and optical images provide more
detailed information. In the literature, Synthetic Aperture Radar (SAR) imagery [3–5] and
optical images [6–9] have been widely used for ship detection methods. These studies
conducted experiments in different complex backgrounds for SAR images and optical
images, respectively. It was shown in [6] that the complex background will cause a lot of
false alarms and even increase the computational time. Therefore, it is difficult to develop a
suitable detection model for the complex ocean background characterized by rough sur-
faces, coastal areas, and river estuaries. Moreover, the utilization of SAR images is limited
by noise response and low resolution. For example, the resolution of SAR images degrades
the detection performance of small and densely distributed ships, especially for fishing
vessels moored in ports. Furthermore, due to the time-consuming of image collection and
preprocessing, it is difficult to use remote sensing data to achieve real-time ship detection.
With the rapid development of digital cameras, intelligent video surveillance systems
are increasingly deployed in ports and coastal areas, which can be utilized for visible
ship target detection. Through video surveillance, the port management system can
automatically assign a suitable berthing position according to the ship detection results,
which reduces ship waiting time and improves the throughput of berthing areas. This not
only reduces the port operating cost but also improves the port service quality. Moreover,
ship detection also plays an important role in coastal defense. In order to ensure the safety
of coastal areas, the coast guard currently spends a lot of manpower in performing patrol
and defense tasks. With the aid of ship detection, the coast guard can instantly understand
the conditions of the coastal area. For example, ships smuggling or crossing the border can
be detected by the video surveillance systems along the coastline. Therefore, the study used
infrared and visible images for ship detection to monitor ships in the harbor day and night.
Clear and low-noise images are beneficial for subsequent object detection. However,
real-world images are inevitably affected by noise, which may originate from adverse
weather conditions, image acquisition chains, or image compression. These lead to the
degradation of the obtained visual image. This degradation can be canceled or at least
reduced by denoising preprocessing. In general, the image denoising methods can be
divided into spatial domain methods and transform domain methods, such as kernel
regression [10], nonlinear digital filters [11], and the most efficient denoising methods
based on wavelets [12] of first-generation [13] or 2nd generation [14]. Such operations can
not only improve the quality of the image, but also improve the performance of subsequent
image processing (extraction of the desired information, prediction, classification, texture
analysis, and segmentation).
In recent years, there have been several studies on ship detection using optical im-
ages [15–17]. Most algorithms contain three common processes, including region selection,
feature extraction, and classification. Region selection [8] generally adopts the sliding
window method to pass through the image globally. This causes a lot of computation re-
dundancy, thus increasing processing time. Then, features of the target are extracted, which
will affect the performance of subsequent target detection. There are some well-known
feature extraction methods, such as local binary pattern (LBP) [15], scale-invariant feature
transformations (SIFT) [18], and histogram of oriented gradients (HOG) [19], which need
to be manually designed to obtain valuable features. In addition, the establishment of
manual features relies too much on expert experience, and the generalization ability is
weak. Based on the extracted features, targets are mapped and classified using a classifier,
such as a support vector machine (SVM) [16,17,20] and Adaboost [6,21]. Most of the tradi-
tional ship detection methods were based on remote sensing data, which were captured
Electronics 2022, 11, 739 3 of 20

from a top-down view. Therefore, the handcrafted features can be defined according to
the ship’s aspect ratio, size, or scattering characteristics. In this paper, the ship images
were taken by the camera in ports from different side-view angles. Even for the same
ship type, different perspectives will lead to different ship characteristics. The traditional
methods are limited by the manually designed object features and templates. For ship
detection, the methods based on handcrafted features encounter bottlenecks in the case
of ship targets with multiple scales, multiple types, and multiple side views, or under
complex weather and ocean conditions [6,7]. When it is difficult to define object features by
hand programming, machine learning provides a feasible solution to learn features from a
large amount of observational data. Recently, computer vision based on deep learning and
convolutional neural networks (CNNs) have been widely used in various fields, especially
for object detection and classification. Semantic image features extracted by the deep CNNs
(DCNNs) are robust to morphological changes, image noise, and relative object positions
in visual images [22–25]. Therefore, this research was motivated to utilize an efficient deep
learning network to achieve automatic feature extraction for machine learning. Ships of
various sizes, shapes, and colors can be detected by deep learning methods with higher
detection accuracy than traditional methods. However, it remains a challenge in detecting
small or densely distributed ships, especially in ports.
However, many high-precision methods are computationally intensive. In recent years,
the deep learning methods implemented on GPU have accelerated the computing speed of
object detection [26–30]. Generally, there are two approaches for object detection based on
deep learning: the one-stage method and the two-stage method. The two-stage approaches
consist of two modules, DCNN, and region proposal network. The representative two-stage
methods mainly include regional-based CNN (R-CNN) [30], Fast R-CNN [31], and Faster
R-CNN [32]. For instance, Fan et al. [33] proposed the modified Faster R-CNN for ship
detection using Polarimetric SAR (PolSAR) data, which was still difficult to detect in-shore
small ships. The study [34] predicted ship navigation direction and detected dense ships
through a detection model with multi-scale rotational R-CNN. The research [35] proposed a
region of interest (ROI) method, which can achieve better small ship detection performance
in SAR images by combining SVM and Faster R-CNN. Dong et al. [36] adopted a multi-
angle box-based rotation in-sensitive structure of object detection to improve the R-CNN for
very-high-resolution (VHR) ship images. The computational efficiency is still insufficient
for real-time processing, even the detection performance of the two-stage approach is
better than that of the traditional one. Subsequently, considering the requirement of fast
processing in real-time object detection, the one-stage method was proposed to directly
detect the category and position of the object by omitting the region proposal step. The main
one-stage representative methods are Single Shot Multibox Detector (SSD) [37], Yolo [38],
Yolov2 [39], Yolov3 [40], and Yolov4 [41]. In the literature, some studies have applied deep
learning methods to ship detection in SAR imagery. For example, Wang et al. [42] improved
the overall performance and detection accuracy on Sentinel-1 SAR images by using SSD
to perform transfer learning. Zhang et al. [43] proposed a grid CNN (G-CNN) approach
for real-time ship detection in SAR images, which had a faster detection performance by
meshing the input images. Furthermore, studies [44,45] have proposed improved Yolo-
based networks for ship tracking. Zhang et al. [45] solved the problems of missing and
inaccurate localization through the combination of the HOG and LBP features by the ship
detection method based on an optimized Yolo network. The study [44] realized the tracking
and detection of ships in monitored marine areas by improving Yolov3 architecture based
on Darknet.
In addition to the detection accuracy, improving the processing speed, reducing
the model complexity, and adapting the ship detection model to the actual hardware
conditions are of great significance to the system implementation. Considering the relatively
balanced detection performance in processing time and detection accuracy of the Yolov3
algorithm [40], this paper utilized the Yolov3 architecture for the ship detection method
by modifying the parameters and architecture of the network. In our previous study [46],
Electronics 2022, 11, 739 4 of 20

the concept of modifying Yolov3 parameters for ship detection was proposed based on
changing the input image size, the number of filters in the convolutional layer, and the
detection scale. Compared with [46], this study further modified the Yolov3 network by
using a spatial pyramid pool (SPP) module to improve feature extraction. More complete
experiments have been conducted, such as selecting a more appropriate input image
size for ship detection and comparing the proposed approach with other deep learning
networks. In addition, the built dataset has been augmented and images of different
complex backgrounds, ship types, and target scales have been used to verify the ship
detection method in this paper. Experimental results showed that the proposed modified
network achieved low computational complexity and robustness in real-time ship detection.
The rest of the paper was organized as follows. The framework of Yolo networks
was given in Section 2. Section 3 described the details of the modified method. Section 4
presented the self-built ship data set and the experimental results. Finally, some conclusions
were drawn in Section 5.

2. Yolov3 Network Architecture

Yolo, one of the popular end-to-end object detection networks, consists of the archi-
tecture of backbone and detection layers. The input image of the Yolo network is split
into square grid cells of size S × S. The cell is responsible for detecting the object whose
center falls within the cell. Each cell can predict N bounding boxes. For each box, there
are 5+C predictions, including box size (h, w), box center coordinates (x, y), confidence
score, and C class probabilities. h and w are the height and width of the bounding box. x
and y represent the coordinates of the box center relative to the grid cell. The parameter C
denotes the number of object categories in the dataset. The confidence score represents the
likelihood that the bounding box is correct and is defined in Equation (1).

truth
Box Confidence = Pr (Object)∗IoU (1)
predict

In (1), Pr (Object

) denotes the probability of an object contained in the bounding box.
truth
IoU represents the intersection over union (IoU) between the ground truth
predict
and the predicted box, as shown in Equation (2).

truth
boxpredict ∩ boxtruth
IoU = (2)
predict boxpredict ∪ boxtruth

IoU is often used to evaluate the accuracy of an object detector. For an IoU greater
than the defined IoU threshold, it means that the prediction of a bounding box containing
an object is “correct”. IoU is useful when assigning anchor boxes during training dataset
preparation and when cleaning multiple prediction boxes for the same object using the
non-maximum suppression algorithm. The default IoU threshold is usually assigned to 0.5,
which is at least half of the ground truth, and the predicted box covers the same region.
Yolov1 is generally reported as having a faster network speed and less computation
time. However, its detection accuracy is lower than other popular one-stage algorithms,
such as SSD and Faster R-CNN. Compared with Yolov1, Yolov2 has significant improve-
ments in computational efficiency and detection performance. Many improvements were
proposed inYolov2. The fully connected layers were replaced by the convolutional layers,
and the concept of anchor boxes was introduced. To match objects of different shapes and
sizes, the anchors are usually set according to the size of the object in the training data set.
Instead of computing class probabilities for each cell as in Yolov1, the class probabilities are
calculated for each anchor box in Yolov2. In addition, the backbone network architecture of
Yolov2 is Darknet-19.
Electronics 2022, 11, 739 5 of 20

Subsequently, Yolov3 was proposed to further improve the detection performance

of the previous versions. The improvements include the multiscale detector, backbone
network, and loss function. Through the Feature Pyramid Network (FPN), Yolov3 makes
full use of CNN to generate three scaled feature maps for the prediction. Thus, Yolov3 is a
multiscale detector that can find targets of various sizes in a single image. For example,
when the size of an input image is 352 × 352, the three feature maps with sizes of 11 × 11,
22 × 22, and 44 × 44 are respectively responsible for detecting targets with large, medium,
and small scales. Yolov3 provides nine anchor boxes, three for each scaled feature map,
for object detection. These improvements make the network have better performance,
especially for the small objects’ detection, but it takes more processing time. Besides, a
deeper Darknet-53 backbone network was applied in Yolov3, which adopted the latest tech-
nologies such as up sampling, skip connections, and residual blocks. Darknet-53 consists of
53 convolutional layers for feature extraction and five residual blocks. The residual block
was introduced to solve the problem of vanishing gradients with the depth increasing,
greatly improve the computational efficiency, and facilitate the training procedure of deeper
convolutional neural networks. Although Yolov3 improves the detection accuracy, it has a
slower processing speed than Yolov2, which uses a lightweight Darknet-19 backbone.
In Yolov3, the loss function includes three kinds of errors, namely coordinate prediction
error, IoU error, and classification error. The coordinate prediction error describes the
localization accuracy of the bounding box and is defined as:
2
s 2 B
h i
s2 B

= λcoord ∑i=1 ∑j=1 Iij (xi − xi )2 + (yi − yi )2 + λcoord ∑i=1 ∑j=1 Iij (wi − wi )2 + hi − hi
obj obj
Errorcoord (3)

λcoord is the weight of the coordinate error. s2 is the number of grid cells per detection
obj
layer and B is the number of the bounding boxes in each grid cell. Iij indicates whether a
target lies in the j-th bounding box of the i-th grid cell. (xi , yi , hi , wi ) and (xi , yi , hi , wi )
represent the center coordinate, height, and width of the ground truth and predicted box,
respectively. The IoU error indicates the degree of overlap between the ground truth and
the predicted box, which is given by

s2 s2
h 2 i h 2 i
B B
∑i=1 ∑j=1 Iij + λnoobj ∑i=1 ∑j=1 Iij
obj noobj
ErrorIoU = Ci − Ci Ci − Ci (4)

λnoobj is the confidence penalty that the prediction box does not contain an object.
Ci and Ci represent the true and predicted confidence, respectively. Classification error
represents the accuracy of classification. It can be defined as:

s2 B
Errorcls = λcoord ∑i=1 ∑j=1 Iij ∑ceclasses (pi (c) − p̂i (c))2
obj
(5)

c represents the class to which the detected target belongs. pi (c) and p̂i (c) refer to the
true probability and predicted value of the target, respectively. Combining the above errors,
the loss function of Yolov3 is expressed as:

Loss = Errorcoord + ErrorIoU + Errorcls (6)

3. Methodology
3.1. Proposed Modified Yolov3 Network Architecture
The proposed modified network architecture in this paper was based on the Yolov3
network. First, the study chose the anchor box size that was more suitable for the self-built
ship data set in network training. The anchor boxes originally proposed in Faster R-CNN
were used to detect multiple objects in one grid cell. Then, the Yolo matched the ratio of
width to height of objects by anchor boxes. In Yolo, the width and height of anchor boxes
were obtained based on the Pascal VOC [47] and COCO [48] data sets. Since those data sets
contained various types of objects, the defined anchor box size was not suitable for the ship
data set in this research. Based on the ship-type characteristics in the built ship data set,
Electronics 2022, 11, 739 6 of 20

this research obtained the appropriate anchor boxes by the K-means [49] algorithm. Since
the prediction layer of the Yolov3 network contains three anchor boxes for each scale, it is
necessary to partition the sizes of bounding boxes into nine categories. In order to acquire
optimal sizes of anchor boxes, the width and height of the bounding box are selected as
the clustering features in K-means. In the clustering process, the bounding box size of
each target in the dataset is divided into nine clusters according to the feature similarity,
which is measured by the IoU value between the current anchor box and the bounding box.
Then, the anchor box size is updated by the mean value of each cluster. These processes are
performed iteratively until the centroid of each cluster does not change. Since the selected
anchor boxes are much closer to the ship shapes in the ship data set, these anchor boxes
can speed up the network training. The size of anchor boxes obtained by K-means were
(14,21), (26,36), (47,38), (62,59), (91,77), (73,113), (130,109), (105,170), (186,172), which were
applied in the following experiments.
Next, the study evaluated the influence of the input image size on the detection per-
formance of the Yolo-based networks. For this purpose, the study examined the efficiency
of networks with different input image sizes, from 288 × 288 to 512 × 512. Generally, the
larger the input image size, that is, the larger the feature map in the deep learning network,
the more features, and details of the image can be retained. Although the detection accuracy
is better when the input image size increases, the computational complexity also increases.
In order to achieve better detection performance and computational efficiency at the same
time, the research will select the appropriate input image size in ship detection.
The multiscale detection module powerfully helps the Yolo network search and de-
tect objects of different scales in the same image. However, the more complex the entire
deep learning network, the longer the computation time required for object detection. In
addition, with the refinement of the grid, more retained image details will increase the
detection accuracy, but at the same time, more training and prediction times will reduce
the computational efficiency. Considering the trade-off between detection accuracy and
computation time, appropriate detection scales not only simplify network architecture but
also improve detection performance. Therefore, it is important to choose an appropriate
network scale for specific object detection, such as ships. The study will consider three com-
binations of detection scales, one of which has all three scales, another retains medium and
small target scales (removing the large target scale), and the other only has the small target
scale. Experiments will examine the ship detection efficiency of these three combinations.
Finally, the influence of the convolution filters on the network performance was con-
sidered. More convolutional filters mean more weights in the deep learning network, which
can improve the detection accuracy of the network, but also increase the computational
burden of the system. Since the built ship dataset includes only six types of ships, choosing
an appropriate number of filters will improve the efficiency of storage and system imple-
mentation of the proposed Yolov3 network architecture. Therefore, this study examined the
ship detection performance by reducing the filters of the convolutional layers in Darknet-53,
the backbone of Yolov3. For example, when a 20% filter reduction was performed, the
number of filters of 32 and 64 in the convolutional layers of the first residual block, shown
in Figure 1, would be reduced to 26 and 52, respectively. The experiments in the next section
showed that an appropriate number of filters can reduce the computational complexity of
the system, improve the classification speed, and maintain the detection accuracy at the
same time.
Electronics 2022,
Electronics 11, 11,
2022, 739x FOR PEER REVIEW 7 of 721
of 20

Figure1.1.Proposed
Figure Proposedmodified
modified Yolov3
Yolov3 network
network with
withSPP
SPPmodule.
module.

3.2.Spatial
3.2. SpatialPyramid
PyramidPooling
Pooling
Spatialpyramid
Spatial pyramidpooling
pooling(SPP)
(SPP)[50,51]
[50,51]isisone
oneof
ofthe
themost
mostpopular
popularapproaches
approaches for for vision
vi-
sion recognition. The SPP module divides each feature map into
recognition. The SPP module divides each feature map into several different grid sizes several different grid
sizes as
(such (such
4 ×as4,4 2× ×
4, 22,×12,×1 1)
× 1)and
andthen
thenperforms
performs the
the maximum
maximumpooling poolingoperation
operation on on
each grid. After the maximum pooling, three feature maps with
each grid. After the maximum pooling, three feature maps with dimensions of 16 × dimensions of 16 × C, 4 × C,
C, and 1 × C will be generated for a C-dimensional input feature
4 × C, and 1 × C will be generated for a C-dimensional input feature map. Then, the map. Then, the three
feature
three mapsmaps
feature are ablearetoable
generate a fixed-length
to generate output feature
a fixed-length output map regardless
feature mapof the input of
regardless
size and will connect to the following fully connected layers. Thus, regardless
the input size and will connect to the following fully connected layers. Thus, regardless of the input
ofdimension, the SPP module
the input dimension, provides
the SPP modulefixed-dimensional output, which was
provides fixed-dimensional impossible
output, whichin was
the previous networks using sliding windows. Due to the flexibility
impossible in the previous networks using sliding windows. Due to the flexibility of the input dimen-
of the
sions,dimensions,
input SPP can incorporate
SPP can the functionality
incorporate obtained in variable
the functionality obtained dimensions.
in variable dimensions.
Moreover, SPP extracts the main spatial information of the feature map and performs
Moreover, SPP extracts the main spatial information of the feature map and performs
stitching, which is a feature enhancement module. The receptive field of a single neuron
stitching, which is a feature enhancement module. The receptive field of a single neuron is
is gradually increasing as the convolutional layers of the Yolov3 network are deepened
gradually increasing as the convolutional layers of the Yolov3 network are deepened during
during the feature extraction process. At the same time, the feature extraction capability
the feature extraction process. At the same time, the feature extraction capability has also
has also been improved, and the extracted features have become more abstract. If the
been improved, and the extracted features have become more abstract. If the shape of the
shape of the object’s feature map is blurred, the spatial information of the small object will
object’s feature map is blurred, the spatial information of the small object will be inaccurate
be inaccurate at this time. Experimental results show that when using Yolov3 to detect
atmultiple
this time. Experimental results show that when using Yolov3 to detect multiple ships
ships in one image, the phenomenon of missed detections will happen and the
inship
onedetection
image, the phenomenon
performance will ofbemissed
greatly detections
reduced. Due willtohappen and the
the enhanced ship detection
feature extrac-
performance will be greatly reduced. Due to the enhanced feature
tion capability of SPP, the study proposed a modified Yolov3 network that adopts extraction capability
the SPP of
SPP, the study
module proposed
to improve thea performance
modified Yolov3 network
of Yolov3 inthat adoptsship
multiple the targets
SPP module to improve
detection. As
the performance of Yolov3 in multiple ship targets detection.
shown in Figure 1, the SPP module is added between the Darknet-53 backbone and FPN.As shown in Figure 1, the
SPP
Themodule is added
feature maps between
are pooled the Darknet-53
in different scales bybackbone
different and FPN.
sliding The feature
windows, maps
of which theare
pooled in different scales by different sliding windows, of which the sizes are 1, 5, 9, and 13
in local spatial bins, respectively. The stride of max-pooling is set to 1 and the padding is
Electronics 2022, 11, 739 8 of 20

utilized to keep the size of the output feature maps unchanged. Then, these four feature
maps concatenate and input to the subsequent detection layer. Experiments verified that
the proposed modified Yolov3 has improved the ship detection performance, especially for
blurred images with densely distributed ships.

4. Dataset and Experiment Results

The study conducted a series of experiments on the self-built ship data set and com-
pared the proposed method with other state-of-the-art detectors to verify the efficiency
of our approach in this section. All experiments were performed on a PC equipped with
16 GB of memory and Intel Core i7-8700k, with 10GB memory of NVIDIA RTX3080 and
using cuDNN 8.0.4 with CUDA 11.0. The operating system was 64-bit Ubuntu 18.04. The
study adopted the Darknet [52] framework to train the deep learning models. In training,
the number of iterations was set as 20,000. The batch and subdivisions were set to 64 and
16, respectively. The study trained the network model with the stochastic gradient descent
(SGD) [53] method. The learning rate of the approach decreased from 0.001 to 0.0001 after
16,000 steps and 0.00001 after 18,000 steps.

4.1. Ship Dataset

The lack of both visible and infrared images required to perform all-day ship detection
in common datasets must be addressed. Therefore, the results of previous studies [22–25]
cannot be compared because every approach utilizes a different dataset and no common
basis for comparison is established. In order to compare the effectiveness of the proposed
method with other ship detection methods, this study first constructed a ship data set
composed of 5027 visible and 530 infrared images. Most of the ship images were taken
from harbors and coastal areas in northeastern Taiwan and were captured by the SONY
AX700 camera equipped with an infrared lens. The effective pixels of SONY AX700 are
approximately 14.20 megapixels. Due to the COVID-19 epidemic, it was difficult to obtain
some images of ships, such as cruises. The insufficient ship images were supplemented by
Google search. There were six types of ships contained in the data set, including container
ships, cruise ships, warships, yachts, sailboats, and fishing boats. The number of images
of each ship type in the data set was summarized in Table 1. Figure 2 showed some
samples in the ship data set. Each image in the data set first manually marked the border
and labeled the ground truth of the object. The study used the LabelImg open source
project on GitHub [54], which is currently the most widely used annotation tool. In the
experiments, the ship data set was divided into 70% (3890), 20% (1111), and 10% (556)
for the training set, verifications set, and testing set, respectively. In addition, to further
improve the generalization ability of the model and increase the samples of the dataset,
the data augmentation method was carried out by random changes in angle, saturation,
exposure, and hue, to prevent overfitting. During the model training process, the results of
the training set and the validation set were compared to check whether it was overfitting.
When the training loss converges, but the mAP of the validation set decreases, it can be
known that the overfitting has occurred. In addition, the model with the best weight
was automatically selected after the training process was completed, even if the iteration
exceeded the early stopping point. Moreover, to evaluate the performance of the proposed
method for ship detection in videos, SONY AX700 recorded three video clips involving
the aforementioned ship types in the Mpeg-4 video format. In the following experiments,
the number of frames processed by the object detection algorithm was evaluated through
these videos.

Table 1. The composition of self-built ship data set.

Class Container Cruise War Ship Yacht Sailboat Fishing Boat

Total numbers 1009 528 1008 1043 1000 969
Table 1. The composition of self-built ship data set.

Fishing
Class Container Cruise War Ship Yacht Sailboat
Boat
Electronics 2022, 11, 739 9 of 20
Total
1009 528 1008 1043 1000 969
numbers

Figure 2. The samples of the data set.

4.2.Evaluation
4.2. EvaluationMethods
Methods
In the study,
In the study, the themetrics
metrics including
including IoU,IoU,precision,
precision,recall,
recall,F1-score
F1-scoreandandmean
mean Average
Average
Precision(mAP),
Precision (mAP),frames
frames perper second
second (FPS),
(FPS),and
andbillion
billionfloating
floatingpoint
pointoperations
operations (BFLOPs)
(BFLOPs)
wereutilized
were utilizedtotoevaluate
evaluatethe the detection
detection performance
performance ofof
thethe proposed
proposed modified
modified network.
network. The
The effectiveness
effectiveness of theofpredicted
the predicted bounding
bounding box isbox is determined
determined according
according to whether
to whether the is
the IoU
IoU is than
greater greater
thethan the specified
specified thresholdthreshold
[55]. In [55]. In the experiment,
the experiment, the IoUthe IoU threshold
threshold was setwasto 0.5.
set to 0.5. Precision, recall rate, and F1-score are common performance
Precision, recall rate, and F1-score are common performance indicators for evaluating indicators for eval-
object
uating object
detectors. detectors.
Precision Precision
(P) refers to the(P)ratio
refers to theships
of true ratiotoofall
true ships
ships to all ships
predicted predicted
by the network.
by the network. Recall (R) refers to the proportion of true
Recall (R) refers to the proportion of true ships predicted by networks among ships predicted by networks
all true
among
ships. all trueisships.
F1-score F1-score is a indicator
a comprehensive comprehensive indicatorprecision
that combines that combines precision
and recall and
to evaluate
recall to evaluate the performance of different networks. The calculation
the performance of different networks. The calculation formulas of the abovementioned formulas of the
abovementioned indicators are as follows:
indicators are as follows:
TP
Precision ==
Precision (7)(7)
TP + FP
Recall = TP (8)
Recall = (8)
TP + FN
F1-score = 2 (9)
Precision × Recall
F1-score = 2 × (9)
where TP (True Positive) represents samplesPrecision + Recallpositive and predicted to be
that are actually
positive;
where TP FP (False
(True Positive)
Positive) represents
represents samples
samples that
that areare actuallypositive
actually negative butpredicted
and predictedto tobe
be positive;
positive; FN (False
FP (False Negative)
Positive) refers to
represents samples
samples that
that are
are actuallynegative
actually positivebut
butpredicted
predicted to
betopositive;
be negative; TN (True
FN (False Negative)
Negative) refers
refers to samples
to samples that that are actually
are actually negative
positive and pre- to
but predicted
dicted to be negative.
be negative; TN (True Negative) refers to samples that are actually negative and predicted
Average precision (AP) value is usually used as a performance index for object de-
to be negative.
tection. It represents
Average the
precision accuracy
(AP) valueof
is the model
usually in aas
used specific category, index
a performance whichforcanobject
be calcu-
detec-
lated by the area under the Precision-Recall (P-R) curve, as shown in Equation
tion. It represents the accuracy of the model in a specific category, which can be calculated(10),
by the area under the Precision-Recall (P-R) curve, as shown in Equation (10),
Z 1
AP = P(R)dR (10)
0

Moreover, to evaluate the precision of all categories, the mean AP (mAP), is often
used as a performance measure for the network.
Frames per second (FPS) represents the number of frames processed by the detection
Electronics 2022, 11, 739 10 of 20
method in one second. It is also an important metric for evaluating the real-time perfor-
mance of the object detector. Besides the above performance metrices, BFLOPs represent
the number of operations required by the detection algorithm and can be used as an indi-
number
cator of operations
to evaluate requiredof
the complexity bythe
thenetwork.
detection algorithm and can be used as an indicator
to evaluate the complexity of the network.
4.3. Modified Yolov3 Performance
4.3. Modified Yolov3 Performance
In this experiment, the study compared the performance of the modified Yolov3 with
In this experiment, the study compared the performance of the modified Yolov3 with
different parameters, including input image sizes, detection scales, and the number of
different parameters, including input image sizes, detection scales, and the number of
convolution filters.
convolution filters.
First, ship detection experiments were conducted to evaluate the impact of the input
First, ship detection experiments were conducted to evaluate the impact of the input
image size on Yolov3 performance. In the experiment, the detection scales and convolu-
image size on Yolov3 performance. In the experiment, the detection scales and convolution
tion filters were maintained as those in the original Yolov3. Figure 3 displayed the mAP
filters were maintained as those in the original Yolov3. Figure 3 displayed the mAP values
values of Yolov3 with input image sizes varying from 288 to 512. It can be observed that
of Yolov3 with input image sizes varying from 288 to 512. It can be observed that the mAP
theofmAP
Yolov3of has
Yolov3 has increased
increased from 89.7% fromto89.7%
91.6%.toHowever,
91.6%. However,
the mAPthe mAP
only only increased
slightly slightly
increased when the input image size was larger than 384. In
when the input image size was larger than 384. In addition to mAP, other performanceaddition to mAP, other per-
formance metrics BFLOPs
metrics BFLOPs and FPSand wereFPS were evaluated
evaluated for simulation
for simulation schemes schemes with an
with an input input
image size
image size of 352 × 352, 384 × 384, 416 × 416, and 448 × 448. These
of 352 × 352, 384 × 384, 416 × 416, and 448 × 448. These results were presented in the results were presented
in first
the first
blockblock of Table
of Table 2. It be
2. It can canobserved
be observed that thethatinput
the input
image image sizeahas
size has a great
great influ-on
influence
ence on the computational complexity of the network architecture.
the computational complexity of the network architecture. Although the mAP is higher Although the mAP is
higher when the input image size increases, the required operations
when the input image size increases, the required operations BFLOPs increase relatively. In BFLOPs increase rel-
atively.
general,In the
general,
largerthethe larger
image thesize,image size, the
the higher themAPhigher the ship
of the mAPdetection.
of the ship detection.the
Comparing
Comparing the results
results in Table 2, the in
mAPTableof 2,
thethe384mAP
× 384 of image
the 384can × 384 image
remain can remain
above 91% which abovewas 91%
only
which was only 0.2% and 0.1% lower than the mAP of the 416 ×
0.2% and 0.1% lower than the mAP of the 416 × 416 and 448 × 448 images, respectively. 416 and 448 × 448 images,
respectively.
Whereas, the Whereas,
BFLOPsthe of BFLOPs
the 384 ×of384 theimage
384 × 384 wasimage was 55.7,
55.7, which waswhich
aboutwas
85%about
and 73%85% of
and 73% of the BFLOPs required for the 416 × 416 and 448 × 448
the BFLOPs required for the 416 × 416 and 448 × 448 images, respectively. Considering the images, respectively.
Considering
computational the efficiency
computational and mAP,efficiency and selected
the study mAP, the thestudy
inputselected theto
image size input
be 384image
× 384
size
fortothe
befollowing
384 × 384 experiments.
for the following experiments.

Figure
Figure 3. The
3. The mAPmAP
of of Yolov3
Yolov3 with
with different
different input
input image
image sizes.
sizes.
Electronics 2022, 11, 739 11 of 20

Table 2. Performance of Yolov3 on training data under different simulation scenarios.

Parameters BFLOPs FPS mAP Precision Recall F1-Score

448 × 448 75.8 94.0 91.5% 0.92 0.85 0.88
416 × 416 65.3 95.2 91.4% 0.92 0.85 0.88
Input image size (Three
detection scales) 384 × 384 55.7 96.9 91.3% 0.92 0.85 0.88
352 × 352 46.8 97.8 90.7% 0.92 0.85 0.88

Scales Two detection scales 51.0 98.4 90.8% 0.93 0.84 0.88
(Input image size 384 × 384) Small target scale 46.3 101.2 88.0% 0.93 0.80 0.86

Filters −20% 36.8 102.5 90.1% 0.92 0.83 0.87

(Input image size 384 × 384 −30% 28.9 106.2 90.7% 0.92 0.84 0.88
and two detection scales)
−40% 23.9 119.3 89.8% 0.90 0.80 0.86

Next, the experiments were performed to examine the influence of the detection scales
of Yolov3 on ship detection. The input image size was 384 × 384, and the convolution
filters remained the same as those in the original Yolov3. Three combinations of detection
scales were considered in the experiment: (1) with all three scales, (2) with two detection
scales for medium and small targets (removing the large target scale), and (3) with only
the small target scale. The detection performance of the three combinations was compared
in the second block of Table 2. For the two detection scales combination scheme (2), the
mAP was 90.8%, which was 0.5% lower than the mAP of Yolov3 with all detection scales
(as shown in the first block of Table 2); and the BFLOPs was 51, which was about 91% of
the operations required in Yolov3 with all detection scales. Although the mAP of the two
detection scales decreased slightly, it still remained above 90%. Therefore, the simulation
schemes of 384 × 384 image size and two detection scales (for medium and small targets)
were considered in the following experiments.
Finally, the impact of reducing convolutional filters on network performance was
examined. In the experiment, the network with 20%, 30%, and 40% filter reduction was
considered. Moreover, the input image size was 384 × 384, and the network preserved
two detection scales for small and medium targets. The detection performance was shown
in the third block of Table 2. It can be observed that the Yolov3 with 30% filter reduction
had a better performance, with 90.7% mAP and 28.9 BFLOPs. Compared with the network
without reducing the convolutional filters, the network with 30% filter reduction had better
calculation efficiency with similar ship detection performance. The BFLOPs of the network
with 30% filter reduction have been reduced to 43.3% of the operations required by the
network with all filters retained, while the mAP remained above 90%.
According to the above experimental results, the proposed Yolov3 modified the net-
work parameters, in which the input image size was 384 × 384, the detection module
retained two scales of small and medium targets, and the convolution filters were reduced
by 30%. The modified Yolov3 greatly reduced the computation cost while maintaining ship
detection accuracy, with 90.7% mAP and 28.9 BFLOPs. Moreover, the FPS of the modified
Yolov3 was up to 106.2, which is about 9.6% higher than the original Yolov3. Figure 4
showed the training process of this modified Yolov3 model. It can be observed that after
completing 20,000 iterations, the modified Yolov3 has reached an accuracy of more than
90%, with a loss of 0.13.
Electronics 2022,11,
Electronics2022, 11,739
x FOR PEER REVIEW 12 of
12 of 20
21

Figure4.4.Training
Figure Trainingprocess
processofofthe
themodified
modifiedYolov3.
Yolov3.The
Thesimulation
simulationscheme
schemewaswas384
384×× 384
384 input
input size,
size,
two detection scales, and 30% filter reduction. The red line and blue line represent the
two detection scales, and 30% filter reduction. The red line and blue line represent the mAP and mAP and
training average loss, respectively.
training average loss, respectively.

Inthe
In thefollowing,
following,the the detection
detection accuracy
accuracy of various
of various typestypes of ships
of ships in theintesting
the testing data
data was
was studied
studied for thefor the Yolov3
Yolov3 network network with different
with different parameters.
parameters. The study The study examined
examined the effecttheof
effect of input image size, detection scales, and convolution filters on
input image size, detection scales, and convolution filters on ship detection accuracy by the ship detection accu-
racy by
same thesimulation
three same threeschemes
simulation schemes
as Table 2. The ascorresponding
Table 2. The corresponding
results were shown resultsinwere
the
shown
first, in theand
second, first, second,
third blocksandofthird
Tableblocks of Table 3, respectively.
3, respectively.
Considering the effect of image size, it can be observed that the network with the
input3.image
Table sizeaccuracy
Detection of 448 ×of448 had better
Yolov3 on test performance for every
data under different type ofscenarios.
simulation ship. The detection
accuracy of 416 × 416 and 384 × 384 image sizes was very close, only about 2% lower than
Parametersthe detection accuracyWarship of 448 × 448Container
image size. Cruise
In fact, the largerSailboat
the input Fishing
image size,mAP the
Ship Ship Yacht Boat
better the detection accuracy, but the computational burden of the network also increases.
448 × 448 88.4 91.2 90.1 87.1 94.4 87.8 89.8
Input image size (Three In order
416 ×to416
reduce the computational
84.6 91.1 complexity,88.0 the research
85.1 tried
92.0 to select84.0a moderate
87.5
detection scales) image × 384
384size 84.5
and used a network 91.2
with appropriate 87.6detection85.2scales92.4 83.8
and convolution 87.4
filters.
352 × 352 82.4 92.3 85.8 82.7 94.6 81.0 86.4
Based on the results of the second block of Table 3, when selecting the input image size of
Scales Two detection scales 90.4 93.1 88.6 90.5 93.0 83.6 89.9
(Input image size 384 × 384) 384 × 384
Small andscale
target using the 87.6
network with 92.2 two detection
85.0 scales 87.2and all convolution
89.4 80.4 filters,
87.0the
detection accuracy of ships has improved and the mAP reached 89.9%, which was 0.1%
Filters −20% 89.6 93.2 95.1 93.0 92.7 83.7 91.2
(Input image size 384 × 384 higher−than 30% the mAP corresponding
91.8 92.4to the input94.5 image94.1size of 44896.3 × 448. 87.6
Finally, the
92.8re-
and two detection scales) sults in−40%the third block 90.4 of Table 3 90.3
also verified 92.4
that an 90.1
appropriate 93.8 number 81.6of convolu-
89.4
tional filters would further improve the accuracy of ship detection. For the network with
30%Considering
filter reduction, the mAP
effectwas up to size,
of image 92.8%, it which
can be was 2.9% higher
observed that the than the 89.9%
network withmAPthe
of the abovementioned network with all convolution filters.
input image size of 448 × 448 had better performance for every type of ship. The detection
accuracyTheof experimental
416 × 416 and results
384 ×validated
384 image that thewas
sizes modified Yolov3,
very close, onlywith
about input imagethan
2% lower size
384 × 384, two detection scales, and a 30% filter reduction in the convolutional
the detection accuracy of 448 × 448 image size. In fact, the larger the input image size, the layer, can
achieve higher ship detection accuracy, superior performance, and
better the detection accuracy, but the computational burden of the network also increases. better calculation effi-
ciency
In orderthan the original
to reduce Yolov3 network.
the computational complexity, the research tried to select a moderate
image size and used a network with appropriate detection scales and convolution filters.
Electronics 2022, 11, 739 13 of 20

Based on the results of the second block of Table 3, when selecting the input image size of
384 × 384 and using the network with two detection scales and all convolution filters, the
detection accuracy of ships has improved and the mAP reached 89.9%, which was 0.1%
higher than the mAP corresponding to the input image size of 448 × 448. Finally, the results
in the third block of Table 3 also verified that an appropriate number of convolutional
filters would further improve the accuracy of ship detection. For the network with 30%
filter reduction, mAP was up to 92.8%, which was 2.9% higher than the 89.9% mAP of the
abovementioned network with all convolution filters.
The experimental results validated that the modified Yolov3, with input image size
384 × 384, two detection scales, and a 30% filter reduction in the convolutional layer,
can achieve higher ship detection accuracy, superior performance, and better calculation
efficiency than the original Yolov3 network.

4.4. Network Comparison

Next, the study compared the detection performance of the proposed modified Yolov3
with the other CNN network architectures, including SSD, EfficientDet [56], ResNet152 [57],
Yolov2, Yolov3-spp (which is Yolov3 with SPP module [51]), Yolov4, and tiny Yolo (which
is a light and fast version of Yolo network). In the experiment, the input image size for
all networks was fixed to 384 × 384. In addition, the study also added the SPP module
to the modified Yolov3 to further improve the proposed network, called the modified
Yolov3-spp, as shown in Figure 1. Table 4 summarized the performance evaluation of
different network architectures from the training data. Yolov3 and Yolov4 can achieve
better detection performance, with mAP greater than 90%. Although ResNet152 had an
acceptable detection accuracy with 85.4% mAP, its BFLOPs was 86.5, which was the highest
among all networks. In the Yolo-based models, the tiny Yolo networks provided poor
detection results but had BFLOPs less than 5. This is not unexpected, since the tiny version
of the Yolo model was designed to implement a fast object tracking system. Finally, the two
proposed improved networks can achieve a mAP greater than 90%, which was close to the
detection accuracy of the original Yolov3 networks. In particular, the mAP of the proposed
modified Yolov3-spp was 93%, which was slightly lower than the mAP of Yolov4. However,
the BFLOPs of the two proposed modified networks were both less than 30, which was
almost 52% and 57% of the operations required by the original Yolov3 and Yolov4 networks.
In addition, the FPS of the proposed modified networks was greater than 100, which was
about 10 FPS faster than the original Yolov3 and Yolov4 networks.

Table 4. Performance comparison of the proposed method with other networks on training data.

BFLOPs FPS mAP Precision Recall F1-Score

EfficientDet 2.9 69.4 63.2% 0.65 0.61 0.63
ResNet152 86.5 59.8 85.4% 0.89 0.78 0.83
SSD 39.7 66.7 76.2% 0.77 0.72 0.74
Yolov2 25.0 97.6 76.5% 0.76 0.72 0.74
Yolov3 55.7 96.9 91.3% 0.92 0.85 0.88
Yolov3-spp 56.0 96.8 91.7% 0.92 0.85 0.88
Yolov4 50.8 94.5 94.8% 0.91 0.88 0.89
Yolov2-tiny 4.6 113.7 67.5% 0.61 0.69 0.64
Yolov3-tiny 4.7 121.1 76.7% 0.73 0.68 0.70
Yolov4-tiny 5.0 119.6 83.4% 0.81 0.78 0.79
Modified Yolov3 28.9 106.2 90.7% 0.92 0.84 0.88
Modified Yolov3-spp 29.2 104.7 93.0% 0.93 0.86 0.89

In addition, Figure 5 showed the detection performance comparison between the

proposed modified Yolov3 and other high detection efficiency methods, including Yolov3,
Yolov3-spp, and Yolov4. It can be observed that the modified Yolov3-spp had better
Yolov3-tiny 4.7 121.1 76.7% 0.73 0.68 0.70

Yolov4-tiny 5.0 119.6 83.4% 0.81 0.78 0.79

Modified
28.9 106.2 90.7% 0.92 0.84 0.88
Yolov3
Electronics 2022, 11, 739 14 of 20

Modified
29.2 104.7 93.0% 0.93 0.86 0.89
Yolov3-spp
performance with high detection accuracy, low computational complexity, and fast process-
ing speed.

Electronics 2022, 11, x FOR PEER REVIEW 16 of 21

(a)

(b)

Figure 5. Performance
Figure 5. Performance evaluation of the Yolo-based
Yolo-based networks,
networks, including
including (a)
(a) BFLOPs,
BFLOPs, FPS and mAP;
(b) Precision,
(b) Precision,Recall,
Recall,and
andF1-score.
F1-score.

Table 5. Detection accuracy of the proposed modified methods and other networks on testing
data.

Container Cruise Fishing

Warship Yacht Sailboat mAP
Ship Ship Boat
Electronics 2022, 11, 739 15 of 20

Then, the detection results of the proposed modified Yolov3 and other CNN networks
by using testing data were shown in Table 5. Yolov2-tiny and EfficientDet had poor
detection results, with mAP of 64.8% and 61.9%, respectively. The detection accuracy of
SSD was similar to that of Yolov2, and the mAP was about 76%. The reason for the poor
detection efficiency was that the network cannot extract effective features from multiscale
images, while the Yolov3 applied the FPN technique to address this problem. The mAP
of Yolov3-spp has improved by 1.4% compared to the original Yolov3. The proposed
modified Yolov3 and modified Yolov3-spp can improve the detection performance, with
mAP of 92.8% and 93.2%, which were 5.4% and 4.4% higher than the original Yolov3
networks, respectively. Among Yolo-based models, Yolov4 achieved the highest detection
performance, reaching 94.3% mAP. The mAP of the proposed modified Yolov3-spp was
1.1% lower than that of Yolov4, which was due to the slightly lower detection accuracy of
the proposed approach for small vessels such as fishing boats. The precision and recall of
Yolov4 were 0.2 lower and 0.2 higher than the proposed method, respectively. Both Yolov4
and the proposed method had an F1-score of 0.89. However, the BFLOPs of the proposed
modified Yolov3-spp were only 57.5% of the required operations of Yolov4.

Table 5. Detection accuracy of the proposed modified methods and other networks on testing data.

Warship Container Ship Cruise Ship Yacht Sailboat Fishing Boat mAP
EfficientDet 64.9 62.1 63.8 68.1 65.2 47.4 61.9
Resnet151 81.2 86.5 83.5 86.8 87.8 80.2 84.3
SSD 75.2 73.5 83.6 79.8 81.5 65.5 76.5
Yolov2 74.8 72.6 80.4 78.2 78.1 67.4 75.3
Yolov3 84.5 91.2 87.6 85.2 92.4 83.8 87.4
Yolov3-spp 86.7 92.1 90.7 84.1 95.6 83.5 88.8
Yolov4 92.9 93.8 95.7 92.9 97.2 93.4 94.3
Yolov2-tiny 65.8 67.8 64.7 65.3 71.8 53.4 64.8
Yolov3-tiny 71.1 78.7 72.5 70.2 74.8 67.8 72.5
Yolov4-tiny 79.6 87.0 82.3 77.3 87.5 77.2 81.8
Modified Yolov3 91.8 92.4 94.5 94.1 96.3 87.6 92.8
Modified Yolov3-spp 92.9 93.2 95.4 93.5 95.8 88.9 93.2

In summary, compared with other networks, the proposed modified Yolov3-spp can
provide high detection accuracy and high calculation efficiency for ship detection. The results
in Tables 4 and 5 verified the superior performance of the proposed modified networks.

4.5. Real Image Verification

Furthermore, in order to qualitatively evaluate the ship detection performance of the
proposed modified networks, the detection results of some samples were shown in Figure 6.
The ship images sampled from the built data set were detected by the original Yolov3, the
modified Yolov3, and the modified Yolv3-spp networks, and the corresponding detection
results were displayed in the first, second, and third columns of Figure 6, respectively. In
the first row of Figure 6, both modified Yolov3 networks detected the container ship even
under complex backgrounds, but the original Yolov3 missed it. In addition, the modified
Yolov3-spp achieved a higher detection confidence score. The images of the other 5 types
of ships were from the second row to the sixth row in Figure 6.
the modified Yolov3-spp achieved better detection and only missed one fishing boat, com-
pared with the other two Yolov3 networks. Finally, even for blurred images or multitype
of ship targets in an image, as shown in the eighth and ninth rows of Figure 6, the modified
Yolov3-spp can detect almost all ships correctly and achieve the highest confidence score.
In general, the proposed modified Yolov3-spp network can improve the performance of
Electronics 2022, 11, 739 16 of 20
multi-scale ship detection, and the detection box is more accurate than the original Yolov3
network.

(1) Con-
tainer
ship

(2) Cruise

(3) Yacht
Electronics 2022, 11, x FOR PEER REVIEW 18 of 21

(4) War
ship

(5) Sailboat

(6) Fishing
boat

(7) Fishing Figure 6. Cont.

boat
(infra-
red im-
ages)

(8) War
ship
(blurred
images)
(6) Fishing
boat
Electronics 2022, 11, 739 17 of 20

(7) Fishing
boat
(infra-
red im-
ages)

(8) War
ship
(blurred
images)

(9) Multi-
type
ship
targets

(a) (b) (c)

Figure 6.
Figure 6. Ship
Shipdetection
detectionresults
resultsfor
formultiscale
multiscaletargets. (a)(a)
targets. Original Yolov3.
Original (b) (b)
Yolov3. Modified Yolov3.
Modified (c)
Yolov3.
Modified Yolov3-spp.
(c) Modified Yolov3-spp.

5. Conclusions
From the results, it can be observed that the modified Yolov3 and the original Yolov3
haveThis
missed some
study small and
proposed obscure ships.
a modified However,
Yolov3-spp the modified
model Yolov3-spp
for ship detection canvisible
with avoid
missing someimages.
and infrared densely arranged
The ships,of
effectiveness and
theeven detectmethod
proposed partially
in obstructed ship targets.
real-time detection was
The adopted
verified SPPexperiments
by the modules improve
on thethe feature
built data extraction and of
set consisting preserve spatial
six types information
of ship images.
by pooling in local
Experimental spatial
results bins,that
showed thereby improving
the proposed the ability
modified to expressoutperforms
Yolov3-spp ship featuresmost
and
alleviating the problem of multiscale ship detection. The modified Yolov3-spp has better
detection performance than the modified Yolov3. In addition, the Yolov3 networks can
successfully detect ships in infrared images, as shown in the seventh row of Figure 6. Due
to the dense distribution of fishing boats in harbors, some fishing boats in this infrared
image were missed detected by Yolov3 and the modified Yolov3. However, the modified
Yolov3-spp achieved better detection and only missed one fishing boat, compared with the
other two Yolov3 networks. Finally, even for blurred images or multitype of ship targets in
an image, as shown in the eighth and ninth rows of Figure 6, the modified Yolov3-spp can
detect almost all ships correctly and achieve the highest confidence score. In general, the
proposed modified Yolov3-spp network can improve the performance of multi-scale ship
detection, and the detection box is more accurate than the original Yolov3 network.

5. Conclusions
This study proposed a modified Yolov3-spp model for ship detection with visible
and infrared images. The effectiveness of the proposed method in real-time detection was
verified by the experiments on the built data set consisting of six types of ship images.
Experimental results showed that the proposed modified Yolov3-spp outperforms most
of the current CNN networks in terms of detection accuracy and computation efficiency.
The proposed method achieved better detection performance than the original Yolov3
in ship detection, increasing mAP by 5.8%, FPS by 8%, and reducing BFLOPs by about
47.6%. Experiments also showed that the proposed method has high detection accuracy in
multiscale detection situations, especially for the detection of densely distributed ships in
Electronics 2022, 11, 739 18 of 20

ports. In conclusion, the proposed method has high computational efficiency and detection
accuracy and meets the requirements of real-time detection. Furthermore, this study has
investigated the ship detection algorithms in detail and developed a common ship dataset
consisting of visible and infrared images. In future work, the attentional mechanism and
the more complete data set will be a key research direction.

Author Contributions: Data curation, Y.-T.C.; Methodology, L.C.; Project administration, Y.-L.C.;
Software, Y.-T.C.; Supervision, J.-H.W.; Validation, L.C.; Writing—original draft, Y.-T.C.; Writing—
review & editing, L.C. All authors have read and agreed to the published version of the manuscript.
Funding: This research was funded by the Ministry of Science and Technology, Taiwan, under Grant
Nos: MOST-109-2221-E019-054, MOST-110-2119-M-027-001 and MOST-110-2221-E-027-101.
Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.
Data Availability Statement: Not applicable.
Conflicts of Interest: The authors declare no conflict of interest.

References
1. Chen, X.; Chen, H.; Wu, H.; Huang, Y.; Yang, Y.; Zhang, W.; Xiong, P. Robust visual ship tracking with an ensemble framework
via multi-view learning and wavelet filter. Sensors 2020, 20, 932. [CrossRef]
2. Hu, W.C.; Yang, C.Y.; Huang, D.Y. Robust real-time ship detection and tracking for visual surveillance of cage aquaculture. J. Vis.
Commun. Image Represent. 2011, 22, 543–556. [CrossRef]
3. Wang, X.; Chen, C. Adaptive ship detection in SAR images using variance WIE-based method. Signal Image Video Process. 2016,
10, 1219–1224. [CrossRef]
4. Hwang, J.; Kim, D.; Jung, H.S. An efficient ship detection method for KOMPSAT-5 synthetic aperture radar imagery based on
adaptive filtering approach. Korean J. Remote Sens. 2017, 33, 89–95. [CrossRef]
5. Chang, Y.L.; Anagaw, A.; Chang, L.; Wang, Y.C.; Hsiao, C.Y.; Lee, W.H. Ship Detection Based on YOLOv2 for SAR Imagery. Remote
Sens. 2019, 11, 786. [CrossRef]
6. Shi, Z.; Yu, X.; Jiang, Z.; Li, B. Ship detection in high-resolution optical imagery based on anomaly detector and local shape
feature. IEEE Trans. Geosci. Remote Sens. 2014, 52, 4511–4523.
7. Liu, G.; Zhang, Y.; Zheng, X.; Sun, X.; Fu, K.; Wang, H. A new method on inshore ship detection in high-resolution satellite images
using shape and context information. IEEE Geosci. Remote Sens. Lett. 2014, 11, 617–621. [CrossRef]
8. Nie, T.; He, B.; Bi, G.; Zhang, Y. A Method of Ship Detection under Complex Background. ISPRS Int. J. Geo-Inf. 2017, 6, 159.
[CrossRef]
9. Dong, C.; Liu, J.; Xu, F. Ship detection in optical remote sensing images based on saliency and a rotation-invariant descriptor.
Remote Sens. 2018, 10, 400. [CrossRef]
10. Takeda, H.; Farsiu, S.; Milanfar, P. Kernel regression for image processing and reconstruction. IEEE Trans. Image Process. 2007, 16,
349–366. [CrossRef]
11. Pitas, I.; Venetsanopoulos, A.N. Nonlinear Digital Filters: Principles and Applications; Kluwer: Boston, MA, USA, 1990.
12. Ouahabi, A. Signal and Image Multiresolution Analysis; ISTE-Wiley: London, UK; Hoboken, NJ, USA, 2013.
13. Ouahabi, A. A review of wavelet denoising in medical imaging. In Proceedings of the 8th International Workshop on Systems,
Signal Processing and Their Applications (IEEE/WoSSPA), Algiers, Algeria, 12–15 May 2013; pp. 19–26.
14. Ahmed, S.S.; Messali, Z.; Ouahabi, A.; Trepout, S.; Messaoudi, C.; Marco, S. Nonparametric denoising methods based on
contourlet transform with sharp frequency localization: Application to low exposure time electron microscopy images. Entropy
2015, 17, 3461–3478. [CrossRef]
15. Yang, F.; Xu, Q.; Li, B. Ship detection from optical satellite images based on saliency segmentation and structure-LBP feature.
IEEE Geosci. Remote Sens. Lett. 2017, 14, 602–606. [CrossRef]
16. Xia, Y.; Wan, S.; Yue, L. A novel algorithm for ship detection based on dynamic fusion model of multi-feature and support
vector machine. In Proceedings of the IEEE Sixth International Conference on Image and Graphics (ICIG), Hefei, China,
12–15 August 2011; pp. 521–526.
17. Xu, J.; Sun, X.; Zhang, D.; Fu, K. Automatic detection of inshore ships in high-resolution remote sensing images using robust
invariant generalized Hough transform. IEEE Geosci. Remote Sens. Lett. 2014, 11, 2070–2074.
18. Lowe, D.G. Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 2004, 60, 91–110. [CrossRef]
19. Dalal, N.; Triggs, B. Histograms of oriented gradients for human detection. IEEE Conf. Comput. Vis. Pattern Recognit. 2005, 1,
886–893.
20. Cortes, C.; Vapnik, V. Support-vector networks. Mach. Learn. 1995, 20, 273–297. [CrossRef]
21. Schapire, R.E. Explaining AdaBoost. In Empirical Inference; Springer: Berlin/Heidelberg, Germany, 2013; pp. 37–52.
Electronics 2022, 11, 739 19 of 20

22. Kim, K.; Hong, S.; Choi, B.; Kim, E. Probabilistic ship detection and classification using deep learning. Appl. Sci. 2018, 8, 936.
[CrossRef]
23. Huang, H.; Sun, D.; Wang, R.; Zhu, C.; Liu, B. Ship target detection based on improved Yolo network. Math. Probl. Eng. 2020,
2020, 6402149. [CrossRef]
24. Li, H.; Deng, L.; Yang, C.; Liu, J.; Gu, Z. Enhanced Yolov3 tiny network for real-time ship detection from visual image. IEEE
Access. 2021, 9, 16692–16706. [CrossRef]
25. Li, Z.; Zhao, L.; Han, X.; Pan, M. Lightweight ship detection methods based on Yolov3 and DenseNet. Math. Probl. Eng. 2020,
2020, 4813183. [CrossRef]
26. Yao, Y.; Jiang, Z.; Zhang, H.; Zhao, D.; Cai, B. Ship detection in optical remote sensing images based on deep convolutional neural
networks. J. Appl. Remote Sens. 2017, 11, 042611. [CrossRef]
27. Lin, H.; Shi, Z.; Zou, Z. Fully convolutional network with task partitioning for inshore ship detection in optical remote sensing
images. IEEE Geosci. Remote Sens. Lett. 2017, 14, 1665–1669. [CrossRef]
28. Yang, X.; Sun, H.; Fu, K.; Yang, J.; Sun, X.; Yan, M.; Guo, Z. Automatic ship detection in remote sensing images from google earth
of complex scenes based on multiscale rotation dense feature pyramid networks. Remote Sens. 2018, 10, 132. [CrossRef]
29. Li, Q.; Mou, L.; Liu, Q.; Wang, Y.; Zhu, X.X. HSF-Net: Multiscale deep feature embedding for ship detection in optical remote
sensing imagery. IEEE Trans. Geosci. Remote Sens. 2018, 56, 7147–7161. [CrossRef]
30. Girshick, R.; Donahue, J.; Darrell, T.; Malik, J. Rich feature hierarchies for accurate object detection and semantic segmentation.
In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 24–27 June 2014;
pp. 580–587.
31. Girshick, R. Fast R-CNN. In Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, Chile,
13–16 December 2015; pp. 1440–1448.
32. Ren, S.; He, K.; Girshick, R.; Sun, J. Faster r-cnn: Towards real-time object detection with region proposal networks. In Proceedings
of the Advances in Neural Information Processing Systems 28 (NIPS 2015), Montreal, QC, Canada, 7–12 December 2015; pp. 91–99.
33. Fan, W.; Zhou, F.; Bai, X.; Tao, M.; Tian, T. Ship detection using deep convolutional neural networks for PolSAR images. Remote
Sens. 2019, 11, 2862. [CrossRef]
34. Yang, X.; Sun, H.; Sun, X.; Yan, M.; Guo, Z.; Fu, K. Position detection and direction prediction for arbitrary-oriented ships via
multitask rotation region convolutional neural network. IEEE Access. 2018, 6, 50839–50849. [CrossRef]
35. Zhasng, S.; Wu, R.; Xu, K.; Wang, J.; Sun, W. R-CNN-Based ship detection from high resolution remote sensing imagery. Remote
Sens. 2019, 11, 631. [CrossRef]
36. Dong, Z.; Lin, B. Learning a robust CNN-based rotation insensitive model for ship detection in VHR remote sensing images. Int.
J. Remote Sens. 2020, 41, 3614–3626. [CrossRef]
37. Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.Y.; Berg, A.C. SSD: Single shot multibox detector. In Proceedings of
the European Conference on Computer Vision, Amsterdam, The Netherlands, 11–14 October 2016; Springer: Amsterdam, The
Netherlands, 2016; pp. 21–37.
38. Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You only look once: Unified, real-time object detection. In Proceedings of the
IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 779–788.
39. Redmon, J.; Farhadi, A. YOLO9000: Better, faster, stronger. In Proceedings of the IEEE Conference on Computer Vision and
Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 7263–7271.
40. Redmon, J.; Farhadi, A. Yolov3: An incremental improvement. arXiv 2018, arXiv:1804.02767.
41. Bochkovskiy, A.; Wang, C.Y.; Mark Liao, H.Y. Yolov4: Optimal speed and accuracy of object detection. arXiv 2020,
arXiv:2004.10934.
42. Wang, Y.; Wang, C.; Zhang, H. Combining a single shot multibox detector with transfer learning for ship detection using Sentinel-1
SAR images. Remote Sens. Lett. 2018, 9, 780–788. [CrossRef]
43. Zhang, T.; Zhang, X. High-speed ship detection in SAR images based on a grid convolutional neural network. Remote Sens. 2019,
11, 1206. [CrossRef]
44. Liu, B.; Wang, S.; Zhao, J.; Li, M. Ship tracking and recognition based on Darknet network and YOLOv3 algorithm. J. Comput.
Appl. 2019, 39, 1663–1668.
45. Zhang, Y.; Shu, J.; Hu, L.; Zhou, Q.; Du, Z. A Ship Target Tracking Algorithm Based on Deep Learning and Multiple Features; SPIE:
Bellingham, WA, USA, 2020; Volume 11433.
46. Chang, L.; Chen, Y.T.; Hung, M.H.; Wang, J.H.; Chang, Y.L. Yolov3 based ship detection in visible and infrared images. In
Proceedings of the 2021 IEEE International Geoscience and Remote Sensing Symposium IGARSS, Brussels, Belgium, 11–16
July 2021.
47. Everingham, M.; Van Gool, L.; Williams, C.K.; Winn, J.; Zisserman, A. The pascal visual object classes (VOC) challenge. Int. J.
Comput. Vis. 2010, 88, 303–338. [CrossRef]
48. Lin, T.Y.; Maire, M.; Belongie, S.; Bourdev, L.; Girshick, R.; Hays, J.; Perona, P.; Ramanan, D.; Dollár, P.; Zitnick, C.L. Microsoft
COCO: Common Objects in Context. In Proceedings of the European Conference on Computer Vision (ECCV) 2014, Zurich,
Switzerland, 6–12 September 2014; Springer: Cham, Switzerland, 2014; pp. 740–755.
49. Kanungo, T.; Mount, D.M.; Netanyahu, N.S.; Piatko, C.D.; Silverman, R.; Wu, A.Y. An efficient k-means clustering algorithm:
Analysis and implementation. IEEE Trans. Pattern Anal. Mach. Intell. 2002, 24, 881–892. [CrossRef]
Electronics 2022, 11, 739 20 of 20

50. He, K.; Zhang, X.; Ren, S.; Sun, J. Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition. IEEE Trans.
Pattern Anal. Mach. Intell. 2015, 37, 1904–1916. [CrossRef]
51. Huang, Z.; Wang, J. DC-SPP-YOLO: Dense connection and spatial pyramid pooling based YOLO for object detection. Inf. Sci.
2020, 522, 241–258. [CrossRef]
52. AlexeyAB. AlexeyAB/Darknet: Yolov3. 2020. Available online: https://github.com/AlexeyAB/darknet (accessed on 10 February 2022).
53. Ruder, S. An overview of gradient descent optimization algorithms. arXiv 2016, arXiv:1609.04747.
54. Tzutalin. Tzutalin/Labelimg. 2018. Available online: https://github.com/tzutalin/labelImg (accessed on 10 February 2022).
55. Li, K.; Huang, Z.; Cheng, Y.C.; Lee, C.H. A maximal figure-of-merit learning approach to maximizing mean average precision
with deep neural network based classifiers. In Proceedings of the 2014 IEEE International Conference on Acoustics, Speech and
Signal Processing (ICASSP), Florence, Italy, 4–9 May 2014; pp. 4503–4507.
56. Tan, M.; Pang, R.; Le, Q.V. Efficientdet: Scalable and efficient object detection. In Proceedings of the IEEE/CVF Conference on
Computer Vision and Pattern Recognition, Seattle, WA, USA, 16–18 June 2020; pp. 10781–10790.
57. He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. arXiv 2015, arXiv:1512.03385.

Application of Artificial Intelligence in Maritime Transportation
No ratings yet
Application of Artificial Intelligence in Maritime Transportation
4 pages
Paper Hal
No ratings yet
Paper Hal
13 pages
Walter Avila Cordova 2020 J. Phys. Conf. Ser. 1642 012003
No ratings yet
Walter Avila Cordova 2020 J. Phys. Conf. Ser. 1642 012003
11 pages
Remote Sensing: Deep Learning For SAR Ship Detection: Past, Present and Future
No ratings yet
Remote Sensing: Deep Learning For SAR Ship Detection: Past, Present and Future
41 pages
Remotesensing 12 03316 v2
No ratings yet
Remotesensing 12 03316 v2
30 pages
(Detectron2) Application of Convolutional Neural Network (CNN) To Recognize Ship Structures
No ratings yet
(Detectron2) Application of Convolutional Neural Network (CNN) To Recognize Ship Structures
16 pages
Object Detection and Ship Classification Using YOLOv5
No ratings yet
Object Detection and Ship Classification Using YOLOv5
10 pages
Li 2021
No ratings yet
Li 2021
30 pages
Jmse 10 00377 v2
No ratings yet
Jmse 10 00377 v2
14 pages
A Lightweight Model For Real-Time Monitoring of SH
No ratings yet
A Lightweight Model For Real-Time Monitoring of SH
17 pages
1 s2.0 S0034425717306193 Main
No ratings yet
1 s2.0 S0034425717306193 Main
26 pages
Thesis - Deep Learning For Detection
No ratings yet
Thesis - Deep Learning For Detection
68 pages
Object Detection and Ship Classification Using YOLOv5
No ratings yet
Object Detection and Ship Classification Using YOLOv5
10 pages
Position Detection and Direction Prediction For Arbitrary-Oriented Ships Via Multiscale Rotation Region Convolutional Neural Network
No ratings yet
Position Detection and Direction Prediction For Arbitrary-Oriented Ships Via Multiscale Rotation Region Convolutional Neural Network
12 pages
Journal of Field Robotics - 2023 - Zhang - Multisensor Fusion Based Maritime Ship Object Detection Method For Autonomous
No ratings yet
Journal of Field Robotics - 2023 - Zhang - Multisensor Fusion Based Maritime Ship Object Detection Method For Autonomous
28 pages
Tech Paper Edit F
No ratings yet
Tech Paper Edit F
14 pages
Fusionnet: Edge Aware Deep Convolutional Networks For Semantic Segmentation of Remote Sensing Harbor Images
No ratings yet
Fusionnet: Edge Aware Deep Convolutional Networks For Semantic Segmentation of Remote Sensing Harbor Images
15 pages
BCSE497J PJT Review1
No ratings yet
BCSE497J PJT Review1
23 pages
Remotesensing 11 00765
No ratings yet
Remotesensing 11 00765
14 pages
Remote Sensing: Mass Processing of Sentinel-1 Images For Maritime Surveillance
No ratings yet
Remote Sensing: Mass Processing of Sentinel-1 Images For Maritime Surveillance
20 pages
Remote Sensing
No ratings yet
Remote Sensing
24 pages
Journal of Field Robotics - 2023 - Zhang - Multisensor Fusion Based Maritime Ship Object Detection Method For Autonomous
No ratings yet
Journal of Field Robotics - 2023 - Zhang - Multisensor Fusion Based Maritime Ship Object Detection Method For Autonomous
18 pages
A - General - Multiscale - Pyramid - Attention - Module - for - Ship - Detection - in - SAR - Images 可能的idea
No ratings yet
A - General - Multiscale - Pyramid - Attention - Module - for - Ship - Detection - in - SAR - Images 可能的idea
13 pages
Performance Comparison of YOLOV8 and YOLOV5 For Vessel Detection For Controlling A Barrier System
No ratings yet
Performance Comparison of YOLOV8 and YOLOV5 For Vessel Detection For Controlling A Barrier System
25 pages
Improving YOLOv8 With Scattering Transform and Attention For Maritime Awareness
No ratings yet
Improving YOLOv8 With Scattering Transform and Attention For Maritime Awareness
6 pages
1 s2.0 S0925753520302095 Main
No ratings yet
1 s2.0 S0925753520302095 Main
9 pages
Chapter 1
No ratings yet
Chapter 1
85 pages
An Incept-TextCNN Model For Ship Target
No ratings yet
An Incept-TextCNN Model For Ship Target
5 pages
Web Programming Notes
100% (1)
Web Programming Notes
82 pages
Comparison of CNN and SVM For Ship Detec
No ratings yet
Comparison of CNN and SVM For Ship Detec
6 pages
Sensors 23 06084 v2
No ratings yet
Sensors 23 06084 v2
16 pages
Kaur Et Al - Sea Situational Awareness Dataset
No ratings yet
Kaur Et Al - Sea Situational Awareness Dataset
9 pages
Prayudi 2020
No ratings yet
Prayudi 2020
7 pages
Sun 2022 J. Phys. Conf. Ser. 2215 012027
No ratings yet
Sun 2022 J. Phys. Conf. Ser. 2215 012027
6 pages
Tech Paper - Edit - F - 1
No ratings yet
Tech Paper - Edit - F - 1
15 pages
Zhang VAIS A Dataset 2015 CVPR Paper
No ratings yet
Zhang VAIS A Dataset 2015 CVPR Paper
7 pages
YOLO-SAIL Attention-Enhanced YOLOv5 With Optimized Bi-FPN For Ship Target Detection in SAR Images
No ratings yet
YOLO-SAIL Attention-Enhanced YOLOv5 With Optimized Bi-FPN For Ship Target Detection in SAR Images
18 pages
IET Image Processing - 2022 - Zheng - Fast Ship Detection Based On Lightweight YOLOv5 Network
No ratings yet
IET Image Processing - 2022 - Zheng - Fast Ship Detection Based On Lightweight YOLOv5 Network
10 pages
Remote Sensing: Improved YOLO Network For Free-Angle Remote Sensing Target Detection
No ratings yet
Remote Sensing: Improved YOLO Network For Free-Angle Remote Sensing Target Detection
20 pages
Bigsardata 2017 8124934
No ratings yet
Bigsardata 2017 8124934
6 pages
Incorporation of AIS Data Based ML
No ratings yet
Incorporation of AIS Data Based ML
32 pages
A Ship Detection Method Base
No ratings yet
A Ship Detection Method Base
6 pages
Enhanced Cascade R-CNN For Multiscale Object Detection in Dense Scenes From SAR Images
No ratings yet
Enhanced Cascade R-CNN For Multiscale Object Detection in Dense Scenes From SAR Images
11 pages
Lightweight Self-Supervised Recognition of Small-Sample Ships Using Micro-Doppler Signatures and UAV-Based UWB Radar
No ratings yet
Lightweight Self-Supervised Recognition of Small-Sample Ships Using Micro-Doppler Signatures and UAV-Based UWB Radar
10 pages
الاستشعار عن بعد في اكتشاف السفن والملاحة
No ratings yet
الاستشعار عن بعد في اكتشاف السفن والملاحة
6 pages
Remotesensing 14 04801 v2
No ratings yet
Remotesensing 14 04801 v2
20 pages
Ship Detection and Classification From Optical Rem
No ratings yet
Ship Detection and Classification From Optical Rem
19 pages
C1 - Application - Detection of Ship From Satellite Image Using Deep Learning
No ratings yet
C1 - Application - Detection of Ship From Satellite Image Using Deep Learning
14 pages
Project Synopsis-Ship Detection Updated
No ratings yet
Project Synopsis-Ship Detection Updated
10 pages
A Survey On SAR Ship Classification Using Deep Learning: Arxiv Preprint - , 2025 1
No ratings yet
A Survey On SAR Ship Classification Using Deep Learning: Arxiv Preprint - , 2025 1
20 pages
Ship Accident Prevention System Using Python
No ratings yet
Ship Accident Prevention System Using Python
13 pages
Fusar Ship Paper
No ratings yet
Fusar Ship Paper
19 pages
Attention Mask R-CNN For Ship Detection and Segmen
No ratings yet
Attention Mask R-CNN For Ship Detection and Segmen
10 pages
Jmse 12 01316
No ratings yet
Jmse 12 01316
29 pages
Pratham Techincal
No ratings yet
Pratham Techincal
24 pages
Tape Reading
0% (2)
Tape Reading
3 pages
Deep Learning Based Efficient Ship Detection From Drone-Captured Images For Maritime Surveillance
No ratings yet
Deep Learning Based Efficient Ship Detection From Drone-Captured Images For Maritime Surveillance
12 pages
Scnet Yolo: A Symmetric Convolution Network For Multi Scenario Ship Detection Based On Yolov7
No ratings yet
Scnet Yolo: A Symmetric Convolution Network For Multi Scenario Ship Detection Based On Yolov7
19 pages
Index
No ratings yet
Index
3 pages
Major Project Synopsis
No ratings yet
Major Project Synopsis
5 pages
COREN Registration and Guide
No ratings yet
COREN Registration and Guide
3 pages
TN 13 Omnitrend Shortcuts
No ratings yet
TN 13 Omnitrend Shortcuts
2 pages
Finite Element Analysis
No ratings yet
Finite Element Analysis
2 pages
Belina RTGS 2020 Year End Notes
No ratings yet
Belina RTGS 2020 Year End Notes
20 pages
Bakery Management Synopsis
No ratings yet
Bakery Management Synopsis
13 pages
As 1683.11-2001 Methods of Test For Elastomers Tension Testing of Vulcanized or Thermoplastic Rubber
No ratings yet
As 1683.11-2001 Methods of Test For Elastomers Tension Testing of Vulcanized or Thermoplastic Rubber
4 pages
Clinical Chemistry Analyzer
No ratings yet
Clinical Chemistry Analyzer
2 pages
Im Smartcool e 6877419 V1.5.0 10 14
No ratings yet
Im Smartcool e 6877419 V1.5.0 10 14
222 pages
Product CI854A
No ratings yet
Product CI854A
3 pages
AIS Book Chapter 1 Answer
No ratings yet
AIS Book Chapter 1 Answer
5 pages
Advanced Ch.03 Management Accounting S4HANA
No ratings yet
Advanced Ch.03 Management Accounting S4HANA
112 pages
1769 td006 - en P PDF
No ratings yet
1769 td006 - en P PDF
132 pages
TMCQ
No ratings yet
TMCQ
14 pages
SimCube SC 5 User Manual PDF
No ratings yet
SimCube SC 5 User Manual PDF
24 pages
CISSP Study Guide Conrad 2024 Scribd Download
100% (3)
CISSP Study Guide Conrad 2024 Scribd Download
65 pages
Agreement &doa
No ratings yet
Agreement &doa
3 pages
FAI - Become An Instructor - 3
No ratings yet
FAI - Become An Instructor - 3
4 pages
2020 - d2 06 Apeltauer Paper
No ratings yet
2020 - d2 06 Apeltauer Paper
16 pages
Asistensi AK1 10 Sept
No ratings yet
Asistensi AK1 10 Sept
13 pages
Chapter 9: Applications of The DFT: Impulse Response or Impulse Response Function
No ratings yet
Chapter 9: Applications of The DFT: Impulse Response or Impulse Response Function
2 pages
SR34i en
No ratings yet
SR34i en
2 pages
Animasi Pesawat Menggunakan OpenGL
No ratings yet
Animasi Pesawat Menggunakan OpenGL
11 pages
Theory of HTML
No ratings yet
Theory of HTML
15 pages
Oil Basics & More: Drawing & Painting With Style and Confidence
No ratings yet
Oil Basics & More: Drawing & Painting With Style and Confidence
20 pages
The Hveem Method
No ratings yet
The Hveem Method
22 pages
Sop - Vor
No ratings yet
Sop - Vor
3 pages
Choose An OTA For The Apple Watch Series 3 (42mm) IPSW Downloads
No ratings yet
Choose An OTA For The Apple Watch Series 3 (42mm) IPSW Downloads
1 page
2960 Switch Cisco Catalyst 48 Port Switch
No ratings yet
2960 Switch Cisco Catalyst 48 Port Switch
1 page
Underwater Communication Technologies: A Simple Guide to Big Ideas
From Everand
Underwater Communication Technologies: A Simple Guide to Big Ideas
NOVA MARTIAN
No ratings yet
Underwater Computer Vision: Exploring the Depths of Computer Vision Beneath the Waves
From Everand
Underwater Computer Vision: Exploring the Depths of Computer Vision Beneath the Waves
Fouad Sabry
No ratings yet