0% found this document useful (0 votes)
7 views33 pages

Advancements in Deep Learning Techniques For Time

This document reviews the advancements in deep learning techniques for time series forecasting in maritime applications, highlighting their impact on operational accuracy and decision-making. It categorizes existing literature into ship operation-related, port operation-related, and shipping market-related topics, while analyzing various deep learning architectures and their applications. The paper identifies gaps in current research and suggests future directions for improving model selection and optimization in maritime contexts.

Uploaded by

Ozgun Kosaner
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views33 pages

Advancements in Deep Learning Techniques For Time

This document reviews the advancements in deep learning techniques for time series forecasting in maritime applications, highlighting their impact on operational accuracy and decision-making. It categorizes existing literature into ship operation-related, port operation-related, and shipping market-related topics, while analyzing various deep learning architectures and their applications. The paper identifies gaps in current research and suggests future directions for improving model selection and optimization in maritime contexts.

Uploaded by

Ozgun Kosaner
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 33

information

Review
Advancements in Deep Learning Techniques for Time Series
Forecasting in Maritime Applications: A Comprehensive Review
Meng Wang 1 , Xinyan Guo 1 , Yanling She 2, *, Yang Zhou 3 , Maohan Liang 4 and Zhong Shuo Chen 1, *

1 School of Intelligent Finance and Business, Xi’an Jiaotong-Liverpool University, Suzhou 215400, China;
[email protected] (M.W.); [email protected] (X.G.)
2 Faculty of International Tourism and Management, City University of Macau, Avenida Padre Tom’as Pereira
Taipa, Macau 999078, China
3 Zhejiang “Eight Eight Strategy” Innovation and Development Research Institute, The Party School of Zhejiang
Provincial Committee of the Communist Party of China, Hangzhou 311121, China; [email protected]
4 Department of Civil and Environmental Engineering, National University of Singapore,
Singapore 117576, Singapore; [email protected]
* Correspondence: [email protected] (Y.S.); [email protected] (Z.S.C.)

Abstract: The maritime industry is integral to global trade and heavily depends on precise forecasting
to maintain efficiency, safety, and economic sustainability. Adopting deep learning for predictive
analysis has markedly improved operational accuracy, cost efficiency, and decision-making. This
technology facilitates advanced time series analysis, vital for optimizing maritime operations. This
paper reviews deep learning applications in time series analysis within the maritime industry, focus-
ing on three areas: ship operation-related, port operation-related, and shipping market-related topics.
It provides a detailed overview of the existing literature on applications such as ship trajectory predic-
tion, ship fuel consumption prediction, port throughput prediction, and shipping market prediction.
The paper comprehensively examines the primary deep learning architectures used for time series
forecasting in the maritime industry, categorizing them into four principal types. It systematically
analyzes the advantages of deep learning architectures across different application scenarios and
explores methodologies for selecting models based on specific requirements. Additionally, it analyzes
Citation: Wang, M.; Guo, X.; She, Y.;
data sources from the existing literature and suggests future research directions.
Zhou, Y.; Liang, M.; Chen, Z.S.
Advancements in Deep Learning
Keywords: deep learning; time series forecasting; maritime; shipping market; port operation
Techniques for Time Series
Forecasting in Maritime Applications:
A Comprehensive Review. Information
2024, 15, 507. https://doi.org/
10.3390/info15080507
1. Introduction
The maritime industry is navigating a complex environment, grappling with increas-
Academic Editor: Francesco
ingly stringent trade policies, heightened geopolitical tensions, and evolving globalization
Camastra
patterns [1]. Adding to these difficulties and challenges, the climate is intricate and the
Received: 30 July 2024 volume of sea traffic is immense, further straining the industry [2]. Nevertheless, maritime
Revised: 16 August 2024 trade remains the cornerstone of global commerce, driving economic growth and under-
Accepted: 17 August 2024 pinning the global economy [3,4]. Amid these complexities, the rapid development of
Published: 21 August 2024 satellite communications, large-scale wireless networks, and data science technologies has
revolutionized the shipping industry. These advancements have made the comprehensive
tracking of global ship movements and trends feasible [5]. The richness and diversity
of collected maritime data have significantly enhanced the accuracy and effectiveness of
Copyright: © 2024 by the authors.
maritime applications, making them indispensable in modern maritime operations [6–8].
Licensee MDPI, Basel, Switzerland.
Deep learning technologies for time series forecasting in maritime applications have gar-
This article is an open access article
nered significant interest, particularly in facing these challenges. These technologies offer
distributed under the terms and
conditions of the Creative Commons
promising solutions, helping the maritime industry manage safety, optimize operations,
Attribution (CC BY) license (https://
and address future challenges more effectively.
creativecommons.org/licenses/by/
However, there are several notable gaps in the existing literature. Existing studies
4.0/). primarily focus on applying and optimizing individual deep learning models, lacking

Information 2024, 15, 507. https://doi.org/10.3390/info15080507 https://www.mdpi.com/journal/information


Information 2024, 15, 507 2 of 33

systematic and comprehensive comparative analysis. For instance, different studies employ
various models, datasets, and experimental setups, making it challenging to conduct
effective horizontal comparisons. This hinders a complete understanding of the strengths
and weaknesses of different deep learning methods in maritime time series analysis and
their appropriate application scenarios. Moreover, the performance variations of different
models in diverse application environments have not been thoroughly explored. While
some studies indicate that certain models excel in specific tasks, these results are often
based on particular datasets and experimental conditions, lacking generalizability. More
comprehensive research is needed to reveal which models have advantages across various
maritime applications and how to select and optimize models based on specific needs.
To overcome these research gaps, this review examines the applications of deep
learning in time series analysis within the maritime industry, focusing on three key areas:
ship operation-related, port operation-related, and shipping market-related topics. For
the purpose of obtaining relevant literature, extensive searches were conducted in major
international databases such as the Web of Science and Scopus to gather bibliometric data
suitable for analysis. Our strategy used specific keywords to identify studies employing
deep learning architectures for time series data in the maritime domain. After screening
and removing duplicates, we collected 89 papers on these topics. Most of the collected
literature, 79%, pertains to ship operations, highlighting the interest in forecasting ship
operation processes using deep learning. This comprehensive data collection underscores
the increasing reliance on advanced computational techniques to enhance maritime research
and operations.
This review explores the diverse applications of deep learning in the maritime do-
main, emphasizing its role in enhancing ship operation-related tasks, port operations, and
shipping market forecasting. Specifically, it delves into applications such as ship trajectory
prediction, anomaly detection, intelligent navigation, and fuel consumption prediction.
These applications leverage the unique advantages of deep learning models such as the
Artificial Neural Network (ANN), Multilayer Perceptron (MLP), Deep Neural Network
(DNN), Long Short-Term Memory (LSTM) network, Temporal Convolutional Network
(TCN), Recurrent Neural Network (RNN), Gated Recurrent Unit (GRU), Convolutional
Neural Network (CNN), Random Vector Functional Link (RVFL) network, and Transformer
models to address specific maritime challenges. Through comparative analysis, we aim
to uncover the performance and suitability of different deep learning models in various
maritime applications, thereby guiding future research and practical applications. Future
research directions emphasize improving data processing and feature extraction, optimiz-
ing deep learning models, and exploring specific maritime applications, including ship
trajectory prediction, port forecasts, ocean environment modeling, fault diagnosis, and
cross-domain applications.
The remainder of this work is organized as follows: Section 2 introduces the literature
review methods. In Section 3, deep learning methods are presented. The applications
of deep learning in time series forecasting within the maritime domain are discussed in
Section 4. An overall analysis is provided in Section 5. Finally, Section 6 concludes the
paper and outlines the limitations of the research.

2. Literature Collection Procedure


To acquire the pertinent literature, major international databases including the Web of
Science and Scopus were utilized to search for bibliometric data suitable for analysis. The
rationale for choosing these specific databases is that they are among the most extensive
academic and scientific research repositories. These databases are widely recognized for
their broad coverage of diverse disciplines. They are considered critical resources for
accessing high-quality, peer-reviewed articles crucial for conducting a thorough literature
review. Figure 1 is a flowchart that displays the overall process from the methods of data
collection through the criteria for filtering to the final selection of references. The search
strategy used for the two search engines is as follows:
Information 2024, 15, x FOR PEER REVIEW 3 of 36

Information 2024, 15, 507 3 of 33


accessing high-quality, peer-reviewed articles crucial for conducting a thorough literature
review. Figure 1 is a flowchart that displays the overall process from the methods of data
collection through the criteria for filtering to the final selection of references. The search
Search scope: Titles, Keywords,
strategy used forand Abstracts
the two search engines is as follows:
Keywords 1: ‘deep’Search
ANDscope:
‘learning’, AND and Abstracts
Titles, Keywords,
Keywords 2: ‘time Keywords
AND series’, AND
1: ‘deep’ AND ‘learning’, AND
Keywords
Keywords 3: ‘maritime’, OR2: ‘time AND series’, AND
Keywords
Keywords 4: ‘vessel’, OR 3: ‘maritime’, OR
Keywords 4: ‘vessel’, OR
Keywords 5: ‘shipping’, OR
Keywords 5: ‘shipping’, OR
Keywords 6: ‘marine’, OR 6: ‘marine’, OR
Keywords
Keywords 7: ‘ship’,Keywords
OR 7: ‘ship’, OR
Keywords 8: ‘port’,Keywords
OR 8: ‘port’, OR
Keywords 9: ‘terminal’
Keywords 9: ‘terminal’

Figure
Figure 1. The flow chart of1.data
The collection.
flow chart of Source:
data collection. Source: authors.
authors.

The primary functionThe primary function


of the first twoofsets
the first two sets of keywords
of keywords is to narrow
is to narrow the search
the search scopescope
to studies that use deep learning architectures for predicting time series data. Subsequent
to studies that use deep learning architectures for predicting time series data. Subsequent
keywords are used to narrow the search scope, all related to the maritime domain. After
keywords are usedconducting
to narrow anthe search
initial searchscope, all above
using the related to theand
strategy maritime
manuallydomain.
removing After
duplicates
conducting an initial
between the two databases, 482 papers were collected. Given that theduplicates
search using the above strategy and manually removing keywords used
between the two databases, 482 cases
might include papers were collected.
of polysemy, a secondGiven that is
screening the keywords
necessary. Theused might
strategy for this
second screening
include cases of polysemy, is as screening
a second follows: is necessary. The strategy for this second

screening is as follows: Retain only articles related to maritime operations. For example, studies on ship-sur-
rounding weather and risk prediction based on ship data will be kept, while research
• Retain only articles related
solely toonmaritime
focused operations.
marine weather For example,
or wave prediction that is studies
unrelatedon ship-
to any aspect
surrounding weather and risk
of maritime prediction
operations based on ship data will be kept, while
will be excluded.
research solely• focused
Exclude onneural
marine weather
network or that
studies wavedoprediction
not employ that
deep is unrelated
learning to anysuch
techniques,
aspect of maritimeasoperations
ANN or MLP with
will beonly one hidden layer.
excluded.
• Exclude neural• network
The language of the
studies thatpublications must bedeep
do not employ English.
learning techniques, such as
• The original data used in the papers must include time series sequences.
ANN or MLP with only one hidden layer.
After undergoing keyword filtering and manual screening, we retained 89 papers
• The language of the publications must be English.
that utilize deep learning architectures and are based on time series data for predictions
• The original data used in the papers
within the maritime domain.must include time series sequences.
After undergoing keyword filtering and manual screening, we retained 89 papers that
utilize deep learning architectures and are based on time series data for predictions within
the maritime domain.

3. Deep Learning Algorithms


3.1. Artificial Neural Network (ANN)
The ANN forms a type of machine learning model that is structured to reflect the
human brain’s neural configuration, which is noted for its effective performance [9]. Struc-
turally, a traditional ANN consists of three key layers: the input layer, one or more hidden
layers, and the output layer [10]. In this setup, raw data are first gathered in the input layer
and then processed through the hidden layers where patterns and features are extracted.
3. Deep Learning Algorithms
3.1. Artificial Neural Network (ANN)
The ANN forms a type of machine learning model that is structured to reflect the
human brain’s neural configuration, which is noted for its effective performance [9]. Struc-
Information 2024, 15, 507 4 of 33
turally, a traditional ANN consists of three key layers: the input layer, one or more hidden
layers, and the output layer [10]. In this setup, raw data are first gathered in the input
layer and then processed through the hidden layers where patterns and features are ex-
The processed information is finally sent
tracted. The processed to the output
information is finallylayer,
sent which generates
to the output layer,the predictive
which generates the
outcomes. The predictive
normalization of input
outcomes. and output data,
The normalization of inputtypically within
and output data,atypically
range ofwithin
0 to 1a range
or −1 to 1, ensures
of 0 to 1 or −1 toand
stability 1, ensures stability
efficiency andtraining
during efficiencybyduring trainingnumerical
preventing by preventing numerical
issues
and enhancingissues and enhancing
convergence rates.convergence rates. Thearchitecture
The foundational foundationalofarchitecture
the ANNofshown the ANN inshown
in Figure
Figure 2 consists of three2 consists
layers:ofan three layers:
input layeran comprising
input layer comprising four neurons
four neurons (I_1 to
(I_1 to I_4), a I_4), a
hidden layer containing four neurons, and an output layer featuring a single neuron (E). (E).
hidden layer containing four neurons, and an output layer featuring a single neuron
Each neuron is interconnected through weights optimized during training, allowing the
Each neuron is interconnected through weights optimized during training, allowing the
network to perform tasks such as prediction and classification effectively.
network to perform tasks such as prediction and classification effectively.

Figure 2. The
Figure 2. The foundational foundationalofarchitecture
architecture the ANN. of the ANN.
Source: Source:
authors authorsbased
redrawn redrawn
onbased
[11]. on [11].

3.1.1.
3.1.1. Multilayer Multilayer
Perceptron Perceptron (MLP)/Deep
(MLP)/Deep Neural Networks
Neural Networks (DNN) (DNN)
The term ANNThe termas
serves ANN serves
a broad as a broadfor
descriptor descriptor
variousfor variousincluding
models, models, including
the MLP,the a MLP,
fundamental archetype. The MLP’s architecture, as previously illustrated, relies on error error
a fundamental archetype. The MLP’s architecture, as previously illustrated, relies on
backpropagation and gradient descent algorithms to optimize network weights [12].
backpropagation and gradient descent algorithms to optimize network weights [12]. These
These methods work together to reduce prediction errors in the model’s output [13]. Equa-
methods work together to reduce prediction errors in the model’s output [13]. Equation (1)
tion (1) explains the formulation of the MLP’s output.
explains the formulation of the MLP’s output.
𝑍 𝑄
𝑜 𝐻
𝑦𝑝Z = 𝜑o 𝑜h{∑ Q𝜔𝑗𝑝 H 𝐻 (∑
[𝜑 𝜔𝑖𝑗 𝑥𝑖 )]} (1)
n io
y p = φo ∑ j=1 ω jp φ H 𝑗=1 ∑i=0 ij i
ω x 𝑖=0 (1)
𝐻 𝑜
where 𝜔𝑖𝑗 signifies the weights associated with the hidden layer and 𝜔𝑗𝑝 indicates the
where ωijH signifies theofweights
weights associated
the output with𝜑the
layer. 𝜑𝐻 and hidden layer and ω ojp indicates thehidden
𝑜 symbolize the activation functions for the
weights of the output layer. φ H and φo symbolize the activation functions for the hidden
and output layers, respectively. Commonly utilized activation functions include ReLU and
Sigmoid, with the selection primarily influenced by the particular application context and
the type of output required.
Conversely, the Deep Feedforward Network or Deep Neural Network (DNN), the
most fundamental model in deep learning, typically features more hidden layers [14]. The
increased architectural depth enhances the network’s ability to learn intricate features and
patterns from the input data, thereby improving its prediction capabilities.
Within the corpus of literature compiled for this paper, two distinct studies employed
the frameworks of ANN and MLP, respectively, to conduct time series forecasting. These
neural network architectures were specifically utilized for predicting ship Response Ampli-
tude Operators (RAOs) and ship classification purposes, respectively [15,16].
Information 2024, 15, 507 5 of 33

3.1.2. WaveNet
Recent advancements in ANN research have led to the development of high-performance
models, notably the WaveNet, a generative model based on convolutional neural networks
designed for audio waveform generation [17]. This model incorporates several advanced
deep learning technologies, including dilated causal convolutions, residual modules, and
gated activation units. The hallmark of WaveNet is its dilated convolution structure, which
enables the model to effectively learn long-term dependencies within audio data, demon-
strating robust learning capabilities. As a probabilistic autoregressive model, WaveNet
directly learns the mappings between sequential sample values, revealing its potential
across a variety of application scenarios. Figure 3 illustrates the foundational architecture of
WaveNet. Furthermore, for a sequence of x = { x1 , x2 . . . , xt−1 }, its joint probability p( x )
Information 2024,can be PEER
15, x FOR represented
REVIEW by Equation (2). WaveNet also utilizes gated activation units,5 of 36 whose

mathematical expression is depicted in Equation (3).


and output layers, respectively. T Commonly utilized activation functions include ReLU
and Sigmoid, with the
p( xselection
and the type of output required.
∏p( xt | xinfluenced
) = primarily 1 , x2 . . . ,byxthe
t−1particular
) application context (2)
t =1
Conversely, the Deep Feedforward Network or Deep Neural Network (DNN), the
most fundamental model in deep learning, typically features more hidden layers [14]. The
   
z = depth
increased architectural W f ,k ∗ the
tanhenhances ⊙ σ Wability
x network’s g,k ∗ xto learn intricate features and (3)
patterns from the input data, thereby improving its prediction capabilities.
where t represents thethe
Within time step
corpus index within
of literature compiled theforsequence;
this paper, two T distinct
denotes theemployed
studies total length of
the sequence;theand k indicates
frameworks of ANN the
andlayer index. The
MLP, respectively, conduct ftime
to terms and g refer
series to the
forecasting. filter and
These
neural network
gate, respectively, while W architectures
is a learnablewere specifically
convolution utilized for predicting
filter. ship Response
A fundamental Am-
component of
plitude Operators (RAOs) and ship classification purposes, respectively [15,16].
WaveNet is its use of causal convolutions. By employing causal convolutions, the model
ensures that the
3.1.2.temporal
WaveNet order of data is preserved during modeling. WaveNet enhances
this approach byRecentincorporating
advancementsdilated
in ANN causal
research convolutions. Dilated of
have led to the development convolutions
high-perfor- skip
input values mance
at specified
models, intervals,
notably the allowing
WaveNet, a the convolutional
generative model basedkernel to cover aneural
on convolutional larger area
networks
than its actual designed
size, which for audio waveform
improves generation [17]. This
model performance. model4incorporates
Figures and 5 illustrate several causal
advanced deep learning technologies, including dilated causal convolutions, residual
and dilated causal
modules,convolutions, respectively.
and gated activation The key
units. The hallmark difference
of WaveNet is its in the convolution
dilated hidden layers is
that the dilated layers
structure, in Figure
which 5 use
enables the modeldifferent dilation
to effectively rates todependencies
learn long-term skip inputs. withinThis au- allows
the model to dio data,a demonstrating
cover broader range robust learning
of data capabilities.
without As a probabilistic
increasing the number autoregressive
of parameters.
model, WaveNet directly learns the mappings between sequential sample values, reveal-
In contrast, the hidden layers in Figure 4 process consecutive input data points, focusing
ing its potential across a variety of application scenarios. Figure 3 illustrates the founda-
on immediate information.
architecture This dilation in Figurefor 5 helps the model
of 𝑥 = {𝑥capture longer-term
tional of WaveNet. Furthermore, a sequence 1 , 𝑥2 … , 𝑥𝑡−1 } , its
dependenciesjoint
in the data. WaveNet
probability 𝑝(𝑥) can be has been demonstrated
represented by Equation (2). to be effective
WaveNet in synthesizing
also utilizes gated
natural human activation
speech units,
andwhose mathematical
predicting timeexpression is depicted in Equation (3).
series [18,19].

Figure 3. The foundational


Figure 3. The foundational architecture
architecture of theofWaveNet.
the WaveNet.
layers is that the dilated layers in Figure 5 use different dilation rates to skip inputs. This
allows the model to cover a broader range of data without increasing the number of pa-
rameters. In contrast, the hidden layers in Figure 4 process consecutive input data points,
focusing on immediate information. This dilation in Figure 5 helps the model capture
Information 2024, 15, 507 longer-term dependencies in the data. WaveNet has been demonstrated to be effective 6 ofin33
synthesizing natural human speech and predicting time series [18,19].

Information 2024, 15, x FOR PEER REVIEW 7 of 36

Figure4.4.Visualization
Figure Visualization of
of aa stack
stack of
of causal
causal convolutional layers.
convolutional layers.

Figure5.5.Visualization
Figure Visualization of
of aa stack
stack of
of dilated
dilated causal convolutional layers.
layers.

3.1.3.
3.1.3.Randomized
Randomized Neural
Neural Network
Network
Random
Random neural networks, commonly
neural networks, commonly usedused in
in shallow
shallow neural
neuralarchitectures,
architectures,initialize
initialize
the weights and biases of hidden layers randomly and keep them fixed
the weights and biases of hidden layers randomly and keep them fixed during training, during training,
enhancing training efficiency [20]. This approach, integral to the construction
enhancing training efficiency [20]. This approach, integral to the construction of Extreme of Extreme
Learning Machines (ELMs) and the RVFL network, is particularly effective
Learning Machines (ELMs) and the RVFL network, is particularly effective for rapid learn- for rapid
learning and streamlining model training
ing and streamlining model training [21,22]. [21,22].
ELM
ELMrepresents
represents aa type
type of
of random
random neural network, particularly
neural network, particularlyemphasizing
emphasizingthe theuse
use
of a single hidden layer, where the weights and biases of the hidden layer
of a single hidden layer, where the weights and biases of the hidden layer remain fixed remain fixed
after
afterinitialization.
initialization.Within
Withinthe
theELM
ELMframework,
framework,thetheonly
onlyparameters
parameters that need
that needoptimization
optimiza-
are the output weights, which are the weights connecting the hidden
tion are the output weights, which are the weights connecting the hidden nodes nodes to the output
to the
output nodes. These are typically computed efficiently via matrix inversion operations.
Figure 6 illustrates a simplified ELM architecture, and 𝛽 represents the weights associ-
ated with the connections in the network.
enhancing training efficiency [20]. This approach, integral to the construction of Extreme
Learning Machines (ELMs) and the RVFL network, is particularly effective for rapid learn-
ing and streamlining model training [21,22].
ELM represents a type of random neural network, particularly emphasizing the use
of a single hidden layer, where the weights and biases of the hidden layer remain fixed
Information 2024, 15, 507
after initialization. Within the ELM framework, the only parameters that need optimiza- 7 of 33
tion are the output weights, which are the weights connecting the hidden nodes to the
output nodes. These are typically computed efficiently via matrix inversion operations.
nodes.6 These
Figure are typically
illustrates computed
a simplified efficiently and
ELM architecture, via matrix inversion
𝛽 represents the operations. Figure 6
weights associ-
illustrates
ated with the a simplified
connectionsELM architecture,
in the network. and β represents the weights associated with the
connections in the network.

Information 2024, 15, x FOR PEER REVIEW 8 of 36

Figure 6. The simplified architecture of the ELM.


Figure 6. The simplified architecture of the ELM.
RVFL is an extension of ELM, retaining all core characteristics of ELM while incor-
RVFL is an extension of ELM, retaining all core characteristics of ELM while incor-
porating additional direct connections from the input layer to the output layer. These
porating additional direct connections from the input layer to the output layer. These di-
direct connections include not only the weights linking hidden and output nodes but also
rect connections include not only the weights linking hidden and output nodes but also
direct linkage weights between input and output nodes. Such direct connections in RVFL
direct linkage weights between input and output nodes. Such direct connections in RVFL
networks
networks provide
provide additional
additional regularization
regularization capabilities,
capabilities, enhancingenhancing the learning
the model’s model’s learning
capacityand
capacity andgeneralizability
generalizability [23].
[23]. In the
In the RVFLRVFL structure,
structure, the the output
output weights,
weights, including both
including
direct
both andand
direct indirect
indirectconnections, canalso
connections, can alsobebecomputed
computed directly
directly through
through matrixmatrix
inver- inversion
operations
sion operations [24].
[24].These
These enhancements
enhancements allow allow RVFL
RVFL networks
networks to more
to more effectively
effectively cap- capture
linear
ture and
linear and nonlinear
nonlinearrelationships
relationships inininput
input data,
data, demonstrating
demonstrating superior
superior performance in
performance
ina awider
wider range
range of ofdata
datatypes
typesandandcomplex
complextasks.tasks.
FigureFigure
7 illustrates a simplified
7 illustrates RVFL
a simplified RVFL
architecture,
architecture, where
where thethe
yellow lineline
yellow represents the direct
represents connections
the direct from the
connections input
from thelayer
input layer to
tothe
theoutput
output layer,
layer, aa distinctive
distinctivefeature of of
feature thethe
RVFL
RVFL network.
network.

Figure
Figure 7. 7.
TheThe simplified
simplified architecture
architecture of theof the RVFL.
RVFL.

While traditional ELM and RVFL are not typically considered deep learning archi-
tectures, they have been applied to more complex data structure processing tasks by in-
troducing deep variants, such as deep ELM (DELM) and ensemble deep random vector
functional link (edRVFL), showing broad applicability across various application do-
mains. This is aimed at combining the benefits of deep learning, such as the ability to
Information 2024, 15, 507 8 of 33

While traditional ELM and RVFL are not typically considered deep learning archi-
tectures, they have been applied to more complex data structure processing tasks by
introducing deep variants, such as deep ELM (DELM) and ensemble deep random vector
functional link (edRVFL), showing broad applicability across various application domains.
This is aimed at combining the benefits of deep learning, such as the ability to capture more
complex data structures, with the efficient training processes of ELM/RVFL [25]. These
deep versions are generally achieved by increasing the number of hidden layers, allowing
the model to learn deeper data representations. Currently, DELM and edRVFL are widely
utilized in image processing or prediction tasks [26–28].

3.2. Convolutional
Information 2024, 15, x FOR PEER REVIEW Neural Network (CNN) 9 of 36
Lecun et al. [29] pioneered the CNN by integrating the backpropagation algorithm
with feedforward neural networks, laying the foundation for modern convolutional ar-
chitectures. and
interaction, Originally
parameter designed
sharingfor image
[14]. processing,
Although initiallythe CNN
used efficiently
primarily addresses
in machine vi-
sion and speech recognition, CNN has also begun to demonstrate its potential in inter-
complex nonlinear challenges through features like equivalent representation, sparse time
action, and parameter sharing [14]. Although initially used primarily in machine vision
series prediction [30,31]. These networks capture local features in temporal data and use
and speech recognition, CNN has also begun to demonstrate its potential in time series
their deep structures to extract time-dependent relationships, providing new perspectives
prediction [30,31]. These networks capture local features in temporal data and use their
and methods for time series analysis. Figure 8 illustrates the fundamental structure of the
deep structures to extract time-dependent relationships, providing new perspectives and
CNN.
methods for time series analysis. Figure 8 illustrates the fundamental structure of the CNN.

Figure 8.
Figure The foundational
8. The foundational architecture
architecture of
of the
the CNN.
CNN. Source:
Source: authors
authors redrawn
redrawn based
based on
on [11].
[11].

CNN has demonstrated superior performance in processing time series data. In time
CNN has demonstrated superior performance in processing time series data. In time
series prediction applications, CNN employs filters to process the input data, capturing
series prediction applications, CNN employs filters to process the input data, capturing
local patterns and features within the time series via convolutional operations, such as
local patterns and features within the time series via convolutional operations, such as
trends and periodic changes. These filters, utilizing weight sharing and regional connec-
trends and periodic changes. These filters, utilizing weight sharing and regional connec-
tions, reduce the complexity and computational load of the model while enabling the
tions, reduce the complexity and computational load of the model while enabling the ef-
effective identification of key structures in temporal data, such as spikes or drops [32]. The
fective identification of key structures in temporal data, such as spikes or drops [32]. The
mathematical expression for a convolutional layer is denoted as Equation (4).
mathematical expression for a convolutional layer is denoted as Equation (4).
 
x l
= φ
𝑥𝑗𝑙 j= 𝜑 (∑
∑i∈ Mj i𝑥 𝑙−1 ∗ij𝜔𝑙 j+ 𝑏𝑙 )
x l −1
∗ ω l
+ b l
(4)
(4)
𝑖 𝑖𝑗 𝑗
𝑖∈𝑀𝑗
In the equation of a convolutional layer, φ represents the activation function; ω𝑙ijl
In the equation of a convolutional layer, 𝜑 represents the activation function; 𝜔𝑖𝑗
denotes the weight values of the convolutional kernel; bl refers to the bias associated with
denotes the weight values of the convolutional kernel; 𝑏𝑗𝑙j refers to the bias associated with
the kernel; and M indicates the set of feature mappings.
the kernel; and 𝑀𝑗j indicates the set of feature mappings.
Following convolution, a pooling layer such as max or average pooling reduces the
Following convolution, a pooling layer such as max or average pooling reduces the
data’s temporal dimension, easing computational demands and helping prevent overfitting.
data’s temporal dimension, easing computational demands and helping prevent overfit-
The network’s later stages consist of fully connected layers that integrate and interpret
ting. The network’s later stages consist of fully connected layers that integrate and inter-
these features, although these often contain many parameters, posing computational chal-
pret these features, although these often contain many parameters, posing computational
lenges. Efficient design adjustments and regularization techniques are critical for enhancing
challenges. Efficient design adjustments and regularization techniques are critical for en-
CNN performance in time series prediction tasks, such as financial trend forecasting or
hancing CNN performance in time series prediction tasks, such as financial trend fore-
weather prediction [33–35]. Through these mechanisms, the CNN utilizes historical data to
casting or weather
effectively forecast prediction [33–35].
future trends, Through
showcasing itsthese
robustmechanisms,
capabilities inthetime
CNN utilizes
series his-
analysis.
torical data to effectively forecast future trends, showcasing its robust capabilities
Although CNN has demonstrated robust capabilities in time series analysis tasks, it in time
series analysis.
exhibits certain limitations in addressing long-term dependencies. To mitigate this chal-
Although
lenge, the TCNCNN has demonstrated
was introduced by Bai etrobust
al. [36],capabilities
emerging as in atime series
notable analysis tasks,
adaptation of CNNit
exhibits certain limitations in addressing long-term dependencies. To mitigate this chal-
lenge, the TCN was introduced by Bai et al. [36], emerging as a notable adaptation of CNN
specifically engineered for time series data processing. TCN processes time series data
exclusively through convolutional layers instead of recurrent layers, thereby effectively
capturing long-range dependencies within the time series. The architecture incorporates
causal convolutions, dilated convolutions, and residual blocks. These techniques collec-
Information 2024, 15, 507 9 of 33

specifically engineered for time series data processing. TCN processes time series data
exclusively through convolutional layers instead of recurrent layers, thereby effectively
capturing long-range dependencies within the time series. The architecture incorporates
Information 2024, 15, x FOR PEER REVIEW 10 of 36
causal convolutions, dilated convolutions, and residual blocks. These techniques collec-
tively address the challenge of extracting long-term information from series data. Figure 9
illustrates the fundamental architecture of the TCN.

Figure
Figure 9. 9.
TheThe foundational
foundational architecture
architecture ofof the
the TCN.
TCN. Source:
Source: authors
authors redrawn
redrawn based
based onon[37].
[37].

TCN
TCN processessequence
processes sequencedata
datausing
usinga aseries
seriesofofone-dimensional
one-dimensionalconvolutional
convolutionaloper-
opera-
tions, each comprising a convolutional layer and a nonlinear activation
ations, each comprising a convolutional layer and a nonlinear activation function. The function. The kernel
size, stride,
kernel and dilation
size, stride, rate can
and dilation ratebecan
adjusted in eachinconvolutional
be adjusted each convolutionaloperation as needed.
operation as
The technique of dilated convolution enables the kernel to skip over
needed. The technique of dilated convolution enables the kernel to skip over some aspectssome aspects in the
insequence, effectively
the sequence, expanding
effectively the receptive
expanding the receptivefield field
without adding
without moremore
adding parameters.
parame-By
stacking multiple layers of such convolutions, the TCN progressively abstracts higher-level
ters. By stacking multiple layers of such convolutions, the TCN progressively abstracts
features from the sequence data, enhancing the model’s expressive capacity. The equation
higher-level features from the sequence data, enhancing the model’s expressive capacity.
for operation F, representing dilated convolutions on elements p of a 1-D sequence, is
The equation for operation 𝐹, representing dilated convolutions on elements p of a 1-D
presented in Equation (5).
sequence, is presented in Equation (5). l −1
F ( p ) = ∑ i =0 f ( i ) x p − d ∗ i (5)
𝑙−1
where d represents the dilation𝐹(𝑝) factor;=f denotes
∑ 𝑓(𝑖)𝑥 (5)of
𝑝−𝑑∗𝑖 of filters; p refers to the size
the number
each filter; and d ∗ i specifies the direction of𝑖=0 the past.
where To avoid localthe
𝑑 represents optima during
dilation training
factor; and tothe
𝑓 denotes enhance
number performance,
of filters; 𝑝 TCN
refersemploys
to the
residual
size of eachconnections
filter; and between convolutional
𝑑 ∗ 𝑖 specifies layers
the direction of[38]. The fundamental concept of resid-
the past.
ualTo
connections
avoid local involves
optimaadding
duringthe output and
training of a current
to enhancelayerperformance,
to the output TCN
from employs
a previous
layer (or
residual several prior
connections layers),
between effectively creating
convolutional a “shortcut”
layers [38]. connection.
The fundamental This method
concept of re-
enhances the model’s efficiency in learning long-range dependencies within
sidual connections involves adding the output of a current layer to the output from a pre- the sequence.
Residual
vious layerconnections
(or several also help
prior to mitigate
layers), vanishing
effectively and aexploding
creating “shortcut”gradients and acceler-
connection. This
ate model convergence. The residual connections can be expressed
method enhances the model’s efficiency in learning long-range dependencies within in Equation (6). the
sequence. Residual connections also help to mitigate vanishing and exploding gradients
O = φ( F ( x ) + x ) (6)
and accelerate model convergence. The residual connections can be expressed in Equation
(6).
where φ represents the activation function. After all convolutional operations, a TCN
typically employs a global average 𝑂 pooling
= 𝜑(𝐹(𝑥)layer+to𝑥)aggregate all features and generate
(6)
a global feature vector. This vector is inputted into subsequent fully connected layers
where 𝜑 represents
designed the activation
for tasks such function.
as regression After all convolutional operations, a TCN
and classification.
typically employs a global average pooling layer
Finally, TCN employs a loss function to measure to aggregate all features
the discrepancy and generate
between a
predictions
global featureoutcomes,
and actual vector. This vector
using the is inputted into subsequent
backpropagation algorithm fully connected
to update layers de-
parameters. Dur-
signed for tasks such as regression and classification.
Finally, TCN employs a loss function to measure the discrepancy between predic-
tions and actual outcomes, using the backpropagation algorithm to update parameters.
During training, various optimization algorithms and regularization techniques enhance
the model’s generalization capabilities [39].
In the literature we collected, it is common to see CNN and TCN used in conjunction
Information 2024, 15, 507 10 of 33

Information 2024, 15, x FOR PEER REVIEW 11 of 36


ing training, various optimization algorithms and regularization techniques enhance the
model’s generalization capabilities [39].
In the literature we collected, it is common to see CNN and TCN used in conjunc-
utilized
tion withforother
trajectory
deepprediction by Bin Syed and
learning architectures. ForAhmed
instance,[40];
thethe CNN–GRU–AM
CNN-LSTM archi-
architecture
tecture was used
was utilized for predicting
for trajectory ship motion
prediction as demonstrated
by Bin Syed and Ahmed by [40];Lithe
et CNN–GRU–AM
al. [41]; and the
IWOA-TCN-Attention
architecture was usedarchitecture
for predicting wasship
employed
motiontoasforecast ship motion
demonstrated by Liattitude, as pre-
et al. [41]; and
sented by Zhang et al. [42].architecture
the IWOA-TCN-Attention These composite models combine
was employed to forecasttheship
advantages of various
motion attitude, as
presented by to
architectures Zhang et al.prediction
improve [42]. Theseaccuracy
composite andmodels combine
performance inthe
theadvantages of various
intricate operational
architectures to
environments ofimprove prediction
the maritime accuracy and performance in the intricate operational
domain.
environments of the maritime domain.
3.3. Recurrent Neural Network (RNN)
3.3. Recurrent Neural Network (RNN)
RNN is a specialized class of neural networks designed specifically for managing
RNN is a of
data sequences specialized class of[43].
varying lengths neural networks
Beyond designed
the standard specifically
input and output for layers,
managingthe
data sequences of varying lengths [43]. Beyond the standard input
core of the RNN consists of one or more hidden layers composed of units with recurrent and output layers, the
core of the RNN consists of one or more hidden layers composed
connections. These connections allow the RNN to capture temporal dependencies, with of units with recurrent
connections.
each Theseacting
hidden layer connections allow the
as a “memory RNN
state” to capture
that temporal
retains past dependencies,
information with
to influence
each hidden layer acting as a “memory state” that retains past information
future outputs. The hidden state at each time step is shaped by the current input and the to influence
future outputs.
preceding hidden The hidden
state, state at the
facilitating eachnetwork’s
time step ability
is shaped by the current
to process and learninputfromandpat-
the
preceding hidden state, facilitating the network’s ability to process and learn from patterns
terns in the sequence. Figure 10 illustrates the fundamental architecture of the RNN.
in the sequence. Figure 10 illustrates the fundamental architecture of the RNN.

Figure
Figure10.
10.The
Thefoundational
foundationalarchitecture
architectureof
ofthe
theRNN.
RNN.Source:
Source: authors
authorsredrawn
redrawn based
based on
on [44].
[44].

𝑥x represents
represents the
theinput layer;s denotes
inputlayer; 𝑠 denotes thethe
state of the
state of hidden W, U,𝑊,
layer;layer;
the hidden and𝑈, V and
signify
𝑉
the weights;
signify and o stands
the weights; for thefor
and 𝑜 stands output layer. layer.
the output “Unfold” typically
“Unfold” refers refers
typically to expanding
to expand-the
RNN
ing theacross
RNN the timethe
across series
timeinto a sequence
series of feedforward
into a sequence networks.
of feedforward The RNN,
networks. Thewith
RNN, its
unique memory capacity and flexibility, demonstrates a decisive advantage
with its unique memory capacity and flexibility, demonstrates a decisive advantage in the in the field of
series forecasting [45]. Whether it is predicting traffic flow, forecasting financial
field of series forecasting [45]. Whether it is predicting traffic flow, forecasting financial and stock
market
and stocktrends,
marketortrends,
anticipating weather and
or anticipating climate
weather andchanges, RNN plays
climate changes, RNN a crucial
plays role
a cru-in
producing
cial role in accurate
producing predictions
accurate [46,47]. The basic
predictions equations
[46,47]. The basic for equations
the RNN can for be
therepresented
RNN can
byrepresented
be Equations (7)byand (8).
Equations (7) and (8). 
st = φ Uxt + Wst−1 + bs (7)
𝑠𝑡 = 𝜑(𝑈𝑥𝑡 + 𝑊𝑠𝑡−1 + 𝑏𝑠 ) (7)
ot = Vst + bo (8)
where bs and bo represent the biases and 𝑜𝑡φ=denotes
𝑉𝑠𝑡 + 𝑏an
𝑜 activation function. Certainly, while
(8)
RNN was designed to address long-term dependencies, its performance on long-duration
data in𝑏experimental
where 𝑠 and 𝑏𝑜 represent
settingsthe
hasbiases 𝜑 denotes
andentirely
not been an activation
optimal [48,49]. function. Certainly,
while RNN was designed to address long-term dependencies, its performance on long-
duration data in experimental settings has not been entirely optimal [48,49].
Information 2024, 15, x FOR PEER REVIEW 12 of 36
Information 2024, 15, 507 11 of 33

3.3.1. Long Short-Term Memory (LSTM)


3.3.1. Long Short-Term Memory (LSTM)
The LSTM network, proposed by Hochreiter and Schmidhuber [50,51], was devel-
The LSTM network, proposed by Hochreiter and Schmidhuber [50,51], was developed
oped to overcome the limitations of short-term memory in the traditional RNN [52]. It
to overcome the limitations of short-term memory in the traditional RNN [52]. It features
features a unique memory cell and an advanced gating mechanism within the RNN
a unique memory cell and an advanced gating mechanism within the RNN framework
framework to enhance functionality. This mechanism includes three key components: an
to enhance functionality. This mechanism includes three key components: an input gate,
input gate, a forget gate, and an output gate. Together, they regulate the retention, mainte-
a forget gate, and an output gate. Together, they regulate the retention, maintenance
nance duration, and retrieval timing of information in the memory cell [53]. Figure 11
duration, and retrieval timing of information in the memory cell [53]. Figure 11 illustrates
illustrates the fundamental structure of the LSTM.
the fundamental structure of the LSTM.

Figure 11. The foundational architecture of the LSTM. Source: authors redrawn based on [11].

ℎh represents
representsthe thehidden
hiddenstate, whilec denotes
state,while 𝑐 denotes thethe
long-term
long-term memory
memory cell.cell.
TheTheconnec-
con-
tion between h and c is finely controlled by three “gates”: the input
nection between ℎ and 𝑐 is finely controlled by three “gates”: the input gate, the forget gate, the forget gate,
and the
gate, andoutput gate. Ingate.
the output eachIn consecutive time step,
each consecutive time thestep,
LSTMthe utilizes
LSTMthe currentthe
utilizes input and
current
previous state to determine the extent of memory elimination (through
input and previous state to determine the extent of memory elimination (through the for- the forget gate), the
quantity
get gate),of thenew information
quantity of new to assimilateto
information into the memory
assimilate cellmemory
into the (throughcellthe(through
input gate),
the
and the amount of information transmission from the memory cell
input gate), and the amount of information transmission from the memory cell to the cur- to the current hidden
rent h (through
statehidden statethe output gate).
ℎ (through These gates
the output gate).primarily
These gates manage the flow
primarily manage of information,
the flow of
utilizing sigmoid and tanh activation functions. When
information, utilizing sigmoid and tanh activation functions. When error signalserror signals are relayedare fromre-
the output layer, the LSTM’s memory cell ccc captures vital error
layed from the output layer, the LSTM’s memory cell ccc captures vital error gradients, gradients, facilitating
prolonged information
facilitating retention. Compared
prolonged information retention. to RNN, LSTM
Compared alleviates
to RNN, LSTM thealleviates
problemsthe of
vanishing or exploding gradients to some extent through their
problems of vanishing or exploding gradients to some extent through their gating mech-gating mechanism, thereby
more effectively handling long-term dependencies. The internal details of the LSTM are
anism, thereby more effectively handling long-term dependencies. The internal details of
shown in Figure 12 and Equations (9) through (14) illustrate all the internal components of
the LSTM are shown in Figure 12 and Equations (9) through (14) illustrate all the internal
the LSTM.
components of the LSTM.  
f t = σ W f [ h t −1 , x t ] + b f (9)

it = σ (Wi [ht−1 , x t ] + bi ) (10)



gt = tanh Wg [ht−1 , x t ] + bg (11)
ot = σ (Wo [ht−1 , x t ] + bo ) (12)
c t = f t ⊙ c t −1 + i t ⊙ g t (13)
ht = tanh(ct ) ⊙ ot (14)
where f t represents the input gate; σ denotes an activation function; W denotes the weight
matrices that are linked to the inputs for the activation functions within the network; it
and gt are form components of the input gate; it typically employs the sigmoid function
as its activation function, while gt utilizes the hyperbolic tangent (tanh) activation; and ot
represents the forget gate.
Information 2024, 15, x FOR PEER REVIEW 13 of 36
Information 2024, 15, 507 12 of 33

Figure
Figure12.
12.The
Theinternal
internal details ofthe
details of theLSTM.
LSTM.Source:
Source: authors
authors redrawn
redrawn based
based on [54].
on [54].

3.3.2. Gated Recurrent Unit (GRU)


𝑓𝑡 =neural
Compared to the LSTM, the GRU [ℎ𝑡−1 , 𝑥was
𝜎(𝑊𝑓network 𝑡] + 𝑏𝑓 )
introduced by Cho et al. [55] and (9)
serves as a foundational regressor within the family of RNNs. It simplifies the architecture
into [aℎsingle
𝑖𝑡 gates
= 𝜎(𝑊 𝑖 𝑡−1 , 𝑥𝑡 ] + 𝑏𝑖 )
by combining the input and forget update gate and merges the hidden (10)
and cell states into one unified state, making it more efficient [56]. The update gate in a
GRU manages modifications to the neuron’s content, while the reset gate determines how
Information 2024, 15, x FOR PEER REVIEW 𝑔𝑡 = 𝑡𝑎𝑛ℎ(𝑊𝑔 [ℎ𝑡−1 , 𝑥𝑡 ] + 𝑏𝑔 ) 14 of 36 (11)
much of the previous state to discard. The internal details of the GRU are illustrated in
Figure 13, and its fundamental equations are represented by Equations (15) to (18).
𝑜𝑡 = 𝜎(𝑊𝑜 [ℎ𝑡−1 , 𝑥𝑡 ] + 𝑏𝑜 ) (12)

𝑐𝑡 = 𝑓𝑡 ⊙ 𝑐𝑡−1 + 𝑖𝑡 ⊙ 𝑔𝑡 (13)

ℎ𝑡 = tanh (𝑐𝑡 ) ⊙ 𝑜𝑡 (14)


where 𝑓𝑡 represents the input gate; 𝜎 denotes an activation function; 𝑊 denotes the
weight matrices that are linked to the inputs for the activation functions within the net-
work; 𝑖𝑡 and 𝑔𝑡 are form components of the input gate; 𝑖𝑡 typically employs the sig-
moid function as its activation function, while 𝑔𝑡 utilizes the hyperbolic tangent (tanh)
activation; and 𝑜𝑡 represents the forget gate.

3.3.2. Gated Recurrent Unit (GRU)


Compared to the LSTM, the GRU neural network was introduced by Cho et al. [55]
and serves as a foundational regressor within the family of RNNs. It simplifies the archi-
tecture by combining the input and forget gates into a single update gate and merges the
hidden and cell states into one unified state, making it more efficient [56]. The update gate
in a GRU manages modifications to the neuron’s content, while the reset gate determines
how much of the previous state to discard. The internal details of the GRU are illustrated
inFigure
Figure
Figure
13.13,
13. Theand its fundamental
internal
The internal equations
details of the GRU.
details of the GRU. are represented
Source: authors
Source: authors by Equations
redrawn based
redrawn based
on [57].
on [57]. (15) to (18).

zt = σ (Wz [ht−1 , x t ] + bz ) (15)


rt = σ (Wr [ht−1 , x t ] + br ) (16)

h t = tanh(Wh [rt ⊙ ht−1 , x t ] + bh ) (17)
Information 2024, 15, 507 13 of 33


h t = (1 − z t ) h t −1 + z t h t (18)

where h t and ht represent the candidate vector and output vector, respectively; σ denotes
an activation function; W refers to the weight matrices associated with the inputs to the
network’s activation functions; zt is identified as the update gate; and rt represents the
reset gate.
The papers we have gathered show that the RNN architecture is the most frequently
applied, with LSTM being the most extensively utilized variant. This widespread adoption
underscores the significant recognition that RNN architectures receive in the field of series
prediction, emphasizing their effectiveness in managing the complexities of series data
across various applications. Among these, they are widely used in tasks related to ship
operations, such as trajectory prediction using RNN-LSTM by Pan et al. [58]; the prediction
of Ship Encounter Situation Awareness using LSTM by Ma et al. [59]; and trajectory
prediction using GRU by Suo et al. [60]. This indicates the efficacy of RNN-based models
in enhancing time series predictions within the maritime domain.

3.4. Attention Mechanism (AM)/Transformer


The Attention Mechanism (AM), initially introduced by Bahdanau et al. [61], found
its initial application in the field of natural language processing (NLP). The mechanism
enables humans to quickly filter high-value information from vast data using limited
attentional resources. Although AM has been less commonly applied in the domain of time
series prediction, it is frequently employed for feature extraction and often combined with
models such as RNN for forecasting time series data. However, the Transformer model,
an advancement based on AM, has seen widespread application in time series prediction.
The Transformer architecture, initially presented in the work of Vaswani et al. [62], relies
entirely on attention mechanisms, specifically self-attention layers, to process sequence data,
abandoning the RNN structures commonly used in traditional sequence models. It employs
the attention mechanism to maintain a constant distance between any two positions in the
sequence, eliminating the need for sequential execution and enabling enhanced parallelism.
Figure 14 shows the standard architecture of the Transformer model.
In time series forecasting with the Transformer model, the process starts by converting
input data into embedding vectors, which are then enriched with positional encodings to
add temporal context. The data pass through several layers of self-attention mechanisms
and fully connected layers, with each layer fortified by residual connections and layer
normalization to ensure stable training. The Transformer excels at analyzing feature
interrelationships across various time intervals due to its parallel processing capability,
crucial for modeling long-term dependencies. The multi-head attention mechanism splits
the input into subspaces, allowing for focused analysis of distinct patterns, which enhances
feature extraction. During the decoding phase, the model uses masked self-attention to
base predictions solely on previously known data, finishing with predictions generated by a
fully connected layer followed by a Softmax layer. This architecture makes the Transformer
particularly effective for complex time series forecasting involving long-term dependencies.
Equations (19)–(21) demonstrate the self-attention mechanism applied to the Query (Q),
Key (K), and Value (V) components.

QK T
 
A( Q, K, V ) = so f tmax √ V (19)
dk

Multi ( Q, K, V ) = Concat( H1 , . . . , Hi )W O (20)


where:  
Hi = A WQiQ , WKiK , WViV (21)

In Equation (17), Q = W Q x, K = W K x, V = W V x, and W are weight matrices of them.


Multi-head Attention is an element of attention mechanisms that enables the model to pro-
Information 2024, 15, 507 14 of 33

Information 2024, 15, x FOR PEER REVIEW 16 of 36


cess information from multiple representational subspaces concurrently. This functionality
is detailed in Equation (18).

Figure 14. The architecture of the Transformer.


Transformer.Source:
Source: authors
authors redrawn
redrawnbased
basedon
on[63].
[63].

In the existing
In time literature,with
series forecasting the AM is often usedmodel,
the Transformer with other models,
the process such
starts byas in the
convert-
GRU-AM for trajectory predictions by Zhang et al. [64]. The Transformer
ing input data into embedding vectors, which are then enriched with positional encodings model is par-
ticularly prevalent in prediction applications, with variations such as the
to add temporal context. The data pass through several layers of self-attention mecha- iTransformer
being
nisms employed for trajectory
and fully connected predictions
layers, with eachbylayer
Zhang et al. [65],
fortified and hybrid
by residual models that
connections and
combine
layer normalization to ensure stable training. The Transformer excels at analyzingtasks
Transformer and LSTM also being used for similar trajectory forecasting by
feature
Jiang et al. [66]. across various time intervals due to its parallel processing capability,
interrelationships
crucial for modeling long-term dependencies. The multi-head attention mechanism splits
3.5. Overview of Algorithms Usage
the input into subspaces, allowing for focused analysis of distinct patterns, which en-
Excluding
hances review andDuring
feature extraction. comparative studiesphase,
the decoding from the
the model
collected papers,
uses maskedFigure 15 il-
self-atten-
lustrates the proportionate usage of primary models in the documented research. In
tion to base predictions solely on previously known data, finishing with predictions gen-
this analytical review encompassing 89 papers, it is conspicuous that the LSTM model is
erated by a fully connected layer followed by a Softmax layer. This architecture makes the
prevalently utilized, appearing in over 40% of the examined studies. This predominance
Transformer particularly effective for complex time series forecasting involving long-term
underscores the model’s robustness in processing sequential data, affirming its continued
dependencies. Equations (19)–(21) demonstrate the self-attention mechanism applied to
relevance and centrality in the literature. While deep learning architectures are often used
the Query (Q), Key (K), and Value (V) components.
Information 2024, 15, 507 15 of 33

in combination to leverage their respective strengths, many of the collected papers exclu-
sively utilized LSTM for predicting time series data, likely due to its exceptional capability
in handling long-duration sequences [59,67–72]. Concurrently, the CNN model is also
featured and employed in approximately 23% of the papers. Unlike LSTM, RNN, and GRU,
the CNN model performs excellently in extracting features from image data, making them
particularly effective in this domain [73–76]. The Attention Mechanism-based AM and
Transformer is seldom used as a standalone predictive model; typically, it is integrated
with a conventional neural network model. Its advantage lies in employing a unique mech-
anism such as a querying process, which facilitates the allocation of weights across spatial
dimensions and temporal steps, thereby emphasizing crucial information [37,42,66,77–81].
Additionally, other models typically do not appear independently in tasks involving time
series prediction; they are usually employed in combinations, such as using certain models
for feature extraction and others for prediction. A CNN is frequently used for both feature
extraction and prediction tasks [54,82–84]. A GRU plays a significant role in prediction
tasks [60,64,82]. Notably, scholars often integrate various deep learning models, for in-
stance, by weighting to aggregate the results from each model [85]. Collectively, these
Information 2024, 15, x FOR PEER REVIEW
findings delineate prevailing technological inclinations within deep learning research18 of
and36
illuminate prospective trajectories for scholarly inquiry.

3% 3%
9%
9%
21%
8%

3%
4%
40%

ANN MLP CNN TCN RNN LSTM GRU MA Transformer

Figure 15.
Figure 15. Percentages
Percentages of
of the
the deep
deep learning
learning algorithms
algorithms used
used in
in the
the collected
collected literature.
literature.

Forecasting in
4. Time Series Forecasting in Maritime
Maritime Applications
4.1.
4.1. Ship
Ship Operation-Related
Operation-Related Applications
Applications
This
This section
section explores
explores various applications of
various applications of deep
deep learning
learning inin ship
ship operations,
operations, includ-
includ-
ing enhancing navigation safety, ship anomaly detection, intelligent navigation
ing enhancing navigation safety, ship anomaly detection, intelligent navigation practices,practices,
meteorological
meteorological predictions,
predictions, and
and fuel
fuel consumption
consumption forecasting.
forecasting. ItIt highlights
highlights the
the use
use of
of
models
models such as LSTM, CNN, and Transformer to improve prediction accuracy andopera-
such as LSTM, CNN, and Transformer to improve prediction accuracy and oper-
tional
ationalefficiency.
efficiency.Additionally,
Additionally,applications in ship
applications type
in ship classification,
type traffic
classification, prediction,
traffic and
prediction,
collision risk assessment
and collision are discussed,
risk assessment demonstrating
are discussed, the broad
demonstrating theimpact
broad of deep learning
impact of deep
on ship operations.
learning on ship operations.
4.1.1. Ship Trajectory Prediction
4.1.1. Ship Trajectory Prediction
A. Navigation Safety Enhancement
A. Navigation Safety Enhancement
Accurate trajectory prediction is essential for maritime safety and navigation optimiza-
Accurate trajectory prediction is essential for maritime safety and navigation optimi-
tion. It involves estimating vessels’ future positions and states, such as longitude, latitude,
zation. It involves estimating vessels’ future positions and states, such as longitude, lati-
heading, and speed. Accurate predictions are vital for enhancing maritime traffic safety,
tude, heading, and speed. Accurate predictions are vital for enhancing maritime traffic
safety, optimizing navigation management, and aiding emergency response [86]. Li et al.
[86] introduced an innovative method based on AIS data, data encoding transformation,
the Attribute Correlation Attention (ACoAtt) module, and the LSTM network. This ap-
proach significantly enhances prediction accuracy and stability by capturing dynamic in-
teractions among ship attributes. The experimental results demonstrate that this method
Information 2024, 15, 507 16 of 33

optimizing navigation management, and aiding emergency response [86]. Li et al. [86]
introduced an innovative method based on AIS data, data encoding transformation, the
Attribute Correlation Attention (ACoAtt) module, and the LSTM network. This approach
significantly enhances prediction accuracy and stability by capturing dynamic interactions
among ship attributes. The experimental results demonstrate that this method outper-
forms traditional models. Similarly, Yu et al. [87] introduced an improved LSTM model
incorporating the ACoAtt module, significantly improving trajectory prediction accuracy.
This model utilizes a multi-head attention mechanism to capture intricate relationships
among ship speed, direction, and position, demonstrating excellent performance in han-
dling time series data. The experimental results indicate that the improved LSTM model
offers significant advantages over standard LSTM models regarding multi-step prediction
accuracy and stability. It has broad application prospects in maritime search and rescue
and navigation safety.
Enhancing ship trajectory prediction in complex environments is essential for effective
maritime operations. Xia et al. [88] proposed a specially designed pre-training model to
enhance ship trajectory prediction performance in complex environments. This model uses
a single-decoder architecture, pre-training on complete trajectory data, and fine-tuning
for specific prediction scenarios. It achieves high-precision predictions and demonstrates
broad adaptability. This method directly inputs trajectory data segments and immediately
initiates prediction, which is suitable for various downstream tasks and significantly
enhances model practicality. Cheng et al. [89] proposed a cross-scale SSE model based
on class-imbalanced vessel movement data to address data imbalance issues. This model
includes multi-scale feature learning modules, cross-scale feature learning modules, and
prototype classifier modules, aiming to directly learn imbalanced vessel movement data.
Compared to traditional softmax classifiers, the distance-based classifier significantly
improves classification performance. This model demonstrates superior scalability and
generality on multiple public datasets and excels in vessel movement data. Traditional
prediction methods are often limited by the irregularity of AIS data and the neglect of
intrinsic vessel information. To address these issues, Xia et al. [90] proposed a divide-and-
conquer model architecture incorporating vessel-specific data, TABTformer (Turtle and
Bunny-TCN-former), to reduce computational complexity and improve accuracy.
Improving the accuracy of ship trajectory prediction is highly significant. Pan et al. [58]
aimed to predict ship arrival times using AIS data to enhance container terminal opera-
tions. They employed a transitive closure method based on equivalence relations for the
fuzzy clustering of routes, achieving optimal route matching and constructing navigation
trajectory features. The proposed RNN-LSTM model combines deep learning with time
series analysis, demonstrating excellent predictive performance and providing technologi-
cal support for maritime intelligent transportation. Bin Syed and Ahmed [40] proposed
a 1D CNN-LSTM architecture for trajectory association. This architecture combines the
advantages of the CNN and the LSTM network, simulating AIS datasets’ spatial and tem-
poral features. By capturing spatial and temporal correlations in the data, this integrated
model surpasses other deep learning architectures in accuracy for trajectory prediction.
Experimental results demonstrate that the CNN-LSTM framework excels in processing
ship trajectory data, notably enhancing predictive performance.
B. Ship Anomaly Detection
Improving prediction accuracy and reducing training time are essential for effective
ship anomaly detection. Violos et al. [67] proposed an ANN architecture combined with
LSTM layers, utilizing genetic algorithms and transfer learning methods to predict the
future locations of moving objects. Genetic algorithms conduct intelligent searches within
the hypothesis space to approximate optimal ANN architectures, while transfer learning
utilizes pre-trained ANN models to expedite the training process. Evaluations based on
real data from various ship trajectories indicate that this model outperforms other advanced
methods in prediction accuracy and significantly reduces training time.
Information 2024, 15, 507 17 of 33

Predicting ship trajectories is crucial for vessel traffic service (VTS) supervision and
accident warning. Ran et al. [91] utilized AIS data to transform navigational dynamics
into time series, extracting navigation trajectory features and employing an attention
mechanism-based LSTM network for training and testing. The results indicate that this
model can effectively and accurately predict ship trajectories, providing valuable references
for VTS supervision and offering high practical application value in accident warnings.
This method presents a novel approach to ship navigation prediction, avoiding the need
to establish complex ship motion models. Yang et al. [92] proposed a ship trajectory
prediction method that combines data denoising and deep learning techniques. Their
process involves trajectory separation, data denoising, and standardization. By denoising
the original AIS data, the complexity and computational time of the input prediction
model are reduced, thereby improving prediction accuracy. Experimental results show
that this method outperforms ETS, ARIMA, SIR, RNN, and LSTM models in prediction
accuracy and computational efficiency, significantly reducing errors in ship trajectory
prediction. This contributes to improving maritime traffic efficiency and safety, preventing
maritime accidents.
Efficient maritime traffic monitoring and collision prevention are critical for ensuring
safe navigation. Sadeghi and Matwin [93] investigated anomaly detection in maritime
time series data using an unsupervised method based on autoencoder (AE) reconstruction
errors. They could identify varying degrees of anomalies by estimating the probability
density function of reconstruction errors. The results demonstrated that this method ef-
fectively pinpointed irregular patterns in ship movement trajectories, providing reliable
support for maritime traffic monitoring and collision prevention. Ma et al. [79] proposed
an AIS data-driven method to predict ship encounter risks by modeling inter-ship behavior
patterns. This method extracts multidimensional features from AIS tracking data to capture
spatial dependencies between encountering ships. It uses a sliding window technique to
create behavioral feature sequences, viewing future collision risk levels as sequence labels.
Utilizing the powerful temporal dependency and complex pattern modeling capabilities
of LSTM network, they established a relationship between inter-ship behavior and future
collision risk. This method efficiently and accurately predicts future collision risks, particu-
larly excelling when potential risks are high, making it promising for early prediction of
ship encounters and enhancing navigation safety.
Detecting ship behavior patterns and monitoring anomalies is essential for maritime
safety. Perumal et al. [94] proposed a novel hybrid model combining the CNN and
the LSTM network. By modifying the residual CNN network and integrating it with
LSTM/GRU, they introduced a new activation function to overcome the dead neuron
propagation issue of rectified linear unit (ReLU) activation functions, improving model
accuracy by over 1%. Furthermore, Hoque and Sharma [68] presented a solution for real-
time monitoring of ship anomalous behavior. They used spectral clustering and outlier
detection methods on historical AIS data to discern ship types and typical routes. LSTM
networks were then used for trajectory prediction and ship engine behavior to determine if
the current route was abnormal. This method applies the LSTM network for trajectory pre-
diction and multivariate time series anomaly detection, displaying real-time anomaly state
classification and considering route changes due to emergencies. It effectively monitors
and identifies ship anomalous behavior, enhancing maritime safety.
Accurate ship fault detection and prevention are vital for maintaining operational
efficiency and safety. Ji et al. [80] proposed a hybrid neural network (HNN) model (CNN-
BiLSTM-Attention) for predicting a ship’s diesel engine exhaust gas temperature (EGT)
based on deep learning. This model demonstrates significant advantages in prediction
accuracy, capable of extracting comprehensive spatial and temporal features of a ship’s
diesel engine EGT, aiding in improving engine performance, preventing faults, reducing
maintenance costs, and controlling emissions to protect the environment. Liu et al. [81]
constructed a ship diesel engine exhaust temperature prediction model combining the
feature extraction capabilities of the attention mechanism with the time series memory
Information 2024, 15, 507 18 of 33

capabilities of LSTM. To boost model prediction accuracy, they applied an enhanced


particle swarm optimization algorithm to optimize the structure parameters of the model,
boosting predictive performance. This model accurately predicts developing faults, offering
a novel approach for ship maintenance, helping to improve diesel engine performance,
prevent faults, reduce maintenance costs, and control emissions. Furthermore, to ensure
the implementation of self-healing strategies following faults in ship electric propulsion
systems, Xie et al. [95] proposed a Res-BiLSTM deep learning algorithm combining residual
networks (ResNet) and bidirectional long short-term memory (BiLSTM) for fault diagnosis.
This approach uses residual connections in CNN to extract feature information from fault
data and BiLSTM networks to identify periodic fault features, providing robust fault
diagnosis for ship electric propulsion systems.
Anticipating and predicting maritime accidents is fundamental to enhancing nav-
igational safety. Choi [70] developed an LSTM network model for maritime accident
prediction, utilizing time series data to forecast the occurrence of maritime accidents based
on sailor watch times. By employing AIS data and the LSTM network to detect ship be-
havior patterns, the model predicts whether routes are normal or abnormal, helping to
anticipate voyage trajectories before reaching the destination. The combination of classifi-
cation methods and time series prediction aids in the early identification of ship behavior,
facilitating timely contact with authorities and improving maritime safety. Han et al. [96]
proposed an LSTM-based model to predict maritime accident frequency. They developed
four different LSTM models using maritime statistical data and compared them with the
Autoregressive Integrated Moving Average (ARIMA) model. The results indicated that
LSTM models outperform ARIMA models in prediction accuracy, demonstrating the ad-
vantage of LSTM in predicting maritime accident frequency based on watch times. This
approach helps in the early detection of propeller faults, enhancing the safety and reliability
of ship operations.
C. Intelligent Navigation Practice
Deep learning methods based on historical ship AIS trajectory data are pivotal for
intelligent navigation practice. Jiang et al. [66] proposed a deep learning method that com-
bines LSTM and Transformer frameworks (TRFM-LS) to predict trajectory time series. The
LSTM module captures temporal features, while the self-attention mechanism overcomes
LSTM’s limitations in capturing long-distance sequence information. This method achieves
high-precision predictions by filtering and smoothing data anomalies through a time win-
dow. It significantly reduces errors, providing early warning references for autonomous
navigation and collision avoidance in intelligent shipping. In addition, Zhang et al. [64]
proposed a high-frequency radar ship trajectory prediction method that integrates Gated
Recurrent Unit and Attention Mechanisms (GRU-AMs) with Autoregressive (AR) models.
High-frequency radar data are input into the CNN using various window lengths to extract
and integrate multi-scale features. GRU and temporal attention mechanisms are then used
to learn time series features and assign weights. With the AR model for linear and nonlinear
predictions, forward and backward computations are performed, and precise predictions
are ultimately obtained through the entropy method for weighting.

4.1.2. Meteorological Factor Prediction


Deep learning methods are increasingly used to estimate sea-state conditions. Se-
limovic et al. [78] proposed an attention mechanism-based deep learning model (AT-NN)
to estimate critical sea-state characteristics. By evaluating each sea-state parameter’s per-
formance, they confirmed the AT-NN model’s suitability for estimating crucial sea-state
parameters. Cheng et al. [97] introduced a novel deep neural network model, SeaStateNet,
designed to estimate sea state using dynamic positioning vessel movement data. SeaSt-
ateNet comprises LSTM, CNN, and Fast Fourier Transform (FFT) modules, which respec-
tively capture long-term dependencies and extract time-invariant features and frequency
features. Benchmarking and experimental results validated the effectiveness of SeaStateNet
Information 2024, 15, 507 19 of 33

for sea-state estimation. Sensitivity analysis assessed the impact of data preprocessing, and
real-time testing further demonstrated the model’s practicality.
Combining various neural networks and feature extraction techniques enhances sea-
state estimation accuracy. Wang et al. [75] proposed a deep neural network model named
DynamicSSE, which autonomously learns the dynamic correlations between sensors at
different locations, generating more valuable predictive features. DynamicSSE includes
feature extraction and dynamic graph structure construction modules, effectively capturing
graph structures and dependencies over long and short periods by combining a CNN
and the LSTM network. Their experimental results indicated that DynamicSSE outper-
formed baseline methods on two vessel motion datasets and demonstrated its practicality
and effectiveness through real-world testing. Additionally, Cheng et al. [74] introduced
a deep learning sea-state estimation model (SpectralNet) based on spectrograms. Com-
pared to methods directly applied to raw time series data, SpectralNet achieved higher
classification accuracy. This approach significantly enhances the interpretability and ac-
curacy of deep-learning-based sea-state estimation models by transforming time–domain
data into time–frequency spectrograms using vessel movement data from a commercial
simulation platform.

4.1.3. Ship Fuel Consumption Prediction


Recent advancements in deep learning have significantly improved the accuracy of
predicting fuel consumption and operational parameters for ship engines. Ilias et al. [77]
proposed a multitask learning (MTL) framework based on Transformer for predicting fuel
oil consumption (FOC) of both main and auxiliary engines. The study utilized a single-task
learning (STL) model composed of bidirectional long short-term memory networks (BiL-
STM) and multi-head self-attention. The MTL setup simultaneously predicted the FOC of
main and auxiliary engines, introducing a regularization loss function to improve accuracy,
which effectively reduced fuel consumption and operational costs [98]. Lei et al. [98] aimed
to improve the accuracy of predicting ship engine speed and fuel consumption using deep
learning algorithms. They chose the LSTM algorithm with time parameters to build a
neural network model, focusing on inland ships’ dynamic time series characteristics. A
comparison between the LSTM model and conventional machine learning approaches
demonstrated that deep learning significantly outperformed traditional methods in pre-
dicting engine speed and fuel consumption for inland vessels. In summary, these studies
highlight the superior performance of deep learning techniques, such as Transformer-based
MTL and LSTM, in accurately predicting fuel consumption and operational parameters for
both main and auxiliary ship engines, demonstrating their potential to reduce fuel usage
and associated costs.

4.1.4. Others
These are other studies of deep learning work conducted in ship operation-related
applications. Ljunggren [99] demonstrated that deep neural networks could learn specific
ship features and outperform the 1-Nearest Neighbor (1NN) baseline method in ship-type
classification, mainly when the classification confidence was highest for 50% of the data.
Additionally, the study proposed an improved short-term traffic prediction model using
LSTM with RNN, chosen for its ability to remember long-term historical input data and
automatically determine the optimal time lag. The experimental results showed that this
model outperformed existing models in accuracy. Zhang et al. [84] combined CNN and
LSTM to construct a data-driven neural network model for forecasting the roll motion of
unmanned surface vehicles (USVs). CNN was employed to extract spatial correlations
and local time series features from USV sensor data. At the same time, the LSTM layer
captured the long-term motion processes of the USV and predicted the roll motion for
the next moment. A fully connected layer decoded the LSTM output and calculated the
final prediction results. Furthermore, Ma et al. [79] proposed a data-driven method for
predicting early collision risk by analyzing encountering vessels’ spatiotemporal motion
Information 2024, 15, 507 20 of 33

behavior. They developed a novel deep learning architecture combining bidirectional


LSTM (BiLSTM) and attention mechanisms to capture the spatiotemporal dependencies
of behavior and their impact on future risks, linking vessel motion behavior to future risk
levels and categorizing the behavior into corresponding risk grades.

4.2. Port Operation-Related Applications


The accurate prediction of container throughput is crucial for optimizing port oper-
ations and making strategic decisions. Kulshrestha et al. [100] introduced a novel deep
learning-based multivariate framework for precise container throughput prediction in
challenging environments. This framework integrates key economic indicators such as
GDP and port tonnage, performing rigorous importance analysis on four initial variables,
including imports and exports. The model highlights its importance for port authorities in
operational and tactical decisions like equipment scheduling, terminal management, route
optimization, and strategic decisions such as port design, construction, and expansion.
Shankar et al. [101] employed the LSTM network to forecast container throughput and
compared their performance with traditional time series methods. They evaluated multiple
common forecasting methods, including ARIMA, Neural Networks (NNs), and Exponen-
tial Smoothing (ES), using four error metrics (relative mean error (RME), relative absolute
error (RAE), root mean square error (RMSE), and mean absolute percentage error (MAPE)).
The relative errors are calculated by dividing the errors by those from a benchmark method,
such as the naive approach. This process improves the interpretability of the forecasting
methods. LSTM demonstrated superior performance over other benchmark methods in
all forecasting characteristics, as confirmed through the Diebold–Mariano (DM) test. Ad-
ditionally, Yang and Chang [83] proposed a hybrid precision neural network architecture
for container throughput prediction, combining the strengths of the CNN and the LSTM
network. This architecture utilizes a CNN to learn feature strength and LSTM to recognize
key internal representations of time series. The results showed that the hybrid precision
neural network architecture surpassed traditional machine learning methods like adaptive
boosting, random forest regression, and support vector regression. Lee et al. [102] aimed
to enhance container throughput prediction models by incorporating external variables
and time series decomposition methods. The study proposed a novel deep learning model
combining time series decomposition, external variables, and multivariate LSTM. The
results demonstrated that this model outperformed traditional LSTM models and could
simultaneously track trends.
Predicting port resilience and productivity is critical for managing disruptions and
maintaining high performance. Cuong et al. [103] explored data analysis methods for
analyzing port resilience and proposed a new paradigm for productivity prediction using a
hybrid deep learning approach. By employing nonlinear time series analysis and statistical
methods, the study assessed the resilience characteristics of ports, helping stakeholders
gain insight into the resilience and productivity of maritime logistics in disruptive market
environments. Data analysis techniques provide management insights to decision-makers,
ensuring that ports can maintain high performance during interruptions.

4.3. Shipping Market-Related Applications


Advanced time series prediction methods are crucial for accurate market forecasting.
Song and Chen [104] introduced an advanced Echo State Network (ESN) variant, the
Enhanced Dual Projection ESN (edDPESN), specifically designed for complex time series
prediction. Unlike traditional ESN, edDPESN trains the linear readout layer uniquely
within compressed dimensions derived from the low-dimensional space of the original
reservoir. This approach combines deep representation and ensemble learning strategies to
improve model generalization. Empirical evaluations showed that edDPESN outperformed
traditional ESN models in time series prediction.
In the realm of liner trade and logistics management, Li et al. [105] developed a
Deep Reinforcement Learning (RL)-based framework for the dynamic liner trade pricing
Information 2024, 15, 507 21 of 33

problem (DeepDLP). This framework aims to determine pricing strategies that adapt to
the dynamic nature of liner trade, characterized by large volumes, low costs, and high
capacity. DeepDLP features a dynamic price prediction module and an RL-based pricing
module, built by analyzing and studying data from real shipping scenarios. Experimental
simulations demonstrated that DeepDLP effectively improves liner trade revenue, veri-
fying its efficiency and effectiveness. Additionally, Alqatawna et al. [106] utilized time
series analysis techniques to predict resource demands for logistics companies, optimizing
order volume forecasts and determining staffing requirements. The study applied methods
like Autoregressive (AR), Autoregressive Integrated Moving Average (ARIMA), and Sea-
sonal Autoregressive Integrated Moving-Average with Exogenous Factors (SARIMAX) to
capture trends and seasonal components, providing interpretable results. This approach
helps companies accurately estimate resources needed for packaging based on predicted
order volumes.
Lim et al. [107] proposed an LSTM-based time series model to predict future values
of various liquid cargo transports in ship transport and port management. Traditional
models like ARIMA and Vector Autoregression (VAR) often fail to consider the linear
dependencies between different types of liquid cargo transport values. The proposed LSTM
model incorporates techniques for handling missing values, considers both short-term
and long-term dependencies, and uses auxiliary variables such as the inflation rate, the
USD exchange rate, the GDP value, and international oil prices to improve prediction
accuracy. Cheng et al. [108] introduced an integrated edRVFL algorithm for predicting ship
order dynamics. The algorithm enhances prediction performance and generalization by
combining deep feature extraction and ensemble learning with a minimal embedding strat-
egy. Accurate ship order predictions are critical for strategic planning in dynamic market
environments, and the edRVFL model demonstrated improved forecasting capabilities and
flexibility across different regions. Mo et al. [54] proposed a novel Convolutional Recurrent
Neural Network for time charter rate prediction under multivariate conditions, specifically
targeting monthly time charter rates for tankers and bulk carriers. The model showed
significant advantages in reflecting dynamic changes in the shipping market.
For ship market index prediction, Kamal et al. [85] proposed an integrated deep
learning method for short-term and long-term Baltic Dry Index (BDI) forecasting, assisting
stakeholders and shipowners in making informed business decisions and mitigating market
risks. The study employed Recurrent Neural Network models (RNN, LSTM, and GRU)
for BDI prediction and advanced sequential deep learning models for one-step and multi-
step forecasting. These models were combined into a Deep Ensemble Recurrent Network
(DERN) to enhance prediction accuracy. Their experimental results indicated that DERN
outperformed traditional methods such as ARIMA, MLP, RNN, LSTM, and GRU in both
short-term and long-term BDI predictions.

4.4. Overview of Time Series Forecasting in Maritime Applications


This chapter provides a concise overview of recent advancements in deep learning
for time series forecasting in maritime applications, focusing on ship trajectory prediction,
ship anomaly detection, and other practical uses. By leveraging diverse models such as
ANN, LSTM, TCN, and Transformer, researchers have significantly enhanced prediction
accuracy and reliability in complex maritime environments. The chapter covers applications
including ship trajectory prediction, weather forecasting, fuel consumption estimation,
port throughput forecasting, and market order prediction. The integration of advanced
techniques like attention mechanisms, ensemble learning, and reinforcement learning
has optimized maritime operations, addressing the data imbalance and computational
complexity challenges. Additionally, the importance of accurate data handling, model
optimization, and long-term and cross-domain applications is emphasized. The reviewed
studies highlight the transformative impact of deep learning, providing a foundation for
future research and practical implementations in the maritime industry.
Information 2024, 15, 507 22 of 33

5. Overall Analysis
5.1. Literature Description
This section primarily explores the distribution of the 89 academic papers we have
collected. Through the analysis of extensively compiled scholarly literature, the content of
this section presents the development trends and the distribution of academic focus within
Information 2024, 15, x FOR PEER REVIEW 25 of 36
this research field.

5.1.1. Literature Distribution


commencing in 2018. Additionally, the chart shows a gradual increase in publications
Figure 16from
presents a bar
3 in 2018 to a chart
peak ofillustrating
28 in 2023. the number of annual publications from the 89
papers we collected, which
Figure focus on
17 displays using deep
a horizontal barlearning architectures
chart representing and time
the number seriespub-
of papers data for
predictions in the lished in journals
maritime or presented
domain from at conferences.
2018 to 2024. ItThequantitatively illustrates
years are plotted onthe dissemina-
the x-axis, while
tion of scholarly articles across various esteemed academic journals and conference pro-
the y-axis quantifies the publications, with numbers ranging from 0 to 30, highlighting trends
ceedings. Notably, “Ocean Engineering” is depicted as the predominant forum, hosting
Information 2024, 15, x FOR PEER REVIEW 25 of 36
in research activity over these years.
six papers, followed Researchers
by a diverse first platforms
array of other adoptedsuchdeep aslearning architectures
“Applied Science.” This for
time series prediction tasks in the maritime domain commencing in 2018. Additionally, the chart
chart highlights the distribution and frequency of research outputs in various scholarly
shows a gradual venues.
increase in publications from 3 in 2018 to a peak of 28 in 2023.
commencing in 2018. Additionally, the chart shows a gradual increase in publications
from 3 in 2018 to a peak of 28 in 2023.
30 Figure 17 displays a horizontal bar chart representing the number of papers pub-
lished in journals or presented at conferences. It quantitatively illustrates the dissemina-
25 tion of scholarly articles across various esteemed academic journals and conference pro-
ceedings. Notably, “Ocean Engineering” is depicted as the predominant forum, hosting
20 six papers, followed by a diverse array of other platforms such as “Applied Science.” This
chart highlights the distribution and frequency of research outputs in various scholarly
venues.
15
30
10
25
5
20
0
2018 2019 2020 2021 2022 2023 2024
15
Figure 16. Number
Figureof
16.annually
Number ofpublished papers.papers.
annually published
10
Figure 17 displays a horizontal bar chart representing the number of papers published
in journals
5 or presented
Applied at conferences. It quantitatively illustrates the dissemination of
Soft Computing
scholarly articles across various esteemed academic journals and conference proceedings.
Applied Science
Notably,
0 “Ocean Engineering” is depicted as the predominant forum, hosting six papers,
followed2018 2019IEEE
by a diverse 2020
array of
Access other2021
platforms 2022
such as2023
“Applied2024
Science”. This chart
highlights the distribution and frequency of research outputs in various scholarly venues.
Figure 16. Number of annually published papers.
Ocean Engineering

Journal of Marine Science and…


Applied Soft Computing
Sensors
Applied Science
Journal of Physics: Conference Series
IEEE Access
0 1 2 3 4 5 6 7
Ocean Engineering
Figure 17. The number of papers published in journals or presented at conferences. Notes: The full
Journal of Marine
name Science
of “Journal and…
of Marine Science and…” is “Journal of Marine Science and Engineering”.

Sensors
5.1.2. Literature Classification
Given the broad scope of the maritime field and the diverse types of time series data
Journal of Physics: Conference
available, Series
it is necessary to categorize the various applications found in the literature more
systematically. In conjunction with the data we have collected, we have referenced several
0 1 2 3 4 5 6 7

Figure 17. The number of papers published in journals or presented at conferences. Notes: The full
Figure 17. The number of papers published in journals or presented at conferences. Notes: The full
name of “Journal of Marine Science and…” is “Journal of Marine Science and Engineering”.
name of “Journal of Marine Science and. . .” is “Journal of Marine Science and Engineering”.
5.1.2. Literature Classification
Given the broad scope of the maritime field and the diverse types of time series data
available, it is necessary to categorize the various applications found in the literature more
systematically. In conjunction with the data we have collected, we have referenced several
Information 2024, 15, 507 23 of 33

5.1.2.
Information 2024, 15, x FOR PEER REVIEW Literature Classification 26 of 36
Given the broad scope of the maritime field and the diverse types of time series data
available, it is necessary to categorize the various applications found in the literature more
systematically.
review articles andIn conjunction
classified the 89with the data
primary paperswe have
into collected,
three we haveship
main categories: referenced
oper- several
review articles
ation-related, portand classified the 89and
operation-related, primary
shippingpapers into three areas
market-related main[108–112].
categories: ship operation-
Figure
18related,
displays a pie
port chart illustrating the
operation-related, andclassification of the collected literature.
shipping market-related Among Figure
areas [108–112]. the 18 dis-
data
playsweagathered,
pie chart71illustrating
papers related
thetoclassification
ship operations constitute
of the 79%literature.
collected of the total,Among
a pro- the data
portion visible in
we gathered, 71the chart.related
papers This indicates a strong scholarly
to ship operations interest
constitute 79%inofutilizing
the total,deep
a proportion
learning architectures and time series data to predict or classify tasks related to ship
visible in the chart. This indicates a strong scholarly interest in utilizing deep learning op-
erations, aiming to optimize ship operation processes.
architectures and time series data to predict or classify tasks related to ship operations,
aiming to optimize ship operation processes.

11%
9%

80%

Ship operation related Port operation related


Shipping market related
Figure 18. The number of papers published in journals or presented at conferences.
Figure 18. The number of papers published in journals or presented at conferences.
5.2. Data Utilized in Maritime Research
5.2. Data Utilized in Maritime Research
5.2.1. Automatic Identification System Data (AIS Data)
5.2.1. Automatic Identification System Data (AIS Data)
AIS Data are from public databases such as international AIS databases and publicly
AIS Data are from public databases such as international AIS databases and publicly
available vessel tracking websites. AIS data include MMSI, speed over ground (SOG),
available vessel tracking websites. AIS data include MMSI, speed over ground (SOG),
course
course overover ground
ground (COG), (COG), timestamp,
timestamp, andlength
and vessel vessel length
[92]. [92]. Additionally,
Additionally, AIS data en- AIS data
compass data points corresponding to each vessel’s movement, including speed overspeed over
encompass data points corresponding to each vessel’s movement, including
ground,course
ground, course over
over ground,
ground, latitude,
latitude, longitude,
longitude, rate rate of turn,
of turn, heading
heading angle,angle, and more [68].
and more
The AIS is a crucial component of modern ship navigation systems,
[68]. The AIS is a crucial component of modern ship navigation systems, widely installed widely installed on
onvessels
vesselstotoenhance
enhancetarget
target identification
identification and andposition
positionmarking
markingcapabilities
capabilities [60].
[60]. As Asa a source
source of real-time
of real-time dynamic
dynamic andand historicalstate
historical state information
information that cancan
that be effectively stored,
be effectively stored, AIS
AIS
datadata
areare pivotalininpredicting
pivotal predicting the the spatiotemporal
spatiotemporal relationships
relationshipsof vessels [66]. [66].
of vessels
Application areas include vessel trajectory prediction [37,40,60,68,86-88,99],
Application areas include vessel trajectory prediction [37,40,60,68,86–88,99], the pre- the pre-
diction of vessel spatiotemporal relationships [79], analysis of vessel movement
diction of vessel spatiotemporal relationships [79], analysis of vessel movement characteris- character-
istics [92,93], and the simulation of collision avoidance [58,66,91].
tics [92,93], and the simulation of collision avoidance [58,66,91].
Apart from the information directly obtained from AIS data, additional insights such
Apart from the information directly obtained from AIS data, additional insights such
as port-to-port average speed, cargo weight [113], technical ship specifications, and port-
as port-to-port average speed, cargo weight [113], technical ship specifications, and port-to-
to-port bunker consumption [114] can be derived by combining AIS data with other data-
port bunker
bases using consumption
parameters [114] can
like voyage time,bedraught,
derivedshipby combining
sizes, and theAIS data with other
international mari- databases
using
time parameters
organization (IMO)likenumber.
voyage time, draught, ship sizes, and the international maritime
organization (IMO) number.
5.2.2. High-Frequency Radar Data and Sensor Data
5.2.2. High-Frequency
High-frequency radarRadar Datathe
data from andYellow
Sensor Data
Sea, China [64], inertial measurement
High-frequency radar data from the Yellow
units (IMU) installed on ships [97], Floating Production Sea, China
Storage[64],
andinertial measurement
Offloading units units
(IMU)[82],
(FPSO) installed on shipsdata
and measured [97],from container
Floating Production Storage
ships [41] andused
have been Offloading units (FPSO) [82],
in these studies.
Application
and measured areasdata
include thecontainer
from study of vessel
shipsmotion characteristics
[41] have been used[41]
in and
thesesea-state
studies.clas-
Application
sification [97].
areas include the study of vessel motion characteristics [41] and sea-state classification [97].
Information 2024, 15, 507 24 of 33

Information 2024, 15, x FOR PEER REVIEW 27 of 36

5.2.3. Container Throughput Data


5.2.3.These dataThroughput
Container come fromDataports such as Busan [102,103], PSA Port [101], and Ulsan Port,
SouthThese
Korea [107], as well
data come from portsas such
fromasClarksons Shipping
Busan [102,103], Intelligence
PSA Port [101], and[108]. Application
Ulsan Port,
areas include port logistics analysis [103], the prediction of container throughput
South Korea [107], as well as from Clarksons Shipping Intelligence [108]. Application ar- [101–103],
shipping market analysis [107], and the prediction of ship orders [107].
eas include port logistics analysis [103], the prediction of container throughput [101–103],
shipping market analysis [107], and the prediction of ship orders [107].
5.2.4. Other Datasets
5.2.4.Regarding
Other Datasetsthe Baltic Dry Index (BDI) [85,115], the ship motion data come from the
Offshore Simulator
Regarding Centre
the Baltic Dry AS (OSC)
Index (BDI)[74,89],
[85,115],real-time operational
the ship motion datafrom
data come [69,80,116,117],
the
Offshore of
analysis Simulator
economic Centre AS (OSC)
conditions in [74,89], real-time
the shipping operational
market datavessel
[85,115], [69,80,116,117],
motion monitor-
analysis
ing [89],of economicestimation
sea-state conditions [74],
in thepredicting
shipping market
vessel[85,115],
diesel vessel
engine motion monitoring
exhaust conditions [80],
[89], sea-state estimation [74], predicting vessel diesel engine exhaust conditions
and ship roll prediction [116]. Figure 19 clearly shows that the most significant [80], and portion
ship
of theroll prediction
data [116]. the
falls under Figure 19 clearly
“Other shows category,
Datasets” that the most significantfor
accounting portion
54.0%of the
of the total
data falls under the “Other Datasets” category, accounting for 54.0% of the total data col-
data collected. This highlights the extensive and diverse range of data sources used in
lected. This highlights the extensive and diverse range of data sources used in maritime
maritime research. The significant portions of AIS data (21.3%) and container throughput
research. The significant portions of AIS data (21.3%) and container throughput data
data (15.7%)
(15.7%) indicate
indicate their critical
their critical roles in roles
maritimein maritime studies,
studies, while while high-frequency
high-frequency radar and radar
and sensor data, though more minor in proportion (9.0%), provide vital
sensor data, though more minor in proportion (9.0%), provide vital environmental and environmental and
dynamic monitoringinformation.
dynamic monitoring information.

21%
9%
54%
16%

AIS Data
High-Frequency Radar Data and Sensor Data
Container Throughput Data
Other Datasets

Figure 19.
Figure 19. Data
Datautilized
utilizedin in
maritime research.
maritime research.

5.3. Evaluation
5.3. Evaluation Parameters
Parameters
After analyzing
After analyzingthe assessment
the assessmentcriteria employed
criteria in 89 academic
employed papers, papers,
in 89 academic we discov-
we discov-
ered a wide range of measurement methods extensively used for assessingeffective-
ered a wide range of measurement methods extensively used for assessing the the effectiveness
ness of machine learning and deep learning models. Prominent metrics for regression
of machine learning and deep learning models. Prominent metrics for regression tasks in-
tasks include Mean Squared Error (MSE), Root Mean Squared Error (RMSE), Mean Abso-
clude Mean Squared Error (MSE), Root Mean Squared Error (RMSE), Mean Absolute Error
lute Error (MAE), and MAPE, which primarily focus on quantifying prediction errors.
(MAE),
Equationsand MAPE,
(22) throughwhich primarily
(25) present thefocus on quantifying
mathematical prediction
constructs for MSE, errors.
RMSE, Equations
MAE, (22)
through
and MAPE. (25) present the mathematical constructs for MSE, RMSE, MAE, and MAPE.

∑𝑛𝑖=1(𝑦  − 𝑦̂ )22
MSE = ∑in=1𝑟𝑒𝑎𝑙
yreal −𝑝𝑟𝑒𝑑
ŷ pred (22)
MSE = 𝑛 (22)
n
∑ 𝑛  − 𝑦̂𝑝𝑟𝑒𝑑 )22
𝑖=1(𝑦
v
n 𝑟𝑒𝑎𝑙
u
RMSE = √ u
t ∑i=1 yreal − ŷ pred (23)
RMSE = 𝑛 (23)
n
 
∑in=1 yreal − ŷ pred
MAE = (24)
n
Information 2024, 15, 507 25 of 33

yreal −ŷ pred


∑in=1 ( yreal )
MAE = ∗ 100% (25)
n
where yreal represents the actual value and ŷ pred represents the predicted value. Further-
more, there are multiple variants of the models mentioned above. For example, according
to a report by Xie et al. [117], the utilization of Mean Relative Absolute Error (MRAE)
is employed to assess and compare the performance of a specific model with that of the
original model. Regarding classification tasks, commonly used metrics encompass accuracy,
precision, recall, and the F1 score. These metrics are instrumental in evaluating a model’s
ability to identify and classify different categories accurately. Equations (26) to (29) present
the computational formulas for accuracy, precision, recall, and the F1 score.

TP + TN
Accuracy = (26)
TP + FP + FN + TN
TP
Precision = (27)
TP + FP
TP
Recall = (28)
TP + FN
Precision ∗ Recall
F1 score = 2 ∗ (29)
Precision + Recal
TP, FP, TN, and FN represent the four essential components of the Confusion Matrix
TP (True Positives), FN (False Negatives), FP (False Positives), and TN (True Negatives).
In this context, “True” and “False” denote the accuracy of the model’s classification; “True”
implies correct classification, while “False” indicates a classification error. “Positives” and
“Negatives” pertain to the predictions made by the classifier, where “Positives” suggests
a prediction of the presence of a condition and “Negatives” signifies the prediction of its
absence. These components are vital metrics for evaluating the efficacy of classification
models, enabling the computation of critical statistical indicators such as accuracy, recall,
precision, and the F1 score. These metrics are indispensable for assessing the performance
of predictive models in various classification scenarios. Some literature concerning novel
architectures or model constructions also considers model execution speed or computa-
tional time as evaluation metrics [115,118]. For different tasks, other evaluation metrics
such as maintenance score, cumulative reward, and distance are also used within the given
collection of papers [119,120]. This diverse array of metrics reflects the problem’s specific
needs, the data’s characteristics, and considerations in model design, all of which are crucial
for accurately assessing and further optimizing model performance.

5.4. Real-World Application Examples


This section primarily summarizes the literature involving empirical experiments
using relevant real-world cases and data, yielding conclusive results. Data such as AIS,
obtainable from databases, will not be discussed further. Among the literature we reviewed,
eight papers implemented deep learning architectures in real-world applications. Notably,
the study by Guo et al. [121], which conducted model simulations within environments
including ship models, is also considered a real-world case and is included accordingly.
The advantages of using deep learning in real-world scenarios, as demonstrated by these
studies, include enhanced accuracy in complex decision-making processes, significant
improvements in operational efficiency through automation, and the ability to adapt to
dynamic and unpredictable environments. The related architectures and their advantages
are displayed in Table 1.
Information 2024, 15, 507 26 of 33

Table 1. Real-world application examples.

Ref. Architecture Dataset Advantage


It is applicable for high-frequency radar ship track
[64] MSCNN-GRU-AM HF radar prediction in environments with significant clutter
and interference
6L34DF dual fuel The high prediction accuracy and early warning timeliness
[80] CNN-BiLSTM-Attention
diesel engine can provide interpretable fault prediction results
Enables early anomaly detection in new ships and
[122] LSTM Two LNG carriers
new equipment
[98] LSTM sensors better and high-precision effects
Not only can it better capture complex ship attitude
[42] Self-Attention-BiLSTM A real military ship changes, but it also shows greater accuracy and stability in
long-term forecasting tasks
[41] CNN–GRU–AM A C11 containership better accuracy of forecasting
[121] GRU A scaled model test good prediction accuracy
[123] CNN A bulk carrier good prediction accuracy

5.5. Future Research Directions


5.5.1. Data Processing and Feature Extraction
Multivariate time series analysis involves expanding univariate time series data to
multivariate data, studying the interrelationships among variables, such as considering
longitude, latitude, wind speed, and wave height collectively affecting vessel motion [86,101].
To enhance model generalization, introducing data augmentation techniques like random
projection and dynamic ensemble methods is essential. This includes researching how
to automatically design edESN models using reinforcement learning and evolutionary
optimization techniques for configuring multivariate time series models [104]. Additionally,
integrating more diverse data sources, such as meteorological, oceanographic, and satellite
remote sensing data, enhances model generalization and adaptability [87]. Expanding
pre-training datasets to cover more sea areas and broader historical trajectory data [88] and
integrating AIS data with sea-state data can significantly improve the accuracy of vessel
motion predictions [89].

5.5.2. Model Optimization and Application of New Technologies


To improve deep learning models in maritime applications, it is essential to explore
advanced models such as CNNs, LSTM, Spiking Neural Networks (SNNs), and Transformer
models. These models enhance feature extraction capabilities and prediction accuracy [118].
Ensemble learning and hybrid models, which combine the strengths of various machine
learning and deep learning models, can further improve prediction accuracy and stability.
For example, combining physics-inspired methods with deep learning models can develop
more accurate hybrid prediction models [40]. Integrating reinforcement learning (RL) and
evolutionary optimization techniques into model design can yield more efficient models.
An example of this is developing an RL-based shipping revenue prediction model, which
involves constructing a simulated environment for experiments and validating the model’s
practical application [105].
While deep learning models are renowned for their robust predictive capabilities,
the rapid advancements in computer hardware have led to the emergence of a variety of
machine learning models tailored for predictive tasks. For specific applications that involve
spatially characterized variables, new machine learning models have shown significant
potential. For instance, the GTWR (Geographically and Temporally Weighted Regression)
model proposed by Huang et al. [124] exemplifies this trend. In the GTWR model, the
regression parameters of the independent variables are adjusted according to spatial and
temporal variations. This feature allows the model to more effectively capture the dynamic
spatio-temporal relationships between explanatory and dependent variables, offering a
Information 2024, 15, 507 27 of 33

more precise analytical tool for research in fields that require nuanced spatial and temporal
analysis [125].

5.5.3. Specific Application Scenarios


Port and shipping forecasts involve predicting port throughput, vessel arrival times,
and barge dwell times while considering external factors such as pandemics and wars [126].
Scenario analysis methods explore the impact of these uncertainties on port throughput and
propose countermeasures. In vessel motion and trajectory prediction, improving models
requires accounting for environmental factors such as wind and waves [92]. The research
aims to accurately predict vessel trajectories under complex scenarios, including special
weather conditions and encounters with other vessels [66]. Developing more accurate
ocean environment models entails combining meteorological and ocean condition data.
For example, using high-frequency radar data and meteorological data can help predict
vessel motion trajectories under different sea states [64].

5.5.4. Practical Applications and Long-Term Predictions


Real-world validation involves testing and validating model performance in real-
world environments, such as sea trials and port operation data analysis. Adjusting and
optimizing model parameters based on practical application are crucial to ensure reliability
and accuracy in actual use [41,127]. Experimental optimization focuses on enhancing ex-
perimental design and data collection methods to improve model performance in practical
applications [86]. Conducting multiple experimental validations is necessary to ensure
model robustness according to operational needs. For long-term predictions, research
aims to improve accuracy by using more granular methods, such as sequence-to-sequence
learning for precise long-term forecasting [85]. For instance, in container volume pre-
dictions, handling minor errors from time series decomposition can enhance prediction
performance [102].

5.5.5. Environmental Impact, Fault Prediction, and Cross-Domain Applications


Developing more accurate marine environment models by combining meteorological
and ocean condition data is crucial for predicting vessel motion trajectories under various
sea conditions, utilizing high-frequency radar data and meteorological data [64]. Improving
fault diagnosis models by integrating unsupervised learning techniques can address sample
imbalance issues, ensuring accurate fault predictions in ship mechanical components and
providing adequate fault warnings [95]. Additionally, researching collision avoidance
for autonomous vessels and developing more reliable collision warning systems involves
combining VTS surveillance for collision and grounding warnings and optimizing BLSTM
networks to enhance prediction accuracy [91]. Furthermore, cross-domain applications
of vessel motion prediction methods, such as stock price prediction and equipment fault
diagnosis, leverage models like LSTM and Transformer to improve prediction accuracy
and practicality in various domains [66,79].

6. Conclusions
This review comprehensively analyzes the various applications of deep learning in
maritime time series prediction. Through an extensive literature search in the Web of Sci-
ence and Scopus, 89 papers were collected, covering ship operations, port operations, and
shipping market forecasts. By examining various deep learning models, including ANN,
MLP, DNN, the LSTM network, TCN, RNN, GRU, CNN, RVFL, and Transformer models,
this review provides valuable insights into their potential and limitations in maritime
applications. The review investigates previous research on diverse applications such as pre-
dicting ship trajectories, weather forecasting, estimating ship fuel consumption, forecasting
port throughput, and predicting shipping market orders. It summarizes the distribution
and classification of the literature and analyzes the data utilized in maritime research,
including AIS data, high-frequency radar data and sensor data, container throughput
Information 2024, 15, 507 28 of 33

data, and other datasets. In summarizing future research directions, the review highlights
several key areas for further investigation, such as improving data handling techniques
to enhance accuracy, optimizing models for better performance, applying deep learning
models to specific scenarios, conducting long-term predictions, and exploring cross-domain
applications. By synthesizing the current state of deep learning applications in maritime
time series prediction, this review underscores both achievements and challenges, offering
a foundation for future studies to build upon.
The main contributions of this study are outlined as follows.
(1) This study fills the gap in the literature on advancements in deep learning techniques
for time series forecasting in maritime applications, focusing on three key areas: ship
operations, port operations, and shipping markets.
(2) Different types of deep learning models are compared to identify the most suitable
models for various applications. The differences between these models are discussed,
providing valuable insights for domain researchers.
(3) The study summarizes future research directions, helping to clarify future research
objectives and guide subsequent studies. These directions include enhancing data
processing and feature extraction, optimizing deep learning models, and exploring
specific application scenarios. Additionally, practical applications and long-term
predictions, environmental impacts, fault prediction, and cross-domain applications
are addressed to provide comprehensive guidance for future research efforts.
While significant progress has been made, addressing these gaps and leveraging
advanced deep learning techniques will enhance operational efficiency, safety, and sustain-
ability in maritime applications. However, this review has limitations, including the rapidly
evolving nature of the field, which may lead to the omission of the latest developments.
Additionally, the review may not fully capture the diversity of maritime applications and
the contextual nuances of different regions. Future reviews should continuously update
findings to reflect ongoing advancements and aim to include a broader range of sources
and perspectives to mitigate these limitations.

Author Contributions: Conceptualization, Z.S.C., Y.S. and M.L.; methodology, Y.S., Z.S.C., M.W.
and X.G.; software, M.W. and X.G.; validation, Z.S.C., Y.S. and Y.Z.; formal analysis, M.W. and
X.G.; investigation, M.W. and X.G.; resources, Z.S.C., Y.Z. and M.L.; data curation, M.W. and X.G.;
writing—original draft preparation, M.W. and X.G.; writing—review and editing, M.W., X.G., Z.S.C.
and M.L.; visualization, M.W. and X.G.; supervision, Z.S.C. and Y.S.; project administration, Y.S.
and Z.S.C.; funding acquisition, Y.S., Z.S.C., Y.Z. and M.L. All authors have read and agreed to the
published version of the manuscript.
Funding: This research was funded by Xi’an Jiaotong-Liverpool University, grant number RDF-22-02-026.
Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.
Data Availability Statement: Data are contained within the article.
Conflicts of Interest: The authors declare no conflicts of interest.

References
1. UNCTAD. Review of Maritime Transport 2023; United Nations Conference on Trade and Development: Geneva, Switzerland, 2023;
Available online: https://www.un-ilibrary.org/content/books/9789213584569 (accessed on 1 April 2024).
2. Liang, M.; Liu, R.W.; Zhan, Y.; Li, H.; Zhu, F.; Wang, F.Y. Fine-Grained Vessel Traffic Flow Prediction With a Spatio-Temporal
Multigraph Convolutional Network. IEEE Trans. Intell. Transp. Syst. 2022, 23, 23694–23707. [CrossRef]
3. Liu, R.W.; Liang, M.; Nie, J.; Lim, W.Y.B.; Zhang, Y.; Guizani, M. Deep Learning-Powered Vessel Trajectory Prediction for
Improving Smart Traffic Services in Maritime Internet of Things. IEEE Trans. Netw. Sci. Eng. 2022, 9, 3080–3094. [CrossRef]
4. Dui, H.; Zheng, X.; Wu, S. Resilience analysis of maritime transportation systems based on importance measures. Reliab. Eng.
Syst. Saf. 2021, 209, 107461. [CrossRef]
5. Liang, M.; Li, H.; Liu, R.W.; Lam, J.S.L.; Yang, Z. PiracyAnalyzer: Spatial temporal patterns analysis of global piracy incidents.
Reliab. Eng. Syst. Saf. 2024, 243, 109877. [CrossRef]
Information 2024, 15, 507 29 of 33

6. Chen, Z.S.; Lam, J.S.L.; Xiao, Z. Prediction of harbour vessel emissions based on machine learning approach. Transp. Res. Part D
Transp. Environ. 2024, 131, 104214. [CrossRef]
7. Chen, Z.S.; Lam, J.S.L.; Xiao, Z. Prediction of harbour vessel fuel consumption based on machine learning approach. Ocean Eng.
2023, 278, 114483. [CrossRef]
8. Liang, M.; Weng, L.; Gao, R.; Li, Y.; Du, L. Unsupervised maritime anomaly detection for intelligent situational awareness using
AIS data. Knowl.-Based Syst. 2024, 284, 111313. [CrossRef]
9. Dave, V.S.; Dutta, K. Neural network based models for software effort estimation: A review. Artif. Intell. Rev. 2014, 42, 295–307.
[CrossRef]
10. Uslu, S.; Celik, M.B. Prediction of engine emissions and performance with artificial neural networks in a single cylinder diesel
engine using diethyl ether. Eng. Sci. Technol. Int. J. 2018, 21, 1194–1201. [CrossRef]
11. Chaudhary, L.; Sharma, S.; Sajwan, M. Systematic Literature Review of Various Neural Network Techniques for Sea Surface
Temperature Prediction Using Remote Sensing Data. Arch. Comput. Methods Eng. 2023, 30, 5071–5103. [CrossRef]
12. Dharia, A.; Adeli, H. Neural network model for rapid forecasting of freeway link travel time. Eng. Appl. Artif. Intell. 2003, 16,
607–613. [CrossRef]
13. Hecht-Nielsen, R. Applications of counterpropagation networks. Neural Netw. 1988, 1, 131–139. [CrossRef]
14. Goodfellow, I.; Bengio, Y.; Courville, A. Deep Learning; MIT Press: Cambridge, MA, USA, 2016.
15. Veerappa, M.; Anneken, M.; Burkart, N. Evaluation of Interpretable Association Rule Mining Methods on Time-Series in the
Maritime Domain. Springer International Publishing: Cham, Switzerland, 2021; pp. 204–218.
16. Frizzell, J.; Furth, M. Prediction of Vessel RAOs: Applications of Deep Learning to Assist in Design. In Proceedings of the SNAME
27th Offshore Symposium, Houston, TX, USA, 22 February 2022. [CrossRef]
17. Van Den Oord, A.; Dieleman, S.; Zen, H.; Simonyan, K.; Vinyals, O.; Graves, A.; Kalchbrenner, N.; Senior, A.; Kavukcuoglu, K.
Wavenet: A generative model for raw audio. arXiv 2016, arXiv:1609.03499.
18. He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on
Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778.
19. Ning, C.X.; Xie, Y.Z.; Sun, L.J. LSTM, WaveNet, and 2D CNN for nonlinear time history prediction of seismic responses. Eng.
Struct. 2023, 286, 116083. [CrossRef]
20. Schmidt, W.F.; Kraaijveld, M.A.; Duin, R.P. Feed forward neural networks with random weights. In International Conference on
Pattern Recognition; IEEE Computer Society Press: Washington, DC, USA, 1992; pp. 1–4.
21. Huang, G.B.; Zhu, Q.Y.; Siew, C.K. Extreme learning machine: Theory and applications. Neurocomputing 2006, 70, 489–501.
[CrossRef]
22. Pao, Y.H.; Park, G.H.; Sobajic, D.J. Learning and generalization characteristics of the random vector functional-link net. Neurocom-
puting 1994, 6, 163–180. [CrossRef]
23. Zhang, L.; Suganthan, P.N. A comprehensive evaluation of random vector functional link networks. Inf. Sci. 2016, 367, 1094–1105.
[CrossRef]
24. Huang, G.; Huang, G.B.; Song, S.J.; You, K.Y. Trends in extreme learning machines: A review. Neural Netw. 2015, 61, 32–48.
[CrossRef]
25. Shi, Q.S.; Katuwal, R.; Suganthan, P.N.; Tanveer, M. Random vector functional link neural network based ensemble deep learning.
Pattern Recognit. 2021, 117, 107978. [CrossRef]
26. Du, L.; Gao, R.B.; Suganthan, P.N.; Wang, D.Z.W. Graph ensemble deep random vector functional link network for traffic
forecasting. Appl. Soft Comput. 2022, 131, 109809. [CrossRef]
27. Rehman, A.; Xing, H.L.; Hussain, M.; Gulzar, N.; Khan, M.A.; Hussain, A.; Mahmood, S. HCDP-DELM: Heterogeneous chronic
disease prediction with temporal perspective enabled deep extreme learning machine. Knowl.-Based Syst. 2024, 284, 111316.
[CrossRef]
28. Gao, R.B.; Li, R.L.; Hu, M.H.; Suganthan, P.N.; Yuen, K.F. Online dynamic ensemble deep random vector functional link neural
network for forecasting. Neural Netw. 2023, 166, 51–69. [CrossRef]
29. Lecun, Y.; Bottou, L.; Bengio, Y.; Haffner, P. Gradient-based learning applied to document recognition. Proc. IEEE 1998, 86,
2278–2324. [CrossRef]
30. Palaz, D.; Magimai-Doss, M.; Collobert, R. End-to-end acoustic modeling using convolutional neural networks for HMM-based
automatic speech recognition. Speech Commun. 2019, 108, 15–32. [CrossRef]
31. Fang, W.; Love, P.E.D.; Luo, H.; Ding, L. Computer vision for behaviour-based safety in construction: A review and future
directions. Adv. Eng. Inform. 2020, 43, 100980. [CrossRef]
32. Qin, L.; Yu, N.; Zhao, D. Applying the convolutional neural network deep learning technology to behavioural recognition in
intelligent video. Teh. Vjesn. 2018, 25, 528–535.
33. Hoseinzade, E.; Haratizadeh, S. CNNpred: CNN-based stock market prediction using a diverse set of variables. Expert Syst. Appl.
2019, 129, 273–285. [CrossRef]
34. Rasp, S.; Dueben, P.D.; Scher, S.; Weyn, J.A.; Mouatadid, S.; Thuerey, N. WeatherBench: A Benchmark Data Set for Data-Driven
Weather Forecasting. J. Adv. Model. Earth Syst. 2020, 12, e2020MS002203. [CrossRef]
35. Crivellari, A.; Beinat, E.; Caetano, S.; Seydoux, A.; Cardoso, T. Multi-target CNN-LSTM regressor for predicting urban distribution
of short-term food delivery demand. J. Bus. Res. 2022, 144, 844–853. [CrossRef]
Information 2024, 15, 507 30 of 33

36. Bai, S.; Kolter, J.Z.; Koltun, V. An empirical evaluation of generic convolutional and recurrent networks for sequence modeling.
arXiv 2018, arXiv:1803.01271.
37. Lin, Z.; Yue, W.; Huang, J.; Wan, J. Ship Trajectory Prediction Based on the TTCN-Attention-GRU Model. Electronics 2023, 12, 2556.
[CrossRef]
38. He, K.; Zhang, X.; Ren, S.; Sun, J. Identity Mappings in Deep Residual Networks. In Computer Vision–ECCV 2016; Leibe, B.,
Matas, J., Sebe, N., Welling, M., Eds.; Springer International Publishing: Cham, Switzerland, 2016; pp. 630–645.
39. Srivastava, N.; Hinton, G.; Krizhevsky, A.; Sutskever, I.; Salakhutdinov, R. Dropout: A simple way to prevent neural networks
from overfitting. J. Mach. Learn. Res. 2014, 15, 1929–1958.
40. Bin Syed, M.A.; Ahmed, I. A CNN-LSTM Architecture for Marine Vessel Track Association Using Automatic Identification System
(AIS) Data. Sensors 2023, 23, 6400. [CrossRef] [PubMed]
41. Li, M.-W.; Xu, D.-Y.; Geng, J.; Hong, W.-C. A hybrid approach for forecasting ship motion using CNN–GRU–AM and GCWOA.
Appl. Soft Comput. 2022, 114, 108084. [CrossRef]
42. Zhang, B.; Wang, S.; Deng, L.; Jia, M.; Xu, J. Ship motion attitude prediction model based on IWOA-TCN-Attention. Ocean Eng.
2023, 272, 113911. [CrossRef]
43. Elman, J.L. Finding Structure in Time. Cogn. Sci. 1990, 14, 179–211. [CrossRef]
44. Shan, F.; He, X.; Armaghani, D.J.; Sheng, D. Effects of data smoothing and recurrent neural network (RNN) algorithms for
real-time forecasting of tunnel boring machine (TBM) performance. J. Rock Mech. Geotech. Eng. 2024, 16, 1538–1551. [CrossRef]
45. Apaydin, H.; Feizi, H.; Sattari, M.T.; Colak, M.S.; Shamshirband, S.; Chau, K.-W. Comparative Analysis of Recurrent Neural
Network Architectures for Reservoir Inflow Forecasting. Water 2020, 12, 1500. [CrossRef]
46. Ma, Z.; Zhang, H.; Liu, J. MM-RNN: A Multimodal RNN for Precipitation Nowcasting. IEEE Trans. Geosci. Remote Sens. 2023, 61,
1–14. [CrossRef]
47. Lu, M.; Xu, X. TRNN: An efficient time-series recurrent neural network for stock price prediction. Inf. Sci. 2024, 657, 119951.
[CrossRef]
48. Bengio, Y.; Simard, P.; Frasconi, P. Learning long-term dependencies with gradient descent is difficult. IEEE Trans. Neural Netw.
1994, 5, 157–166. [CrossRef]
49. Kratzert, F.; Klotz, D.; Brenner, C.; Schulz, K.; Herrnegger, M. Rainfall–runoff modelling using Long Short-Term Memory (LSTM)
networks. Hydrol. Earth Syst. Sci. 2018, 22, 6005–6022. [CrossRef]
50. Hochreiter, S. Untersuchungen zu dynamischen neuronalen Netzen. Diploma Tech. Univ. München 1991, 91, 31.
51. Hochreiter, S.; Schmidhuber, J. Long Short-Term Memory. Neural Comput. 1997, 9, 1735–1780. [CrossRef] [PubMed]
52. Gers, F.A.; Schmidhuber, J.; Cummins, F. Learning to Forget: Continual Prediction with LSTM. Neural Comput. 2000, 12, 2451–2471.
[CrossRef] [PubMed]
53. Witten, I.H.; Frank, E. Data mining: Practical machine learning tools and techniques with Java implementations. SIGMOD Rec.
2002, 31, 76–77. [CrossRef]
54. Mo, J.X.; Gao, R.B.; Liu, J.H.; Du, L.; Yuen, K.F. Annual dilated convolutional LSTM network for time charter rate forecasting.
Appl. Soft Comput. 2022, 126, 109259. [CrossRef]
55. Cho, K.; Van Merriënboer, B.; Bahdanau, D.; Bengio, Y. On the properties of neural machine translation: Encoder-decoder
approaches. arXiv 2014, arXiv:1409.1259.
56. Yang, S.; Yu, X.; Zhou, Y. LSTM and GRU Neural Network Performance Comparison Study: Taking Yelp Review Dataset as an
Example. In Proceedings of the 2020 International Workshop on Electronic Communication and Artificial Intelligence (IWECAI),
Shanghai, China, 12–14 June 2020; pp. 98–101. [CrossRef]
57. Zhao, Z.N.; Yun, S.N.; Jia, L.Y.; Guo, J.X.; Meng, Y.; He, N.; Li, X.J.; Shi, J.R.; Yang, L. Hybrid VMD-CNN-GRU-based model for
short-term forecasting of wind power considering spatio-temporal features. Eng. Appl. Artif. Intell. 2023, 121, 105982. [CrossRef]
58. Pan, N.; Ding, Y.; Fu, J.; Wang, J.; Zheng, H. Research on Ship Arrival Law Based on Route Matching and Deep Learning. J. Phys.
Conf. Ser. 2021, 1952, 022023. [CrossRef]
59. Ma, J.; Li, W.K.; Jia, C.F.; Zhang, C.W.; Zhang, Y. Risk Prediction for Ship Encounter Situation Awareness Using Long Short-Term
Memory Based Deep Learning on Intership Behaviors. J. Adv. Transp. 2020, 2020, 8897700. [CrossRef]
60. Suo, Y.F.; Chen, W.K.; Claramunt, C.; Yang, S.H. A Ship Trajectory Prediction Framework Based on a Recurrent Neural Network.
Sensors 2020, 20, 5133. [CrossRef] [PubMed]
61. Bahdanau, D.; Cho, K.; Bengio, Y. Neural machine translation by jointly learning to align and translate. arXiv 2014, arXiv:1409.0473.
[CrossRef]
62. Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. Adv.
Neural Inf. Process. Syst. 2017, 30, 03762.
63. Nascimento, E.G.S.; de Melo, T.A.C.; Moreira, D.M. A transformer-based deep neural network with wavelet transform for
forecasting wind speed and wind energy. Energy 2023, 278, 127678. [CrossRef]
64. Zhang, L.; Zhang, J.; Niu, J.; Wu, Q.M.J.; Li, G. Track Prediction for HF Radar Vessels Submerged in Strong Clutter Based on
MSCNN Fusion with GRU-AM and AR Model. Remote Sens. 2021, 13, 2164. [CrossRef]
65. Zhang, X.; Fu, X.; Xiao, Z.; Xu, H.; Zhang, W.; Koh, J.; Qin, Z. A Dynamic Context-Aware Approach for Vessel Trajectory Prediction
Based on Multi-Stage Deep Learning. IEEE Trans. Intell. Veh. 2024, 1–16. [CrossRef]
Information 2024, 15, 507 31 of 33

66. Jiang, D.; Shi, G.; Li, N.; Ma, L.; Li, W.; Shi, J. TRFM-LS: Transformer-Based Deep Learning Method for Vessel Trajectory Prediction.
J. Mar. Sci. Eng. 2023, 11, 880. [CrossRef]
67. Violos, J.; Tsanakas, S.; Androutsopoulou, M.; Palaiokrassas, G.; Varvarigou, T. Next Position Prediction Using LSTM Neural
Networks. In Proceedings of the 11th Hellenic Conference on Artificial Intelligence, Athens, Greece, 2–4 September 2020;
pp. 232–240. [CrossRef]
68. Hoque, X.; Sharma, S.K. Ensembled Deep Learning Approach for Maritime Anomaly Detection System. In Proceedings of the 1st
International Conference on Emerging Trends in Information Technology (ICETIT), Inst Informat Technol & Management, New
Delhi, India, 21–22 June 2020; In Lecture Notes in Electrical Engineering; Volume 605, pp. 862–869.
69. Wang, Y.; Zhang, M.; Fu, H.; Wang, Q. Research on Prediction Method of Ship Rolling Motion Based on Deep Learning. In
Proceedings of the 2020 39th Chinese Control Conference (CCC), Shenyang, China, 27–29 July 2020; pp. 7182–7187. [CrossRef]
70. Choi, J. Predicting the Frequency of Marine Accidents by Navigators’ Watch Duty Time in South Korea Using LSTM. Appl. Sci.
2022, 12, 11724. [CrossRef]
71. Li, T.; Li, Y.B. Prediction of ship trajectory based on deep learning. J. Phys. Conf. Ser. 2023, 2613, 012023. [CrossRef]
72. Chondrodima, E.; Pelekis, N.; Pikrakis, A.; Theodoridis, Y. An Efficient LSTM Neural Network-Based Framework for Vessel
Location Forecasting. IEEE Trans. Intell. Transp. Syst. 2023, 24, 4872–4888. [CrossRef]
73. Long, Z.; Suyuan, W.; Zhongma, C.; Jiaqi, F.; Xiaoting, Y.; Wei, D. Lira-YOLO: A lightweight model for ship detection in radar
images. J. Syst. Eng. Electron. 2020, 31, 950–956. [CrossRef]
74. Cheng, X.; Li, G.; Skulstad, R.; Zhang, H.; Chen, S. SpectralSeaNet: Spectrogram and Convolutional Network-based Sea State
Estimation. In Proceedings of the IECON 2020 the 46th Annual Conference of the IEEE Industrial Electronics Society, Singapore,
18–21 October 2020; pp. 5069–5074. [CrossRef]
75. Wang, K.; Cheng, X.; Shi, F. Learning Dynamic Graph Structures for Sea State Estimation with Deep Neural Networks. In
Proceedings of the 2023 6th International Conference on Intelligent Autonomous Systems (ICoIAS), Qinhuangdao, China, 22–24
September 2023; pp. 161–166.
76. Yu, J.; Huang, D.; Shi, X.; Li, W.; Wang, X. Real-Time Moving Ship Detection from Low-Resolution Large-Scale Remote Sensing
Image Sequence. Appl. Sci. 2023, 13, 2584. [CrossRef]
77. Ilias, L.; Kapsalis, P.; Mouzakitis, S.; Askounis, D. A Multitask Learning Framework for Predicting Ship Fuel Oil Consumption.
IEEE Access 2023, 11, 132576–132589. [CrossRef]
78. Selimovic, D.; Hrzic, F.; Prpic-Orsic, J.; Lerga, J. Estimation of sea state parameters from ship motion responses using attention-
based neural networks. Ocean Eng. 2023, 281, 114915. [CrossRef]
79. Ma, J.; Jia, C.; Yang, X.; Cheng, X.; Li, W.; Zhang, C. A Data-Driven Approach for Collision Risk Early Warning in Vessel Encounter
Situations Using Attention-BiLSTM. IEEE Access 2020, 8, 188771–188783. [CrossRef]
80. Ji, Z.; Gan, H.; Liu, B. A Deep Learning-Based Fault Warning Model for Exhaust Temperature Prediction and Fault Warning of
Marine Diesel Engine. J. Mar. Sci. Eng. 2023, 11, 1509. [CrossRef]
81. Liu, Y.; Gan, H.; Cong, Y.; Hu, G. Research on fault prediction of marine diesel engine based on attention-LSTM. Proc. Inst. Mech.
Eng. Part M J. Eng. Marit. Environ. 2023, 237, 508–519. [CrossRef]
82. Li, M.W.; Xu, D.Y.; Geng, J.; Hong, W.C. A ship motion forecasting approach based on empirical mode decomposition method
hybrid deep learning network and quantum butterfly optimization algorithm. Nonlinear Dyn. 2022, 107, 2447–2467. [CrossRef]
83. Yang, C.H.; Chang, P.Y. Forecasting the Demand for Container Throughput Using a Mixed-Precision Neural Architecture Based
on CNN–LSTM. Mathematics 2020, 8, 1784. [CrossRef]
84. Zhang, W.; Wu, P.; Peng, Y.; Liu, D. Roll Motion Prediction of Unmanned Surface Vehicle Based on Coupled CNN and LSTM.
Future Internet 2019, 11, 243. [CrossRef]
85. Kamal, I.M.; Bae, H.; Sunghyun, S.; Yun, H. DERN: Deep Ensemble Learning Model for Short- and Long-Term Prediction of Baltic
Dry Index. Appl. Sci. 2020, 10, 1504. [CrossRef]
86. Li, M.Z.; Li, B.; Qi, Z.G.; Li, J.S.; Wu, J.W. Enhancing Maritime Navigational Safety: Ship Trajectory Prediction Using ACoAtt–
LSTM and AIS Data. ISPRS Int. J. Geo-Inform. 2024, 13, 85. [CrossRef]
87. Yu, T.; Zhang, Y.; Zhao, S.; Yang, J.; Li, W.; Guo, W. Vessel trajectory prediction based on modified LSTM with attention mechanism.
In Proceedings of the 2024 4th International Conference on Neural Networks, Information and Communication Engineering,
NNICE, Guangzhou, China, 19–21 January 2024; pp. 912–918. [CrossRef]
88. Xia, C.; Peng, Y.; Qu, D. A pre-trained model specialized for ship trajectory prediction. In Proceedings of the IEEE Advanced
Information Technology, Electronic and Automation Control Conference (IAEAC), Chongqing, China, 15–17 March 2024;
pp. 1857–1860. [CrossRef]
89. Cheng, X.; Li, G.; Skulstad, R.; Chen, S.; Hildre, H.P.; Zhang, H. Modeling and Analysis of Motion Data from Dynamically
Positioned Vessels for Sea State Estimation. In Proceedings of the 2019 International Conference on Robotics and Automation
(ICRA), Montreal, QC, Canada, 20–24 May 2019; pp. 6644–6650. [CrossRef]
90. Xia, C.; Qu, D.; Zheng, Y. TATBformer: A Divide-and-Conquer Approach to Ship Trajectory Prediction Modeling. In Proceedings
of the 2023 IEEE 11th Joint International Information Technology and Artificial Intelligence Conference (ITAIC), Chongqing,
China, 8–10 December 2023; pp. 335–339. [CrossRef]
Information 2024, 15, 507 32 of 33

91. Ran, Y.; Shi, G.; Li, W. Ship Track Prediction Model based on Automatic Identification System Data and Bidirectional Cyclic
Neural Network. In Proceedings of the 2021 4th International Symposium on Traffic Transportation and Civil Architecture,
ISTTCA, Suzhou, China, 12–14 November 2021; pp. 297–301. [CrossRef]
92. Yang, C.H.; Wu, C.H.; Shao, J.C.; Wang, Y.C.; Hsieh, C.M. AIS-Based Intelligent Vessel Trajectory Prediction Using Bi-LSTM. IEEE
Access 2022, 10, 24302–24315. [CrossRef]
93. Sadeghi, Z.; Matwin, S. Anomaly detection for maritime navigation based on probability density function of error of reconstruction.
J. Intell. Syst. 2023, 32, 20220270. [CrossRef]
94. Perumal, V.; Murugaiyan, S.; Ravichandran, P.; Venkatesan, R.; Sundar, R. Real time identification of anomalous events in coastal
regions using deep learning techniques. Concurr. Comput. Pract. Exp. 2021, 33, e6421. [CrossRef]
95. Xie, J.L.; Shi, W.F.; Shi, Y.Q. Research on Fault Diagnosis of Six-Phase Propulsion Motor Drive Inverter for Marine Electric
Propulsion System Based on Res-BiLSTM. Machines 2022, 10, 736. [CrossRef]
96. Han, P.; Li, G.; Skulstad, R.; Skjong, S.; Zhang, H. A Deep Learning Approach to Detect and Isolate Thruster Failures for
Dynamically Positioned Vessels Using Motion Data. IEEE Trans. Instrum. Meas. 2021, 70, 1–11. [CrossRef]
97. Cheng, X.; Wang, K.; Liu, X.; Yu, Q.; Shi, F.; Ren, Z.; Chen, S. A Novel Class-Imbalanced Ship Motion Data-Based Cross-Scale
Model for Sea State Estimation. IEEE Trans. Intell. Transp. Syst. 2023, 24, 15907–15919. [CrossRef]
98. Lei, L.; Wen, Z.; Peng, Z. Prediction of Main Engine Speed and Fuel Consumption of Inland Ships Based on Deep Learning.
J. Phys. Conf. Ser. 2021, 2025, 012012.
99. Ljunggren, H. Using Deep Learning for Classifying Ship Trajectories. In Proceedings of the 21st International Conference on
Information Fusion (FUSION), Cambridge, UK, 10–13 July 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 2158–2164.
100. Kulshrestha, A.; Yadav, A.; Sharma, H.; Suman, S. A deep learning-based multivariate decomposition and ensemble framework
for container throughput forecasting. J. Forecast. 2024, in press. [CrossRef]
101. Shankar, S.; Ilavarasan, P.V.; Punia, S.; Singh, S.P. Forecasting container throughput with long short-term memory networks. Ind.
Manag. Data Syst. 2020, 120, 425–441. [CrossRef]
102. Lee, E.; Kim, D.; Bae, H. Container Volume Prediction Using Time-Series Decomposition with a Long Short-Term Memory Models.
Appl. Sci. 2021, 11, 8995. [CrossRef]
103. Cuong, T.N.; You, S.-S.; Long, L.N.B.; Kim, H.-S. Seaport Resilience Analysis and Throughput Forecast Using a Deep Learning
Approach: A Case Study of Busan Port. Sustainability 2022, 14, 13985. [CrossRef]
104. Song, X.; Chen, Z.S. Shipping market time series forecasting via an Ensemble Deep Dual-Projection Echo State Network. Comput.
Electr. Eng. 2024, 117, 109218. [CrossRef]
105. Li, X.; Hu, Y.; Bai, Y.; Gao, X.; Chen, G. DeepDLP: Deep Reinforcement Learning based Framework for Dynamic Liner Trade
Pricing. In Proceedings of the Proceedings of the 2023 17th International Conference on Ubiquitous Information Management
and Communication, IMCOM, Seoul, Republic of Korea, 3–5 January 2023; pp. 1–8. [CrossRef]
106. Alqatawna, A.; Abu-Salih, B.; Obeid, N.; Almiani, M. Incorporating Time-Series Forecasting Techniques to Predict Logistics
Companies’ Staffing Needs and Order Volume. Computation 2023, 11, 141. [CrossRef]
107. Lim, S.; Kim, S.J.; Park, Y.; Kwon, N. A deep learning-based time series model with missing value handling techniques to predict
various types of liquid cargo traffic. Expert Syst. Appl. 2021, 184, 115532. [CrossRef]
108. Cheng, R.; Gao, R.; Yuen, K.F. Ship order book forecasting by an ensemble deep parsimonious random vector functional link
network. Eng. Appl. Artif. Intell. 2024, 133, 108139. [CrossRef]
109. Xiao, Z.; Fu, X.J.; Zhang, L.Y.; Goh, R.S.M. Traffic Pattern Mining and Forecasting Technologies in Maritime Traffic Service
Networks: A Comprehensive Survey. IEEE Trans. Intell. Transp. Syst. 2020, 21, 1796–1825. [CrossRef]
110. Yan, R.; Wang, S.A.; Psaraftis, H.N. Data analytics for fuel consumption management in maritime transportation: Status and
perspectives. Transp. Res. Part E Logist. Transp. Rev. 2021, 155, 102489. [CrossRef]
111. Filom, S.; Amiri, A.M.; Razavi, S. Applications of machine learning methods in port operations—A systematic literature review.
Transp. Res. Part E-Logist. Transp. Rev. 2022, 161, 102722. [CrossRef]
112. Ksciuk, J.; Kuhlemann, S.; Tierney, K.; Koberstein, A. Uncertainty in maritime ship routing and scheduling: A Literature review.
Eur. J. Oper. Res. 2023, 308, 499–524. [CrossRef]
113. Jia, H.; Prakash, V.; Smith, T. Estimating vessel payloads in bulk shipping using AIS data. Int. J. Shipp. Transp. Logist. 2019, 11,
25–40. [CrossRef]
114. Yang, D.; Wu, L.X.; Wang, S.A.; Jia, H.Y.; Li, K.X. How big data enriches maritime research—A critical review of Automatic
Identification System (AIS) data applications. Transp. Rev. 2019, 39, 755–773. [CrossRef]
115. Liu, M.; Zhao, Y.; Wang, J.; Liu, C.; Li, G. A Deep Learning Framework for Baltic Dry Index Forecasting. Procedia Comput. Sci.
2022, 199, 821–828. [CrossRef]
116. Wang, Y.C.; Wang, H.; Zou, D.X.; Fu, H.X. Ship roll prediction algorithm based on Bi-LSTM-TPA combined model. J. Mar. Sci.
Eng. 2021, 9, 387. [CrossRef]
117. Xie, H.T.; Jiang, X.Q.; Hu, X.; Wu, Z.T.; Wang, G.Q.; Xie, K. High-efficiency and low-energy ship recognition strategy based on
spiking neural network in SAR images. Front. Neurorobotics 2022, 16, 970832. [CrossRef]
Information 2024, 15, 507 33 of 33

118. Muñoz, D.U.; Ruiz-Aguilar, J.J.; González-Enrique, J.; Domínguez, I.J.T. A Deep Ensemble Neural Network Approach to
Improve Predictions of Container Inspection Volume. In Proceedings of the 15th International Work-Conference on Artificial
Neural Networks (IWANN), Gran Canaria, Spain, 12–14 June 2019; Springer: Berlin/Heidelberg, Germany, 2019; Volume 11506,
pp. 806–817. [CrossRef]
119. Velasco-Gallego, C.; Lazakis, I. Mar-RUL: A remaining useful life prediction approach for fault prognostics of marine machinery.
Appl. Ocean Res. 2023, 140, 103735. [CrossRef]
120. Zhang, X.; Zheng, K.; Wang, C.; Chen, J.; Qi, H. A novel deep reinforcement learning for POMDP-based autonomous ship
collision decision-making. Neural Comput. Appl. 2023, 1–15. [CrossRef]
121. Guo, X.X.; Zhang, X.T.; Lu, W.Y.; Tian, X.L.; Li, X. Real-time prediction of 6-DOF motions of a turret-moored FPSO in harsh sea
state. Ocean Eng. 2022, 265, 112500. [CrossRef]
122. Kim, D.; Kim, T.; An, M.; Cho, Y.; Baek, Y.; IEEE. Edge AI-based early anomaly detection of LNG Carrier Main Engine systems. In
Proceedings of the OCEANS Conference, Limerick, Ireland, 5–8 June 2023. [CrossRef]
123. Theodoropoulos, P.; Spandonidis, C.C.; Giannopoulos, F.; Fassois, S. A Deep Learning-Based Fault Detection Model for Optimiza-
tion of Shipping Operations and Enhancement of Maritime Safety. Sensors 2021, 21, 5658. [CrossRef] [PubMed]
124. Huang, B.; Wu, B.; Barry, M. Geographically and temporally weighted regression for modeling spatio-temporal variation in house
prices. Int. J. Geogr. Inf. Sci. 2010, 24, 383–401. [CrossRef]
125. Zhang, W.; Xu, Y.; Streets, D.G.; Wang, C. How does decarbonization of the central heating industry affect employment? A
spatiotemporal analysis from the perspective of urbanization. Energy Build. 2024, 306, 113912. [CrossRef]
126. Zhang, D.; Li, X.; Wan, C.; Man, J. A novel hybrid deep-learning framework for medium-term container throughput forecasting:
An application to China’s Guangzhou, Qingdao and Shanghai hub ports. Marit. Econ. Logist. 2024, 26, 44–73. [CrossRef]
127. Wang, Y.; Wang, H.; Zhou, B.; Fu, H. Multi-dimensional prediction method based on Bi-LSTMC for ship roll. Ocean Eng. 2021,
242, 110106. [CrossRef]

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.

You might also like