0% found this document useful (0 votes)

88 views27 pages

Machine Learning Approaches For Improving Condition-Based Maintenance of Naval Propulsion Plants

Uploaded by

alexverde3

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

88 views27 pages

Machine Learning Approaches For Improving Condition-Based Maintenance of Naval Propulsion Plants

Uploaded by

alexverde3

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 27

Journal of Engineering for the Mar-

itime Environment
Machine Learning Approaches for 0(0):1–24
©The Author(s) 2010

Improving Condition–Based Maintenance Reprints and permission:

sagepub.co.uk/journalsPermissions.nav

of Naval Propulsion Plants DOI:doi number

http://mms.sagepub.com

A. Coraddu1 , L. Oneto1 , A. Ghio1 , S. Savio1 , D. Anguita2 , and M. Figari1 ∗

1 – DITEN - University of Genova, Via Opera Pia 11a, I-16145 Genova, Italy
2 – DIBRIS - University of Genova, Via Opera Pia 13, I-16145 Genova, Italy

Abstract
Availability, reliability and economic sustainability of naval propulsion plants is a key element to cope with, because
maintenance costs represent a large slice of total operational expenses. Depending on the adopted strategy, impact of
maintenance on overall expenses can remarkably vary: for example, letting an asset running up until breakdown can lead to
unaffordable costs. As a matter of fact, a desideratum is to progress maintenance technology of ship propulsion systems from
Breakdown (BM) or Preventive Maintenance (PM) up to more effective Condition Based Maintenance (CBM) approaches.
The central idea in CBM is to monitor the propulsion equipment by exploiting heterogeneous sensors, enabling diagnosis
and, most of all, prognosis of the propulsion system’s components and of their potential future failures. The success of
CBM clearly hinges on the capability of developing effective predictive models: for this purpose, in this paper is proposed
to use effective Machine Learning (ML) methods. In particular, authors take into consideration an application of CBM
to Gas Turbines (GTs) used for vessel propulsion, where the performance and advantages of exploiting ML methods in
modeling the degradation of the propulsion plant over time are tested. Experiments, conducted on data generated from a
sophisticated simulator of a GT, mounted on a Frigate characterized by a COmbined Diesel eLectric And Gas (CODLAG)
propulsion plant type, will allow to show the effectiveness of the proposed ML approaches and to benchmark them in a
realistic maritime application.
Keywords
Condition-Based Maintenance, Gas Turbine, COmbined Diesel eLectric And Gas (CODLAG) propulsion plant, Machine
Learning, Asset decay forecast

Disclaimer
The experiments have been carried out by means of parts of a complex numerical simulator of a naval vessel (Frigate),
characterized by a Gas Turbine propulsion plant. The different blocks forming the complete simulator have been developed
and fine tuned throughout the years on several similar real propulsion plants: in view of these observations, the data that

∗ Corresponding author; e-mail: [email protected]

2 Journal name 0(0)

will be used are arguably in agreement with a possible real vessel, although they do not match exactly with any active
unit in service. Moreover, this work and any results and conclusions, therein obtained and drawn, only reflect the personal
views of the authors and does not represent the views, policies or interests of any government or governmental authority
of any type or country.

1. Introduction
According to the British Standard (1), maintenance includes all actions necessary to retain a system or an item in, or
restoring it to, a state in which it can perform its required functions. The most common way of inflecting such concepts in
practice has been always deployed according to a “fix it when it breaks” approach (2). However, this has been becoming
an unaffordable and gold-brick methodology, since data gathering from the field is ever cheaper and costs related to a
breakdown may overcome the asset value (3; 4). Indeed, the last decades going-smart technology and cross-industry needs
in maintenance, e.g. ranging from manufacturing (5; 6) to the transportation domain (7; 8), has engendered a pivotal change
from a reactive to a proactive perspective (9), trespassing the original focus on repairing-replacing actions towards more
sophisticated preventive and prescriptive activities. In particular, maintenance actions can be framed into a taxonomy, which
includes three categories (10): Corrective, Preventive, and Condition-Based.
In Corrective Maintenance (CM), the equipment or asset is run down to its breaking-down point, and maintenance
activities are carried-out afterwards with the purpose of restoring the system at earliest. In this case, maintenance activities
are triggered by an unscheduled event (e.g. failure), and no a-priori strategies can be deployed. Consequently, costs related
to such approach are usually high: they comprise direct costs, e.g. due to potentially concatenated failures of other parts of
the asset, and indirect costs, e.g. related to potential losses in (environmental, worker, etc.) safety and integrity, and asset
unavailability.
Preventive Maintenance (PM), instead, is carried-out before breakdowns in order to avoid them and minimizing the
possibility of potential issues related to failures. Several variations exist, such as adjustments, replacements, renewals
and inspections, which take place subject to a predetermined planning and schedule: this allows to establish time-slots of
unavailability of an asset (or part of it), oppositely to the unpredictability which characterize random failure patterns in
CM. With particular reference to systematic PM, parts are replaced independently of their actual status, as a safety-level
life-time is established for them: in such a way, the probability of failures for a system decreases, as popularly proved
through adverting to the so-called bath-tub curve (11).
Condition-Based Maintenance (CBM), instead, refers to triggering maintenance activities as they are necessitated by
the condition of the target system (2). This approach enables determining the conditions of in-service assets to predict
potential degradations and to plan, consequently, when maintenance activities will be needed and should be performed to
minimize disruptions. In other words, CBM switch maintenance view from pure diagnosis to high-valued prognosis of
faults. CBM thus entails the diagnosis of the target system and timely, accurate, and valuable identification of potential
issues, by exploiting effective approaches from the predictive analytics domain (2; 3; 12).
While CM is usually avoided due to the high costs characterizing this approach, CBM is gaining popularity because
of the competitive advantages over PM, related to the extreme conservativeness of preventive methods in several domains
(leading, once again, to often unjustified remarkable costs). As a matter of fact, assets are typically featured by complex
interrelations among the parts of which they consist: such correlations are not straightforward to grasp and identify, thus
leading to the unfeasibility of a simple approach like the above-cited bath-tub model (2; 11). CBM, instead, proved to be
effective to maximize availability, efficiency, sustainability, and safety of assets, enabling the upgrading of maintenance
role from being seen as evil up to a way for creating added value (13; 14).
The CBM approach to maintenance is very generic, depicting an horizontal view to a crosscut problem in different
heterogeneous domains. When a more vertical view is taken into consideration, focussing on the maritime domain, repair
Running head right side 3

maintenance expenses for conventional ships can amount up to about 20% of total operability costs, including manning
expenses, insurance and administration costs (15). For naval applications, maintenance optimization depicts as a key
task, focused to reduce operations costs while getting the optimal availability of the ship for the intended service: such
optimization is the result of a trade-off between excessive maintenance and machinery down-time, where CBM helps
opening the door towards best balancing costs and availability. In other words, CBM enables a just-in-time deployment of
ship maintenance, by allowing to plan and execute maintenance activities only when needed.
The Condition-Based Maintenance approach requires that a reliable and effective diagnostic policy on one (or more)
naval asset(s) is implemented, in order to enable further refinement towards identifying potential failures in advance. Gas
Turbines (GTs), used for naval propulsion, represent a key example of how gathering data from components can be a driver
for optimizing maintenance strategies (16; 17). Diagnostic methods for GTs (18; 19) allow to obtain information on the
health status of the components without the need to inspect the machine, thus causing an interruption of operations with
consequent disruptions of services. Diagnosis represents the starting milestone to describe the status of the turbine, in order
to evolve such systems towards prognostic models, able to predict potential future issues with a reasonable advance. In
this sense, the use of these approaches assume a central importance when applied to propulsion systems in naval vessels:
the implementation of CBM strategies allows to improve the reliability and efficiency of the asset and to provide support
to maintenance operations in a non-invasive way and by limiting interventions, which are carried-out in accordance with a
when needed fashion.
The cornerstone of applying CBM consists in the adoption of a proactive approach to maintenance in place of more
conventional reactive interventions: to implement such key change, predictive modeling assumes a central importance to
allow automatic triggering of alarms without the need of designing complex, often unknown and, in some sense, unknowable
models. For this purpose, the use of robust Machine Learning (ML) models (20; 21) is here proposed and tested. These
models shown appealing modeling and predictive capabilities even in very complex and heterogeneous problem domains.
In particular, the authors focus on Regularized Least Square (RLS) (21) and Support Vector Machine (SVM) (20), i.e.
two state-of-the-art methods whose effectiveness in coping with real-world problems is also confirmed by the inclusion
in commercial software packages (e.g. it is the case of SVM, included in the ORACLE Data Mining suite (22)). RLS and
SVM models are induced only from data collected from the field on the assets that need to be modeled, i.e. the predictive
model does not require any a-priori knowledge on the turbines, their architecture/configuration or on the system overall.
In this paper, the performance and potentialities of RLS and SVM models are benchmarked by training them on
data to forecast the performance decay of Gas Turbines, installed on a naval vessel. The aim of this work is to pave the
way towards designing a prognostic CBM methodology to naval propulsion plants equipped with GTs. In particular, the
propulsive performance degradation of a naval vessel powered by a GT has been simulated, between two drydocks, in
order to make available and to collect a dataset of experiments. Such data are then exploited to derive a predictive model
to automatically detect the system decay state, targeting effective CBM in the considered domain.
For these purposes, the paper is organized as follows. Section 2 introduces the model and the simulator of the naval
vessel, considered for presenting the ML approach to CBM in the subsequent analysis. Then, Section 3 focuses on the
Machine Learning (ML) framework, by introducing the main approaches that will be exploited for this work’s targets:
actionable algorithms will be proposed, which can be implemented for forecasting purposes in CBM applications. With
reference to the characteristics of the simulator of the naval vessel and of the needs, emerging from the ML framework, a
dataset has been created, according to the setup described in Section 4: this dataset will be released for the use to the research
community on the widespread well-known dataset repository of the University of California in Irvine (UCI). Section 5
subsequently presents some experimental results, derived by applying the proposed ML approaches to the dataset, while
Section 6 will finally draw the conclusions.
4 Journal name 0(0)

Fig. 1. CODLAG propulsion system

Gas Generator

Fuel

Air Gas Gas GT Emission

C HP Brake
LP
Shaft Shaft Torque

Air
Inlet

Fig. 2. Gas Turbine Scheme

2. Ship Propulsion Models

2.1. Introduction
The authors contextualize the proposal for Condition-Based Maintenance with reference to a Frigate naval vessel, charac-
terized by a COmbined Diesel eLectric And Gas (CODLAG) propulsion plant type, as depicted in Figure 1. In this particular
configuration the Gas Turbine (GT) mechanically drives the two Controllable Pitch Propellers (CPP) via a cross-connected
gearbox. The propulsion plant numerical model is derived from authors previous works (23; 24). The simulator is devel-
oped in Simulink® software environment, a Matlab® toolbox, structured in a modular arrangement such as each module
represents a single propulsion component. In particular the gas turbine is characterized by a two shafts arrangement with
Gas Generator (GG), consisting of compressor and High Pressure (HP) turbine for the exhaust gas production, driving the
Low Pressure (LP) power turbine as reported in Figure 2. Detailed information for the GT performance simulation can be
found in (25; 26). The GT Compressor behavior is modeled by means of steady-state performance maps (27), representing
rotational speed and isentropic efficiency ηc as functions of the pressure ratio βC and the airflow rate Mc .
Running head right side 5

Telegraph
Ship speed (v)

Rt Tp

Gearbox
n Engine

Qp Qe
Fig. 3. One degree of freedom ship dynamics

2.2. Model Description

A generic ship propulsion plant may be considered made up by three main components: the engine, the transmission gear,
the propulsor as depicted in Figure 3. It is possible to describe the dynamic performance of the propulsion plant (28) by
means of the simultaneous solution of the three non-linear differential equations in the time domain.

dv(t)
 M · dt = p · Tp (t) − Rt (t)


2π · Jp · dn(t)
dt = i · e
· Qe (t) − Qp (t) (1)
 dmf (t) = f n, dn(t)


dt dt

where where Tp is the propulsor thrust, p is the number of propulsors, Rt is the total ship resistance, M is the ship mass,
Qe is the engine torque, i is the gear ratio, e is the number of engines per shaft, Qp is the propulsor torque and Jp is the
polar moment of the rotating masses.
The first equation represents the longitudinal motion of the ship, the second equation represents the rotational motion
of the shaft line , finally the third equations represents the control system behaviour. Forces and torques in Equation 1 can
be represented in different ways depending on: the type of the propulsion system, the type of analysis and the expected
degree of accuracy. Among the possible available approaches, in order to reduce the calculation time, the inter-component
volumes method has been adopted, preferring it to more sophisticated models for the GT dynamic simulation. For the
simulation of the General Electric LM 2500 gas turbine this procedure has been applied (25) and in order to validate the
model, steady state and transient results obtained by the simulator have been compared with experimental data provided
by the gas turbine manufacturer.
The gas turbine mathematical model is structured in a modular arrangement, as shown in Figure 4 where the Simulink®
top level scheme of the considered gas turbine is reported. In the same figure the input and output variables of the different
plant components such as compressor, high pressure turbine, power turbine, combustor and so on are shown. It is worth
mentioning that each module, representing a specific engine component, has been modelled by means of steady state
performance maps, time dependent momentum, energy and mass equations and non linear algebraic equations.
The main differential equations to be solved in the time domain refer to both dynamic and thermodynamic engine
parameters. In particular the working fluid processes are governed by the continuity equation:

d(ρ · V ) · ρ
= Mi − M o (2)
dt
6 Journal name 0(0)

Mf% Mf% 1
2 TIC % Pexh
T6bl T6bl
TIC % p6 p6
Mf% Mf% p6 p6
GGrpm GGrpm
T48 T48 M6 M6

1 GT rpm p5 p5
GT rpm OUTLET
M5 M5 1
T2r T2r
GT_T GT_T GT Torque
2 TCS_TIC
Mblc Mblc
T48 TCS_TIC Torque Torque Torque TOT GT_T
T48
GG rpm GG rpm

TCS LP TURBINE AUXILIARY

SYSTEMS
p2

Mf% Mf% p2
Mfl Mfl p4 10
8 T2r T2r T3 T3
Mf% Mf%
T4bl T4bl
5 M2 p3 p3
p2 T2r T2r M2
T48 T48
M3 mf T3 M4 M4
M3 4 T3 M4 M4
M2 M2 p1 p1 M2 M2
mf
p3 p3
COMBUSTOR M5 M5 p5 p5
3 Ta T1 T1 Mblc Mblc T2r T2r M3 M3
Ta
3 3 HP - LP
INLET Mblc Mblc
T1 GGrpm GGrpm C Torque C_T GGrpm VOLUME
HP_T HP_T
GGrpm GGrpm Hp Torque HP_T
GGrpm GGrpm
COMPRESSOR
C_T C_T
HP TURBINE

GAS GENERATOR
SHAFT DYNAMICS

Fig. 4. LM 2500 gas turbine model scheme

PORT BRIDGE Torque

LEVER POSITION PITCH Threshold
COMBINATOR
Actual Pitch
PITCH
ADJUSTMENT
Pitch Setpoint
Starboard Shaft
SPEED
Torque
COMBINATOR Port Shaft
Torque
Shaft Speed
+_ PID Setpoint

Port Shaft Starboard Shaft

MAX
Actual Speed Actual Speed

Fig. 5. Shaft speed Control scheme

and by the energy equation:

dmf (t) dn(t)
=f n, (3)
dt dt

where Mi and Mo are the inlet and outlet mass flow rates, V is the GT component volume, h and u are enthalpy and internal
energy, while P and Q0 represent power and heat flow respectively.
The dynamics of the rotating shafts is determined by the dynamic momentum equation:

dω 1
= (Pm − Pbr ) (4)
dt J ·ω

being Pm and Pbr the motor and brake powers and ω the angular velocity. The propulsion system is managed by a typical
combinator law whose general layout is presented in Figure 5. The control system has been designed to provide a linear
relationship between lever position and ship speed. In this application the ship propulsion control system dynamics has
Running head right side 7

been represented by a proportional integral controller:

Z
T IC = Kp · (nsp − n) + Ki · (nsp − n)dτ (5)

where: TIC is the control signal (i.e. % of fuel flow) for the propulsion engine, n and nsp are the actual and the set
point propeller shaft speed respectively, while Kp and Ki represent the PI control loop constants. This is only a rough
approximation, but can be considered satisfactory for the most important ship maneuvers. Finally it worth noting that model
validation for a ferry, a naval corvette and an aircraft carrier can be found in (29; 30; 31).

2.3. Asset Degradation Model

The model described in Section 2 has been has been updated to take into account the performances decay of the several
propulsive components, such as GT, propellers and hull. As reported in (32), the time domain performance decay has been
modeled by means of suitable coefficients inserted in the GT Compressor, HP and LP modules of the simulator. In a such
a way it has been possible to quantify the variation of performance of each component and consequently of the entire
propulsion system.
As far as GT Compressor performance decay is concerned, this phenomena is caused by the fouling. The fouling of the
compressor of a gas turbine is indicated as a major cause of loss of performance of the machine (33; 34). It is caused by
air impurities that together with the exhaust gases and oil vapors produces a layer that joins to cover the blades. The main
consequences are an increase in the specific fuel consumption of the gas turbine and an increasing of the temperature of
the exhaust gas. The effect of the fouling is simulated by reducing, over service hours, the numerical values of the airflow
rate Mc and of the isentropic efficiency ηc .
Turbines performance reduction after several running hours is due to their blades erosion and fouling, so, in analogy to
the performance decay modeling of the GT Compressor, a gas flow rate reduction factor kMT (32) has been generated on
the basis of literature data, regarding heavy duty and aeronautical turbines as reported in (35).
These particular performance decay function are empirically derived and are only a function of time. The real degradation
behavior of the physical asset could be described by means of a more accurate function able to describe both time dependency
and the real operation profile. In order to overcome this issue authors put forward an analysis able to take into account each
possible combination of GT Compressor, HP and LP turbine condition.
Under this assumption authors sample the range of decay of GT Compressor and turbines with an uniform grid of
precision 0.001 so to have a good granularity of representation. In particular for the GT Compressor decay state the
following state discretization has been adopted:

kMc ∈ [0.95, 1] (6)

kMci = 1 − i · 0.001, i ∈ {0, 1, · · · , 51} (7)

Similarly, for the Gas Turbines:

kMt ∈ [0.975, 1] (8)

kMti = 1 − i · 0.001, i ∈ {0, 1, · · · , 26} (9)

8 Journal name 0(0)

Table 1. Main simulation output

Variable Symbol Units
Lever position lp []
Ship speed v [knots]
Gas Turbine shaft torque GTT [kN m]
Gas Turbine rate of revolutions GTn [rpm]
Gas Generator rate of revolutions GGn [rpm]
Starboard Propeller Torque Ts [kN]
Port Propeller Torque Tp [kN]
HP Turbine exit temperature T48 [°C]
GT Compressor inlet air temperature T1 [°C]
GT Compressor outlet air temperature T2 [°C]
HP Turbine exit pressure P48 [bar]
GT Compressor inlet air pressure P1 [bar]
GT Compressor outlet air pressure P2 [bar]
GT exhaust gas pressure Pexh [bar]
Turbine Injection Control TIC [%]
Fuel flow mf [kg/s]

The hull and propellers degradation here are not take into account. An improvement of the model is under development for
introducing this phenomena in the propulsion plant simulator.
For this condition based maintenance application let us consider the propulsion system identified by the followings
parameters:

1. Ship speed v: this parameters allows to take into account different GT load conditions. The authors choose to consider
different probability density function (PDF) for the speed distribution as there are three variants for this kind of naval
vessel: Anti-submarine Warfare (ASW) variant, general-purpose variant (GP) and Anti-aircraft warfare (AAW) variant.
In particular three different probability function density has been take into account: uniform PDF for the GP variant,
bimodal PDF for the AAW and ASW variant. A further analysis has been performed by means of a unimodal distribution
for the cruising speed.
2. GT Compressor degradation coefficient kMc : this parameters describe the reduction, over service hours, of the numerical
values of the airflow rate Mc and of the isentropic efficiency ηc .
3. Gas Turbine degradation coefficient kMt : this parameters describe the gas flow rate reduction factor over service hours.

In Table 1 the main simulation outputs are reported. This subset of model’s outputs are the same quantities that the
automation system installed onboard can acquire and store.
As far as numerical model is concerned it is worth noting that the ship dynamics part of the model has been validated
and fine tuned comparing the simulation results with sea trial campaign data, measured during the full scale test campaign
of the vessel, as reported in (36).

3. Machine Learning (ML) Framework

Condition-Based Maintenance competitive advantages over Corrective and Preventive Maintenance mostly deal with pre-
dictive capabilities in detecting potential failures with a reasonable advance, so to proactively schedule proper interventions
only when needed and, in any case, before breakdown. Machine Learning (ML) approaches play a central role in extracting
non-trivial information from amounts of raw data, collected from sensors and heterogeneous inputs and concerning the
monitored asset. The learning process for ML approaches usually consists of two phases:

1. during the training phase, a set of data is used to induce a model that best fits them, according to some criteria;
2. then, the trained model is set in motion into the real application during the feed-forward phase.
Running head right side 9

In this paper, authors focus on the problem of identifying the decay trend in Gas Turbines performance, when GTs
are used for naval propulsion. In this sense, a regression problem must be solved, where the decay trend over time
must be estimated based on data, collected from GTs and, more broadly speaking, from the vessel. In particular, in the
conventional
n oregression framework in ML (20; 37; 38), a set of training data X = {(xi , yi )}, i = 1, ..., n, where in general
(k)
x i = xi ∈ Rd (k = 1, ..., d) and yi ∈ R is considered. As the authors are targeting a regression problem, the purpose
is to find the best approximating function h(x), where h : Rd → R, which should be close (in some sense) to the unknown
target function f (x).
During the training phase, the quality of the learned regressor h(x) is usually measured according to a loss function
`(h(x), y) (39), which calculates the discrepancy between the true output y and the estimated one. The empirical error then
computes the average discrepancy per sample, reported by a model on the training patterns:
n
1X
L̂n (h) = `(h(xi ), yi ). (10)
n i=1

A simple criterion for selecting the final model during the training phase could then consist in simply choosing the
approximating function that minimizes the empirical error L̂n (h): this approach is known as Empirical Risk Minimization
(ERM) (40). However, ERM is usually avoided in ML as it leads to severe overfitting of the model on the training dataset: as
a matter of fact, in this case the training process chooses a model, complicated enough to perfectly describe all the training
samples (including noise, which afflicts them). In other words, ERM implies memorization of data rather than learning
from them.
A more effective approach is represented by minimizing a cost function where the tradeoff between accuracy on the
training data and a measure of the complexity of the selected approximating function is implemented (41; 42)1 :

h∗ : min L̂n (h) + λ C(h). (11)

In other words, the best approximating function h∗ is chosen as the one that is complicated enough to learn from data
without overfitting them. In particular, C(·) is a complexity measure: depending on the exploited ML approach, different
measures are realized2 . Instead, λ is a hyperparameter, that must be aprioristically set and is not obtained as an output of
the optimization procedure: it regulates the trade-off between the overfitting tendency, related to the minimization of the
empirical error, and the underfitting tendency, related to the minimization of C(·). The optimal value for λ is problem-
dependent, and tuning this hyperparameter is a non-trivial task: the authors will devote a subsection in the following for
this purpose.
Note that, when monitoring the decay of complex environments and systems like in CBM applications, the authors
could have to deal with Multiple Output (MO) problems instead of the Single Output (SO) ones, considered so far: in the
MO framework, the objective value of the decay is an array y i ∈ Rm , with m > 1. Although several approaches, such as the
ones based on Multi-Task Learning (44) or Transfer Learning (45), can be applied to training models in the MO framework,
the most effective solution often consists in decomposing the MO dataset into m SO problems (46; 47). In addition to allow
simplifying the presentation of the ML method, this approach enables grasping the phenomena, characterizing the decay
of components for CBM purposes, whenever the considered outputs are not very strongly correlated one to each other: as
this hypothesis sounds reasonable in the considered application, the authors opt for following this path.

1 For the sake of notational simplicity, h = h(x) is identified.

2 Note that, according to the No Free Lunch Theorem (43), no optimal choice exists, as opting for the best procedure is a problem-dependent task.
10 Journal name 0(0)

3.1. Regularized Least Squares

In the next sections the authors present two effective approaches, which rely on the framework depicted above. The first
approach is exploited to deploy predictive capabilities for Condition-Based Maintenance purposes is Regularized Least
Squares (RLS) (21; 48; 49). In RLS, approximation functions are defined as

h(x) = wT φ(x), (12)

where a non-linear mapping φ : Rd → RD , D d, is applied in such a way non-linearity is pursued while still coping
with linear models.
For RLS, Problem (11) is configured as follows. The complexity of the approximation function is measured as

C(h) = kwk22 (13)

i.e. the Euclidean norm of the set of weights describing the regressor, which is a quite standard complexity measure in ML
(e.g. (20)). Regarding the loss function, the Mean Squared Error (MSE) loss is adopted:
n n
1X 1X 2
L̂n (h) = `(h(xi ), yi ) = [h(xi ) − yi ] . (14)
n i=1 n i=1

Consequently, Problem (13) is reformulated as:

n
∗ 1 X T 2
w : min w φ(x) − yi + λkwk22 . (15)
w n i=1

By exploiting the Representer Theorem (50), the solution h∗ of the RLS Problem (15) can be expressed as a linear
combination of the samples projected in the space defined by φ:
n
X
∗
h (x) = αi φ(xi )T φ(x). (16)
i=1

It is worth underlining that, according to the kernel trick (51; 37), it is possible to reformulate h∗ (x) without an explicit
knowledge of φ by using a proper kernel function K(xi , x) = φ(xi )T φ(x):

n
X
h∗ (x) = αi K(xi , x). (17)
i=1

Several kernel functions can be retrieved in literature (51; 37), but the Gaussian kernel is often used as it enables learning
every possible function (52):

2
K(xi , xj ) = e−γkxi −xj k2 , (18)

where γ is a hyperparameter, that must be aprioristically tuned (analogously to λ). In particular, for small values of γ,
simpler functions are realized (note that for γ → 0 the authors derive a linear regressor); instead, if large values of γ are
employed, more complicated functions can be realized.
Running head right side 11

As a final issue, the RLS Problem (15) can be reformulated by exploiting kernels:
 2
n n n X n
1 X X X
α∗ : min αj K(xj , xi ) − yi  + λ αi αj K(xj , xi ). (19)
α n i=1 j=1 i=1 j=1

Let us define y = [y1 , . . . , yn ]T , α = [α1 , . . . , αn ]T , the matrix K such that Ki,j = Kji = K(xj , xi ), and the Identity
matrix I ∈ Rn×n , a matrix-based formulation of Problem (19) can be obtained:

1 2
α∗ : min kKα − yk2 + λαT Kα (20)
α n

By setting equal to zero the derivative with respect to α, α can be found as follows:

(K + nλI) α∗ = y, (21)

that is a linear system for which effective solvers have been developed throughout the years, allowing to cope with even
very large sets of training data (53).

3.2. Support Vector Regression

As an effective alternative to RLS, the Support Vector Machine models for Regression (briefly Support Vector Regression
– SVR) (38) can be used. The main idea behind SVR consists in searching for the approximating function3 h(x) =
wT φ(x) which, in principle, should not be far from the target function f (x) more than a user-defined threshold ε, i.e.
|f (x) − h(x)| < ε. Consequently, the epsilon-insensitive loss function is adopted:

`(h(x), y) = |h(x) − y|ε , (22)

where |·|ε = max (0, |·| − ε). On the other hand, the same complexity measure, exploited by RLS, is used by SVR.
As a consequence, Problem (11) can be reformulated as follows:
n
1 X
w∗ : min kwk22 + C ξˆi + ξˇi (23)
w,ξ̂,ξ̌ 2 i=1

wT φ(xi ) − yi ≤ ε + ξˆi , ∀i ∈ {1, . . . , n} (24)

yi − wT φ(xi ) ≤ ε + ξˇi , ∀i ∈ {1, . . . , n} (25)
ξˆi , ξˇi ≥ 0, ∀i ∈ {1, . . . , n}

1
where C = λnis conventionally adopted as hyperparameter in place of λ. The two constraints of Eqns. (24) and (25)
are added to force learning of approximation functions which are close to the target one: note, in fact, that ξˆi , ξˇi are slack
variables.

3 Note that the originally proposed function for SVR is h(x) = wT φ(x) + b, where b ∈ R is a bias term. However, the introduction of the bias would
unnecessary complicate the analysis of this case: in fact, it is possible to prove that the theoretical framework of SVM is maintained also if b = 0 as
far as a positive definite kernel, like the Gaussian one of Eq. (18), is used (54).
12 Journal name 0(0)

Note that Problem (23) is convex (20). Nevertheless, for computational reasons, the dual formulation of the SVR
problem above is usually adopted, which can be derived by introducing the Lagrange Multipliers α̂, α̌, η̂, η̌ (55):

1 Xn
L w, ξ̂, ξ̌, α̂, α̌, η̂, η̌ = kwk22 + C ξî + ξˇi (26)
2 i=1
n
X h i
+ α̂i wT φ(xi ) − yi − ε − ξî
i=1
n
X
α̌i yi − wT φ(xi ) − ε − ξˇi

+
i=1
n
X n
X
− η̂i ξî − η̌i ξˇi
i=1 i=1

for which the following Karush-Kuhn-Tucker (KKT) conditions hold:

n
∂L X
=0 → w= (α̌i − α̂i ) φ(xi ) (27)
∂w i=1
∂L
=0 → C − α̂i − η̂i = 0 → α̂i ≤ C, ∀i ∈ {1, . . . , n} (28)
∂ ξî
∂L
=0 → C − α̌i − η̌i = 0 → α̂i ≤ C, ∀i ∈ {1, . . . , n} (29)
∂ ξˇi
wT φ(xi ) − yi ≤ ε − ξî , ∀i ∈ {1, . . . , n} (30)
yi − w φ(xi ) ≤ ε − ξˇi , ∀i ∈ {1, . . . , n}
T
(31)
ξî , ξˇi , α̂i , α̌i , ξî , ξˇi ≥ 0, ∀i ∈ {1, . . . , n} (32)
h i
α̂i wT φ(xi ) − yi − ε + ξî = 0, ∀i ∈ {1, . . . , n} (33)

α̌i yi − wT φ(xi ) − ε + ξˇi = 0, ∀i ∈ {1, . . . , n}

(34)
η̂i ξˆi , η̌i ξˇi = 0, ∀i ∈ {1, . . . , n} (35)

By exploiting these relationships in Eq. (26), the dual formulation of Problem (23) is derived, for which several efficient
solvers have been developed throughout the last years (56; 57):
n n
1 XX
α̂∗ , α̌∗ : min (α̌i − α̂i ) (α̌j − α̂j ) φ(xi )T φ(xj )
α̂,α̌ 2 i=1 j=1
n
X n
X
+ α̂i [yi + ε] − α̌i [yi − ε] (36)
i=1 i=1

0 ≤ α̌i ≤ C, ∀i ∈ {1, . . . , n}
0 ≤ α̂i ≤ C, ∀i ∈ {1, . . . , n}

The trained approximating function h∗ (x) can be expressed through α̂∗ , α̌∗

n
X
h∗ (x) = (α̌i∗ − α̂i∗ ) φ(xi )T φ(x). (37)
i=1
Running head right side 13

It is worth noting that, if |wT φ(xi ) − yi | ≤ ε, it derives from Eqns. (33) and (34) that α̂i = α̌i = 0: as a consequence, the
solution of SVR is usually sparse, i.e. it can be described by exploiting only a limited subset of parameters (and, obviously,
of training data). In particular, the less data are outside the ε–insensitive tube, the more sparse the SVR solution will be
(20; 37; 38). In other words, let Ssv = {i : α̂i = α̌i = 0} be the set of indexes, and |Ssv | ≤ n (and, usually, |Ssv | n)
its cardinality; then, in order to compute h∗ (x), only data indexed by Ssv can be contemplated:
X
h∗ (x) = (α̌i∗ − α̂i∗ ) φ(xi )T φ(x). (38)
i∈Ssv

This allows to reduce the computational burden with respect to RLS, for which the solution is dense (i.e. almost all αi 6= 0
in Eq. (21)).
As a final issue, analogously to what done for RLS, the kernel trick can be exploited in order to reformulate the SVR
training problem by exploiting a matrix formulation:
" #T " #" # " #T " #
∗ ∗ 1 α̌ K −K α̌ α̌ −y + ε
α̂ , α̌ : min + (39)
α̂,α̌ 2 α̂ −K K α̂ α̂ y+ε
0 ≤ α̌, α̂ ≤ C

3.3. Tuning the hyperparameters λ, γ, C and ε

The performance of RLS and SVR models depends on the quality of the hyperparameters tuning. As highlighted while
presenting these approaches, while the parameters α∗ , α̂∗ , and α̌∗ are identified by the optimization procedure, the tuples
of hyperparameters (λ, γ) for RLS and (γ, C, ε) for SVR must be set before finding the model: this a-priori tuning phase is
known as the model selection step (58; 59; 60), and is run as a part of the training phase. Some rule-of-thumb approaches
have been proposed throughout the years (e.g. (22)), which allow to tune hyperparameters in a quick way: however, the
quality of the generated model is obviously not the best one. On the contrary, the most effective approaches consist in
performing an exhaustive hyperparameters grid search: the optimization problem for RLS/SVR is solved several times for
different values of the tuples, and the learned models are benchmarked according to some criteria. The most widespread
quality index for Machine Learning models is the estimated generalization error of the regressor, i.e. the error that the
function will be likely performing on new unknown data, generated by the same probability distribution that originated the
learning set (61).
Several methods exist for this purpose: out–of–sample methods, like the well–known k–Fold Cross Validation (KCV )
(62) approach, represent the state-of-the-art model selection approaches when targeting several applications (60). In KCV,
the original dataset X is split into k independent subsets (namely, the folds), each one consisting of n/k samples: (k − 1)
parts are used, in turn, as a training set, and the remaining fold is exploited as a validation set. Note, in fact, that the error
performed by the trained model on the validation set can be reliably used for estimating the generalization error, because
this fold has not been used for training the model (61; 63; 62; 64). The procedure is iterated k times. It is worth underlining
that k itself could be considered as a hyperparameter (65): however, in practice, k = 5, 10 or 20 represent feasible choices
in the largest part of applications (65; 66).
The KCV procedures for RLS and SVR are presented in Algorithms 1 and 2, respectively: they only differ in the way
the hyperparameters tuples are defined and, obviously, in the procedure, used for model training. Given a training set X, a
fixed number of folds k, and a grid of values to be explored for the hyperparameters4 λ, γ, C and/or ε (depending on the
chosen approach, i.e. either RLS or SVR), the original dataset is split into k folds, as a first step: let S = {1, ..., n} be the

4 Ideally, the range where to search for the hyperparameters should be λ, γ, C, ε

∈ [0, +∞), and the search grid should have infinite resolution. Obviously,
both the coarseness of the grid and the size of the searching space severely influence the quality of the solution and the amount of computation time
14 Journal name 0(0)

set of indexes of the samples in X; then, k sets of indexes S (t) , t = 1, ..., k, are created. These subsets are such that:

k
[ k
\
(t)
S =S ∨ S (t) = ∅. (40)
t=1 t=1

The best error for the KCV procedure is initialized to L∗KCV = +∞. Then, for every possible set of values in the grid for the
hyperparameters tuple, (λ, γ) for RLS or (γ, C, ε) for SVR, a KCV stage is run. At the t-th KCV step, the patterns in the

validation set are indexed by S (t) , while the training set consists of the remaining samples, whose indexes are S \ S (t) .
A model is then trained on the training set and its performance is verified on the validation patterns by computing the
empirical error

(t) 1 X (t)
L̂n/k (h(t) ) = ` h (xi ), yi . (41)
n/k (t)
i∈S

After the k KCV steps are over, the estimated generalization error for the hyperparameters tuple can be computed by
averaging the t empirical errors on the validation sets:

k
1 X (t)
LKCV = L̂ (h). (42)
k t=1 n/k

If the obtained error is lower than the best L∗KCV reported so far, the hyperparameters tuple is saved as potential best
configuration for the learning procedure.
As a final issue the authors want to point out that, in order to select the best hyperparameters tuple, k models are created
at every KCV step. Once the best tuple is found, the final model is trained on the whole set X by running the learning
procedure where hyperparameters are set to the identified best values.

4. Dataset Creation
In this paper, the authors aim is to develop a predictive model for CBM purposes with reference to Gas Turbines mounted
on a Frigate naval vessel. Since the COmbined Diesel eLectric And Gas (CODLAG) propulsion plant, mounted on the
Frigate, is of recent installation, no real-world data gathered from ship’s automation system are currently available, which
could help studying and validating a Machine Learning solution to the Condition-Based Maintenance needs of this asset.
Nevertheless, much of the effort has been spent in realizing the physical model and, consequently, an accurate simulator
of the system, as introduced in Section 2 as well as in previous works (23; 24). As a consequence, the model of Section
2 can be exploited in order to create a virtual (but ground-truth complying (23; 24)) experimental datasets, to be used for
ML modeling purposes.
As described in Section 2.3, the propulsion system is depicted through three parameters:

1. Speed: this parameter is controlled via the control lever. The latter can only assume a finite number of position lpk , k ∈
{0, . . . , 9}, which in turn correspond to a finite set of possible configurations for fuel flow and blade position. Each set
point is designed to reach a desired speed vk , k ∈ {0, · · · , 9}:

lpi = i, i ∈ {0, · · · , 9} (43)

lpi → vi = 3 ∗ lpi [Knots] , ∀i ∈ {0, · · · , 9} (44)

needed by the learning procedure. Nevertheless, some practical suggestions can be retrieved in literature (e.g. (66)), which help finding good trade-offs
between performance and computational time.
Running head right side 15

Algorithm 1 The k–Fold Cross Validation (KCV) model selection procedure for RLS.
Require: X, k, a grid for every hyperparameter involved Gλ,γ

1: L∗KCV = +∞
2: (λ∗ , γ ∗ ) = (−∞, −∞)
3: S (t) = split(S), t = 1, ..., k
4: for all λ ∈ Gλ , γ ∈ Gγ do
5: Let λ(p) , γ (q) be the current hyperparameters tuple
6: LKCV = 0
7: for t = 1 → k do
8: Train h(t) with RLS Problem (21) on the set of data indexed by S \ S (t) , by using the hyperparameter tuple

λ(p) , γ (q)
(t)
9: Compute L̂n/k (h(t) ) according to Eq. (41)
10: end for
11: Compute LKCV according to Eq. (42)
12: if LKCV < L∗KCV then
13: L∗KCV = LKCV
(λ∗ , γ ∗ ) = λ(p) , γ (q)

14:
15: end if
16: end for
17: Train h∗ with RLS Problem (21) by using the whole set X and the tuple (λ∗ , γ ∗ )
18: return h∗

Algorithm 2 The k–Fold Cross Validation (KCV) model selection procedure for SVR.
Require: X, k, a grid for every hyperparameter involved Gγ,C,ε

1: L∗KCV = +∞
2: (C ∗ , γ ∗ , ε∗ ) = (−∞, −∞, −∞)
3: S (t) = split(S), t = 1, ..., k
4: for all C ∈ GC , γ ∈ Gγ , ε ∈ Gε do
5: Let C (p) , γ (q) , ε(s) be the current hyperparameters tuple
6: LKCV = 0
7: for t = 1 → k do
8: Train h(t) with SVR Problem (39) on the set of data indexed by S \ S (t) , by using the hyperparameter tuple

C (p) , γ (q) , ε(s)
(t)
9: Compute L̂n/k (h(t) ) according to Eq. (41)
10: end for
11: Compute LKCV according to Eq. (42)
12: if LKCV < L∗KCV then
13: L∗KCV = LKCV
(C ∗ , γ ∗ , ε∗ ) = C (p) , γ (q) , ε(s)

14:
15: end if
16: end for
17: Train h∗ with SVR Problem (39) by using the whole set X and the tuple (C ∗ , γ ∗ , ε∗ )
18: return h∗
16 Journal name 0(0)

Note that, if the transients is not taken into account, lpi and vi are deterministically related by a linear law. In the
presented analysis the transients between different speeds are not considered since, in the operational life of a ship,
these phenomena cover a negligible time with respect to steady states.
2. Compressor decay:

kMc ∈ [0.95, 1] (45)

The decay rate has been sampled through a uniform grid with resolution 10−3 , so to reach a good granularity when
representing the GT Compressor decay state:

kMci = 1 − i · 0.001, i ∈ {0, 1, · · · , 51} (46)

3. Gas Turbine decay:

kMt ∈ [0.975, 1] (47)

Analogously to the case of compressor, the decay rate is sampled through a uniform grid with resolution 10−3 . The GT
decay state can be then represented as:

kMti = 1 − i · 0.001, i ∈ {0, 1, · · · , 26} (48)

Once these quantities are fixed, several heterogeneous measures, related to the propulsion plant and listed in Table 1, can
be derived by exploiting the simulator, punctually describing the state of the system. The purpose of the analysis is to train
a model on such measures, in order to estimate kMc and kMt from them. It is worthwhile noting that the measures, derived
by the simulator, can also be obtained by the automation system equipping the ship, and could be used in a real-world
application scenario as well.
Given the above premises, the evolution of the system can exhaustively explored by simulating all its possible states.
The space of possible states is described via the triples:

(lp, kMc , kMt )i , i ∈ {0, 1, · · · , 11934} (49)

since:

lpi = i, i ∈ {0, · · · , 9} (50)

kMci = 1 − i · 0.001, i ∈ {0, 1, · · · , 51} (51)
kMti = 1 − i · 0.001, i ∈ {0, 1, · · · , 26}. (52)

It is then possible to run the simulation for the 9 · 51 · 26 = 11934 conditions considered in the Matlab® Simulink®
environment for each triple [lp, kMc , kMt ]i . Every model is run until the steady state is reached: then, the above-cited 16
measures listed in Table 1 can be extracted from the simulation and finally organized into a dataset. The latter defines a
conventional Multiple Output (MO) problem, where every sample xi ∈ R16 and every output is y i ∈ R2 , as it includes
the values of kMc and kMt .
The authors intend to publish this dataset for free use by the scientific community through the well-known UCI repository
(67). While waiting for final approval for publication, news and preliminary releases of the dataset will be soon available
at the dedicated webpage http://cbm.smartlab.ws.
Running head right side 17

5. Experimental Results and Discussion

Given the complexity of the considered problem, the authors decided to tackle it through decomposition in simpler sub-
problems. In particular, the authors start by targeting the prediction of the GT compressor decay when the GT turbine decay
has no effects on the system; then, a similar analysis is pursued, where GT compressor decay effects are bypassed; finally,
a complete analysis is carried-out, where both decay effects are included.

5.1. Modeling and predicting the Compressor decay

The dataset is simply created by isolating the effects of kMc from the ones of kMt : starting from the dataset created according
to the methodology of Section 4, this goal can be reached by only considering those experiments which correspond to values
of kMt = 1 have been considered. In this way, only the GT compressor decay effects on the system, thus allowing to
focus on a simpler, but still significant (34; 35), problem. The dataset X consists of 9 · 51 = 459 experiments, i.e. the ones
corresponding to the following tuples:

(lp, kMc )i , i ∈ {0, 1, · · · , 459} (53)

lpi = i, i ∈ {0, · · · , 9} (54)
kMci = 1 − i · 0.001, i ∈ {0, 1, · · · , 51}. (55)

In order to train an effective model and to test its performance in a real-world-like scenario, the dataset X of available
experiments is divided into two subsets:

• Xtrain , consisting of ntrain samples, which is used for the training phase (thus including model selection and creation)
• Xtest = X \ Xtrain , consisting of ntest = n − ntrain data. The feedforward phase of the model h∗ , trained on Xtrain , is run
on these samples analogously to an application setting of the regressor. As the target outputs are available for Xtest , it
is always possible to compute the error that h∗ would obtain in the application scenario, which is defined as:

∗ 1 X
L̂test
ntest (h ) = ` (h∗ (xi ), yi ) . (56)
ntest
xi ∈Xtest

In particular, the authors opt for exploiting the relative error (68; 69) as loss function, which can be expressed as:
∗
∗
h (xi ) − yi
` (h (xi ), yi ) =
(57)
yi

The splitting of the dataset into training and test sets is a key point to deal with. In particular, two aspects must be considered:
(i) the choice of ntrain , i.e. the amount of data for training purposes; (ii) the way data are picked from X to be included into
Xtrain . Both points can be analyzed from the modeling and application point of view. Regarding issue (i) above:

• From the Machine Learning point of view, choosing the most proper number of data for training and test purposes has a
non-negligible impact on the quality of the final model: if ntrain is too large, the model will generally be more accurate,
but the test assessment probably will not be relevant from a statistical point of view; on the contrary, if too samples are
reserved for testing purposes, the regressor could become too inaccurate, being trained on a limited knowledge-base.
Usually, a feasible approach consists in performing some trials, by using different increasing values for ntrain , in order
to find the best trade-off.
• From the naval application point of view, it is worth underlining that the role of the simulator would be played by
an expert operator, who checks the decay state of the component and reports any measures, coming from the ship’s
automation, into a database. The purpose of the ML predictive approach for CBM, targeted by this paper, is to support
18 Journal name 0(0)

the expert in such monitoring operations. Determining the value of ntrain is equivalent to quantifying the number of
manual inspection the expert has to complete before the model can be trained, or, in other words, before he is supported
in his activities by the ML predictive model for CBM. Too large values for ntrain would require a remarkable effort by
the expert and/or long waiting time before operating the model, reducing the potential impact of such technology in
the maritime domain.

Instead, when issue (ii) above is taken into account:

• From the ML point of view, data can be picked either in a random way or through a more rigorous procedure, which
exploits some information on the observed phenomenon. Random selection is obviously easier to implement, but
usually leads to poorer quality models, as the space of possible values for measures and decay rates is not exhaustively
explored. Approaches, allowing to explore the topology representation of data, usually enable deriving more effective
regressors at the expenses of a small supplementary delta-effort: it is the case, for example, of the Nearly Homogeneous
Sampling method (70), that can be exploited in the experiments.
• The way data are sampled in this virtual framework maps into the protocol for data gathering for maintenance purposes
in the real-world application. In other words, a random sampling corresponds to an expert operator that checks for
the health status of the vessel components without following any predetermined and/or rigorous time-schedule and/or
procedure; on the contrary, using more sophisticated sampling techniques is equivalent to an operator, who follows
a planned schedule on predetermined operational conditions for monitoring the vessel. Though this second scenario
seems to be more plausible, both settings will be analyzed in the following.

As described in Section 2.3, the Frigate can be characterized by different mission profiles (AAW, ASW, GP), which
correspond to different operational profiles: they are distinguished by the different probability density function, used for
depicting the distribution of the speed. Such information cannot be straightforwardly retrieved in the dataset, created as
described in the previous section. Nevertheless, the whole space of potential operating conditions is mapped, it is always
possible to resample the dataset according to different speed distributions, designed with reference to the different mission
profiles. Unfortunately, such profiles are confidential information not available in public literature: however, it is always
possible to design some reasonable distributions, based on authors’ expertise. In particular the following scenarios will be
considered:

1. a uniform distribution (UD) (Figure 6), characterizing a general purpose mission of the Frigate;
2. a bimodal distribution (BD) (Figure 7), where the two modes are centered, respectively, on the cruising speed and on
the maximum speed (while other speed values are neglected);
3. a monomodal distribution (MD) (Figure 8), where the mode is centered over the cruise speed and the time, spent by the
vessel at different speed values, is considered negligible.

For the sake of completeness, it is worth noting that the ASW mission profile can be characterized by a trimodal distribution:
in this case, another mode is introduced in addition to the ones of the BD profile, centered on the optimal speed for the
antisubmarine operations. However, in this analysis, the last case is not taken into account, since the vessel exploits the
electrical prolusion mode, which is not contemplated by the Matlab® Simulink® model, for reaching the first peak (71; 72)
in place of the Gas Turbine mode. Consequently, the ASW mission profile can still be reasonably represented by BD, as
introduced above.
Table 2 shows the results for the experiments, where only GT compressor decay is taken into account. In particular,
∗
the error on the independent test set L̂test
ntest (h ) is reported. The error is computed for different experimental setups, listed
below:

• different experiments are performed by varying the amount of patterns used for training purposes (refer to issue (i),
introduced above). In particular, ntrain ∈ {10, 25, 50, 100, 200} were tested.
Running head right side 19

0.8

0.6
PDF

0.4

0.2

20 40 60 80 100 120 140 160 180

Percentage respect to the design Speed

Fig. 6. Uniform distribution (UD), used as probability density function of the speed (percentage respect to the design speed equal to 15
Knots) of the Frigate.

0.8

0.6
PDF

0.4

0.2

20 40 60 80 100 120 140 160 180

Percentage respect to the design Speed

Fig. 7. Bimodal distribution (BD), used as probability density function of the speed (percentage respect to the design speed equal to 15
Knots) of the Frigate.
20 Journal name 0(0)

0.8

0.6

PDF

0.4

0.2

20 40 60 80 100 120 140 160 180

Percentage respect to the design Speed

Fig. 8. Monomodal distribution (MD), used as probability density function of the speed (percentage respect to the design speed equal
to 15 Knots) of the Frigate.

• the experiments has been replicated for different values of ntrain , by opting for different sampling strategies (refer to
issue (ii) above). In particular a random sampling and the Nearly Homogeneous Sampling (70) strategies is exploited.
Note, moreover, that the splitting procedure for creating Xtrain and Xtest from X is replicated 30 times per each
experiment, so to derive statistically relevant results
• the experiments were replicated by varying the probability distribution of vessel speed according to the three
distributions, previously introduced (UD, BD, MD)
• the RLS and SVR learning procedures are used, respectively presented in Algorithms 1 and 2, to derive the model h∗ .
In particular, the authors set:

– RLS: the grids for hyperparameters tuning purposes are fixed to Gλ ∈ [10−6 , 104 ] and Gγ ∈ [10−6 , 104 ], consisting
of 30 values equally spaced in logarithmic scale
– SVR: the grids are fixed to GC ∈ [10−6 , 104 ] and Gγ ∈ [10−6 , 104 ], consisting of 30 values equally spaced in loga-
rithmic scale. Note that SVR requires that the hyperparameter ε is tuned as well. Nevertheless, this hyperparameter
is application-dependent and can be usually aprioristically set to a fixed value without compromising performance
while reducing the computational burden. The values of ε is set to ε = 10−3 , i.e. equal to the considered granularity
when modeling the system behavior.

Table 3 presents the results for the set of experiments, performed on the data: in particular, the average error and its
variance, as recorded on the different performed analyses, are shown. From them, it is worthwhile deriving that:

• Broadly speaking, SVR is characterized by better performance than RLS: as a matter of fact, SVR average error is
smaller, and the variance values for SVR show that this approach is more stable than RLS as the training set is changed
(e.g. because of different speed distributions or of different sampling strategies)
• Model derived from datasets, sampled according to the Nearly Homogeneous Sampling (70) approach, are usually
more accurate. However, the difference with respect to the performance of regressors, trained on randomly sampled
training sets, is negligible: such result allows to derive precious information for designing the procedures for CBM
data gathering on the Frigate
Running head right side 21

• On equal terms (e.g. value of ntrain ), if the uniform distribution (UD) or the vessel speed is contemplated, that takes
into account a broad range of potential speed values, the forecasting is less accurate than in those experiments, where
more concentrated distributions are exploited (e.g. monomodal and bimodal distribution – MD and BD, respectively).
This is expected: in fact, the use of MD and BD simplifies the problem by neglecting several different speeds, that are
considered, on the contrary, by UD. This allows to highlight the importance of correctly and informatively depicting
the speed distribution of the vessel, as a basic condition for deriving more effective models with a smaller amount of
gathered data
• Finally, ntrain ≈ 50 samples are usually enough, in the proposed approaches, to derive an accurate model for CBM
purposes, in this preliminary study framework.

5.2. Modeling and predicting the Gas Turbine decay

As a second issue, the dataset is created by isolating the effects of the GT decay kMt from the ones of the GT compressor
kMc (i.e. only those tuples where kMc = 1 are considered). In this case, the dataset X consists of 9·26 = 234 experiments,
i.e. the ones that correspond to:

(lp, kMt )i , i ∈ {0, 1, · · · , 234} (58)

lpi = i, i ∈ {0, · · · , 9} (59)
kMti = 1 − i · 0.001, i ∈ {0, 1, · · · , 26}. (60)

The experimental setup is not changed from the one, depicted in the previous section; however, as a very partial
exception, the case ntrain = 200 is negletted due to the more limited amount of available data (234 vs 459). In fact, when
ntrain = 200, only 60 samples are available for testing purposes, which could be too few to target a reliable analysis.
Table 3 shows the results, obtained from the performed tests: the quantities, presented in the table, are analogous to the
ones of Table 2, and so are the conclusions that can be drawn from them.

5.3. Modeling and predicting both the GT compressor and the GT decays
As a final issue, the case where both decay factors are taken into account are analyzed, so depicting a Multiple Output
(MO) problem with m = 2 (refer to Section 3 for more details). By exploiting the approach proposed by (46; 47), i.e.
by decomposing the MO problem into m Single Output (SO) sub-problems, a models to estimate kMc and kMt in the
general CBM scenario is created. The objective is to verify whether the models are able to accurately track variations
in GT parameters, due to decay, even when two components are degrading their performance (and not necessarily in a
synchronous fashion).
The experimental setup is kept unmodified with reference to the one, applied for previous experiments. Clearly, at this
stage, the authors considered the whole dataset generated according to the considerations, reported in Section 4: having
11934 samples available, larger cardinalities for the training set are tested, that is ntrain ∈ {10, 25, 50, 100, 205, 500}. The
results are shown in Tables 4 and 5, which are respectively referred to the Gas Turbine compressor and the Gas Turbine
decay. Note that:

• In spite of the contemporaneous variations of two decay factors, the trend of the error, performed by the trained models,
is qualitatively the same as the one, observed when contemplating one decay at a time. This validates the choice of
opting for the decomposition of the MO problem into separate SO ones
• Nevertheless, slightly larger number of training samples is generally needed to create effective models. As a matter of
fact, around 100 data are typically required to gain low error rates, which can rise up to 100 ÷ 200 when a UD is used
to describe the vessel speed
22 Journal name 0(0)

Table 2. Test error rates (in percentage) when only the Gas Turbine compressor decay is considered.

Random Sampling

UD BD MD
RLS SVR RLS SVR RLS SVR
∗ ∗ ∗ ∗ ∗ ∗
ntrain L̂test
ntest (h ) L̂test
ntest (h ) L̂test
ntest (h ) L̂test
ntest (h ) L̂test
ntest (h ) L̂test
ntest (h )
10 30.23 ± 5.57 27.57 ± 5.18 5.36 ± 1.95 4.81 ± 1.70 0.96 ± 0.02 0.87 ± 0.02
25 12.39 ± 2.50 11.16 ± 2.28 1.08 ± 0.22 0.98 ± 0.20 0.44 ± 0.00 0.39 ± 0.00
50 5.13 ± 0.57 4.67 ± 0.52 0.34 ± 0.01 0.31 ± 0.01 0.28 ± 0.00 0.24 ± 0.00
100 1.59 ± 0.15 1.42 ± 0.13 0.17 ± 0.00 0.15 ± 0.00 0.13 ± 0.00 0.12 ± 0.00
200 0.50 ± 0.01 0.45 ± 0.01 0.08 ± 0.00 0.08 ± 0.00 0.04 ± 0.00 0.04 ± 0.00

Nearly Homogeneous Sampling (70)

UD BD MD
RLS SVR RLS SVR RLS SVR
∗ ∗ ∗ ∗ ∗ ∗
ntrain L̂test
ntest (h ) L̂test
ntest (h ) L̂test
ntest (h ) L̂test
ntest (h ) L̂test
ntest (h ) L̂test
ntest (h )
10 25.96 ± 4.93 23.59 ± 4.56 4.61 ± 1.60 4.28 ± 1.43 0.83 ± 0.02 0.74 ± 0.02
25 10.55 ± 2.16 9.35 ± 1.99 0.91 ± 0.19 0.82 ± 0.17 0.37 ± 0.00 0.33 ± 0.00
50 4.47 ± 0.49 4.10 ± 0.44 0.30 ± 0.01 0.27 ± 0.01 0.23 ± 0.00 0.21 ± 0.00
100 1.35 ± 0.13 1.20 ± 0.11 0.14 ± 0.00 0.13 ± 0.00 0.12 ± 0.00 0.10 ± 0.00
200 0.42 ± 0.01 0.37 ± 0.01 0.07 ± 0.00 0.07 ± 0.00 0.04 ± 0.00 0.03 ± 0.00

• Finally, similar conclusions to the ones, drawn in the previous subsections, can be made.

6. Conclusions
The impact of maintenance cost in the naval domain is remarkable: as a consequence, this enables the necessity of designing
more effective approaches, for example based on Condition-Based Maintenance (CBM). Relying on the need of constantly
monitoring the amounts of data, gathered from the field, CBM strategies require that proper Machine Learning (ML) models
are implemented, able to highlight potential safety issues and/or intervention requests with a reasonable advance. In this
paper, in particular, the authors focussed on an application of CBM to Gas Turbines (GTs) used for naval vessel propulsion
to propose an innovative modeling strategy based on effective ML approaches, targeted towards forecasting the degradation
of the propulsion plant over time.
In order to test the proposed techniques, a sophisticated and realistic simulator of a naval propulsion system is exploited,
mounted on a Frigate characterized by a COmbined Diesel eLectric And Gas (CODLAG), to generate a large dataset, that
will be soon released to the research community for benchmarking purposes. The proposed approach has been tested on
this dataset, showing its effectiveness for CBM purposes in a real-world maritime application, even in the challenging
scenario where more than one decay factor is varied over time.
The approach and results, shown in this paper, clearly represent a first step to pave the way towards effective and
actionable CBM systems, powered by ML capabilities. Future works will mostly concern the extension of the approach
to contemplate the decay of a larger number of components, which drive the maintenance needs of the propulsion plant;
moreover, validation with data, gathered on-field, will also allow to validate the ground-truth compliancy of the estimation
with real-world evidences. Such activities will open the door to further evolutions of the proposed CBM approach, so to
predict, with a reasonable advance, potential breakdowns without requiring a preliminary and propaedeutic labeling activity
by the operators (i.e. exploiting an unsupervised method (73)).
Running head right side 23

Table 3. Test error rates (in percentage) when only the Gas Turbine decay is considered.

Random Sampling

UD BD MD
RLS SVR RLS SVR RLS SVR
∗ ∗ ∗ ∗ ∗ ∗
ntrain L̂test
ntest (h ) L̂test
ntest (h ) L̂test
ntest (h ) L̂test
ntest (h ) L̂test
ntest (h ) L̂test
ntest (h )
10 29.77 ± 5.26 27.07 ± 4.84 4.14 ± 1.44 3.77 ± 1.29 0.50 ± 0.00 0.45 ± 0.00
25 15.41 ± 4.45 13.94 ± 4.08 0.40 ± 0.01 0.36 ± 0.01 0.38 ± 0.00 0.34 ± 0.00
50 4.50 ± 1.35 4.07 ± 1.24 0.28 ± 0.03 0.25 ± 0.03 0.21 ± 0.00 0.18 ± 0.00
100 1.60 ± 0.17 1.44 ± 0.16 0.11 ± 0.00 0.09 ± 0.00 0.04 ± 0.00 0.03 ± 0.00

Nearly Homogeneous Sampling (70)

UD BD MD
RLS SVR RLS SVR RLS SVR
∗ ∗ ∗ ∗ ∗ ∗
ntrain L̂test
ntest (h ) L̂test
ntest (h ) L̂test
ntest (h ) L̂test
ntest (h ) L̂test
ntest (h ) L̂test
ntest (h )
10 25.52 ± 4.60 23.25 ± 4.23 3.63 ± 1.20 3.38 ± 1.12 0.43 ± 0.00 0.38 ± 0.00
25 13.20 ± 3.88 11.67 ± 3.50 0.34 ± 0.00 0.31 ± 0.00 0.32 ± 0.00 0.29 ± 0.00
50 3.90 ± 1.18 3.57 ± 1.07 0.23 ± 0.03 0.22 ± 0.02 0.17 ± 0.00 0.15 ± 0.00
100 1.36 ± 0.15 1.22 ± 0.13 0.09 ± 0.00 0.08 ± 0.00 0.03 ± 0.00 0.03 ± 0.00

Table 4. Test error rate (in percentage) when forecasting the Compressor decay in the Multiple Output setup.

Random Sampling

UD BD MD
RLS SVR RLS SVR RLS SVR
∗ ∗ ∗ ∗ ∗ ∗
ntrain L̂test
ntest (h ) L̂test
ntest (h ) L̂test
ntest (h ) L̂test
ntest (h ) L̂test
ntest (h ) L̂test
ntest (h )
10 28.95 ± 4.31 26.32 ± 4.02 6.98 ± 2.34 6.20 ± 2.06 1.98 ± 0.05 1.79 ± 0.05
25 15.80 ± 2.45 14.26 ± 2.24 1.50 ± 0.12 1.38 ± 0.11 0.69 ± 0.01 0.61 ± 0.00
50 6.21 ± 0.41 5.62 ± 0.37 0.55 ± 0.01 0.49 ± 0.01 0.44 ± 0.00 0.38 ± 0.00
100 2.72 ± 0.19 2.43 ± 0.17 0.32 ± 0.00 0.29 ± 0.00 0.27 ± 0.00 0.25 ± 0.00
250 0.78 ± 0.03 0.69 ± 0.02 0.20 ± 0.00 0.18 ± 0.00 0.16 ± 0.00 0.14 ± 0.00
500 0.40 ± 0.00 0.35 ± 0.00 0.09 ± 0.00 0.08 ± 0.00 0.09 ± 0.00 0.08 ± 0.00

Nearly Homogeneous Sampling (70)

UD BD MD
RLS SVR RLS SVR RLS SVR
∗ ∗ ∗ ∗ ∗ ∗
ntrain L̂test
ntest (h ) L̂test
ntest (h ) L̂test
ntest (h ) L̂test
ntest (h ) L̂test
ntest (h ) L̂test
ntest (h )
10 24.78 ± 3.83 22.54 ± 3.48 5.97 ± 1.95 5.56 ± 1.79 1.68 ± 0.04 1.51 ± 0.04
25 13.49 ± 2.14 12.01 ± 1.93 1.30 ± 0.10 1.16 ± 0.09 0.58 ± 0.00 0.53 ± 0.00
50 5.40 ± 0.35 4.91 ± 0.32 0.46 ± 0.01 0.42 ± 0.01 0.37 ± 0.00 0.33 ± 0.00
100 2.30 ± 0.16 2.06 ± 0.15 0.28 ± 0.00 0.25 ± 0.00 0.24 ± 0.00 0.21 ± 0.00
250 0.65 ± 0.02 0.58 ± 0.02 0.17 ± 0.00 0.15 ± 0.00 0.14 ± 0.00 0.12 ± 0.00
500 0.33 ± 0.00 0.29 ± 0.00 0.08 ± 0.00 0.07 ± 0.00 0.08 ± 0.00 0.07 ± 0.00
24 Journal name 0(0)

Table 5. Test error rate (in percentage) when forecasting the Turbine decay in the Multiple Output setup.

Random Sampling

UD BD MD
RLS SVR RLS SVR RLS SVR
∗ ∗ ∗ ∗ ∗ ∗
ntrain L̂test
ntest (h ) L̂test
ntest (h ) L̂test
ntest (h ) L̂test
ntest (h ) L̂test
ntest (h ) L̂test
ntest (h )
10 32.00 ± 4.97 29.31 ± 4.61 9.41 ± 2.34 8.52 ± 2.03 1.77 ± 0.04 1.60 ± 0.04
25 25.27 ± 3.56 22.83 ± 3.23 1.96 ± 0.36 1.78 ± 0.32 0.65 ± 0.00 0.58 ± 0.00
50 12.71 ± 2.23 11.50 ± 2.01 0.67 ± 0.02 0.60 ± 0.02 0.46 ± 0.00 0.40 ± 0.00
100 4.95 ± 0.93 4.43 ± 0.85 0.39 ± 0.00 0.35 ± 0.00 0.33 ± 0.00 0.30 ± 0.00
250 1.34 ± 0.10 1.19 ± 0.09 0.24 ± 0.00 0.22 ± 0.00 0.18 ± 0.00 0.16 ± 0.00
500 0.63 ± 0.02 0.55 ± 0.02 0.17 ± 0.00 0.15 ± 0.00 0.10 ± 0.00 0.09 ± 0.00

Nearly Homogeneous Sampling (70)

UD BD MD
RLS SVR RLS SVR RLS SVR
∗ ∗ ∗ ∗ ∗ ∗
ntrain L̂test
ntest (h ) L̂test
ntest (h ) L̂test
ntest (h ) L̂test
ntest (h ) L̂test
ntest (h ) L̂test
ntest (h )
10 27.62 ± 4.40 25.26 ± 3.97 8.20 ± 1.93 7.54 ± 1.75 1.50 ± 0.04 1.36 ± 0.03
25 21.59 ± 3.09 19.19 ± 2.79 1.67 ± 0.32 1.49 ± 0.28 0.55 ± 0.00 0.49 ± 0.00
50 11.02 ± 1.92 10.05 ± 1.72 0.57 ± 0.02 0.52 ± 0.01 0.39 ± 0.00 0.34 ± 0.00
100 4.18 ± 0.80 3.75 ± 0.70 0.34 ± 0.00 0.30 ± 0.00 0.29 ± 0.00 0.26 ± 0.00
250 1.12 ± 0.08 1.00 ± 0.08 0.20 ± 0.00 0.19 ± 0.00 0.15 ± 0.00 0.14 ± 0.00
500 0.52 ± 0.02 0.46 ± 0.01 0.14 ± 0.00 0.13 ± 0.00 0.09 ± 0.00 0.08 ± 0.00

References

[1] Standard B. Maintenance Management Terms in Terotechnology; 1984.

[2] Kothamasu R, Huang SH. Adaptive Mamdani fuzzy model for condition-based maintenance. Fuzzy sets and Systems. 2007;158(24):2715–
2733.
[3] Widodo A, Yang BS. Support vector machine in machine condition monitoring and fault diagnosis. Mechanical Systems and Signal
Processing. 2007;21(6):2560–2574.
[4] Mobley RK. An introduction to predictive maintenance. Butterworth-Heinemann; 2002.
[5] Peng Y, Dong M, Zuo MJ. Current status of machine prognostics in condition-based maintenance: a review. The International Journal
of Advanced Manufacturing Technology. 2010;50(1-4):297–313.
[6] Yam R, Tse P, Li L, Tu P. Intelligent predictive decision support system for condition-based maintenance. The International Journal of
Advanced Manufacturing Technology. 2001;17(5):383–391.
[7] Budai G, Huisman D, Dekker R. Scheduling preventive railway maintenance activities. Journal of the Operational Research Society.
2005;57(9):1035–1044.
[8] project Automated AR, cost effective railway infrastructure maintenance. http://www.acem-rail.eu;.
[9] Murthy DP, Kobbacy KAH. Complex system maintenance handbook. Springer; 2008.
[10] Budai-Balke G. Operations Research Models for Scheduling Railway Infrastructure Maintenance. Rozenberg Publishers; 2009.
[11] Stamatis DH. Failure mode and effect analysis: FMEA from theory to execution. Asq Press; 2003.
[12] McNamara D, Cunningham A, Jenkinson I, Wang J. A model assessing cost of operating marine systems using data obtained from
Monte Carlo analysis. In: Proceedings of the Institution of Mechanical Engineers, Part M: Journal of Engineering for the Maritime
Environment (In Press); 2013. .
[13] Al-Najjar B. The lack of maintenance and not maintenance which costs: a model to describe and quantify the impact of vibration-based
maintenance on company’s business. International Journal of Production Economics. 2007;107(1):260–273.
[14] Liyanage JP, Kumar U. Towards a value-based view on operations and maintenance performance management. Journal of Quality in
Running head right side 25

Maintenance Engineering. 2003;9(4):333–350.

[15] Sebastiani L, Pescetto A, Ambrosio L. The condition monitoring system for optimal maintenance a possible application on offshore
vessels. In: Proceedings of 101h Offshore Mediterranean Conference (OMC 2011); 2013. .
[16] Urban LA. Parameter selection for multiple fault diagnostics of gas turbine engines. Journal of Engineering for Power. 1975;97(2):225–
231.
[17] Palmé T, Breuhaus P, Assadi M, Klein A, Kim M. New Alstom Monitoring Tools Leveraging Artificial Neural Network Technologies.
In: Turbo Expo: Turbine Technical Conference and Exposition; 2011. .
[18] Volponi AJ, DePold H, Ganguli R, Daguang C. The use of Kalman filter and neural network methodologies in gas turbine performance
diagnostics: a comparative study. Journal of engineering for gas turbines and power. 2003;125(4):917–924.
[19] Urban LA. Gas Path Analysis Applied to Turbine Engine Condition Monitoring. Journal of Aircraft. 1973;10(7):400–406.
[20] Vapnik VN. Statistical learning theory. Wiley; 1998.
[21] Györfi L. A distribution-free theory of nonparametric regression. Springer; 2002.
[22] Milenova BL, Yarmus JS, Campos MM. SVM in oracle database 10g: removing the barriers to widespread adoption of support vector
machines. In: International conference on Very large data bases; 2005. .
[23] Altosole M, Benvenuto G, Campora U. Numerical modelling of the engines governors of a CODLAG propulsion plant. In: Proceedings
of the 10th International Conference on Marine Sciences and Technologies; 2010. .
[24] Altosole M, Benvenuto G, Figari M, Campora U. Real-time simulation of a COGAG naval ship propulsion system. Proceedings of the
Institution of Mechanical Engineers, Part M: Journal of Engineering for the Maritime Environment. 2009;223(1):47–62.
[25] Benvenuto G, Campora U. A Gas Turbine Modular Model for Ship Propulsion Studies. In: Symposium on High Speed Marine Vehicles;
2005. .
[26] Benvenuto G, Campora U. Performance Prediction of a Faulty Marine Diesel Engine Under Different Governor Settings. In: International
Conference on Marine Research And Transportation; 2007. .
[27] Saravanamuttoo H, Rogers GF, Cohen H. Gas turbine theory. Pearson Education; 2001.
[28] Figari M, Altosole M. Dynamic behaviour and stability of marine propulsion systems. Proceedings of the Institution of Mechanical
Engineers, Part M: Journal of Engineering for the Maritime Environment. 2007;221(4):187–205.
[29] Campora U, Figari M. Numerical simulation of ship propulsion transients and full-scale validation. Proceedings of the Institution of
Mechanical Engineers, Part M: Journal of Engineering for the Maritime Environment. 2003;217(1):41–52.
[30] Altosole M, Figari M, S D. Design And Optimisation Of Propulsion Systems By Dynamic Numerical Simulation. In: Proceedings of the
International Conference on Ships and Shipping Research; 2003. .
[31] Altosole M, Benvenuto G, Campora U. Marine Gas Turbine Propulsion System Simulation: Comparison of Different Approaches. In:
Proceedings of the International Conference on Ships and Shipping Research; 2006. .
[32] Bruzzone M. Decadimento delle prestazioni degli impianti di propulsione navale con elica mossa da turbine a gas; 2011. Master Degree
Thesis. University of Genova.
[33] Tarabrin A, Schurovsky V, Bodrov A, Stalder J. An analysis of axial compressor fouling and a blade cleaning method. Journal of
turbomachinery. 1998;120(2):256–261.
[34] Meher-Homji CB, Focke AB, Wooldridge MB. Fouling of Axial Flow Compressors - Causes, Effects, Detection, and Control. In:
Proceedings of the Eighteenth Turbomachinery Symposium; 1989. p. 55–76.
[35] Aker GF, Saravanamuttoo HIH. Predicting Gas Turbine Performance Degradation Due to Compressor Fouling Using Computer Simulation
Techniques. Journal of Engineering for Gas Turbines and Power. 2006;111(2):343 – 350.
[36] Martelli M. (in press); 2014. Ph.D. Thesis. University of Genova.
[37] Cristianini N, Shawe-Taylor J. An introduction to support vector machines and other kernel-based learning methods. Cambridge university
press; 2000.
[38] Smola AJ, Schölkopf B. A tutorial on support vector regression. Statistics and computing. 2004;14(3):199–222.
[39] Lee WS, Bartlett PL, Williamson RC. The importance of convexity in learning with squared loss. IEEE Transactions on Information
26 Journal name 0(0)

Theory. 1998;44(5):1974–1980.
[40] Lugosi G, Zeger K. Nonparametric estimation via empirical risk minimization. Information Theory, IEEE Transactions on.
1995;41(3):677–687.
[41] Tikhonov A, Arsenin VY. Methods for solving ill-posed problems. Nauka, Moscow; 1979.
[42] Engl HW, Hanke M, Neubauer A. Regularization of inverse problems. Springer; 1996.
[43] Wolpert DH, Macready WG. No free lunch theorems for optimization. IEEE Transactions on Evolutionary Computation. 1997;1(1):67–82.
[44] Evgeniou T, Micchelli CA, Pontil M. Learning Multiple Tasks with Kernel Methods. Journal of Machine Learning Research. 2005;6:615–
637.
[45] Pan SJ, Yang Q. A survey on transfer learning. IEEE Transactions on Knowledge and Data Engineering. 2010;22(10):1345–1359.
[46] Zhang W, Liu X, Ding Y, Shi D. Multi-output LS-SVR machine in extended feature space. In: Computational Intelligence for Measurement
Systems and Applications (CIMSA), 2012 IEEE International Conference on; 2012. .
[47] Baldassarre L, Rosasco L, Barla A, Verri A. Multi-output learning via spectral filtering. Machine learning. 2012;87(3):259–301.
[48] Pollard D. Empirical processes: theory and applications. In: NSF-CBMS regional conference series in probability and statistics; 1990. .
[49] Caponnetto A, De Vito E. Optimal rates for the regularized least-squares algorithm. Foundations of Computational Mathematics.
2007;7(3):331–368.
[50] Schölkopf B, Herbrich R, Smola AJ. A generalized representer theorem. In: Computational learning theory; 2001. .
[51] Scholkopf B. The kernel trick for distances. In: Neural information processing systems; 2001. .
[52] Keerthi SS, Lin CJ. Asymptotic behaviors of support vector machines with Gaussian kernel. Neural computation. 2003;15(7):1667–1689.
[53] Young DM. Iterative solution of large linear systems. DoverPublications. com; 2003.
[54] Poggio T, Mukherjee S, Rifkin R, Raklin A, Verri A. b. In: Uncertainty in Geometric Computations; 2002. .
[55] Boyd SP, Vandenberghe L. Convex optimization. Cambridge university press; 2004.
[56] Shawe-Taylor J, Sun S. A review of optimization methodologies in support vector machines. Neurocomputing. 2011;74(17):3609–3618.
[57] Keerthi SS, Shevade SK, Bhattacharyya C, Murthy KRK. Improvements to Platt’s SMO algorithm for SVM classifier design. Neural
Computation. 2001;13(3):637–649.
[58] De Vito E, Caponnetto A, Rosasco L. Model selection for regularized least-squares algorithm in learning theory. Foundations of
Computational Mathematics. 2005;5(1):59–85.
[59] Bartlett PL, Boucheron S, Lugosi G. Model selection and error estimation. Machine Learning. 2002;48(1-3):85–113.
[60] Anguita D, Ghio A, Oneto L, Ridella S. In-Sample and Out-of-Sample Model Selection and Error Estimation for Support Vector Machines.
IEEE Transactions on Neural Networks and Learning Systems. 2012;23(9):1390–1406.
[61] Guyon I, Saffari A, Dror G, Cawley G. Model selection: Beyond the bayesian/frequentist divide. The Journal of Machine Learning
Research. 2010;11:61–87.
[62] Kohavi R, et al. A study of cross-validation and bootstrap for accuracy estimation and model selection. In: International Joint Conference
on Artificial Intelligence; 1995. .
[63] Anguita D, A G, Ridella S, Sterpi D. K-Fold Cross Validation for Error Rate Estimate in Support Vector Machines. In: International
Conference on Data Mining; 2009. .
[64] Dietterich TG. Approximate statistical tests for comparing supervised classification learning algorithms. Neural computation.
1998;10(7):1895–1923.
[65] Anguita D, Ghelardoni L, Ghio A, Oneto L, Ridella S. The K in K-fold Cross Validation. In: European Symposium on Artificial Neural
Networks; 2012. .
[66] Hsu CW, Chang CC, Lin CJ. A practical guide to support vector classification, available for download at http://www.csie.ntu.
edu.tw/~cjlin/papers/guide/guide.pdf; 2009.
[67] Bache K, Lichman M. UCI Machine Learning Repository; 2013. Available from: http://archive.ics.uci.edu/ml.
[68] Narula SC, Wellington JF. Prediction, Linear Regression and the Minimum Sum of Relative Errors. Technometrics. 1977;19(2):185–190.
[69] Chen K, Guo S, Lin Y, Ying Z. Least absolute relative error estimation. Journal of the American Statistical Association.
Running head right side 27

2010;105(491):1104–1112.
[70] Aupetit M. Nearly homogeneous multi-partitioning with a deterministic generator. Neurocomputing. 2009;72(7):1379–1389.
[71] Coraddu A, Figari M. Ship electric propulsion: Analyses through modeling and simulation marine 2011. In: Computational Methods in
Marine Engineering; 2011. .
[72] Li Z, Yan X, Peng Z. Ship electric propulsion with a sensorless permanent magnet synchronous motor: A simulation study. Proceedings
of the Institution of Mechanical Engineers, Part M: Journal of Engineering for the Maritime Environment. 2012;226(4):378–386.
[73] Weber M, Welling M, Perona P. Unsupervised learning of models for recognition. Springer; 2000.

Techniques and Indices For Preventive Maintenance Optimization
No ratings yet
Techniques and Indices For Preventive Maintenance Optimization
6 pages
Paper 1
No ratings yet
Paper 1
33 pages
Designing Predictive Maintenance
100% (1)
Designing Predictive Maintenance
28 pages
ENS Gearbox Recommendation
100% (1)
ENS Gearbox Recommendation
174 pages
Machine Learning Applications in Predictive Maintenance For Vehicles Case Studies
No ratings yet
Machine Learning Applications in Predictive Maintenance For Vehicles Case Studies
14 pages
Haward Technology Middle East Point Enginee Technology Middle East 77 Pe120
100% (1)
Haward Technology Middle East Point Enginee Technology Middle East 77 Pe120
203 pages
Thesis Condition Based Maintenance
100% (2)
Thesis Condition Based Maintenance
4 pages
2018 Cipollini Etal OE 2018 Condition Based Maintenance of Naval Propulsion Systems With Supervised Data Analysis
No ratings yet
2018 Cipollini Etal OE 2018 Condition Based Maintenance of Naval Propulsion Systems With Supervised Data Analysis
33 pages
Final CBM
No ratings yet
Final CBM
24 pages
Introduction To Maintenance MethodsPD3
No ratings yet
Introduction To Maintenance MethodsPD3
65 pages
A Survey of Predictive Maintenance: Systems, Purposes and Approaches
No ratings yet
A Survey of Predictive Maintenance: Systems, Purposes and Approaches
38 pages
Aircraft Maintenance
100% (2)
Aircraft Maintenance
36 pages
LR DCBM Report v7
No ratings yet
LR DCBM Report v7
33 pages
Condition Based Maintenance
100% (2)
Condition Based Maintenance
19 pages
2024.A Comprehensive Review of Machine Learning Techniques For Condition-Based Maintenance
No ratings yet
2024.A Comprehensive Review of Machine Learning Techniques For Condition-Based Maintenance
20 pages
A Machine Learning Based Predictive Maintenance Algorithm For Auxiliary Engines
No ratings yet
A Machine Learning Based Predictive Maintenance Algorithm For Auxiliary Engines
13 pages
Smart Maintenance
No ratings yet
Smart Maintenance
10 pages
Optimization of A Multilevel Integrated Preventive Maintenance Scheduling Mathematical Model Using Genetic Algorithm
No ratings yet
Optimization of A Multilevel Integrated Preventive Maintenance Scheduling Mathematical Model Using Genetic Algorithm
12 pages
Current Status of Machine Prognostics in Condition-Based
No ratings yet
Current Status of Machine Prognostics in Condition-Based
17 pages
Hierarchical Condition Based Maintenance Planning Fo - 2024 - Reliability Engine
No ratings yet
Hierarchical Condition Based Maintenance Planning Fo - 2024 - Reliability Engine
15 pages
Algorithms 17 00411 v2
No ratings yet
Algorithms 17 00411 v2
17 pages
2020.research On Multi-Attribute Decision-Making in Condition-Based Maintenance For Power Transformers Based On Cloud and Kernel Vector Space Models
No ratings yet
2020.research On Multi-Attribute Decision-Making in Condition-Based Maintenance For Power Transformers Based On Cloud and Kernel Vector Space Models
11 pages
Condition Monitoring of Machine Tool Feed Drives: A Review
No ratings yet
Condition Monitoring of Machine Tool Feed Drives: A Review
28 pages
AI-Driven Predictive Maintenance in Modern Maritime
No ratings yet
AI-Driven Predictive Maintenance in Modern Maritime
18 pages
Paper 1
No ratings yet
Paper 1
12 pages
Chapter 6 - Consolidated Financial Statements (Part 3)
No ratings yet
Chapter 6 - Consolidated Financial Statements (Part 3)
41 pages
Machines: Condition-Based Maintenance-An Extensive Literature Review
No ratings yet
Machines: Condition-Based Maintenance-An Extensive Literature Review
28 pages
Machine Learning in Predictive Maintenance Towards Sustainable Smart Manufacturing in Industry 4.0
No ratings yet
Machine Learning in Predictive Maintenance Towards Sustainable Smart Manufacturing in Industry 4.0
42 pages
2020.condition Based Maintenance of Oil and Gas Equipment - A Review
No ratings yet
2020.condition Based Maintenance of Oil and Gas Equipment - A Review
9 pages
Condition Based Maintenance in The Manufacturing Industry: From Strategy To Implemetnation
No ratings yet
Condition Based Maintenance in The Manufacturing Industry: From Strategy To Implemetnation
103 pages
Real-Time Analytics in Managed Pressure Drilling
From Everand
Real-Time Analytics in Managed Pressure Drilling
DHIVAKAR POOSAPADI
No ratings yet
Sonuu Mrs Report
No ratings yet
Sonuu Mrs Report
16 pages
A Deep Learning Driven Method For Fault Classification and Degradation
No ratings yet
A Deep Learning Driven Method For Fault Classification and Degradation
10 pages
Predictive Maintenance Model For Marine Vessels Using Machine Learning
No ratings yet
Predictive Maintenance Model For Marine Vessels Using Machine Learning
12 pages
Paper 15-Comprehensive Analysis For Sensor Based Hydraulic System
No ratings yet
Paper 15-Comprehensive Analysis For Sensor Based Hydraulic System
9 pages
Cipollini 2018
No ratings yet
Cipollini 2018
12 pages
09 GangNiu FrameworkforCostEffectiveandAccurateMaintenance IEEE
No ratings yet
09 GangNiu FrameworkforCostEffectiveandAccurateMaintenance IEEE
8 pages
Designs 07 00098
No ratings yet
Designs 07 00098
23 pages
1-S2.0-S0951832023000820-Main-Progrma Mantto Condic
No ratings yet
1-S2.0-S0951832023000820-Main-Progrma Mantto Condic
11 pages
Development of A Methodology For Conditi
No ratings yet
Development of A Methodology For Conditi
19 pages
Condition Based Techniques and Predictive Maintenance For Motor
No ratings yet
Condition Based Techniques and Predictive Maintenance For Motor
7 pages
A Reinforcement Learning Agent For Maintenance of Deteriorating Systems With Increasingly Imperfect Repairs
No ratings yet
A Reinforcement Learning Agent For Maintenance of Deteriorating Systems With Increasingly Imperfect Repairs
11 pages
A Review On Condition-Based Maintenance Optimization Models For Stochastically Deteriorating System
No ratings yet
A Review On Condition-Based Maintenance Optimization Models For Stochastically Deteriorating System
42 pages
Maintenance Decision-Making Support For Textile Machines: A Knowledge-Based Approach Using Fuzzy Logic and Vibration Monitoring
No ratings yet
Maintenance Decision-Making Support For Textile Machines: A Knowledge-Based Approach Using Fuzzy Logic and Vibration Monitoring
11 pages
Welcome To The Flutter API Reference Documentation
No ratings yet
Welcome To The Flutter API Reference Documentation
6 pages
424 Exams
No ratings yet
424 Exams
20 pages
Progress Chart Electrical Installation and Maintenance NC II
No ratings yet
Progress Chart Electrical Installation and Maintenance NC II
4 pages
Neil Sedaka
0% (1)
Neil Sedaka
15 pages
Ciências Exatas
No ratings yet
Ciências Exatas
5 pages
Electronics: Condition-Based Maintenance of HVAC On A High-Speed Train For Fault Detection
No ratings yet
Electronics: Condition-Based Maintenance of HVAC On A High-Speed Train For Fault Detection
14 pages
Mu5dc CH7DC (Rev 2.0)
No ratings yet
Mu5dc CH7DC (Rev 2.0)
86 pages
Machine Learning For Predictive Maintenance: A Multiple Classifier Approach
No ratings yet
Machine Learning For Predictive Maintenance: A Multiple Classifier Approach
9 pages
Nonlinear Trend Analysis of Mill Fan System Vibrat
No ratings yet
Nonlinear Trend Analysis of Mill Fan System Vibrat
7 pages
Sensors: Condition Assessment of Industrial Gas Turbine Compressor Using A Drift Soft Sensor Based in Autoencoder
No ratings yet
Sensors: Condition Assessment of Industrial Gas Turbine Compressor Using A Drift Soft Sensor Based in Autoencoder
14 pages
Condition Based Maintenance: Mike Dileo, Charles Manker, and John Cadick, P.E. © 1999 by
No ratings yet
Condition Based Maintenance: Mike Dileo, Charles Manker, and John Cadick, P.E. © 1999 by
18 pages
Technical Aspects Related to the Design and Construction of Engineered Containment Barriers for Environmental Remediation
From Everand
Technical Aspects Related to the Design and Construction of Engineered Containment Barriers for Environmental Remediation
IAEA
No ratings yet
C - M S A T: Ondition Based Aintenance Using Ensor Rrays and Elematics
No ratings yet
C - M S A T: Ondition Based Aintenance Using Ensor Rrays and Elematics
10 pages
Condition-Based Maintenance: Model vs. Statistics A Performance Comparison
No ratings yet
Condition-Based Maintenance: Model vs. Statistics A Performance Comparison
7 pages
Ageing Management and Long Term Operation of Nuclear Power Plants: Data Management, Scope Setting, Plant Programmes and Documentation
From Everand
Ageing Management and Long Term Operation of Nuclear Power Plants: Data Management, Scope Setting, Plant Programmes and Documentation
IAEA
No ratings yet
Six Sigma: A Literature Review
No ratings yet
Six Sigma: A Literature Review
30 pages
Machines 05 00024 PDF
No ratings yet
Machines 05 00024 PDF
15 pages
Simulation of Substation Integrated Monitoring
No ratings yet
Simulation of Substation Integrated Monitoring
5 pages
1 s2.0 S2212827118312344 Main
No ratings yet
1 s2.0 S2212827118312344 Main
6 pages
Condition Based Maintenance
No ratings yet
Condition Based Maintenance
18 pages
Nasal and Paranasal Sinus Pathologies
No ratings yet
Nasal and Paranasal Sinus Pathologies
7 pages
Ceejay
No ratings yet
Ceejay
8 pages
Literatura de Machine Learning PDF
No ratings yet
Literatura de Machine Learning PDF
10 pages
IJCRT2205280
No ratings yet
IJCRT2205280
4 pages
Stefx85: Joined May 2015 Status: Member 894 Posts Online Now
No ratings yet
Stefx85: Joined May 2015 Status: Member 894 Posts Online Now
2 pages
How To Ace Your Data Science Interview
50% (2)
How To Ace Your Data Science Interview
1 page
Machine Learning Approach For Predictive Maintenance in Industry 4.0
No ratings yet
Machine Learning Approach For Predictive Maintenance in Industry 4.0
6 pages
Expert Systems With Applications: Van Tung Tran, Bo-Suk Yang
No ratings yet
Expert Systems With Applications: Van Tung Tran, Bo-Suk Yang
12 pages
Writing Message Passing Parallel Programs With MPI: Course Notes
No ratings yet
Writing Message Passing Parallel Programs With MPI: Course Notes
80 pages
CR 3-25 A-Fgj-A-E Hqqe: Position Qty. Description Single Price
No ratings yet
CR 3-25 A-Fgj-A-E Hqqe: Position Qty. Description Single Price
7 pages
2023 Process Food by Sugar concentration-MATERIALS-TOOLS-EQUIPMENTS
No ratings yet
2023 Process Food by Sugar concentration-MATERIALS-TOOLS-EQUIPMENTS
2 pages
Author: Henric Zazzi, PDC Center For High Performance: Computing
No ratings yet
Author: Henric Zazzi, PDC Center For High Performance: Computing
54 pages
RWTP GB MNE MSA 000149 Multicore & Earthing Cable
No ratings yet
RWTP GB MNE MSA 000149 Multicore & Earthing Cable
60 pages
Modern Chemistry Chapter 5 Homework 5-7 Answer Key
100% (1)
Modern Chemistry Chapter 5 Homework 5-7 Answer Key
8 pages
Saudi Aramco - Computational Fluid Dynamics (CFD) Modeling Engineer
No ratings yet
Saudi Aramco - Computational Fluid Dynamics (CFD) Modeling Engineer
4 pages
NJ Familycare Aged, Blind, Disabled Programs: Abd Checklist
No ratings yet
NJ Familycare Aged, Blind, Disabled Programs: Abd Checklist
2 pages
Tardecilla Plastic Limit and Plasticity Index of Soil
No ratings yet
Tardecilla Plastic Limit and Plasticity Index of Soil
11 pages
55153rr10 17
No ratings yet
55153rr10 17
2 pages
How To Speak - How To Speak - MIT OpenCourseWare PDF
No ratings yet
How To Speak - How To Speak - MIT OpenCourseWare PDF
4 pages
Physical Phenomena in Real Time: David Brookes and Eugenia Etkina
No ratings yet
Physical Phenomena in Real Time: David Brookes and Eugenia Etkina
3 pages
2016CathTelGuide08Paterson PDF
No ratings yet
2016CathTelGuide08Paterson PDF
12 pages
2016CathTelGuide08Paterson PDF
No ratings yet
2016CathTelGuide08Paterson PDF
12 pages
Cath Tel Guide 2018 Paterson
No ratings yet
Cath Tel Guide 2018 Paterson
12 pages
Dealer Management System 2023 AI Withlogo
No ratings yet
Dealer Management System 2023 AI Withlogo
1 page
PHM Conference 2019 Program
No ratings yet
PHM Conference 2019 Program
15 pages
First SuperHot Rock Geothermal Development To Advance
No ratings yet
First SuperHot Rock Geothermal Development To Advance
4 pages
Turn Your Savings Into Retirement Income - ResourceList - 759822.6.0
No ratings yet
Turn Your Savings Into Retirement Income - ResourceList - 759822.6.0
1 page
Shri P G Venkat Ram - Bridge Construction Methods and The Failures Associated With Them
No ratings yet
Shri P G Venkat Ram - Bridge Construction Methods and The Failures Associated With Them
24 pages
Support The Work of The Priests of The Sacred Heart - Priests of The Sacred Heart
No ratings yet
Support The Work of The Priests of The Sacred Heart - Priests of The Sacred Heart
2 pages
Duncan Virtual Guidelines
No ratings yet
Duncan Virtual Guidelines
7 pages
Evolution Bi Metal and Stainless
No ratings yet
Evolution Bi Metal and Stainless
5 pages
Power of Silence
No ratings yet
Power of Silence
1 page
MikroTik Routers and Wireless
No ratings yet
MikroTik Routers and Wireless
7 pages
Antenna Andrew VHF DB224 Manual PDF
No ratings yet
Antenna Andrew VHF DB224 Manual PDF
4 pages
An Interview With Madeleine Albright - WSJ
No ratings yet
An Interview With Madeleine Albright - WSJ
5 pages
BBC Black History Month Event Proposal September 2024
No ratings yet
BBC Black History Month Event Proposal September 2024
11 pages
Plane Tutorial 21
No ratings yet
Plane Tutorial 21
6 pages
A Qualitative Study - An Exploration of The Use of Emotional Intelligence by Military Leaders in Their Decision-Making Process
No ratings yet
A Qualitative Study - An Exploration of The Use of Emotional Intelligence by Military Leaders in Their Decision-Making Process
24 pages
Dis W23
No ratings yet
Dis W23
1 page
14.4 V & 18 V Family Handout-R1
No ratings yet
14.4 V & 18 V Family Handout-R1
2 pages
Cara Add Scanner Shortcut - Google Search
No ratings yet
Cara Add Scanner Shortcut - Google Search
1 page
Invoice
No ratings yet
Invoice
1 page
Inglese Eta Vittoriana Verifica
No ratings yet
Inglese Eta Vittoriana Verifica
2 pages

Machine Learning Approaches For Improving Condition-Based Maintenance of Naval Propulsion Plants

Uploaded by

Machine Learning Approaches For Improving Condition-Based Maintenance of Naval Propulsion Plants

Uploaded by

Journal of Engineering for the Mar-

Improving Condition–Based Maintenance Reprints and permission:

of Naval Propulsion Plants DOI:doi number

A. Coraddu1 , L. Oneto1 , A. Ghio1 , S. Savio1 , D. Anguita2 , and M. Figari1 ∗

∗ Corresponding author; e-mail: [email protected]

Fig. 1. CODLAG propulsion system

Air Gas Gas GT Emission

Fig. 2. Gas Turbine Scheme

2. Ship Propulsion Models

2.2. Model Description

TCS LP TURBINE AUXILIARY

Fig. 4. LM 2500 gas turbine model scheme

PORT BRIDGE Torque

Port Shaft Starboard Shaft

Fig. 5. Shaft speed Control scheme

and by the energy equation:

been represented by a proportional integral controller:

2.3. Asset Degradation Model

kMc ∈ [0.95, 1] (6)

kMci = 1 − i · 0.001, i ∈ {0, 1, · · · , 51} (7)

Similarly, for the Gas Turbines:

kMt ∈ [0.975, 1] (8)

kMti = 1 − i · 0.001, i ∈ {0, 1, · · · , 26} (9)

Table 1. Main simulation output

3. Machine Learning (ML) Framework

h∗ : min L̂n (h) + λ C(h). (11)

1 For the sake of notational simplicity, h = h(x) is identified.

3.1. Regularized Least Squares

h(x) = wT φ(x), (12)

C(h) = kwk22 (13)

Consequently, Problem (13) is reformulated as:

3.2. Support Vector Regression

`(h(x), y) = |h(x) − y|ε , (22)

wT φ(xi ) − yi ≤ ε + ξˆi , ∀i ∈ {1, . . . , n} (24)

for which the following Karush-Kuhn-Tucker (KKT) conditions hold:

α̌i yi − wT φ(xi ) − ε + ξˇi = 0, ∀i ∈ {1, . . . , n}

3.3. Tuning the hyperparameters λ, γ, C and ε

4 Ideally, the range where to search for the hyperparameters should be λ, γ, C, ε

lpi = i, i ∈ {0, · · · , 9} (43)

kMc ∈ [0.95, 1] (45)

kMci = 1 − i · 0.001, i ∈ {0, 1, · · · , 51} (46)

3. Gas Turbine decay:

kMt ∈ [0.975, 1] (47)

kMti = 1 − i · 0.001, i ∈ {0, 1, · · · , 26} (48)

(lp, kMc , kMt )i , i ∈ {0, 1, · · · , 11934} (49)

lpi = i, i ∈ {0, · · · , 9} (50)

5. Experimental Results and Discussion

5.1. Modeling and predicting the Compressor decay

(lp, kMc )i , i ∈ {0, 1, · · · , 459} (53)

Instead, when issue (ii) above is taken into account:

20 40 60 80 100 120 140 160 180

20 40 60 80 100 120 140 160 180

20 40 60 80 100 120 140 160 180

5.2. Modeling and predicting the Gas Turbine decay

(lp, kMt )i , i ∈ {0, 1, · · · , 234} (58)

Nearly Homogeneous Sampling (70)

Nearly Homogeneous Sampling (70)

Nearly Homogeneous Sampling (70)

Nearly Homogeneous Sampling (70)

[1] Standard B. Maintenance Management Terms in Terotechnology; 1984.

Maintenance Engineering. 2003;9(4):333–350.

You might also like