0% found this document useful (0 votes)
10 views82 pages

Offshore Wind Turbine Gearbox Condition Monitoring With Data Cubes and Deep Learning1

Uploaded by

loyasaad007
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views82 pages

Offshore Wind Turbine Gearbox Condition Monitoring With Data Cubes and Deep Learning1

Uploaded by

loyasaad007
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 82

Master of Science in Information Systems

Business Analytics

Offshore Wind Turbine Gearbox Condition Monitoring with Data


Cubes and Deep Learning

Saeid Sheikhi
No. 120997

Supervisor: Debasish Ghose


Co-Supervisor: Arvind Keprate

A report submitted in partial fulfillment of the requirement for the degree of Master of
Science in Information Systems Business Analytics

Kristiania University College


Prinsens Gate 7-9
0152 Oslo
Norway
Abstract
This thesis presents an investigation into the use of deep learning for anomaly
detection in offshore wind turbine condition monitoring gearbox data. Focusing on
the novel application of convolutional autoencoders integrated with data cubes for en-
hanced data structuring and analysis. The research develops a deep learning frame-
work that applies signal-to-image processing techniques to convert traditional time-
series vibration data into a multidimensional structured data cube format, facilitating
the application of convolutional neural networks for improved feature extraction and
anomaly detection. Through extensive literature review and methodological innova-
tion, this thesis explores the capacity of these advanced models to handle the complex,
high-dimensional datasets typically found in OWT operations. The empirical analy-
sis demonstrates significant advancements in the detection capabilities of the system,
pointing towards a reduction in maintenance costs and increased efficiency in the op-
erational management of wind turbines. The introduction of data cubes allows for a
more nuanced understanding of the spatial and temporal dynamics of turbine sensor
data, enhancing the predictive accuracy of the anomaly detection system. This re-
search not only substantiates the effectiveness of deep learning models in real-world
condition monitoring but also reveals the practical challenges and constraints faced
during implementation. The conclusion offers directions for future research, advo-
cating for the exploration of more sophisticated deep learning algorithms and their
integration with enhanced data handling architectures to further improve the reliabil-
ity and cost-effectiveness of maintenance strategies in the renewable energy sector.
Acknowledgment
This thesis is a testament to the collective support and inspiration I received throughout
my journey. The unwavering support of my family was crucial, providing me with moti-
vation and encouragement during the most challenging times. Their sacrifices and belief
in my abilities were fundamental to my perseverance and success.

I am immensely indebted to Professor Lester Allan Lasrado, our esteemed program direc-
tor, whose invaluable guidance and profound expertise in the field profoundly shaped my
academic and research pursuits. His mentorship was not only educational but also inspi-
rational. Professors Arvind Keprate and Debashish Ghose, who co-supervised my thesis,
deserve special thanks for their extensive involvement and meticulous guidance. Their
commitment to excellence and ethical standards significantly influenced my approach to
research, instilling a rigorous scientific attitude and respect for the integrity of academic
inquiry.

I would also like to acknowledge two individuals from Equinor, who, preferring to re-
main anonymous, offered crucial insights during an essential interview that significantly
influenced the objectives of this thesis. Their expert perspectives and thoughtful feed-
back were instrumental in refining my research questions and methods. Additionally, my
gratitude extends to Guru Prasad Bhandari for his technical assistance in setting up the
university’s High-Performance Computing (HPC) facilities. His support was vital for
the computational aspects of my research, enabling me to navigate the complexities of
model training efficiently.

Collectively, these individuals have greatly enriched my educational experience at Kris-


tiania University, providing a solid foundation of support and knowledge that has been
indispensable in my academic growth.
Title: Data Cubes and Deep Learning: Gearbox Condition Monitoring for Offshore Wind Turbine

Contents
1 Introduction 1
1.1 Wind Energy: A Key to Sustainable Power . . . . . . . . . . . . . . . . . 1
1.2 Condition Monitoring of Offshore Wind Turbines . . . . . . . . . . . . . . 2
1.2.1 Gearbox Condition Monitoring . . . . . . . . . . . . . . . . . . . . 4
1.3 Research Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.4 Methodological Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.5 Structure of the Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

2 Literature Review 10
2.1 Offshore Wind Turbine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.1.1 OWT Gearbox . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.2 OWT Condition Monitoring . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.2.1 Statistical Methods for Condition Monitoring . . . . . . . . . . . . 17
2.2.2 Artificial Intelligence for Condition Monitoring . . . . . . . . . . . 18
2.2.3 Condition Monitoring with Deep Learning . . . . . . . . . . . . . . 20
2.2.4 Auto-Encoders for Anomaly Detection . . . . . . . . . . . . . . . . 22
2.2.5 Convolutional Autoencoder for Anomaly Detection . . . . . . . . . 24
2.3 Literature Review Summary . . . . . . . . . . . . . . . . . . . . . . . . . . 26

3 Methodology & Implementation 28


3.1 First Step: Data Mining . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.2 Second Step: Data Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.3 Third Step: Data Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . 39
3.3.1 Signal-to-Image Processing Algorithms . . . . . . . . . . . . . . . . 40
3.3.2 Data Cubes - Multi Layered Data . . . . . . . . . . . . . . . . . . 44
3.4 Fourth Step: Model Architecture . . . . . . . . . . . . . . . . . . . . . . . 44
3.5 Fifth Step: Model Training . . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.6 Sixth Step: Model Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.7 Methodology Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

4 Results 49
4.1 Data Transformation Results . . . . . . . . . . . . . . . . . . . . . . . . . 49
4.2 Models Training Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.3 Performance Evaluation of the CAE Models . . . . . . . . . . . . . . . . . 55
4.4 Analysis of Anomaly Detection Capabilities . . . . . . . . . . . . . . . . . 60
4.5 Model Resistance to Data Inconsistency . . . . . . . . . . . . . . . . . . . 63
4.6 Impact of Artifact Architecture on O&M . . . . . . . . . . . . . . . . . . 63
4.7 Comparative Analysis with Conventional Methods . . . . . . . . . . . . . 64

20.04.2024 Student number: 120997 3


5 Discussion 66
5.1 Implications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
5.2 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
5.3 Challenges and Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . 68
5.4 Future Research Directions . . . . . . . . . . . . . . . . . . . . . . . . . . 68
5.5 Impact of Design Science Research . . . . . . . . . . . . . . . . . . . . . . 69

6 Conclusion 70

References 71
Title: Data Cubes and Deep Learning: Gearbox Condition Monitoring for Offshore Wind Turbine

1 Introduction
The importance of shifting to renewable energy cannot be overstated, considering its cru-
cial role in mitigating catastrophic climate change. Three-quarters of worldwide CO2
emissions come from the energy sector alone [1]. This sobering statistic underlines the
sector’s significant influence and critical role in tackling climate issues. To minimise the
worst effects of climate change, reaching net zero greenhouse gas emissions by 2050 is
critical, mandating significant reductions as early as 2030. The transition to renewable is
more than just a favourable step; it is critical to protecting our environment and future
generations.

To tackle climate change and meet emissions targets, governments throughout the world
are increasingly turning to renewable and low-carbon energy sources including wind,
geothermal, and hydrogen. Wind energy has grown significantly as a sustainable energy
source, thanks to technology improvements and favourable government legislation [2,
3]. The International Energy Agency’s 2023 analysis forecasts that global wind energy
generation will double by 2028 compared to 2022, highlighting its substantial potential
in contributing to climate change mitigation [4].

1.1 Wind Energy: A Key to Sustainable Power


In the past forty years, many countries have increasingly utilized wind energy for power
generation. As shown in Figure 1 wind power is projected to be the world’s third largest
source of renewable energy by the end of 2023, with a total installed capacity of around
1,017 gigawatts (GW). This places it behind solar energy (1,418 GW) and hydropower
(1,267 GW). In 2023, wind energy installations reached a ten-year high of 115 GW [5].
Estimates indicate that by 2050, wind energy, comprising 12% from offshore and 18%
from onshore sources, will provide 30% of global electricity, making it a mainstream
energy resource [6].

The rapid rise of wind energy in the global power landscape emphasises its importance
in the renewable energy transition. The significant increase in installations, particularly
in 2023, indicates strong momentum that is projected to continue, establishing wind
energy as a key factor in meeting future sustainability targets. Wind energy is expected
to become a cornerstone of the global energy mix by mid-century, with a balanced con-
tribution from both offshore and onshore sources, significantly contributing to carbon
emission reductions and supporting in the quest of a climate-neutral future. Although
to meet these obligations to sustainability and continues growth.

Since 2010, the cost of wind energy has decreased by 60%, making it financially com-
petitive with fossil fuels in several geographical regions [7, 8]. Offshore wind energy
is expected to achieve global cost competitiveness by 2030 as a result of technological
innovations such as larger turbines and improved economies of scale, as well as market
restructuring and the incorporation of environmental benefits to improve its competitive

20.04.2024 Student number: 120997 1


Title: Data Cubes and Deep Learning: Gearbox Condition Monitoring for Offshore Wind Turbine

edge and expand-ability. [9]. Wind energy’s progress, supported by both market forces
and scientific innovation, solidifies its position as a driving force for a sustainable future.

Figure 1: IRENA Renewable Energy Capacity by Technology from 2010 to 2023 [5]

In the last decade, offshore wind turbine (OWT) capacities have dramatically increased
from the kW range to up to 15 MW, largely due to the expansion of rotor diameters
from about 30m in kW turbines to an impressive 240m in 15MW turbines [10]. Although
larger turbines are more productive in terms of electricity generation, they are also more
susceptible to malfunctions, particularly in challenging environments. Offshore wind en-
ergy’s expansion is impeded by costly operation and maintenance (O&M) costs, which
can account for up to 30% of electricity generation expenses [11].

The challenges of reaching remote places, the need for specialised equipment, and adverse
weather can cause O&M costs to quadruple when compared to onshore wind generation
operations. Numerous studies have been carried out to explore solutions for lowering
O&M expenses in offshore wind installations [12]. The implementation of Supervisory
Control and Data Acquisition (SCADA) systems and condition monitoring technologies
has driven a shift towards more proactive maintenance strategies. These are often divided
into two categories: preventative maintenance, which aims to address possible faults and
reduce downtime, and corrective maintenance, which deals with repairs after a failure
has occurred.

1.2 Condition Monitoring of Offshore Wind Turbines


OWT condition monitoring (CM) methodologies are generally classified as offline or
online. Offline procedures, which are suitable for the design and certification of new

20.04.2024 Student number: 120997 2


Title: Data Cubes and Deep Learning: Gearbox Condition Monitoring for Offshore Wind Turbine

turbines, demand periodic inspections that require stopping the machinery, limiting the
ability for continuous real-time status evaluation [13]. On the other hand, online meth-
ods provide uninterrupted, real-time surveillance and are becoming increasingly essential,
especially for offshore wind farms [14]. Online CM techniques are separated into two
categories: signal-based and model-based. Signal-based monitoring uses numerous pa-
rameters such as vibration, temperature, and acoustic emissions, and includes SCADA
based and condition-specific approaches [15].

Continuous surveillance of Wind Turbines (WTs) leverages the built-in SCADA sys-
tem. Despite the presence of about 8000 components in a standard WT [16], it is
standard practice in the industry to use specialized CM systems for essential parts such
as gearboxes and generators. CM strategies encompass techniques that rely on acoustic
measurements, electrical effects, power quality, temperature, oil debris, vibration anal-
ysis, and physics-based data analytics [17]. These techniques are categorized based on
their physical impact on the WT as either intrusive (such as vibration and wear debris
monitoring, shock pulse methods) or non-intrusive (including ultrasonic testing, visual
inspection, acoustic emission, thermography, and performance monitoring through power
signal analysis) [17]. CM of OWTs entails continuous observation of CM systems to as-
sess operational status and detect anomalies through crucial indicators [18].

Many methods have been used by numerous researchers over the last ten years to use the
SCADA/CM data to find anomalies. Nevertheless, traditional statistical and machine
learning methods have challenges due to the large volume and complexity of SCADA/CM
data, which limits their ability to fully utilise this data [19]. By adding a variety of factors
to deep learning frameworks, recent advances in deep learning technology have made it
easier to use CM data effectively and discover important abstract features for CM [20].
When the trend and seasonality are effectively utilised, the time series data—which
consists of trend, seasonality, and irregular components—can greatly increase forecast
accuracy and reliability. But often, these elements are missed in the simple implemen-
tation of different deep learning layers [21].

Due to its skewed nature, where normal data greatly outweighs abnormal data, the CM
data distribution utilised in WT monitoring frequently confronts challenges [15]. The
efficacy of CM may be weakened by this imbalance since data-driven models may become
skewed in favour of the majority of normal data. Concerns about data quality are also
frequently encountered, and these include problems with missing values, NULL entries,
zeros, data that exceeds plausibility thresholds, statistical outliers, lengthy sequences of
consecutively identical results, and improper data formats [15]. Remedial actions such
as linear or exponential interpolation, capping extreme values, or discarding problem-
atic data channels are typically applied to mitigate these issues, yet such fixes may still
affect the precision required in data analysis. High-quality SCADA data is crucial for
effectively assimilating and processing in deep learning model applications, training, and
testing; however, the prevalence of abnormal data instances—ranging from irregular tur-

20.04.2024 Student number: 120997 3


Title: Data Cubes and Deep Learning: Gearbox Condition Monitoring for Offshore Wind Turbine

bine operations to data collection mistakes, human errors, abnormal weather conditions,
and missing entries—can trigger false alarms and necessitate expensive checks by spe-
cialized personnel on OWTs [22]. Prior studies have highlighted six critical data quality
dimensions vital for any deep learning application: consistent representation, complete-
ness, feature accuracy, target accuracy, uniqueness, and target class balance [23].

Data-driven decision-making, which leverages business intelligence to foster a more sus-


tainable organization [24], must itself be sustainable. Since 2010, cloud data centers
have consumed more than 2.4% of the world’s electricity, incurring an economic cost
of around USD 30 billion [25], a figure that predates the widespread adoption of data-
heavy machine learning models. To boost sustainability, proposals include transforming
data warehouses into zero-emission facilities and reducing dependence on such energy-
intensive centers [25], although previous research on the CM of OWTs has not yet tackled
this particular challenge according to this study.

1.2.1 Gearbox Condition Monitoring


Research spanning from 2007 [26] through to recent studies [27] highlights the gearbox
as a critical component in the CM of WTs. These studies not only indicate that gearbox
malfunctions result in the most significant downtime for WTs, but also emphasize that
the gearbox is the most expensive component to maintain over the expected 20-year
operational lifespan of a WT [28]. This underscores the critical importance of concen-
trating on gearbox CM to not only reduce downtime of OWTs but also to diminish the
O&M costs associated with it.

Figure 2: The GRC Three-Point Suspension Drivetrain [28]

20.04.2024 Student number: 120997 4


Title: Data Cubes and Deep Learning: Gearbox Condition Monitoring for Offshore Wind Turbine

The gearbox in a WT is essential for transferring and increasing energy from the rotating
hub to the generator. In large-scale WTs, the output rotation from gearboxes can
surpass the input rotation by over a hundred times. The three fundamental components
of all gearboxes are shafts, gears, and bearings [29]. In a standard WT gearbox, the
energy conversion process involves three distinct stages, each associated with a specific
shaft: the input shaft, intermediate shaft, and output shaft. The rotational energy from
the rotating hub enters the gearbox via the input shaft at a low speed. This speed
is then amplified through a series of gears to the intermediate shaft and subsequently
transmitted to the output shaft, which delivers the energy to the generator at a high
speed [30]. There are two primary configurations for these stages: one involving one
planetary and two parallel stages with the planetary gears positioned in the low-speed
stage, and another configuration featuring two planetary and one parallel stage, with
the parallel stage situated in the high-speed stage [31].

Figure 3: Exploded View of WT Gearbox [28]

1.3 Research Objectives


The objective of this study is to develop a solution using deep learning (DL) models that
effectively reduces false anomaly detection in the CM data of OWT gearboxes. This so-
lution will focus on accurately analyzing temporal sequences in the data. This goal was
informed by an interview conducted with two experts from Equinor, who preferred to
remain anonymous. The interview highlighted that Equinor employs both manufacturer-
supplied and proprietary CM systems to detect anomalies in OWTs. These systems
monitor critical parameters such as temperature, vibrations, and displacements, and
trigger maintenance actions when anomalies are detected. Despite regular upgrades and
general reliability issues such as false alarms, communication network problems, and
data quality challenges continue to compromise the effectiveness of anomaly detection.
To address these issues, Equinor has implemented costly measures, including enhancing
data quality through communication redundancy, expensive storage of data, and prompt
maintenance responses.

20.04.2024 Student number: 120997 5


Title: Data Cubes and Deep Learning: Gearbox Condition Monitoring for Offshore Wind Turbine

The research aims to address several problems identified during the interview: improving
the reliability of anomaly detection with minimal false positives caused by minor envi-
ronmental changes, data class imbalances, where anomalies are scarce. Additionally, the
study will focus on overcoming challenges related to inconsistent data representation
and data gaps resulting from communication channel issues. The expected outcome is
a model that not only improves anomaly detection but also contributes to the sustain-
able operation of OWTs by reducing unnecessary expenditures and enhancing system
responsiveness. The research questions defined in this research are as follows:

1. Can the performance of cutting-edge deep learning models in anomaly detection


for OWTs be improved by the inclusion of signal-to-image processing algorithms?

2. How can an artifact employing signal-to-image processing methods reduce the data
thirstiness of deep learning methodologies?

3. Can the system that was developed manage all six critical dimensions of data
quality essential for any deep learning application, namely consistent representa-
tion, completeness, feature accuracy, target accuracy, uniqueness, and target class
balance?

1.4 Methodological Approach


In this study, the design science framework articulated by Hevner et al. in “Design Sci-
ence in Information Systems Research” [32], is applied as a fundamental methodology
for tackling intricate challenges within the OWT CM. This framework integrates theo-
retical foundations with the creation of innovative IT artifacts, specifically designed to
address the evolving demands of this industry. The approach recommended by Hevner
et al. [32] emphasizes a rigorous, yet flexible problem-solving process, prioritizing a deep
understanding of the problem context before the development of artifacts.

20.04.2024 Student number: 120997 6


Title: Data Cubes and Deep Learning: Gearbox Condition Monitoring for Offshore Wind Turbine

Figure 4: Hevner et al. [32] Design Science Framework in Information Systems

Peffers et al. [33] outlined a six-stage methodology for conducting design science research
within the information systems field:

1. Problem Identification and Motivation, as defined by Peffers et al. [33], en-


tails a thorough delineation of the research problem and an explanation of the
value of its resolution. This not only stimulates the interest and engagement of
both the researcher and the audience but also aids in understanding the researcher’s
viewpoint on the complexities of the problem and the importance of its solution.
The current study aims to improve OWT maintenance monitoring by employing
DL models for anomaly detection, addressing specific challenges in CM data man-
agement such as class imbalance and data quality, which are critical for effective
CM.

2. Objectives of a Solution, as articulated by Peffers et al. [33], should emerge


logically from the defined problem and specify explicit quantitative or qualitative
targets. These objectives should focus on the ways in which the proposed solution
surpasses existing methods or resolves issues that have not been addressed previ-
ously, grounded in a comprehensive understanding of the existing challenges and
the effectiveness of current solutions. The aim of this study is to develop an energy
and data-efficient DL model that proficiently identifies anomalies in CM data for

20.04.2024 Student number: 120997 7


Title: Data Cubes and Deep Learning: Gearbox Condition Monitoring for Offshore Wind Turbine

OWTs’ Gearbox, effectively addressing issues of data imbalance and quality, and
enhancing sustainability.

3. The Design and Development phase, as outlined by Peffers et al. [33], entails
the creation of the artifact solution, which includes constructs, models, methods,
or instantiations. This phase focuses on specifying the desired functionalities and
architecture of the artifact and realizing it, underpinned by the relevant theoretical
knowledge pertinent to the solution. In this study, the principal activity during
the Design and Development phase will involve constructing a DL model that
incorporates signal-to-image processing and data cubes to address the complexities
of CM data and enhance anomaly detection in OWTs.

4. The Demonstration phase, as described by Peffers et al. [33], involves illustrat-


ing the efficacy of the artifact in resolving the identified problem using methods
such as experimentation, simulation, case studies, or proofs. This phase empha-
sizes a thorough understanding of how the artifact is applied in problem-solving
contexts. In this study, the effectiveness of the DL model will be demonstrated
through its application in real-world maintenance scenarios for OWTs, with a spe-
cific focus on anomaly detection in Gearbox CM data.

5. The Evaluation stage, also delineated by Peffers et al. [33], consists of ap-
praising the performance of the artifact in solving the problem by comparing the
objectives set out for the solution with the actual outcomes observed in the demon-
stration. This assessment employs appropriate metrics and analytical methods to
evaluate whether further iterations are needed for enhancement or whether the
project should advance to the communication phase based on the context of the
research. The evaluation of this project will involve a comparison of the model’s
performance against established objectives, specifically examining its accuracy in
detecting anomalies, its data efficiency, and its contributions to sustainability in
OWTs.

6. The Communication phase, as defined by Peffers et al. [33], encompasses


the dissemination of the research problem, the utility and innovation of the arti-
fact, the rigor of its design, and its effectiveness to both academic researchers and
industry practitioners. This process often parallels the structure of empirical re-
search papers and requires a keen understanding of the disciplinary culture. In this
study, findings will be disseminated through two primary channels: academic and
practical. The main academic outlet for these findings will be this master’s the-
sis. Furthermore, to actively engage the practitioner community, especially those
working with Gearbox CM data in OWT farms, two structured questionnaires will
be administered to Company A. The first questionnaire is designed to gather in-
sights into their specific needs, thereby informing the direction of the study, while
the second aims to solicit feedback on the research outcomes to further refine and
improve the developed artifact.

20.04.2024 Student number: 120997 8


Title: Data Cubes and Deep Learning: Gearbox Condition Monitoring for Offshore Wind Turbine

To implement design science research approach in the thesis as described above, the
problem identification and objectives of this research were informed by prior studies and
an interview with Equinor, as detailed in Chapters I and II. The design and development
phase is presented in Chapter III, with the demonstration and evaluation phases cov-
ered in Chapters IV and V. The communication phase encompasses a submitted article
for review, previous research [34], and this thesis, targeting the academic community.
Additionally, a follow-up interview with Equinor is planned after the completion of this
thesis.

1.5 Structure of the Thesis


The following is the structure of the manuscript: Chapter II provides a thorough analysis
of the literature, examining earlier research in the area. The method used in this thesis
is described in Chapter III, and the design and the implementation of the artefact are
covered in Chapter IV. Chapter V examines the outcomes, and Chapter VI discusses
about these conclusions. The research implications are investigated in Chapter VII,
and the study’s main findings and contributions are summarised in Chapter VIII, which
serves as a conclusion.

20.04.2024 Student number: 120997 9


Title: Data Cubes and Deep Learning: Gearbox Condition Monitoring for Offshore Wind Turbine

2 Literature Review
This research utilized Webster & Watson’s [35] methodology to perform a thorough con-
ceptual analysis of relevant literature, identifying key foundational elements pertinent
to the study’s topic. Table 1 displays a concept matrix that encapsulates the findings
from the reviewed studies. The literature review process began by pinpointing the prin-
cipal domains of the study and conducting a comprehensive search within each domain
using relevant keywords in two primary databases, Oria and Google Scholar. During
this review, certain references from prior studies were deemed relevant and subsequently
cited in this research. This study divides the reviewed and utilised resources into seven
main categories where they can overlap in some cases: background, theory, data quality,
convolutional neural network, autoencoder, signal-to-Image processing algorithm, data
footprint. Given the vast amount of utilised resources, only those are placed in the
concept matrix which have the highest impact on this research. The following explains
each category in detail.

1. Background (B in table): This category includes the foundational literature


that supports the motivations and theoretical frameworks of this study, providing
insights into its conceptual foundations and underlying rationale.
• Keywords: “Offshore Wind Turbine”, “Artificial Intelligence for Maintenance”,
“Deep Learning”, “Signal-to-Image”, “Data Footprint”, “Data Quality”
2. Theory (T in table): This category encompasses articles instrumental in shap-
ing the research design methodology, specifically within the framework of design
science in this case.
• Keywords: “Design Science”, “Research Methodology”, “Theoretical Frame-
works”, “Offshore Wind Turbines”, “Maintenance Strategies”
3. Data Quality (DQ in table): This category encompasses literature that explores
any of the six data quality dimensions outlined in the previous section.
• Keywords: “Data Quality”, “Machine Learning”, “Deep Learning”, “Anomaly
Detection”, “SCADA Data”, “Wind Turbines”
4. Convolutional Neural Network (CNN in table): This category includes
literature focusing on the application of convolutional neural networks for anomaly
detection and condition monitoring in OWTs.
• Keywords: “Deep Learning”, “Convolutional Neural Network”, “Anomaly
Detection”, “Offshore Wind Turbines”, “Condition Monitoring”
5. Autoencoder (AE): This category covers research on the use of autoencoders
for anomaly detection and condition monitoring in OWTs.
• Keywords: “Autoencoder”, “Anomaly Detection”, “Offshore Wind Turbine”,
“Deep Learning”, “Convolutional Neural Network”

20.04.2024 Student number: 120997 10


Title: Data Cubes and Deep Learning: Gearbox Condition Monitoring for Offshore Wind Turbine

6. Signal-to-Image Processing Algorithm (SPA in table): This category dis-


cusses algorithms that convert signal data into images for enhanced processing and
analysis using deep learning techniques.

• Keywords: “Signal-to-Image”, “Gramian Angular Field”, “Markov Transition


Fiel”, “Grey Scale encoding”, “Spectrogram”, “Scalogram”, “Deep Learn-
ing”, “Convolutional Neural Network”

7. Data Footprint (DF in table): This category explores the less examined as-
pect of advancing the sustainability of renewable energy through emphasis on the
CO2 footprint associated with data. It comprises studies that have either directly
tackled or recognized the importance of this crucial issue in the realm of renewable
energy.

• Keywords: “Data Footprint”, “Sustainability”, “Renewable Energy”, “Data


Storage”, “Data Management”

20.04.2024 Student number: 120997 11


Title: Data Cubes and Deep Learning: Gearbox Condition Monitoring for Offshore Wind Turbine

Table 1: Table Concept Matrix Augmented with Units of Analysis


Concepts
Ref Author Artificial Intelligence
T DQ SPA DF B
DL CNN AE
15 (Badihi et al.) X X
120 (Barra et al.) X X X
73 (Black et al.) X X X
124 (Boashash) X
23 (Budach et al.) X X
1 (Diesendorf) X
122 (Fahim et al.) X X X
90 (Garcı́a & Peinado) X X
114 (Geron) X X
32 (Hevner et al.) X
4 (IEA) X
34 (Keprate et al.) X X X X X
7 (Lazard) X
106 (Liu et al.) X X
118 (Ma et al.) X
11 (May et al.) X
113 (Miele et al.) X X X X
27 (Olabi et al.) X
20 (Pang et al.) X
33 (Peffers et al.) X
116 (Sheng) X
25 (Shuja et al.) X
121 (Song et al.) X X X X
17 (Stetco et al.) X X
125 (Thill et al.) X X X X
119 (Wang and Oates) X X X
35 (Watson and Webster) X
123 (Wen et al.) X X X
19 (Xiao et al.) X X
89 (Xiao et al.) X X
100 (Xu et al.) X X
The acronyms used in this table is explained in page 11 of this thesis.

2.1 Offshore Wind Turbine


OWTs are critical for generating wind power as they utilize minimal land and produce
a high output of electricity [36]. The structural support systems of onshore and OWTs
are the main points of difference. Fixed-support and floating OWTs are the two main

20.04.2024 Student number: 120997 12


Title: Data Cubes and Deep Learning: Gearbox Condition Monitoring for Offshore Wind Turbine

types. When contrasted to onshore models, OWTs provide a number of benefits. One
major advantage of OWTs is their larger size, which results in more economical power
generation [37]. Additional advantages include reduced environmental impacts and ac-
cess to more abundant wind resources.

Structurally, OWTs are similar to their onshore counterparts. Most commercial OWTs
are horizontal-axis, three-bladed, upwind machines. The components above sea level
include blades that generate torque connected to a hub, part of the nacelle. Within the
nacelle, you’ll find the main shaft, a gearbox, and the generator. Power cables extend
from the nacelle down through the tower to the support structure, which includes an
outer deck that allows operators to access both the tower and the nacelle. Notably,
horizontal-axis WTs are favored over vertical-axis ones for offshore use due to their su-
perior wind capturing efficiency [38].

As previously discussed, beyond the non-financial advantages of wind energy, the prof-
itability of OWTs is crucial. A significant portion of current expenses is linked to the
O&M of OWTs and generally OWTs have higher rate of failure, mainly driven by the
reason of being in harsher environments [39]. Currently, offshore wind farms employ two
main O&M strategies [40]: preventive maintenance and corrective maintenance. Pre-
ventive maintenance is a proactive approach that involves servicing components before
they fail to ensure continuous operation. This strategy includes two methodologies:
condition-based maintenance (CBM) and time-based maintenance (TBM). TBM, also
known as periodic maintenance, schedules regular maintenance (e.g., every six months,
annually, or every five years) based on the expected failure patterns of components. Cre-
ating a maintenance program to stop additional component deterioration when certain
CM indicators suggest irregularities is known as condition-based maintenance, or pre-
dictive maintenance. Previous studies [41] suggest that CBM is less costly than other
methods.

20.04.2024 Student number: 120997 13


Title: Data Cubes and Deep Learning: Gearbox Condition Monitoring for Offshore Wind Turbine

Figure 5: A typical Offshore Wind Turbine [38]

CM, integral to the CBM approach, involves the continuous assessment of machine states
using sensor data to identify deviations from normal operation, thereby enabling main-
tenance actions to be precisely timed based on actual equipment needs. The OWT
SCADA system extensively gathers information from crucial WT sub-assemblies [42].
Consequently, the potential additional value of the WT SCADA system could be ex-
panded into the area of CM.

OWT SCADA data, typically averaged over 10 minutes, are not suited for conventional
machine CM techniques like spectral analysis due to their format [43]. The SCADA sys-
tem, not originally designed for CM, fails to capture all necessary information for com-
prehensive WT CM. Furthermore, the variability of SCADA data values, such as bearing
vibration and temperature under different operational conditions, makes it challenging
to identify emerging faults without sophisticated data analysis tools. However, SCADA
data hold considerable potential for WT CM, especially since SCADA systems are al-
ready installed in most megawatt-scale OWTs. This pre-existing infrastructure means
that no additional hardware investment is required for developing a SCADA-based CM
system, making it a cost-effective solution. Some initial efforts in this direction have
recently been undertaken.

2.1.1 OWT Gearbox


The gearbox in an OWT is pivotal, functioning to elevate the rotor’s low rotational
speed to a higher speed apt for the electrical generator, with the speed ratio being con-
tingent on variables such as the rated wind speed and the generator’s number of poles
[44]. Moreover, the gearbox in an OWT is instrumental in multiple key aspects including
torque modulation, load distribution, power output management, and vibration attenu-
ation, all of which significantly contribute to enhancing the turbine’s overall reliability

20.04.2024 Student number: 120997 14


Title: Data Cubes and Deep Learning: Gearbox Condition Monitoring for Offshore Wind Turbine

and operational performance.

Figure 6: Assembly Drawing of Shafts and Gears from the Three-Stage Gearbox [45]

A WT gearbox usually consists of one or more helical stages and multiple planetary
stages [37]. In the past, the most common architecture for these turbines was a doubly
fed induction generator, which allowed for variable speed operation within a restricted
range of up to 50% in order to adjust to changing wind conditions. In order to recycle
some of the electricity generated back to the generator rotor, this system uses a converter
that is smaller than the rated turbine power. Permanent magnet generators are increas-
ingly being used in conjunction with full-size converters, which manage currents at the
turbine’s rated power level. This development makes it possible to provide more flexible
variable speed operation and grid services related to reactive power needs. Planetary
gearbox is used in the majority of WT engines because it can transfer significant power
in a small package [46]. But these gearboxes are vulnerable to deterioration due to the
harsh circumstances found in wind farms, such as high loads, strong winds, and dust
corrosion, which could result in large investment and productivity losses. This calls for
the creation of a reliable technique for diagnosing faults in WT planetary gears [45].

20.04.2024 Student number: 120997 15


Title: Data Cubes and Deep Learning: Gearbox Condition Monitoring for Offshore Wind Turbine

2.2 OWT Condition Monitoring


The wind energy industry has evolved from small clusters of easily maintained turbines
to offshore wind farms, where maintenance is more challenging and significantly in-
creases the levelized cost of energy compared to onshore locations. CBM is a preventive
maintenance strategy that involves assessing physical conditions, analyzing them, and
initiating targeted maintenance actions to prevent failures by understanding the physics
of failure [47]. Implementing CBM in WTs adds complexity and cost due to the need for
sensors and analysis tools, requiring a strategy based on a deep understanding of failure
mechanisms and prioritization of WT systems to fully leverage its benefits. WT CM
systems comprise various techniques [48, 49, 50], each specific to a particular aspect of
turbine health assessment, like Vibration Analysis, Oil Analysis, Thermography, Physi-
cal Condition Monitoring, Strain Measurements, Acoustic Monitoring, Electrical Effects
Monitoring, Process Parameter Monitoring, and Performance Monitoring.

Intelligent CM of WT health is enabling a transition from scheduled and reactive main-


tenance to a more predictive and proactive regime, driven by the demanding offshore
environment and the increasing number of turbines in wind farms. Manufacturers have
developed CM systems to track important parameters such as drive train vibration, oil
quality, and temperatures of main sub-assemblies, which they believe have the most
importance in CBM [51]. Utilising methods like vibration analysis, acoustics, oil analy-
sis, strain measurement, and thermography, CM systems in WT use sensors and signal
processing equipment to continuously evaluate the condition of critical components like
blades, gearbox, generator, main bearings, and tower [52]. These systems allow for the
detection of defects during operation, allowing for the prompt planning of maintenance
operations to prevent damage or failure. They can function offline with data collected
at regular intervals, or online for rapid feedback. In addition to saving downtime, main-
tenance expenses, and operating costs, this proactive approach improves the turbines’
reliability, availability, maintainability, and safety (RAMS) [53].

There are two main types of WT CM systems commercially available: one type uti-
lizes the existing SCADA systems installed on large WTs, while the other type is a
purpose-designed CM systems specifically created for WTs [54]. CM systems focuses
more narrowly on the health of specific components like gearboxes, generators, bearings,
and rotors through techniques such as vibration analysis, particle counting, and strain
measurements. CM systems samples at a higher frequency than SCADA and incurs
higher costs associated with financial outlays and the demands of data transfer and stor-
age. SCADA systems typically log 10-minute averages of 1 Hz sampled values, which
include maximum, minimum, and standard deviation data in addition to the number of
starts and stops and alarm logs [55].

Although these are frequently recorded separately in a dedicated CM systems, indicat-


ing the lack of a standard monitoring equipment set across different turbine populations
despite a trend towards installing more sensors in modern turbines, additional param-

20.04.2024 Student number: 120997 16


Title: Data Cubes and Deep Learning: Gearbox Condition Monitoring for Offshore Wind Turbine

eters like vibrations, oil pressure levels, and filter statuses may also be tracked by a
WT SCADA system. Data-driven methodologies in both SCADA systems and purpose-
designed CMS exhibit similar trends. Purpose-designed CMS systems can be partially
or entirely integrated into SCADA systems, reflecting the comparable nature of the data
they collect. Consequently, this study concentrates on the methodologies utilized in
these systems for CBM purposes.

Many industries use a variety of data-driven approaches, such as time series analysis,
artificial intelligence (AI), statistical techniques, rough sets, and grey system theory. In
particular, the O&M of OWTs heavily relies on AI and statistical methodologies [56]. A
number of data-driven applications have been highlighted by previous research, including
the use of satellite data for OWT CM [57] and the integration of hybrid approaches that
combine robot-based CM with AI [58]. The study does not cover all of these approaches,
but the the following sections will focus on the most well-known data-driven strategies
that are currently in use.

2.2.1 Statistical Methods for Condition Monitoring


Given their sensitivity to initial conditions and error accumulation potential, statistical
models are frequently utilised for short-term projections. They estimate failure rates
and component degradation using CM data and prior failure information. The funda-
mental techniques in this framework foretell the timing of future failures by modelling
failure distributions using functions including the exponential, lognormal, Gaussian, and
Weibull [59]. Using time series data and the AutoRegressive Integrated Moving Average
(ARIMA) model is a popular statistical technique for failure prediction. Time series
forecasting frequently uses ARIMA models, which combine moving-average (MA) and
autoregressive (AR) components [60].

Based on [61], as statistical model is a collection of probability distributions inside a


sample area. A mathematical framework that is non-deterministic and relies on a col-
lection of statistical assumptions about the sample is called a statistical model. Time
series analysis, regression analysis, stochastic processes, Bayesian inference, and proba-
bility distribution are all included in this model. Rezamand et al. [62] used the Weibull,
Lognormal, and Exponential life distribution models to analyse data from a wind farm.
The operational hours, generated power, and ambient temperature were among the vari-
ables analysed. According to their research, the models that best predict the lifetime
distribution of generators are the 3P-Weibull and IPL-Lognormal distributions. Kang
et al. [63] used two sets of gearbox run-to-failure data to assess the Lévy alpha-stable
distribution. Their findings demonstrate that the stable distribution effectively monitors
the degradation pathway of gearboxes. Many studies [64] investigate the identification of
bearing defects without knowledge of the transfer function between the bearings and the
tower by examining the sound and vibration of the tower. To distinguish between the
properties of malfunctioning and sound vibration signals, they use statistical analytic
techniques.

20.04.2024 Student number: 120997 17


Title: Data Cubes and Deep Learning: Gearbox Condition Monitoring for Offshore Wind Turbine

Many research have assessed the benefits and drawbacks of statistical procedures for
OWT CM and compared them with different ways [65, 66, 67]. Statistical models are
valued in WT CM since they are able to predict equipment failures and evaluate compo-
nent states using historical information, which helps with preventive maintenance. They
do, however, have certain disadvantages, including the need for large, precise historical
datasets, susceptibility to operational modifications that may distort predictions, and
difficulties with high dimensionality that may lead to over fitting and computational
complexity. Moreover, in constantly changing situations, these models may lose predic-
tive accuracy due to systematic mistakes that develop over time and the requirement for
exact initial parameters.

2.2.2 Artificial Intelligence for Condition Monitoring


AI is broadly defined as systems or software that exhibit intelligent behavior by analyz-
ing their environment and autonomously taking actions to achieve specific goals, with
capabilities ranging from simulating human cognitive functions to learning and adapting
during operation [68, 69, 70, 71, 72]. AI approaches, specially the branch of Machine
Learning (ML) Fig: 7, has gained popularity due to greater data management, storage,
and processing capabilities, as well as increased data availability. AI techniques are
generally faster and less expensive to implement than other approaches, which makes
them particularly helpful for intricate, constantly monitored systems. These methods
include reinforcement learning as well as supervised, semi-supervised, and unsupervised
learning [73]. In O&M for WTs, semi-supervised learning blends more unlabeled data
with minimally labelled data, whereas supervised learning uses labelled data to train
models on turbine states. Without labelled data, unsupervised learning looks for hidden
patterns using techniques like cluster analysis. Reinforcement learning does not require
labelled data; instead, it maximises rewards by optimising behaviour through environ-
mental interaction and observation.

20.04.2024 Student number: 120997 18


Title: Data Cubes and Deep Learning: Gearbox Condition Monitoring for Offshore Wind Turbine

Figure 7: Artificial Intelligence Categories [74]

Basic ML methods for failure prediction encompass techniques such as linear regression
[75], support vector machines (SVMs) [76], decision trees/random forests [77], K-Nearest
Neighbors (KNNs) [78], and artificial neural networks (ANNs) [79], which include adap-
tive neuro-fuzzy inference systems (ANFIS) [80]. The amount of research conducted
utilising all various methods of ML for WT CM is significantly high. These studies have
used ML techniques to monitor all parts of WTs, in this part a few are mentioned. In or-
der to monitor turbine blades using vibration data and high-frequency response function
measurements (FRFs), a set of ML approaches was introduced [81]. Damage-sensitive
characteristics were retrieved by dimensionality reduction of the FRF data using prob-
abilistic principal component analysis, and these algorithms were tuned for quick and
efficient online monitoring. The complex, nonlinear interactions between inputs and
outputs are captured using sophisticated ML algorithms, and the performance of WTs
is then usually monitored, in many studies [82, 83, 84, 85, 86] by residual analysis. ML
can also be useful in many different parts of wind energy [87], as like power prediction
or wind load forecasting.

20.04.2024 Student number: 120997 19


Title: Data Cubes and Deep Learning: Gearbox Condition Monitoring for Offshore Wind Turbine

The amount of time series data makes it possible to use a data-driven approach to
CM; nevertheless, the volume and complexity of CM data pose difficulties for conven-
tional statistical and ML techniques, making it more difficult for them to effectively
use this information [88, 89]. While classic neural network architectures are the source
of inspiration for DL, DL performs noticeably better than them [74]. It builds multi-
layered learning models by combining graph technologies and transformations. Recent
developments in DL methodologies have produced remarkable outcomes in a variety of
applications, including natural language processing (NLP), speech and audio processing,
and visual data processing. The energy sector and researchers of the field not behind
other applications, DL technology has made it possible to use CM data more effectively
by incorporating a variety of parameters into models that extract important abstract
features for CM [89]. However, direct processing with multiple DL layers may miss im-
portant trend and seasonality components. The next section of this chapter is dedicated
to utilising DL in CM.

2.2.3 Condition Monitoring with Deep Learning


ANNs made up of several nodes (neurons) Fig: 8 layered one on top of the other, are
used in DL frameworks [90]. Due to the connections between neurons, signals can be
transmitted and processed through layers until an output is generated. Layer count,
neuron count within each layer, and the particular connections among these neurons de-
termine the complexity of DL models. Convolutional neural networks (CNNs), recurrent
neural networks (RNNs), and feedforward neural networks (FNNs) are three popular DL
architectures that are appropriate for various tasks and data types [90]. DL models are
widely used in many different fields, including state classifcation [91], data-driven fatigue
estimation [92], and false alarm detection [93]. Yet their main application is in anomaly
detection across several components; gearboxes, blades, and bearings are the most often
inspected components [94, 95]. The relationship between meteorological factors—such
as temperature, humidity, and wind speed—and the failure behaviour of the five main
WT components—the gearbox, generator, frequency converter, pitch system, and yaw
system—is established in [96], which is a noteworthy example of a successful application.
Using historical data, supervised and unsupervised data mining and ML approaches are
used to create a system that can anticipate impending component failures.

Gearboxes are intensively studied because of their vital importance and high failure
rates [97]. DL Models are increasingly being used for anomaly detection and diagnosis,
particularly using vibration signals and SCADA data [98, 99]. To enhance performance,
it is common to combine techniques like Decision Trees, Particle Swarm Optimization,
Genetic Algorithms, and others with ML and Big Data methods for anomaly detection
and prediction in WTs [90]. Specifically, DL models are extensively employed, either
alone or in conjunction with other AI approaches, to analyze signals from various sensors
for detecting faults in blades, bearings, and other components, using data like SCADA,
vibration, and temperature.

20.04.2024 Student number: 120997 20


Title: Data Cubes and Deep Learning: Gearbox Condition Monitoring for Offshore Wind Turbine

Figure 8: Simple Neural Network Architecture [74]

The vast amount of research on CM utilizing DL models is beyond the scope of this study,
hence a few examples which are evaluated as the most successful ones are presented in
this section. Xu et al [100] has developed a composite quantile regression long short-term
memory network with group lasso (CQR-LSTM-GL) trained on SCADA data, to ad-
dress the imbalance issue in abnormal and healthy data. The method employs the CQR-
LSTM-GL model to define normal WT behavior and uses decision trees (DT) to detect
anomalies by comparing smoothed residuals against a dynamically adjusted threshold.
This method’s flexibility and usefulness are enhanced by its ability to spot anomalies
without the need for labelled data. On the other hand, a drawback is that it depends too
much on accurately describing normal behaviour, which may leave out local abnormali-
ties and not generalize. Zhu et al. [101] introduced a novel methodology which integrates
long short-term memory networks, fuzzy synthesis, and feature-based transfer learning
to predict and evaluate the operational state of WTs, utilizing historical CM data to
effectively calibrate and predict imminent component failures. This method advantage
is its effective use of historical CM data to predict operational states and imminent
component failures, enhancing predictive accuracy; however, its reliance on extensive
data for training and the complexity of integrating multiple algorithms may limit its
applicability in scenarios with limited data availability or computational resources. The
methodology developed by Lyu et al. [102], termed Decomposed Sequence Interactive
Network (DSI-Net), is designed for anomaly detection in WT SCADA data to predict
and evaluate the operational state of components through a combination of sequence
decomposition and interactive learning. While the DSI-Net method’s potential com-
plexity and computational demand may limit its applicability in less resource-intensive
environments, it has the advantage of effectively utilising SCADA data for precise CM of
WTs through the extraction and interactive learning of spatio-temporal features. Three

20.04.2024 Student number: 120997 21


Title: Data Cubes and Deep Learning: Gearbox Condition Monitoring for Offshore Wind Turbine

distinct anomaly detection models (Elliptical Envelope, Isolation Forest, and One-Class
Support Vector Machine) were introduced by McKinnon et al. [103] to monitor the state
of WT gearboxes using SCADA data. By comparing the performance of each model over
two distinct months, each turbine’s health status could be effectively ascertained. With
only two months of data, this method can evaluate the health of a turbine gearbox, which
is advantageous since it eliminates the need for extensive historical data processing and
storage, which lowers computing expenses. As a result of the short data period, which
might not fully capture all variability and any anomalies in turbine operation, there is
a risk of decreased diagnostic accuracy.

While previous research has demonstrated the effectiveness of DL models in various


domains, there remains a lack of a comprehensive, reliable strategy that addresses real-
world applications, particularly where the explain-ability of model decisions is crucial.
The strategy should also be able to address data quality effectively and ensures that the
models can generalize across various circumstances involving extensive and large amounts
of data, thus providing high accuracy and reliability crucial for O&M in OWTs—a sector
known for its high costs. However, the inherent “black box” nature of DL models poses
challenges in industry adoption, as the inability to fully explain decisions can undermine
trust, especially when models that are above 90% accurate may still generate costly false
alarms. Among the methodologies explored by previous studies, this study found the
Auto-Encoders and CNNs the most successful in the points mentioned above, hence in
the next sections this study will take a deep dive into each of these methodologies.

2.2.4 Auto-Encoders for Anomaly Detection


The encoder and the decoder are the two main parts of an autoencoder DL model.
The encoder part aims to minimise redundancy and preserve important information by
compressing a normally high-dimensional dataset into a lower-dimensional code. The
next step for the decoder is to try and recreate the input data as nearly as possible to
how it was originally. Autoencoders may learn effective data representations through
this encoding and decoding process, which is useful for applications like dimensionality
reduction and data denoising [104]. The architecture of a autoencoder is shown in fig:
9. Researchers have used this technique [105] to simulate the historical distribution of
data from WTs with the goal of determining whether or not new data points fit into
this known distribution. Furthermore, earlier research [106] has shown that this design
can with an unsupervised manner learn the properties of time-series data, which makes
it appropriate for anomaly detection.

20.04.2024 Student number: 120997 22


Title: Data Cubes and Deep Learning: Gearbox Condition Monitoring for Offshore Wind Turbine

Figure 9: Autoencoder Architecture

The same as previous section of this study, the following is a few examples of how auto-
encoders can successfully be applied for anomaly detection and their advantages and
disadvantages which this study found noteworthy in this research. Using SCADA data,
Liu et al. [106] Triplet-Convolutional Deep Autoencoder (Triplet-Conv DAE) approach
improves anomaly detection in WTs by fusing convolutional autoencoders with deep
metric learning. In order to increase detection accuracy, this approach seeks to capture
both typical operational patterns and distinguishing characteristics from anomalies. De-
spite its innovation, the Triplet-Conv DAE technique has some real-world drawbacks,
including data imbalance, high computing needs, difficult model tweaking, reliance on
high-quality data, and scaling problems in a variety of operational situations. Li et
al. [107] utilises a technique called the Deep Small-World Neural Network (DSWNN),
which is intended for early anomaly detection in WTs. It starts with an unsupervised
learning approach and then uses minimal supervised learning to fine-tune network pa-
rameters in an effort to predict and detect faults in a difficult data-sparse environment
with efficiency. This approach has the benefit of strong defect detection in settings with
little labelled data since it makes use of the strengths of small-world neural networks
and deep auto-encoders to improve prediction accuracy. However, the drawback is that
the small-world transformation and fine-tuning process may result in greater complexity
and processing needs, which could impede scalability and rapid implementation. A Long
Short-Term Memory-based Variational Autoencoder Wasserstein Generative Adversar-
ial Network (LSTM-based VAE-WGAN) is utilised in Zhang et al. [108] methodology
for anomaly identification in WTs. Through improved local information extraction and
feature amplification, this approach—which is used in semi-supervised training condi-

20.04.2024 Student number: 120997 23


Title: Data Cubes and Deep Learning: Gearbox Condition Monitoring for Offshore Wind Turbine

tions—is intended to effectively overcome the obstacles presented by small and noisy
data samples. This approach has the advantage of being able to reliably learn intricate,
high-dimensional data distributions and handle small, noisy datasets in an adaptive
manner, which greatly improves detection accuracy. The intricacy and computational
demands of combining LSTM, VAE, and GAN architectures, however, could be a draw-
back because they could cause problems with training effectiveness and resources require-
ments as like computation and data. Using SCADA data, Liu et al. [109] methodol-
ogy—dubbed Sparse Dictionary Learning based Adversarial Variational Auto-Encoders
(AVAE SDL)—integrates sparse dictionary learning with adversarial variational auto-
encoders to identify anomalies in WTs. By utilising the feature extraction strength of
sparse dictionary learning and the generative capabilities of GANs, this method seeks to
improve fault detection reliability by precisely identifying impending defects. By reduc-
ing the influence of random noise and extracting crucial features from high-dimensional
data, the AVAE SDL approach has the advantage of improving the ability to identify
defects in WTs with accuracy and reliability. The drawback, on the other hand, is that
training and optimising the hybrid model may be computationally demanding and so-
phisticated, which may provide problems for scalability and practical application.

Using an autoencoder to identify anomalies from the norm, this approach simulates the
healthy operating data from SCADA and CM systems and may exclude abnormalities
or gearbox breakdowns. These studies have brought to light a number of applications;
however, like with earlier models they typically fail to incorporate erroneous samples
and usually ignore anomalous data in the training phase. As a result, the DL model is
skewed towards normal data and fails to generalise. The hybrid versions employed are
expensive models with high demand on computation resources and data requirements.

2.2.5 Convolutional Autoencoder for Anomaly Detection


In DL techniques, CNNs are becoming more and more popular, especially in anomaly
detection [110]. CNNs, as seen in Fig. 10, are a type of DL algorithms that process data
with grid-like topology, like photographs, using a hierarchical, multi-layer architecture.
They do this by applying non-linear activations, pooling, and convolutions to extract
and learn spatial hierarchies of features. The original purpose of CNN development was
image processing. CNNs are particularly useful for tasks like image recognition and
classification because of its architecture, which makes use of shared weights and local
connections to capitalise on the 2D structure of an input image. For example CNNs
trained on a fresh dataset of photos of tiny WT blades are used in Altice et al. [111]
approach to identify flaws. This strategy attempts to lower inspection risks and expenses
while improving fault identification accuracy. The model performs exceptionally well on
test data, demonstrating the great accuracy of this technology in defect detection. The
drawback is that overfitting may occur, particularly in smaller or less varied datasets,
which may restrict the model’s applicability to actual situations.

20.04.2024 Student number: 120997 24


Title: Data Cubes and Deep Learning: Gearbox Condition Monitoring for Offshore Wind Turbine

Figure 10: Simple CNN Architecture [74]

Researchers have also created many hybrid strategies where they employ one-dimensional
models as like LSTMs and RNNs with one or two dimensional CNNs to create better
models. For example in order to improve CM and anomaly identification in WTs, Zhu et
al. [112] devised a system that integrates Kernel Principal Component Analysis (KPCA)
with a CNN and a LSTM. To enhance the precision of forecasting operating states and
possible malfunctions in intricate mechanical systems, this method combines the extrac-
tion of spatial and temporal features. This methodology’s advantage is its capacity to
extract complete features, which improves the accuracy of WT anomaly detection and
prediction. The drawback, however, is the computational load and complexity needed
to integrate and optimise these sophisticated algorithms, which may prevent quick de-
ployment and efficient real-time analysis.

Convolutional nodes can be used in place of conventional autoencoder (CAE) nodes to


create a new architecture called the CAE, as shown in Figure 11. Researchers have
used this method to find anomalies in a variety of industries, including wind energy.
Graph Convolutional Autoencoder for Multivariate Time Series (MTGCAE), a technique
created by Miele et al [113], combines autoencoders and graph convolutional networks to
improve anomaly identification in WTs utilising SCADA data. This method represents
sensor networks as dynamic functional graphs, allowing for an improved identification of
abnormalities by taking advantage of the spatial and temporal correlations contained in
multivariate sensor data. The MTGCAE approach has the advantage of being able to
record intricate relationships and dependencies between many sensors, which may result
in anomaly identification that is possibly more accurate. On the down side, graph-based
learning’s complexity and the requirement to define a suitable graph structure could
present difficulties, particularly with regard to scalability and real-time application.

20.04.2024 Student number: 120997 25


Title: Data Cubes and Deep Learning: Gearbox Condition Monitoring for Offshore Wind Turbine

Figure 11: Convolutional Autoencoder Architecture

2.3 Literature Review Summary


The vast amount of data gathered by WTs makes it easier to design a variety of data-
driven, long-term solutions for CM automation. Because of the high dimensionality,
hidden seasonality, environmental adaptability, and reliance on domain expertise of WT
data, conventional statistical and ML approaches may find it difficult to generalise the
results to future data trends or more widely used applications. These models are highly
dependent on the ongoing contributions of statisticians, data scientists, and domain spe-
cialists, which might be unaffordable for real-world applications. The literature study
states that, taking these factors into account, DL methods stand out as the most promis-
ing method for detecting anomalies in WT data, with the potential to increase accuracy
and dependability.

Even with advances in using DL for anomaly detection, existing studies frequently fall
short of meeting the necessary standards for lowering false alarms or improving model
accuracy. Building trust is crucial for DL applications to become widely accepted in
the industry, especially because these solutions are black boxes. Gaining the requisite
confidence fundamentally requires improving accuracy and efficiency. Moreover, the sus-
tainable energy sector must create accurate and dependable solutions in order to pursue
carbon-neutral goals. DL can help lower O&M costs by reducing reliance on extensive
and expensive systems for CM, which could boost the competitiveness of sustainable en-
ergy. This disparity highlights the vital need for approaches that improve the accuracy
and dependability of decision-making, directly corresponding with the goals of bolstering
sustainable energy projects and boosting operational effectiveness.

20.04.2024 Student number: 120997 26


Title: Data Cubes and Deep Learning: Gearbox Condition Monitoring for Offshore Wind Turbine

The review also emphasises the need for more research to resolve data discrepancies
and the reliance on large datasets. The current body of literature falls short in offering
adequate solutions for lowering data volume requirements without sacrificing or enhanc-
ing model performance. This shortcoming is especially problematic for the sustainable
energy industry, since effective data processing and usage are necessary to lower energy
consumption and save operating expenses associated with data management. Further-
more, the models presented frequently train primarily on normal data and identify a
small number of aberrant samples, or they mostly rely on having balanced datasets be-
tween normal and abnormal situations. This method may produce models that overfit
to the available dataset of anomalous samples, making them incapable of generalising
successfully to novel anomalies that are not well-represented in it.

There is a crucial gap in the field of CM for OWTs that the literature analysis has
highlighted, especially with regard to data efficiency, model dependability, and false
alarm avoidance. For anomaly detection in this industry to become more accurate
and reliable, these issues must be resolved. Thus, future work should focus on novel
approaches that improve model accuracy while handling the seasonally variable and high-
dimensional data that are intrinsic to OWT procedures. Such study would guarantee
that the solutions created are resistant to data inconsistencies in addition to helping to
reduce the amount of data required. By concentrating on these areas, future research
may lead to significant improvements in operational effectiveness and assist the renewable
energy sector’s sustainability objectives, opening the door for more dependable and
financially sound OWT maintenance techniques.

20.04.2024 Student number: 120997 27


Title: Data Cubes and Deep Learning: Gearbox Condition Monitoring for Offshore Wind Turbine

3 Methodology & Implementation


The current study utilises a data pipeline, as seen in Figure 12 for project management.
The steps outlined here are based on previous research guidelines [114] and are consistent
with the design science implementation framework that Peffers et al. [33] suggested.

Figure 12: Proposed Data Pipeline

Every step listed in the data pipeline is examined in-depth in this section, along with
thorough explanations for each. In order to achieve proactive CM of OWTs, this study
proposes an DL system for anomaly detection within gearbox CM data. The data
pipeline that is given here will ultimately serve as a artifact for this thesis.The main
focus of this study is on the data preprocessing step where image processing algorithms
are used to transform signal time-series data into image representations and further on
to data cubes. The methodology in this study was inspired by a previous study on
SCADA data in which three signal-to-image processing algorithms were used to trans-
form the temperatures of two generator bearings into images and detect anomalies in
an unsupervised fashion of auto-encoders. This study showed that this approach was
effective even in the presence of small anomalies [34]. This foundation highlights how
image-based analysis might improve anomaly identification where the model gets a bet-
ter understanding of the trend, seasonality, and cycles within the data.

The inclusion of data cubes in this study was motivated by two factors: firstly, individual
motivation, and secondly, research on the structure of Lidar sensor data. The personal
inspiration originated in a comparison with layered educational images (see Figure 13)
that improves human comprehension of intricate anatomical structures. An investigation
into whether comparable methods could improve DL models’ comprehension of modified
picture data properties was spurred by this visual methodology, which makes it easier for
an intuitive knowledge of the spatial relationships and configurations within the human
body. This study investigates the integration of CNNs with autoencoder architectures,
further motivated by the effective use of CNNs in handling similarly structured data from
Lidar sensors and the reliable performance of CNN architectures in detecting anomalies
even in one-dimensional time-series data compared to other methodologies as discussed
in the literature review. Through the combination of CNNs’ feature extraction powers
with autoencoders’ dimensionality reduction and reconstruction skills, CAE combine
their strengths to create a methodology that has the potential to significantly improve

20.04.2024 Student number: 120997 28


Title: Data Cubes and Deep Learning: Gearbox Condition Monitoring for Offshore Wind Turbine

anomaly identification in OWTs.

Figure 13: An Plastic Overlay Illustrating the Anatomy of the Human Body by [115]

3.1 First Step: Data Mining


To increase dependability and save maintenance costs, the National Renewable Energy
Laboratory (NREL) extensively monitored the vibration status of WT gearboxes [116].
Data were gathered using controlled dynamometer experiments to compare a “healthy”
and a “damaged” gearbox of the same design. While the damaged gearbox received
initial dynamometer testing and then field testing at a wind farm close to NREL, the
healthy gearbox was tested just in the dynamometer environment. Because of loss-of-oil
events that occurred in the damaged gearbox during these testing, NREL was able to
obtain important information about the effects of such incidences on the gearbox inter-
nals. In order to support the investigation and creation of more potent vibration-based
CM systems, the data collection attempted to close the gap in benchmarking datasets.

Each gearbox had around 125 sensors placed as part of the instrumentation used for
the data collection, which recorded a broad range of operational characteristics. The
gears were arranged so that accelerometers could measure vibrational data at a high
sampling rate of 40 kHz per channel. These sensors gave a thorough understanding of
the gearbox’s operating dynamics, in addition to the high-speed shaft RPM signals. A
high-speed data acquisition device from National Instruments, the PXI-4472B, was used
to carefully record the data. The purpose of this configuration was to provide thorough
monitoring and precise data collection, both of which are essential for examining the
behaviour of the gearbox under varied load and failure scenarios.

The test results yielded datasets with a broad range of information, such as precise vi-
bration measurements at particular gearbox locations, gearbox working speeds, and the

20.04.2024 Student number: 120997 29


Title: Data Cubes and Deep Learning: Gearbox Condition Monitoring for Offshore Wind Turbine

related mechanical states. Every test run included continuous data collection, yielding
operational data down to the minute, which is essential for in-depth analysis and mod-
elling. These databases are especially useful for creating and verifying new diagnostic
tools and techniques meant to anticipate and mitigate WT gearbox breakdowns because
of their vast and detailed nature. These statistics are critical to the development of CM
methods, which have the potential to significantly increase WT operational reliability
and efficiency. Ten sensors are chosen for the anomaly detection and each of the chosen
sensors from the collection offers vital measurements that are necessary for identifying
irregularities in WT gearboxes [116]. These sensors allow for the detection of anomalous
patterns that may point to mechanical problems by detecting vibrational signals from a
number of crucial gearbox components illustrated in Fig 14.

Figure 14: Vibrations Sensors Location [116]

The significance of each sensor for anomaly detection in gearboxes is explained as follows:

1. AN3 and AN4 - Ring Gear Radial Sensors:

(a) Position: Positioned at 6 o’clock (AN3) and 12 o’clock (AN4) on the ring
gear.
(b) Purpose: These sensors capture the radial vibrations of the ring gear, which is
a central component that interacts with both the sun and planet gears. Mon-

20.04.2024 Student number: 120997 30


Title: Data Cubes and Deep Learning: Gearbox Condition Monitoring for Offshore Wind Turbine

itoring the vibrations at different positions allows for detecting misalignment,


wear, or damage such as gear teeth defects.
2. AN5 - LS-SH Radial Sensor:
(a) Position: Mounted to measure the radial vibrations of the Low-Speed Shaft
(LS-SH).
(b) Purpose: This sensor helps monitor the vibrations directly associated with
the low-speed shaft’s operation. It is crucial for detecting faults like shaft
misalignment, unbalance, or bearing failures at an early stage.
3. AN6 - IMS-SH Radial Sensor:
(a) Position: Positioned to measure the Intermediate-Speed Shaft (IMS-SH) ra-
dial vibrations.
(b) Purpose: The intermediate-speed shaft transmits torque from the low-speed
to the high-speed stages, and monitoring its vibrations is vital for detecting
anomalies like shaft misalignment and bearing wear.
4. AN7 - HS-SH Radial Sensor:
(a) Position: Measures radial vibrations on the High-Speed Shaft (HS-SH).
(b) Purpose: This sensor is key to monitoring the operational integrity of the
high-speed shaft, where faults could lead to significant failures due to the
high operational speeds.
5. AN8 and AN9 - HS-SH Bearing Radial Sensors:
(a) Position: AN8 and AN9 are mounted near the upwind and downwind bearings
of the High-Speed Shaft.
(b) Purpose: These sensors provide crucial data on the state of the bearings, cap-
turing vibrations that might indicate bearing degradation or failure, crucial
for preventing catastrophic gearbox failures.
6. AN10 - Carrier Downwind Radial Sensor:
(a) Position: Measures radial vibrations of the carrier at the downwind position.
(b) Purpose: This sensor’s location makes it ideal for detecting anomalies in the
carrier, which holds the planetary gear system together, thereby identifying
issues like bearing wear or structural weaknesses in the carrier assembly.
7. Speed Sensor (HS-SH):
(a) Measurement: Records the rotational speed of the High-Speed Shaft.
(b) Purpose: The speed sensor is critical for context, as changes in vibration
patterns can be related to changes in operational speeds. It helps correlate
vibrational data with specific operational conditions.

20.04.2024 Student number: 120997 31


Title: Data Cubes and Deep Learning: Gearbox Condition Monitoring for Offshore Wind Turbine

Table 2: Sensor Descriptions & Models & Units


Units in
Sensor Label/Signal Name Description Sensor Model
Data File
AN3 Ring gear radial 6 o’clock IMI 626B02 m/s2
AN4 Ring gear radial 12 o’clock IMI 626B02 m/s2
AN5 LS-SH radial IMI 622B01 m/s2
AN6 IMS-SH radial IMI 622B01 m/s2
AN7 HS-SH radial IMI 622B01 m/s2
AN8 HS-SH upwind bearing radial IMI 622B01 m/s2
AN9 HS-SH downwind bearing radial IMI 622B01 m/s2
AN10 Carrier downwind radial IMI 622B01 m/s2
Speed* HS-SH rpm

There are various benefits to concentrating on these particular sensors for anomaly de-
tection in gearboxes. First, they guarantee thorough coverage by keeping an eye on
all important mechanical parts, including the ring gear, shafts, and bearings, giving an
overall picture of the gearbox’s condition. The sensors increase the system’s sensitivity
to changes in alignment, balance, and mechanical integrity for early anomaly detection
by focusing on both high-speed and low-speed components, which allows them to catch
a wide range of possible problems. The addition of speed measurements provides impor-
tant operational context and facilitates the distinction between true abnormalities and
regular operational changes.

3.2 Second Step: Data Analysis


There are twenty files in the dataset; ten have been designated as healthy and ten
as damaged. Nine sensors’ worth of data are included in each file, with 2.4 million
rows representing each variable. Each file contains time-series data corresponding to a
60-second time frame. The data having very high quality is empty of any missing values.

Statistical analysis of the first healthy data presented in table 3 shows us that the mean
values of sensors like AN3 and AN4 in the first healthy sample are marginally negative
and almost zero, suggesting a central tendency around a nearly neutral displacement.
The standard deviations, especially for sensors such as AN6 and AN8, reflect a wide
range of data points that may represent different vibration responses in typical working
environments. Particularly for sensors like AN7 and AN9, the minimum and maximum
readings exhibit a significant range, suggesting sporadic extreme values that may be
the result of temporary circumstances. The distribution can be understood through the
quartiles; the 25th and 75th percentiles indicate that most data points are concentrated
around the mean, with possible outliers adding to the wide range.

20.04.2024 Student number: 120997 32


Title: Data Cubes and Deep Learning: Gearbox Condition Monitoring for Offshore Wind Turbine

Table 3: Descriptive Statistics for Healthy Gearbox Sensors


Count Mean Std Dev Min 25% 75% Max
AN3 2.4M -0.0035 1.1337 -7.841 -0.7752 0.7679 7.845
AN4 2.4M -0.0023 1.9153 -9.194 -1.3243 1.3194 8.998
AN5 2.4M -0.0912 1.7150 -9.034 -1.2505 1.0679 9.026
AN6 2.4M -0.2377 2.8050 -14.219 -2.1309 1.6552 13.780
AN7 2.4M 0.0189 2.6228 -13.830 -1.7393 1.7761 14.199
AN8 2.4M -0.0854 4.7370 -19.730 -3.4349 3.2137 19.528
AN9 2.4M -0.3791 2.1300 -7.741 -1.8443 1.0857 7.635
AN10 2.4M 0.0049 1.9012 -9.872 -1.2805 1.2856 10.074
Speed 2.4M 0.0922 0.6580 -0.432 0.0006 0.1835 1.430

On the other hand, higher standard deviations across sensors, including AN5 and AN6,
indicate overall more variability in sensor readings in the damaged gearbox data pre-
sented at table 4, suggesting more erratic behaviour potentially owing to mechanical
difficulties. The means are marginally more negative than the healthy gearbox’s but
still very near to zero, which may indicate that there has been damage that has caused
a change in the normal vibration pattern. This dataset exhibits higher extremes in min-
imum and maximum readings, particularly in the high-speed shaft sensors (AN8 and
AN9), which may be directly impacted by gearbox deficiencies. The quartiles show a
wider range in the middle 50% of the data, indicating a higher degree of performance
unpredictability in the gearbox.

Table 4: Descriptive Statistics for Damaged Gearbox Sensors


Count Mean Std Dev Min 25% 75% Max
AN3 2.4M -0.0296 2.8748 -11.355 -2.0409 1.9811 11.488
AN4 2.4M -0.0188 2.9146 -12.871 -1.9519 1.9145 12.789
AN5 2.4M -0.1288 3.8564 -19.942 -2.6938 2.4365 20.004
AN6 2.4M -0.0395 4.0245 -18.215 -2.7791 2.6997 18.066
AN7 2.4M -0.0557 3.7733 -17.368 -2.6333 2.5220 16.748
AN8 2.4M -0.0398 7.5640 -33.286 -5.1225 4.8852 32.788
AN9 2.4M -0.0811 7.1544 -33.073 -5.0156 4.8532 32.552
AN10 2.4M -0.0168 3.9631 -19.333 -2.7017 2.6683 18.976
Speed 2.4M 1799.823 3.0525 1792.918 1796.816 1802.829 1806.934

20.04.2024 Student number: 120997 33


Title: Data Cubes and Deep Learning: Gearbox Condition Monitoring for Offshore Wind Turbine

Figure 15: A healthy (Green) and a Damaged (Orange) First 4 Sensors Distribution

20.04.2024 Student number: 120997 34


Title: Data Cubes and Deep Learning: Gearbox Condition Monitoring for Offshore Wind Turbine

Figure 16: A healthy (Green) and a Damaged (Orange) Last 4 Sensors Distribution
20.04.2024 Student number: 120997 35
Title: Data Cubes and Deep Learning: Gearbox Condition Monitoring for Offshore Wind Turbine

The correlation matrix presented at Fig 17 and Fig 19 for the healthy gearbox highlights
several significant relationships among the sensors that reflect the mechanical interde-
pendencies expected in a functioning gearbox. As an example, sensors AN4 and AN6
have a positive correlation of roughly 0.33, pointing to coordinated behaviours among
the components they both monitor and pointing to a condition of sound operation. On
the other hand, AN4 and AN7 have a negative correlation of -0.30, which may be re-
lated to their functions in the gearbox to balance opposing mechanical forces. AN8 and
AN9 have the strongest positive correlation, with a value of 0.53. This is probably be-
cause they monitor similar components, including high-speed shaft bearings. AN4 and
AN10, on the other hand, show a substantial negative correlation of -0.35, suggesting
that their operational functions respond in opposite ways. The gearbox’s condition and
the efficiency of the sensor placements in collecting crucial dynamics are confirmed by
these correlations, which collectively offer a distinct and predicted pattern of operational
behaviour.

Figure 17: Correlation Matrix for the Healthy Gearbox

The correlation matrix for the damaged gearbox reveals significant disruptions in sensor
correlations, which strongly suggest underlying mechanical issues presented in the Fig 18
and Fig 20. For instance, the change in correlation between AN3 and AN4 from positive
(0.13) to negative (-0.16) in the damaged condition suggests that the components these

20.04.2024 Student number: 120997 36


Title: Data Cubes and Deep Learning: Gearbox Condition Monitoring for Offshore Wind Turbine

sensors are monitoring may have been misaligned or decoupled. Furthermore, formerly
strong correlations have noticeably diminished, such as that between AN5 and AN6,
suggesting that the gearbox’s failing state may be affecting the sensors’ capacity to
identify coordinated operations. Moreover, the uniform reaction to damage may be
shown in the increase in correlation between AN7 and AN9 to 0.35 in the damaged state.
This is probably caused by increased vibrational energy or harmonics from faults such
misalignments or bearing wear. These alterations suggest severe mechanical disruptions
typical of gearbox degradation, including the emergence of novel abnormal correlations.

Figure 18: Correlation Matrix for the Damaged Gearbox

By employing statistical and correlation measures, this thorough examination has ef-
ficiently distinguished between healthy and damaged WT gearbox states, offering in-
sightful information on the operational health of these systems. However, there are a
number of key issues that typical statistical methods may not be able to fully handle,
including the complexity of gearbox dynamics, the subtlety of early-stage damage, and
the massive amount of data generated from various sensors over continuous operation.
These techniques are strong at spotting distinct anomalies and well-established patterns,
but they frequently fall short in picking up on subtle irregularities that appear before
obvious or palpable indications of damage. Furthermore, anomaly detection may be
overly sensitive or under sensitive due to the predetermined thresholds and established

20.04.2024 Student number: 120997 37


Title: Data Cubes and Deep Learning: Gearbox Condition Monitoring for Offshore Wind Turbine

patterns used in traditional statistical analysis.

Figure 19: The Scatter Plot of All Sensors from Healthy Dataset

20.04.2024 Student number: 120997 38


Title: Data Cubes and Deep Learning: Gearbox Condition Monitoring for Offshore Wind Turbine

Figure 20: The Scatter Plot of All Sensors from Damaged Dataset

3.3 Third Step: Data Preprocessing


Since there were no obvious inconsistencies in the dataset, the preprocessing procedures
were reduced and the dataset was ready for study. A min-max scaler was used since
the data had different distributions throughout, with separate minimum and maximum
values [117]. As is common in the DL domain, this scaler converts numerical features to
scale between zero and one. Normalisation is necessary to guarantee that every variable
in the dataset is given the same weight by the models, which makes learning and inter-

20.04.2024 Student number: 120997 39


Title: Data Cubes and Deep Learning: Gearbox Condition Monitoring for Offshore Wind Turbine

pretation more consistent.

Model accuracy and generalisation can be improved by including signal-to-image pro-


cessing methods during the data preparation phase, as prior research [34] showed for
SCADA data. Six signal-to-image processing techniques will be used in this study, and
their performance will be compared. The structure of Lidar 3D data serves as an in-
spiration for this study [118], which has motivated layering images generated from the
signal-to-image processing algorithms from different sensors to produce data cubes, a
novel type of data. The sections that follow will offer detail of each method utilised in
this data pipeline step to prepare data in the necessary format.

3.3.1 Signal-to-Image Processing Algorithms


The dataset must first be split into distinct temporal segments because these are the
foundational blocks for signal-to-image processing algorithms. This segmentation is es-
sential for data traceability and practical application since most image processing meth-
ods prevent the data from being returned to its original form [34]. A crucial hyperpa-
rameter that affects study results and model accuracy is the selection of window size.
The window size must be chosen carefully so that it is neither too big, which could lower
the model’s accuracy, nor too tiny, which could result in images with too little detail
to extract useful insights. As a result of the study’s consistent 64 by 64 window size,
one image is produced for every 4096 data points. Following this, an image is generated
for each subset of 4096 data points using a signal-to-image processing technique. The
outputs from all sensors are represented by these images, which are later on stacked on
top of each other to create a data cube for each corresponding time period. There will
be two kind of data cubes: “Healthy” and “Damaged”. The signal-to-image processing
algorithms utilised in this study are followed as:
1. Gramian Angular Field (GAF): There are two steps in the process of turning
time series data into a GAF matrix [119]. First, the data is transformed into polar
coordinates using a particular operation (Equation 1), which allows the temporal
dimension to be scaled while preserving the final matrix’s attributes. Because
of its bijective nature, this transformation ensures that absolute temporal links
are preserved [120]. For this technique, the data must be normalised using the
normalisation process outlined in Equation (2), where UB and LB stand for the
upper and lower bounds, respectively, so that the values fall within the range of -1
to 1.

(
ϕ = arccos(Xnorm,i ) ϕ ∈ [0, π],
(1)
r = Ln r ∈ R+ ,

xi − LB
Xnorm,i = ∈ [−1, 1] (2)
UB − LB

20.04.2024 Student number: 120997 40


Title: Data Cubes and Deep Learning: Gearbox Condition Monitoring for Offshore Wind Turbine

Following this, the final Gramian matrix, referred to as the GAF, is computed as
shown in equations (3) and (4).
r q
T
 [2] [2]
GAF = Xnorm × Xnorm − I − Xnorm × Xnorm (3)

 
cos(f1 + f1 ) cos(f1 + f2 ) · · · cos(f1 + fN )
cos(f2 + f1 ) cos(f2 + f2 ) · · · cos(f2 + fN )
GAF =  (4)
 
.. .. .. .. 
 . . . . 
cos(f2 + f1 ) cos(f2 + f2 ) · · · cos(f2 + fN )

(a) The Gramian Angular Summation Field (GASF) technique utilizes


the cosine of the summed angles to create a Gramian matrix, resulting in
matrix elements represented by cos(fi +fj ). This approach captures temporal
correlations among different time instances.
(b) The Gramian Angular Difference Field (GADF) leverages the sine of
the angle differences to construct its Gramian matrix, where elements are
depicted as sin(fi − fj ), highlighting temporal differences across various time
points.

2. Markov Transition Field (MTF): This method examines variations between


points in a time series by generating a matrix of transition probabilities [121]. This
process begins with discretizing the time series, followed by counting and normal-
izing transitions between discrete bins, leading to the creation of an MTF matrix
that depicts the likelihood of transitions between point pairs, but does not reflect
temporal relationships [122]. This process results in the formation of an MTF
matrix that displays the probability of transitions between point pairs. Transi-
tion counts, labeled as Wi j, are calculated for each consecutive pair of points in
the dataset, as described in Equation (5). These counts are then normalized to
establish transition probabilities, which are also represented as Wi j and further
elaborated in Equation (6). Although the Markov Transition Matrix W encapsu-
lates these probabilities, it does not reflect temporal relationships. The final MTF
matrix, referred to as MTF and consisting of an N 2 matrix, represents the tran-
sition probabilities between point pairs in the time series, as detailed in Equation
(7).

∀(i, j) ∈ [1, Q + 1], wij = number of transitions qi → qj (5)

wij
∀(i, j) ∈ [1, Q + 1], wij = P (6)
j wij

20.04.2024 Student number: 120997 41


Title: Data Cubes and Deep Learning: Gearbox Condition Monitoring for Offshore Wind Turbine

 
wij |x1 ∈ qi , x1 ∈ qj · · · wij |xN ∈ qi , xN ∈ qj
MTF = 
 .. .. .. 
(7)
. . . 
wij |xN ∈ qi , xN ∈ qj · · · wij |xN ∈ qi , xN ∈ qj
Significant information loss may occur from the non-reversible MTF transformation
procedure. The crucial discretization stage of binning has a big impact on whether
or not information is kept. The MTF translation process necessitates careful bin
selection because selecting fewer bins may result in a significant loss of information,
while selecting more bins may lead to data sparsity.
3. Grey Scale encoding (GS): A method previously applied in fault diagnosis
within manufacturing systems using CNNs [123], involves a two-step transforma-
tion of one-dimensional time series data into color-encoded values representing
different intensities. Initially, the data is divided into K sub-series, each contain-
ing K1 data points. These sub-series are subsequently rescaled to represent color
encoding values, such as 8-bit integers. To maintain consistency with other en-
coding methods, K is made equivalent to K1, and the stride ’s’—which denotes
the beginning of each sub-series—is determined accordingly. The process of GS
encoding is detailed in equation (8):

(x(i−1).s + j − LB)
 
GSij = round P · , for i, j ∈ [1, K] (8)
U B − LB

Within the context of the ’round’ operator, it is applied to round values to the
closest integers. Here, ’P’ represents a scaling factor, and ’UB’ and ’LB’ refer to
the upper and lower bounds, respectively, which are used in the scaling of ’X’.
4. Spectrogram: The Spectrogram is used for obtaining a localized spectrum of a
signal at specific times by multiplying the signal with a window function centered
on the desired time, producing a specific outcome [124]. By multiplying the signal
with the window W (T ) centred at T = t, a localised spectrum of S(T ) can be
constructed for a given signal represented by S(T ) and an even, real window W (T ),
each with their corresponding Fourier Transforms S(f ) and W (f ). A particular
outcome of this operation is given in (9):
SW (t, T ′ ) = S(T )W (T − t) (9)
Next the short-time Fourier transform (STFT) is calculated as (FsW (t, f ):
FsW (t, f ) = FT →f {S(T )w(T − t)} (10)
The Spectrogram, designated as SsW (t, f ) , is defined as the square of the absolute
value of the Short-Time Fourier Transform (STFT), and can be mathematically
expressed in a specific formulation:
Z ∞ 2
W
Ss (t, f ) = S(T )w(T − t)e−2πjf T dT (11)
−∞

20.04.2024 Student number: 120997 42


Title: Data Cubes and Deep Learning: Gearbox Condition Monitoring for Offshore Wind Turbine

5. Scalogram: produced by the Continuous Wavelet Transform (CWT) [124], offers


a time-frequency analysis of a signal, displaying how the signal’s energy is dis-
tributed across various frequencies over time, ideal for analyzing non-stationary
signals. It utilizes a scalable and shifted version of a wavelet to provide a time-
frequency analysis of a signal. Mathematically, the CWT of a signal x(t) can be
expressed at a specific scale ’a’ and translation ’b’:

1
W (a, b) = √ R t−b
 dt (12)
a x(t)ψ a

Upon computing the Continuous Wavelet Transform (CWT) of a signal, a scalo-


gram is generated. This scalogram serves as a time-frequency representation, illus-
trating how the energy of the signal is distributed across various frequencies over
time:

S(a, b) = |W (a, b)|2 (13)

In the scalogram S(a, b) ), ’a’ represents the scale and ’b’ signifies the temporal
position. The amplitude at each (a, b) coordinate reflects the energy magnitude
of the signal at that particular scale and temporal location.

CNNs can be applied when one-dimensional time-series data is transformed into two-
dimensional formats, like images, by means of image processing algorithms. Due to their
innate ability to handle two-dimensional data, CNNs are particularly good at spotting
patterns and features in images. This is because they make use of shared weights,
local connectivity, and spatial hierarchies—all of which are difficult to capture in one-
dimensional signal analysis. CNNs can now analyse intricate data structures in two
dimensions because to this adaption, which greatly improves the model’s capacity to
understand temporal dynamics as spatial correlations. By switching from 1D to 2D
representation, neural networks’ analytical powers are expanded and their architectural
advantages are better matched, which enhances their effectiveness in tasks involving
pattern recognition and feature extraction from sequential data.

20.04.2024 Student number: 120997 43


Title: Data Cubes and Deep Learning: Gearbox Condition Monitoring for Offshore Wind Turbine

Figure 21: Example of a Batch with the Size of 4096 Image Representation of the First
Sensor.

3.3.2 Data Cubes - Multi Layered Data


Following the segmentation of the dataset into subsets of 4096 data points and the sub-
sequent application of signal-to-image transformation algorithms, the resulting images
are rendered with dimensions of (Color Channel, 64, 64) for each sensor, producing a
total of nine images for a single time period. Color channels in these images delin-
eate discrete components of color information, typically comprising red, green, and blue
(RGB), which merge at varying intensities to manifest a comprehensive color spectrum
in the final visual representation. The images derived from the signal-to-image algo-
rithms exhibit variations in their color channel composition, with GAF, MTF, and GS
images incorporating a single color channel, whereas Spectrogram and Scalogram images
are generated with three color channels. This research proposes a method of stacking
these images along their color channels to create higher-dimensional image arrays with
dimensions of (Color Channels × 9, 64, 64), where the factor of nine corresponds to the
number of sensors. This technique is aimed at enhancing the understanding of the model
on a broader scale of the OWT condition from various viewpoints at the same time.

3.4 Fourth Step: Model Architecture


The methodology of this study leverages a 2D CAE framework, where the model is
trained to capture the characteristics of the input data cubes with the goal of accurately
reconstructing them. Researchers have utilized this technique to analyze the historical
distribution of WT data, aiming to determine if new data points conform to the existing
distribution. Previous studies [125] have demonstrated this architecture’s effectiveness
in learning the features of time-series data in an unsupervised manner, particularly for

20.04.2024 Student number: 120997 44


Title: Data Cubes and Deep Learning: Gearbox Condition Monitoring for Offshore Wind Turbine

anomaly detection.

The model’s architectural layout is constant for all image processing methods, ensuring
that the hyperparameters are stable during training and testing. The range of hy-
perparameter adjustment has been limited by computational constraints, despite some
hyperparameter optimisation having been accomplished. Therefore, in order to achieve
the best potential results, this study recommends rigorous hyperparameter modification
in real-world scenarios. Table 5 provides specifics on the architecture, including layers
and characteristics. The model architecture is kept simple since to emphasize the impact
of correct data preprocessing in the model performance.

Table 5: The architecture of the 2D CAE

Kernel
Section Layer Parameters
Size Stride
Encoder
Input - -
Conv2D 1 13,830 16 1
MaxPool2D 1 0 2 -
Conv2D 2 1,155 8 1
MaxPool2D 2 0 2 -
Conv2D 3 49 4 1
Decoder
Dec Conv2D 1 51 4 1
MaxPool2D 1 0 2 -
Dec Conv2D 2 1,158 8 1
MaxPool2D 1 0 2 -
Dec Conv2D 3 15,615 17 1
Total Params 31,858

The point of entry for data into the network is the input layer. Convolutional layers
known as Conv2D layers use spatial convolution to extract characteristics from images by
applying filters that identify spatial hierarchies between pixels. The MaxPool2D layers
help to make feature identification more resilient to changes in scale and orientation by
lowering the spatial dimensions of the input volume for the subsequent convolutional
layer. This also helps to lower the computational load and memory use. By upsampling
and learning to recover the original inputs, the Dec Conv2D layers in the decoder portion
of the network effectively execute the opposite action of the Conv2D layers, recreating
the input image from its encoded form. Each layer is crucial for the network’s ability to
compress (encode) and then reconstruct (decode) the input data, maintaining a balance
between data reduction and information retention.

20.04.2024 Student number: 120997 45


Title: Data Cubes and Deep Learning: Gearbox Condition Monitoring for Offshore Wind Turbine

3.5 Fifth Step: Model Training


The model was trained utilising the university High Performance Computer (HPC), or
known as Kristiania-HPC. Kristiania-HPC system is equipped with robust specifications
designed to handle demanding computational tasks. Each compute node in the system
features four AMD Epyc 64-core CPUs, providing substantial processing power. The
system is supported by 256 GB of DDR4 RAM per node, culminating in a total of 1 TB
of memory across all nodes, and includes expansive storage capabilities with 16 TB of
SSD storage per node, amounting to a total of 175 TB. Networking is facilitated through
10Gigabit Ethernet per compute node, ensuring fast data transfer rates. The system
runs on Ubuntu Server LTS 22 with Kernel 5.4+ O, providing a stable and secure oper-
ating environment.

A section of the dataset was set aside for further validation against both normal and
anomalous data, with the goal of giving the CAE priority training on operationally
normal data cubes. This tactical segment aims to guarantee that the CAE efficiently
internalises the traits of normal data cubes and underlying patterns, which is a crucial
component for effective anomaly identification. The training regimen uses the Mean
Squared Error (MSE) loss function and the Adam optimizer, which are often chosen
for these kinds of tasks since they are good at minimising reconstruction mistakes and
resilient at handling huge datasets.

The model is operationally executed through 100 epochs of training and validation util-
ising 4,500 data cubes which is equivalent to 18.4 million data points from each sensor,
with an early stopping mechanism incorporated to reduce the danger of overfitting. The
Kristiania-HPC framework is used to coordinate these computing processes, and SLURM
is used to facilitate effective task scheduling. The training process is accelerated greatly
by this infrastructure, which allows for the utilisation of numerous compute nodes. This
mimics an industrial setting where computing performance is critical to reducing the
time it takes to obtain meaningful insights. Additionally, careful tracking of training
and validation losses provides important information on the convergence patterns and
performance of the model, which lays the groundwork for focused training technique
optimisations and refinements.

3.6 Sixth Step: Model Evaluation


The metrics utilized in our comparisons include accuracy, precision, recall, and F1-score.
If we were to say that our model is a berry picker, the F1 score measures the accuracy of
a berry picker by balancing the precision (the proportion of picked berries that are ripe)
with the recall (the proportion of ripe berries that were successfully picked), ensuring
that both the avoidance of unripe berries and the thoroughness of the harvest are equally
considered [111]. Precision in equation (14) quantifies the accuracy of a model by dividing
the number of true positives (correct predictions) by the sum of true positives and false
positives (incorrect positive predictions). Recall in equation (15) measures the model’s

20.04.2024 Student number: 120997 46


Title: Data Cubes and Deep Learning: Gearbox Condition Monitoring for Offshore Wind Turbine

ability to identify all relevant instances by dividing the number of true positives by the
sum of true positives and false negatives (missed positives). The F1 Score combines
both precision and recall into a single metric by taking their harmonic mean in equation
(16), providing a balanced measure of a model’s performance, especially when the class
distribution is uneven.

TP
Precision = (14)
TP + FP

TP
Recall = (15)
TP + FN

Precision × Recall
F1 = 2 × (16)
Precision + Recall
The optimally performing model was subjected to evaluations across nine distinct sce-
narios. In each scenario, a chosen sensor of the nine sensors data for a designated
time period were substituted with historical statistical figures. This substitution aims
to assess the model’s capability to manage missing data due to communication dis-
ruptions or sensor malfunctions. This testing protocol was informed by discussions with
experts from Equinor, who highlighted these issues as significant concerns within the do-
main. The methodology employed involves generating synthetic datasets derived from
historical datasets representing healthy and damaged states. For each attribute within
the datasets, key statistical metrics—mean, standard deviation, minimum, and maxi-
mum—are calculated. Using these metrics, a structured sequence is constructed for each
attribute, consisting of the mean, one standard deviation above and below the mean, the
minimum, and the maximum values. This sequence is designed to represent potential
variations within the data that might occur due to inconsistency or error but is still
reflective of the underlying statistical characteristics. The sequence for each attribute is
replicated a specified number of times to match the length of the original dataset, with
adjustments made to account for any remainder in the dataset length. This method
ensures that the synthetic data maintains the same scale and distribution characteristics
as the original data, thus providing a consistent basis for model evaluation. This model
is tested against the same measures as normal and consistent data, where the F1 score
has the highest priority. This will allow the research to test the models robustness under
different circumstances. The model undergoes evaluation across nine distinct scenarios,
wherein a sensor exhibits data loss or experiences malfunctions due to unforeseen events.
This evaluation is conducted using the same metrics as those applied to normal and con-
sistent data, with a primary emphasis on the F1 score. This methodological approach
enables the assessment of the model’s robustness and its ability to maintain performance
under varying conditions, thus providing insights into the resilience of the model against
data irregularities.

20.04.2024 Student number: 120997 47


Title: Data Cubes and Deep Learning: Gearbox Condition Monitoring for Offshore Wind Turbine

3.7 Methodology Summary


The methodology detailed in the thesis and illustrated in Fig 22, includes an innovative
approach to generating data cubes from images derived from various sensors attached
to OWT gearbox. By transforming the time-series sensor data into a structured 2D
format and subsequently compiling these into multidimensional data cubes, the analy-
sis captures a comprehensive spatial and temporal context of the turbines’ operational
state. This not only enhances the data integrity by maintaining the interrelations among
different sensor readings but also leverages the power of CAEs in recognizing complex
patterns across multiple dimensions. This method greatly benefits anomaly detection
systems by providing a holistic view of the OWT’s condition, allowing for more accurate
assessments than what could be achieved through analyzing individual sensor outputs
in isolation.

Integrating the data cube approach can enhance the model’s ability to discern between
true anomalies and local discrepancies that do not signify operational risks. By training
the CAE to recognize the ’normal’ operational state across a concatenated set of sensor
data, the model develops an intuitive understanding of the turbine’s comprehensive
operational baseline. This prevents the model from mistaking minor deviations, which
might be normal variations or non-critical anomalies, for serious faults. Such capability
is crucial for reducing the incidence of false alarms and focusing maintenance efforts on
genuinely significant issues that could lead to system failures if left unaddressed. This
could aid build reliability and reliance on DL solutions in the O&M domain of OWTs.
Furthermore, once the CAE model is adequately trained on these rich, multi-sensor
data cubes, there is a reduced need for ongoing, extensive data inputs to maintain its
efficacy. The robustness developed through this training allows the model to perform
accurate predictions and anomaly detections with less frequent recalibrations using new
data. This reduction in dependency on continual data influx not only streamlines the
operational process but also diminishes computational costs and enhances the system’s
efficiency. Thus, this final artifact, leveraging DL and advanced data structuring tech-
niques, aims to significantly decrease the operational overhead and logistical challenges
associated with routine data collection and analysis in the maintenance of OWTs.

Figure 22: Proposed Artifact for Anomaly Detection in OWT Gearbox.

20.04.2024 Student number: 120997 48


Title: Data Cubes and Deep Learning: Gearbox Condition Monitoring for Offshore Wind Turbine

4 Results
The results of evaluating six DL models that were trained on 4,500 healthy data cubes
which are stacked images produced by various sensors, by utilising one of the signal-to-
image processing techniques: GASF, GADF, MTF, GS, Spectrogram, or Scalogram are
explained in this section. To guarantee both resilience and accuracy in performance,
these CAE models were rigorously developed, trained, and validated using CM data
from OWT gearbox. The findings that are provided here are essential in supporting the
theory that the created artifact may effectively detect abnormalities whithin a sample
size of 1,374 data cubes equaivelant to 5.6 million healthy and abnormal data points from
9 sensors, half healthy and half abnormal, in operations by analysing multi-dimensional
sensor data in the form of a data cube, which in turn allows maintenance plans to be
improved using predictive analytics. This section begins with an analysis of the CAE
models’ general performance before moving on to a thorough assessment of their anomaly
detection skills, the impact of the data cube architecture on their functionality, and
their relative effectiveness in comparison to other techniques. The following subsections
include quantitative and qualitative evaluations of the models’ performance, carefully
analysing the sensitivity-specificity trade-off that is essential for real-world use.

4.1 Data Transformation Results


The data preprocessing methodologies described in the methodology section of this the-
sis were applied to all sensors to generate both healthy and damaged data cubes. The
step-by-step process of this data preprocessing for the GASF algorithm is illustrated
in Fig 23 for 4096 data points from all sensors. Initially, time-series data from each
sensor (AN3, AN4, AN5, AN6, AN7, AN8, AN9, AN10, and Speed) were transformed
into GASF images, capturing the temporal correlations within the data. This trans-
formation involved converting the raw time-series signals into two-dimensional images
that represent the angular summation field, effectively encoding the temporal dynamics
into a spatial format. These GASF images for each sensor were then stacked together,
creating a multidimensional data cube with a shape of (9, 64, 64), where ’9’ corresponds
to the number of sensors, and ’64x64’ represents the dimensions of each GASF image.

Following this comprehensive transformation for all six corresponding signal-to-image


processing algorithm, the dataset was divided into three subsets: training, validation,
and test datasets. The majority of the data was allocated to the training phase, which
included 4,000 healthy data cubes, equivalent to 16.38 million data points per sensor.
The validation dataset, also utilized during the training phase, consisted of 500 data
cubes, equivalent to 2 million data points per sensor. The test phase comprised both
healthy and abnormal data, totaling 1,374 data cubes, with 687 data cubes from each
class. This structured approach ensures that the temporal characteristics and corre-
lations within the sensor data are effectively captured and utilized in the subsequent
anomaly detection models, enhancing the accuracy and reliability of the CM system.

20.04.2024 Student number: 120997 49


Title: Data Cubes and Deep Learning: Gearbox Condition Monitoring for Offshore Wind Turbine

Figure 23: Data Transformation GASF Algorithm

20.04.2024 Student number: 120997 50


Title: Data Cubes and Deep Learning: Gearbox Condition Monitoring for Offshore Wind Turbine

4.2 Models Training Evaluation


The MSE is employed as the loss function for both training and validation loss calcu-
lations. Monitoring the training loss provides insight into how the model is learning
from the training dataset across epochs, while monitoring the validation loss enables the
assessment of the model’s generalization capabilities to unseen data. The GASF model
exhibited a significant reduction in both training and validation loss within the first few
epochs, starting at a training loss of 0.2528 and validation loss of 0.2050, and ending
at 0.1406 and 0.1407 respectively after 100 epochs. This model showed rapid conver-
gence, with losses stabilizing quickly. The close alignment of training and validation
loss indicates good generalization, suggesting that the model is effectively capturing the
underlying patterns in the data without overfitting illustrated in Fig 24.

Figure 24: Training & Validation Curve of GASF

The training and validation loss for the GADF model showed a gradual decrease over
100 epochs. The training loss started at 0.2542 and decreased to 0.2139, while the
validation loss began at 0.2275 and settled at 0.2143. The model demonstrated a steady
decline in both training and validation loss, indicating that it effectively learned from the
data. However, the convergence was slow, and there was a marginal difference between
training and validation loss towards the later epochs, suggesting the model was learning
consistently without significant overfitting or underfitting illustrated in Fig 25.

20.04.2024 Student number: 120997 51


Title: Data Cubes and Deep Learning: Gearbox Condition Monitoring for Offshore Wind Turbine

Figure 25: Training & Validation Curve of GADF

The MTF model demonstrated a rapid decrease in training and validation loss within
the first few epochs. The training loss started at 0.0022 and validation loss at 0.0008,
but both quickly converged to around 0.0002 by the 6th epoch, where early stopping
was triggered due to no further improvement. The early stopping indicates the model
reached its optimal performance quickly, avoiding overfitting. This rapid convergence
and the low loss values imply that the MTF model was highly efficient in learning the
data patterns. Although the optimised model was selected the training process went
through all 100 epochs for analysis purposes illustrated in Fig 26.

20.04.2024 Student number: 120997 52


Title: Data Cubes and Deep Learning: Gearbox Condition Monitoring for Offshore Wind Turbine

Figure 26: Training & Validation Curve of MTF

The GS model’s training process was characterized by a very low initial loss, with the
training loss starting at 0.0023 and validation loss at 0.0008. Both losses quickly con-
verged to around 0.0002, and remained stable throughout the 100 epochs. The extremely
low loss values and their stability suggest that the model learned the data representa-
tion very efficiently. This consistency between training and validation loss indicates
strong generalization capabilities, meaning the model performed equally well on unseen
validation data illustrated in Fig 27.

20.04.2024 Student number: 120997 53


Title: Data Cubes and Deep Learning: Gearbox Condition Monitoring for Offshore Wind Turbine

Figure 27: Training & Validation Curve of GS

The Scalogram model showed a significant reduction in both training and validation loss
early in the training process. The training loss decreased from 0.2161 to 0.0399, and the
validation loss from 0.1316 to 0.0368 within 37 epochs, at which point early stopping
was triggered. The steep decline in losses suggests that the model effectively captured
the data’s structure quickly. The validation loss was closely aligned with the training
loss, indicating good generalization without overfitting illustrated in Fig 28.

Figure 28: Training & Validation Curve of Scalogram

20.04.2024 Student number: 120997 54


Title: Data Cubes and Deep Learning: Gearbox Condition Monitoring for Offshore Wind Turbine

The Spectrogram model exhibited a dramatic reduction in training and validation loss
within the first few epochs, starting from a training loss of 0.1545 and validation loss
of 0.0273, and decreasing to 0.0004 and 0.0004 respectively after 17 epochs. Early
stopping was triggered due to the rapid convergence, indicating that the model achieved
optimal performance swiftly. The low final loss values and the close alignment between
training and validation loss demonstrate the model’s strong learning and generalization
capabilities 29.

Figure 29: Training & Validation Curve of Spectogram

The analysis indicates that all models demonstrated effective learning and good gener-
alization. The GASF model converged faster than the GADF model, suggesting higher
efficiency for this dataset. The GS and MTF models exhibited very low loss values
and rapid convergence, while the Scalogram and Spectrogram models showed significant
and rapid loss reductions with early stopping, indicating quick and effective learning of
data patterns. Given that the models are CAEs trained exclusively on healthy data to
learn and mimic its characteristics, the observed behavior of the validation loss being
consistently lower than the training loss is normal. In the context of anomaly detection,
CAEs are typically trained to accurately reconstruct normal data, and a well-generalized
model should ideally exhibit lower reconstruction error on the validation set. Therefore,
the lower validation loss indicates effective generalization to normal patterns without
overfitting, aligning with the objectives of training a CAE for anomaly detection.

4.3 Performance Evaluation of the CAE Models


The models sorted from the best to the worse F1-Score in the table 6 are presented. The
precision values of all the models show how well each one forecasts anomalies, which

20.04.2024 Student number: 120997 55


Title: Data Cubes and Deep Learning: Gearbox Condition Monitoring for Offshore Wind Turbine

is important in lowering the false positive rate that might cause needless maintenance
procedures. Of the models, GASF and MTF show almost perfect precision, indicating
that they can correctly detect real anomalies and not fall for any false alarms.

Table 6: Performance of different signal to image models.


Method Precision Recall F1 Score
GASF 0.998 1 0.999
MTF 0.995 1 0.997
GADF 0.980 1 0.989
GS 0.898 1 0.946
SP 0.5 1 0.666
SC 0.5 1 0.666

With a recall of 1, the recall metric—which measures how effectively the models catch all
pertinent anomalies—reveals that almost every model performs remarkably well. This
implies that the models have a high degree of ability to identify all current anomalies,
which is a crucial quality for guaranteeing dependability in the monitoring of critical
infrastructure. The excellent performance of the GASF and MTF models is further val-
idated by the F1 scores, which are near to 1, and balance precision and recall. These
findings support the CAE models’ resilience in managing intricate, multidimensional
sensor data and highlight the potential benefits of sophisticated data cubes generated
from signal-to-image processing methods for improving the anomaly detection systems’
capacity for predictive maintenance in OWTs.

Only for the top 4 performing models the distribution of the model reconstruction MSE
in testing phase is analysed. The GASF model demonstrates a pronounced separation
between healthy and damaged classes (see Fig 30), with the healthy data MSE peaking
around 0.15 and the damaged data MSE peaking around 0.33, indicating a significant
gap between the two. The confusion matrix for GASF shows perfect classification for
the damaged class and almost perfect classification for the healthy class, with only one
misclassification. This lone misclassified healthy data cube appears as an outlier, high-
lighting the model’s high accuracy and effectiveness in distinguishing between healthy
and damaged instances.

20.04.2024 Student number: 120997 56


Title: Data Cubes and Deep Learning: Gearbox Condition Monitoring for Offshore Wind Turbine

Figure 30: MSE Distribution for GASF Model in Testing Phase

The GADF model shows a clear separation between healthy and damaged classes (see Fig
31), with the MSE distribution for healthy data peaking around 0.26 and for damaged
data peaking around 0.33, indicating effective differentiation. The confusion matrix for
GADF reveals high accuracy, correctly classifying all 687 damaged instances and 673
out of 687 healthy instances, with 14 healthy instances misclassified as damaged. This
suggests the model is highly accurate but has a slight tendency to misclassify some
healthy instances.

20.04.2024 Student number: 120997 57


Title: Data Cubes and Deep Learning: Gearbox Condition Monitoring for Offshore Wind Turbine

Figure 31: MSE Distribution for GADF Model in Testing Phase

The MTF model exhibits very low MSE values for both healthy and damaged classes
(see Fig 32), peaking around 0.001 and 0.004 respectively, with a clear and well-defined
separation. The confusion matrix for MTF shows almost perfect classification, correctly
identifying 684 out of 687 healthy instances and all 687 damaged instances, with only 3
misclassifications in the healthy class. This indicates the MTF model is highly effective
in distinguishing between healthy and damaged data, with minimal errors.

20.04.2024 Student number: 120997 58


Title: Data Cubes and Deep Learning: Gearbox Condition Monitoring for Offshore Wind Turbine

Figure 32: MSE Distribution for MTF Model in Testing Phase

The GS model’s MSE distribution shows peaks around 0.28 for healthy data and 0.32
for damaged data (see Fig 33), with noticeable but overlapping separation compared to
GADF and GASF. The confusion matrix for GS indicates that it correctly classified 609
out of 687 healthy instances and all 687 damaged instances, with 78 misclassifications
in the healthy class. This suggests that, while the GS model performs well, it is less
accurate than GADF and GASF, particularly in classifying healthy instances.

20.04.2024 Student number: 120997 59


Title: Data Cubes and Deep Learning: Gearbox Condition Monitoring for Offshore Wind Turbine

Figure 33: MSE Distribution for GS Model in Testing Phase

The systematic evaluation of the CAE models, particularly in the context of advanced
data preprocessing through the generation of data cubes from various sensors, has dis-
tinctly demonstrated the profound impact of signal-to-image processing algorithms on
the predictive accuracy of DL systems for anomaly detection in OWTs. The utilization
of data cubes has proven crucial, allowing the models like GASF and MTF to achieve
outstanding precision and recall metrics. These findings not only affirm the original
hypothesis that integrating sophisticated image processing techniques can significantly
enhance the detection of operational anomalies but also highlight the vital role of inno-
vative data structuring in optimizing the performance of anomaly detection systems. By
effectively capturing the spatial and temporal dynamics within the multidimensional sen-
sor data, these data cubes facilitate a deeper learning of normal and anomalous patterns,
thereby improving maintenance strategies in the wind energy sector. This advanced ap-
proach ensures the models’ reliability and efficiency by minimizing false alarms, paving
the way for their practical application in reducing downtime and maintenance costs while
improving the safety and longevity of WTs.

4.4 Analysis of Anomaly Detection Capabilities


This section delves into the anomaly detection capabilities of the models, assessing their
effectiveness through the lens of confusion matrices. This analysis begins by comparing
the performance of the GASF and GADF models (See Fig 34). The confusion matrix for
the GASF algorithm demonstrates outstanding anomaly detection capabilities, with 686
true negatives and 687 true positives, indicating almost perfect classification with only
one false positive. This high level of accuracy suggests that GASF model is extremely
effective at correctly identifying both normal and anomalous conditions without almost

20.04.2024 Student number: 120997 60


Title: Data Cubes and Deep Learning: Gearbox Condition Monitoring for Offshore Wind Turbine

any error, making it highly reliable for applications where the cost of a false alarm is
critical. Conversely, the GADF model shows a slightly less optimal performance, with
673 true negatives and 687 true positives, but with 14 false positives. This suggests
that while GADF is proficient at identifying true anomalies (high true positive rate),
it is somewhat more prone to classifying normal conditions as anomalous. Although
this might increase operational checks due to false alarms, it could be advantageous in
scenarios where failing to detect an anomaly could result in severe consequences, thus
justifying the higher false positive rate.

Figure 34: Confusion Matrix for GASF and GADF Models

The GS model’s confusion matrix (See Fig 35) reveals a relatively high number of false
positives, with 78 out of 687 total positive predictions. While the model correctly iden-
tifies all true positive conditions (687), the presence of 78 false positives indicates a
lower precision compared to the GASF and GADF models. This could potentially lead
to increased maintenance operations due to false alarms, suggesting a need for further
calibration or a possible trade-off in scenarios where higher sensitivity is required to
avoid missing critical failures. On the other hand, the MTF model exhibits a markedly
superior performance with only 3 false positives and 684 true negatives, maintaining a
perfect score in detecting true positives (687). This result underscores the MTF model’s
high precision and recall, positioning it as an excellent choice for precise anomaly de-
tection in environments where both false positives and false negatives carry significant
operational and safety implications. The robustness of the MTF model in minimizing
false alarms while ensuring no anomaly goes undetected makes it a valuable tool for
reliable and efficient predictive maintenance strategies.

20.04.2024 Student number: 120997 61


Title: Data Cubes and Deep Learning: Gearbox Condition Monitoring for Offshore Wind Turbine

Figure 35: Confusion Matrix for GS and MTF Models

Reflecting on the analysis of the CAE models trained on Spectrogram (SP) and Scalo-
gram (SC) images, it becomes evident that the apparent perfect performance of these
models is misleading. Initially, the confusion matrices for both SP and SC suggested an
ideal scenario with 686 true positives and zero false positives or negatives (see Fig 36),
which would indicate flawless anomaly detection. However, this interpretation overlooks
a critical shortcoming: both models have failed to accurately learn and replicate the
characteristics of the images generated by their respective algorithms, resulting in every
instance being misclassified as an anomaly. This failure reveals a fundamental flaw in
the model’s learning capability or in the suitability of the data preprocessing method
used for these particular types of data. The models’ inability to differentiate between
normal and abnormal operational states suggests that they are not capturing the essen-
tial features needed for effective anomaly detection, but are instead learning a biased
representation where all inputs are interpreted as outliers.

Figure 36: Confusion Matrix for SP and SC Models

20.04.2024 Student number: 120997 62


Title: Data Cubes and Deep Learning: Gearbox Condition Monitoring for Offshore Wind Turbine

4.5 Model Resistance to Data Inconsistency


The evaluation of the model was conducted by substituting the data across all nine
sensors to simulate a scenario of sensor failure leading to a complete loss of connection.
The model underwent testing with a substantial dataset comprising 2.4 million data
points each from healthy and damaged conditions, wherein the sensors readings were re-
placed with a series of five statistical metrics derived from historical data—mean, mean
minus standard deviation, mean plus standard deviation, minimum, and maximum val-
ues. This method of data substitution was chosen to avoid the simplification of using a
constant value replacement, which would allow the CAE to effortlessly reconstruct the
input data cube for both healthy and damaged datasets. Such an approach would simply
not allow the system to discern anomalies by measuring the error of the CAE, as the
replication of input data would be straightforward for both healthy and damaged data
cubes. By using varied statistical measures, the model’s capability to detect anomalies
via the error measurement of the CAE is enhanced, thus providing a more rigorous test
of its performance under simulated sensor failure conditions.

The confusion matrix and associated metrics indicate a robust performance of the model
under the simulated sensor failure scenario presented in Table 7 and Table 8. The matrix
shows that the model successfully identified all 585 damaged cases without any false
negatives, as evidenced by a recall of 1.0, indicating perfect sensitivity. However, there
were 156 false positives, where healthy cases were incorrectly classified as damaged,
reflected in a precision of approximately 0.789. The balance between precision and
recall is captured by the F1 score, which is 0.882, suggesting a high overall accuracy but
indicating room for improvement in reducing false positive rates to enhance the model’s
precision.

Table 7: Model Resistance to Data Inconsistency - Confusion Matrix


Predicted Healthy Predicted Damaged
Actual Healthy 429 156
Actual Damaged 0 585

Table 8: Model Resistance to Data Inconsistency - Evaluation metrics


Metric Value
Precision 0.7895
Recall 1.0000
F1 Score 0.8824

4.6 Impact of Artifact Architecture on O&M


The integration of multi-dimensional data cubes in the CAE models has proven to be able
to improve predictive maintenance methodologies for OWTs by enabling precise anomaly

20.04.2024 Student number: 120997 63


Title: Data Cubes and Deep Learning: Gearbox Condition Monitoring for Offshore Wind Turbine

detection and the proactive scheduling of maintenance operations. This methodology


leverages images generated from various sensors, not only to understand the health status
of individual components but also to discern how these components interact, influencing
each other’s operational states. The models, particularly GASF and MTF, demonstrate
high precision and recall, ensuring maintenance decisions are based on accurate and re-
liable data. This ability to differentiate between normal and abnormal conditions with
such accuracy stems from the models’ comprehensive training on diversified data cubes
composed of all the sensors at the same time, which equips them with a nuanced under-
standing of turbine dynamics. As a result, these models require less frequent updates and
fine-tuning, significantly reducing the need for continuous data transfer and computa-
tional resources. This efficiency not only optimizes operational costs but also minimizes
false positives and negatives, thereby enhancing the overall reliability and effectiveness
of turbine maintenance operations.

The need of live data transfer from the OWTs to the CM operators will be reduced
to a single signal of false or true when such reliable models are deployed on site of the
OWT. Where the false indicates a healthy and normal status of the OWT and the true
indicates immediate need of monitoring. This reliability built on high precision and
accuracy, enables a more sustainable and cost efficient methodology for OWT O&M.
Making the sustainable energy more sustainable and cost competitive at the same time.

4.7 Comparative Analysis with Conventional Methods


In traditional anomaly detection for OWTs, methods like statistical analysis and threshold-
based monitoring have dominated, relying heavily on predefined criteria and historical
data patterns. These conventional techniques, while effective in stable and predictable
environments, often struggle with the dynamic and complex nature of WT data, which is
subject to rapid changes due to environmental and mechanical factors. Research such as
[90] illustrates that traditional models often require frequent re-calibrations as new data
becomes available by highly expertise individuals in the domain, leading to a reactive
rather than proactive maintenance strategy. Transitioning to the artifact introduced
in this study, represents a significant shift towards more adaptive and robust anomaly
detection systems, where the artifact has a broad understanding of the OWT data and
how each component is reliant on the other. Unlike conventional methods, the models
discussed in this study, particularly those trained on GASF and MTF processed data
cubes, demonstrate an ability to discern subtle patterns and dependencies within mul-
tidimensional sensor data. This is corroborated by the high precision and recall rates
observed, which are seldom achieved by basic statistical methods or ML models.

The comparison extends to the operational efficiency of these methods. Previous DL


methods, as noted by [101], often incur higher long-term costs due to manual adjust-
ments and maintenance triggered by false positives or undetected anomalies. In contrast,
the CAE models developed in this study leverage their initial extensive training to min-
imize such errors, thus reducing the need for human intervention and allowing for more

20.04.2024 Student number: 120997 64


Title: Data Cubes and Deep Learning: Gearbox Condition Monitoring for Offshore Wind Turbine

continuous, autonomous monitoring. This advantage is especially critical in offshore


settings where maintenance operations are costly and logistically challenging.

20.04.2024 Student number: 120997 65


Title: Data Cubes and Deep Learning: Gearbox Condition Monitoring for Offshore Wind Turbine

5 Discussion
Revisiting the major objectives of this study: creating and assessing a DL model for
anomaly detection in OWT maintenance, again in this discussion part. The outcomes
demonstrated notable progress in terms of model efficiency and accuracy, which is con-
sistent with the study goals. These results provide realistic applications in an actual
environment, which not only extends existing ideas on DL based CM but also supports
them. The ensuing segments will delve into the theoretical, practical, and methodological
consequences of these findings, deliberate on the constraints of the present investigation,
and suggest avenues for subsequent research.

5.1 Implications
This research advances the field by demonstrating the use of six signal-to-image process-
ing algorithms to create “Data Cubes”, transforming traditional time-series sensor data
into three-dimensional matrices that capture temporal and spatial relationships. This
innovative preprocessing step enhances the ability of CAE to detect anomalies in OWTs
by improving their capability to discern intricate correlations in multivariate sensor data.
The high F1 scores achieved with MTF and GASF algorithms highlight the effectiveness
of this approach in minimizing false alarms and increasing detection accuracy. Prac-
tically, this leads to improved operational efficiency and safety, reduced maintenance
costs, and greater reliability in renewable energy infrastructure. The study’s findings
suggest that this methodology could be applied to other domains requiring multivariate
time-series analysis, broadening its impact and utility.

The methodology adopted for training DL models primarily uses healthy data, reducing
the need for a balanced dataset with numerous anomalies. This approach demonstrates
the models’ ability to generalize from normal operating conditions to detect deviations,
achieving high reliability and accuracy in anomaly detection. The unsupervised training
regimen utilizes data from all available sensors, highlighting the models’ capability to
recognize healthy operational data characteristics and understand the interplay between
different sensors and components within OWTs. This comprehensive sensor integration
minimizes the need for retraining or further development after initial deployment, even
when encountering new anomalies. Practically, this enhances operational efficiency by
reducing retraining frequency, lowering maintenance costs, and minimizing downtime,
thereby improving OWT reliability and ensuring continuous operation. The research also
advances the understanding of multi-sensor integration and its applications in complex
systems, paving the way for future research in sensor fusion.
The methodology in this study enhances sustainability in anomaly detection for the re-
newable energy sector by improving data efficiency and reducing environmental impact.
By eliminating the need for anomalies in training datasets, the approach reduces exten-
sive data collection and storage, thus lowering CO2 emissions from data transfer and
storage in OWTs. The use of unsupervised learning techniques decreases the necessity
for labeled data, reducing energy consumption and reliance on domain-specific exper-

20.04.2024 Student number: 120997 66


Title: Data Cubes and Deep Learning: Gearbox Condition Monitoring for Offshore Wind Turbine

tise. The lightweight and robust nature of CAE models allows for direct deployment on
OWTs, minimizing the need for frequent retraining and fine-tuning, and reducing ongo-
ing energy demands for data monitoring and model evaluations. This research advances
sustainable AI practices by promoting resource-efficient and environmentally friendly
ML models. In practice, this methodology lowers CO2 emissions, decreases energy con-
sumption, and enhances the efficiency and reliability of OWT O&M, thereby supporting
sustainability goals in the renewable energy sector.
The model’s performance was evaluated against six data quality dimensions, demon-
strating resilience in handling data inconsistencies such as consistent representation,
completeness, and feature accuracy. Using synthetic data to mimic these discrepancies,
the model maintained an acceptable F1-score of 0.88 despite a high rate of false alerts
due to sensor failures. The training process effectively managed target class balance
and data uniqueness, and the model adeptly detected anomalies even with missing data.
This research underscores the importance of data preprocessing steps in ML and sug-
gests that data cubes can enhance anomaly detection systems with minimum affection
due to data quality issues. Practically, the model’s ability to handle data quality issues
reliably makes it valuable for complex industrial settings, reducing downtime and un-
necessary maintenance, and improving operational efficiency and safety. This approach
supports operational strategies with trustworthy data insights, highlighting the poten-
tial for adaptable algorithms that can maintain robust performance under varying data
quality conditions.

5.2 Contributions
This research successfully designed and implemented a DL based framework capable of
effectively detecting anomalies in the gearbox of OWTs. This model leverages state-
of-the-art signal-to-image processing algorithms to generate data cubes from various
sensors, enhancing the input data quality for DL applications. By addressing the pri-
mary challenge of handling high-dimensional, noisy, and imbalanced data, the framework
represents an advancement in the field.

The proposed model demonstrates an improvement in detection accuracy and a reduc-


tion in false positive rates compared to traditional methods. By incorporating CAEs, the
system has shown superior capability in understanding complex patterns and temporal
sequences within the operational data of OWTs, which are crucial for reliable anomaly
detection. This advancement not only enhances anomaly detection but also contributes
to operational efficiency by enabling early detection of potential failures, allowing for
timely maintenance interventions. Consequently, this minimizes downtime and extends
the lifespan of critical OWT components, reducing operational and maintenance costs
for wind farms.

Moreover, the research outlines a methodological approach that can be scaled and
adapted to other types of machinery and conditions within the renewable energy sec-
tor. This scalability enhances the utility of the research beyond the specific context of

20.04.2024 Student number: 120997 67


Title: Data Cubes and Deep Learning: Gearbox Condition Monitoring for Offshore Wind Turbine

OWTs, potentially benefiting a broader range of applications in industrial CM. Addi-


tionally, by improving maintenance efficiency and reliability, the research contributes
to the sustainability goals of renewable energy systems. Efficient anomaly detection
systems help ensure optimal operation of WTs, maximizing energy output and reduc-
ing waste and environmental impact. This is further supported by a methodology that
handles six critical elements of data quality—uniqueness, target class balance, feature
correctness, consistent representation, completeness, and accurate target—significantly
advancing data quality management in DL applications.

5.3 Challenges and Limitations


This thesis faced several challenges and limitations that influenced the methodology and
outcomes of the research. Understanding these limitations is essential for interpreting the
results and for guiding future work in this area. One significant limitation encountered
in this thesis was the absence of specific labels indicating the root causes of damage in
the dataset. This gap restricted the development of a more refined, supervised learning
model capable of diagnosing the precise nature of each anomaly. Although the current
model is effective at detecting anomalies, its ability to pinpoint specific fault reason is
not present.

The intensive computational resources required for processing and training DL models
also posed a considerable challenge. Limited by the capabilities of personal computing
resources, the research utilized the university’s High-Performance Computing (HPC)
facilities. This necessity turned into a valuable learning experience, familiarizing me
with the complexities of training DL models on HPCs, which, despite its challenges,
proved to be an enriching aspect of the research process. Furthermore, the project faced
difficulties in gaining industry cooperation, which limited the opportunities to test the
model under real-world operational conditions. Such challenges underscore the need for
closer collaboration between academia and industry to ensure that research outcomes are
both practical and applicable in real-world environments beyond the datasets available.

5.4 Future Research Directions


Further research can greatly increase the impact and efficacy of DL techniques for
anomaly detection in OWTs by building on the strong basis provided by this thesis.
Future research could improve the current models by adding a supervised learning layer
that makes use of a labelled dataset. This layer would allow the models to identify
anomalies and classify them according to their features and possible causes. This would
increase the operational efficiency of turbines and enable more focused maintenance tech-
niques. Furthermore, by making AI systems more explainable, integrating AI alignment
approaches can overcome important model limitations. This is necessary to ensure that
decisions made by AI are comprehensible and justified, and to win over stakeholders’
trust and acceptance. This approach could align DL models outputs with human values
and operational needs, making the systems not only more transparent but also more

20.04.2024 Student number: 120997 68


Title: Data Cubes and Deep Learning: Gearbox Condition Monitoring for Offshore Wind Turbine

aligned with the practical realities of turbine maintenance.

Moreover, this thesis’s concept of data cube production offers up new multidisciplinary
directions for the application of DL to a range of domains where time-series data is com-
mon. Through the use of signal-to-image processing methods, researchers may convert
these data into images, which can then be processed by CNNs to improve pattern iden-
tification and anomaly detection. This approach has potential applications outside of
WTs, such as in the healthcare, financial, and environmental monitoring sectors, where
comparable data formats are present. Applying these approaches to various SCADA
data in various contexts and using more powerful computing resources will be crucial
for scalability and deployment. DL models can be made more accurate and resilient by
training them on enlarged data cubes that contain more variables. This methodology
not only assesses the models’ generalizability but also guarantees their efficient imple-
mentation at large scale, offering a full solution for WT CM and maintenance worldwide.
The field can advance towards more dependable, scalable, and interpretable AI-driven
monitoring systems by tackling these future research directions.

5.5 Impact of Design Science Research


Design Science Research methodology has been instrumental in shaping the method-
ologies and outcomes of this thesis by providing a rigorous, systematic framework for
addressing the complex issues in OWT CM. The adoption of DSR allowed for a struc-
tured exploration of innovative solutions, specifically in the application of DL models to
anomaly detection in WT gearboxes. Through the iterative design and testing phases
outlined by Hevner’s [32] DSR model, the research benefited from continuous refine-
ment and alignment with practical needs as identified in the interview with Equinor.
The engagement with industry professionals provided critical insights that grounded the
research in real-world applications, ensuring that the developed models addressed the
specific challenges of data quality and anomaly detection reliability encountered in the
field.

DSR’s emphasis on creating and evaluating IT artifacts meant that the solutions de-
veloped were not only theoretically sound but also practically viable. The interview
with Equinor highlighted the operational challenges in their existing systems, such as
high rates of false positives in anomaly detection and heavy expenses on data quality.
Applying DSR, the thesis developed a CAE framework that leverages multi dimensional
data cubes from signal-to-image transformations to enhance the detection capabilities
of the system. This not only reduced false positives significantly but also contributed
to more reliable maintenance scheduling and operational efficiency where data quality is
not as critical as before, thereby aligning with the strategic goals of reducing downtime
and maintenance costs in OWT operations. This practical application of DSR under-
scores its value in bridging the gap between theoretical research and tangible industry
improvements, exemplifying how academic research can directly contribute to enhancing
industry practices in renewable energy maintenance.

20.04.2024 Student number: 120997 69


Title: Data Cubes and Deep Learning: Gearbox Condition Monitoring for Offshore Wind Turbine

6 Conclusion
This thesis, motivated by the overwhelming cost of O&M in OWT and aligned with
the sustainability goals of the world, has successfully demonstrated the integration of
CAEs with data cubes for enhanced anomaly detection in OWT gearbox CM. The ap-
plication of data cubes has proven invaluable, boosting model performance and setting a
benchmark for future research in multi-dimensional data representations across various
predictive maintenance domains. The GASF model, the best-performing model, exhib-
ited superior performance compared to previous studies, achieving a precision of 0.998
and a perfect recall of 1. This resulted in an impressive F1 Score of 0.999, indicating
nearly flawless sensitivity and precision.

Moreover, this thesis bridges the theoretical aspects of DL with its practical applica-
tions in maintaining critical infrastructure. It underscores the importance of precise
data preprocessing to achieve the high accuracy necessary for AI applications in critical
environments. This research not only contributes to the domain of sustainable energy
by enhancing operational efficiencies but also promotes sustainability within the sector
by offering reliable solutions that minimize data transfer and storage requirements. By
demonstrating how data cubes can effectively address all six data quality dimensions,
this thesis lays a substantial foundation for advancing AI applications in renewable en-
ergy and beyond. This study encountered limitations such as the absence of specific
labels for root causes of damage in the dataset, which hindered the development of a
more detailed supervised learning model for diagnosing anomalies precisely. Future re-
search could extend this work by exploring the integration of diverse data types and
enhancing the explainability of DL models in industrial settings.

20.04.2024 Student number: 120997 70


Title: Data Cubes and Deep Learning: Gearbox Condition Monitoring for Offshore Wind Turbine

References
[1] Mark Diesendorf. Scenarios for mitigating co2 emissions from energy supply in the absence of co2
removal. https://doi.org/10.1080/14693062.2022.2061407, 2022.
[2] Arvind Keprate, Nikhil Bagalkot, Muhammad Salman Siddiqui, and Subhamoy Sen. Reliability
analysis of 15mw horizontal axis wind turbine rotor blades using fluid-structure interaction simu-
lation and adaptive kriging model. https://doi.org/10.1016/j.oceaneng.2023.116138, 2023.
[3] Yanting Li and Zhenyu Wu. A condition monitoring approach of multi-turbine based on var model
at farm level. https://doi.org/10.1016/j.renene.2020.11.106, 2022.
[4] International Energy Agency (IEA). Renewables 2023. https://www.iea.org/reports/
renewables-2023, 2024. License: CC BY 4.0.
[5] International Renewable Energy Agency (IRENA). Statistics time series. https://www.irena.
org/Data/View-data-by-topic/Capacity-and-Generation/Statistics-Time-Series, 2023.
[6] European Commission. An eu strategy to harness the potential of offshore renewable energy for a
climate neutral future. https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=COM%3A2020%
3A741%3AFIN, 2020.
[7] Lazard. Levelized cost of energy and levelized cost of storage 2023. https://www.lazard.com/
research-insights/2023-levelized-cost-of-energyplus/, 2023.
[8] Paul Graham, Jenny Hayward, James Foster, and Lisa Havas. Gencost 2020-21 final report.
https://www.csiro.au/-/media/EF/Files/GenCost2020-21_FinalReport.pdf, June 2021.
[9] Oliver Summerfield-Ryan and Susan Park. The power of wind: The global wind energy industry’s
successes and failures. https://doi.org/10.1016/j.ecolecon.2023.107841, 2023.
[10] N. Bagalkot, A. Keprate, and R. Orderløkken. Combining computational fluid dynamics and
gradient boosting regressor for predicting force distribution on horizontal axis wind turbine. https:
//doi.org/10.3390/vibration4010017, 2021.
[11] Allan May, David McMillan, and Sebastian Thöns. Economic analysis of condition monitoring
systems for offshore wind turbine sub-systems. https://doi.org/10.1049/iet-rpg.2015.0019,
2015.
[12] C. Yang, J. Jia, K. He, L. Xue, C. Jiang, S. Liu, B. Zhao, M. Wu, and H. Cui. Comprehensive
analysis and evaluation of the operation and maintenance of offshore wind power systems: A
survey. https://doi.org/10.3390/en16145562, 2023.
[13] R. W. Hyers, J. G. Mcgowan, K. L. Sullivan, J. F. Manwell, and B. C. Syrett. Condition monitoring
and prognosis of utility scale wind turbines. https://doi.org/10.1179/174892406X163397, 2006.
[14] E. Wiggelinkhuizen, T. Verbruggen, H. Braam, L. Rademakers, J. Xiang, and S. Watson. Assess-
ment of condition monitoring techniques for offshore wind farms. https://doi.org/10.1115/1.
2931512, 2008.
[15] H. Badihi, Y. Zhang, B. Jiang, P. Pillay, and S. Rakheja. A comprehensive review on signal-based
and model-based condition monitoring of wind turbines: Fault diagnosis and lifetime prognosis.
https://doi.org/10.1109/jproc.2022.3171691, 2022.
[16] N. Bagalkot, J. Jose, and A. Keprate. Chapter 2 - key components of the horizontal axis wind
turbine. https://doi.org/10.1016/B978-0-323-91852-7.00006-4, January 2024.
[17] A. Stetco, F. Dinmohammadi, X. Zhao, V. Robu, D. Flynn, M. Barnes, J. Keane, and G. Nenadic.
Machine learning methods for wind turbine condition monitoring: A review. https://doi.org/
10.1016/j.renene.2018.10.047, April 2019.
[18] Z. Kong, B. Tang, L. Deng, W. Liu, and Y. Han. Condition monitoring of wind turbines based on
spatio-temporal fusion of scada data by convolutional neural networks and gated recurrent units.
https://doi.org/10.1016/j.renene.2019.07.033, 2020.

20.04.2024 Student number: 120997 71


Title: Data Cubes and Deep Learning: Gearbox Condition Monitoring for Offshore Wind Turbine

[19] X. Xiao, J. Liu, D. Liu, Y. Tang, J. Dai, and F. Zhang. Stacked sparse autoencoders-based multi-
layer perceptron for main bearing temperature prediction of large-scale wind turbines. https:
//doi.org/10.1002/cpe.6315, 2021.
[20] Y. Pang, Q. He, G. Jiang, and P. Xie. Spatio-temporal fusion neural network for multi-class fault
diagnosis of wind turbines based on scada data. https://doi.org/10.1016/j.renene.2020.06.
154, 2020.
[21] Y. Li and Z. Wu. A condition monitoring approach of multi-turbine based on var model at farm
level. https://doi.org/10.1016/j.renene.2020.11.106, 2020.
[22] W. Wen, Y. Liu, R. Sun, and Y. Liu. Research on anomaly detection of wind farm scada wind
speed data. https://doi.org/10.3390/en15165869, 2022.
[23] L. Budach, M. Feuerpfeil, N. Ihde, A. Nathansen, N. Noack, H. Patzlaff, F. Naumann, and H. Har-
mouch. The effects of data quality on machine learning performance. https://doi.org/10.48550/
arXiv.2207.14529, 2022.
[24] D. Paradza and O. Daramola. Business intelligence and business value in organisations: A sys-
tematic literature review. https://doi.org/10.3390/su132011382, 2021.
[25] J. Shuja, S. A. Madani, K. Bilal, K. Hayat, S. U. Khan, and S. Sarwar. Energy-efficient data
centers. https://doi.org/10.1007/s00607-012-0211-2, 2012.
[26] J. Ribrant and L. M. Bertling. Survey of failures in wind power systems with focus on swedish
wind power plants during 1997–2005. https://doi.org/10.1109/TEC.2006.889614, March 2007.
[27] A.G. Olabi, T. Wilberforce, K. Elsaid, E.T. Sayed, T. Salameh, M.A. Abdelkareem, and
A. Baroutaji. A review on failure modes of wind turbine components. https://doi.org/10.
3390/en14175241, 2021.
[28] H. Link, W. LaCava, J. van Dam, B. McNiff, S. Sheng, R. Wallen, M. McDade, S. Lambert,
S. Butterfield, and F. Oyague. Gearbox reliability collaborative project report: Findings from
phase 1 and phase 2 testing. https://www.nrel.gov/docs/fy11osti/51885.pdf, 2011.
[29] S. Sheng. Gearbox reliability database: yesterday, today, and tomorrow. https://www.nrel.gov/
docs/fy15osti/63106.pdf, 2014.
[30] O.I. Owolabi, N. Madushele, and P.A. et al. Adedeji. Fem and ann approaches to wind turbine gear-
box monitoring and diagnosis: a mini review. https://doi.org/10.1007/s40860-022-00183-4,
2023.
[31] M. Ghane, A.R. Nejad, and M. et al. Blanke. Statistical fault diagnosis of wind turbine drivetrain
applied to a 5mw floating wind turbine. https://doi.org/10.1088/1742-6596/753/5/052017,
2016.
[32] A. R. Hevner, S. T. March, J. Park, and S. Ram. Design science in information systems research.
https://doi.org/10.2307/25148625, 2004.
[33] K. Peffers, T. Tuunanen, C. E. Gengler, M. Rossi, W. Hui, Ville Virtanen, and J. Bragge. Design
science research process: A model for producing and presenting information systems research.
https://doi.org/10.48550/arXiv.2006.02763, 2006.
[34] A. Keprate, S. Sheikhi, M. S. Siddiqui, and M. Tanwar. Comparing deep learning based image
processing techniques for unsupervised anomaly detection in offshore wind turbines. https://
doi.org/10.1109/IEEM58616.2023.10406361, 2023.
[35] R. T. Watson and J. Webster. Analysing the past to prepare for the future: Writing a literature
review a roadmap for release 2.0. https://doi.org/10.1080/12460125.2020.1798591, 2020.
[36] J. Kang, Z. Wang, and C. Guedes Soares. Condition-based maintenance for offshore wind turbines
based on support vector machine. https://doi.org/10.3390/en13143518, 2020.
[37] J. Helsen. Review of research on condition monitoring for improved o&m of offshore wind turbine
drivetrains. https://doi.org/10.1007/s40857-021-00237-2, 2021.

20.04.2024 Student number: 120997 72


Title: Data Cubes and Deep Learning: Gearbox Condition Monitoring for Offshore Wind Turbine

[38] T. Asim, S.Z. Islam, A. Hemmati, and M.S.U. Khalid. A review of recent advancements in offshore
wind turbine technology. https://doi.org/10.3390/en15020579, 2022.
[39] Xiangjing Su, Yanhao Shan, Chaojie Li, Yang Mi, Yang Fu, and Zhaoyang Dong. Spatial-temporal
attention and gru based interpretable condition monitoring of offshore wind turbine gearboxes.
https://doi.org/10.1049/rpg2.12315, 2022.
[40] Q. Fan, X. Wang, J. Yuan, X. Liu, H. Hu, and P. Lin. A review of the development of key
technologies for offshore wind power in china. https://doi.org/10.3390/jmse10070929, 2022.
[41] H. Hinrichs. Condition monitoring based maintenance strategies for operating offshore
wind farms. https://www.planet-energie.de/de/media/Delft_Workshop_2006_ProjektGmbH_
OL_final.pdf, 2006.
[42] IEC International Standard IEC61400-Part 25-1, Communications for Monitoring and Control of
Wind Power Plants: Overall Description of Principles and Models. https://webstore.iec.ch/
publication/29062, 2006.
[43] Wenxian Yang, Richard Court, and Jiesheng Jiang. Wind turbine condition monitoring by the
approach of scada data analysis. http://dx.doi.org/10.1016/j.renene.2012.11.030, 2012.
[44] Olimpo Anaya-Lara, David Campos-Gaona, Edgar Moreno-Goytia, and Grain Adam. Offshore
wind energy generation: Control, protection, and integration to electrical systems, 2014.
[45] U. Gutierrez Santiago, A. Fernández Sisón, H. Polinder, and J. W. van Wingerden. Input torque
measurements for wind turbine gearboxes using fiber optical strain sensors, 2021.
[46] Zhipeng Feng and Ming Liang. Complex signal analysis for wind turbine planetary gearbox fault di-
agnosis via iterative atomic decomposition thresholding. https://doi.org/10.1016/j.jsv.2014.
05.029, 2014.
[47] En 13306: Maintenance terminology. European Standard, 2017.
[48] T. W. Verbruggen. Wind turbine operation & maintenance based on condition monitoring wt-Ω.
2003.
[49] JP Verhoef and TW Verbruggen. Conditiebewaking aan windturbines; een verkennende studie.
ECN-C-058, Juni 2001, 2001.
[50] Z. Hameed, Y.S. Hong, Y.M. Cho, S.H. Ahn, and C.K. Song. Condition monitoring and fault
detection of wind turbines and related algorithms: A review. Renewable and Sustainable Energy
Reviews, Volume 13, Issue 1, Pages 1-39, 2009.
[51] Matti Scheu, Lorena Tremps Bolsa, Ursula Smolka, Athanasios Kolios, and Feargal Brennan. A
systematic failure mode effects and criticality analysis for offshore wind turbine systems towards
integrated condition based maintenance strategies. https://doi.org/10.1016/j.oceaneng.2019.
02.048, 2019.
[52] Fausto Pedro Garcı́a Márquez, Andrew Mark Tobias, Jesús Marı́a Pinar Pérez, and Mayorkinos
Papaelias. Condition monitoring of wind turbines: Techniques and methods. https://doi.org/
10.1016/j.renene.2012.03.003, 2012.
[53] Wenxian Yang, P.J. Tavner, Christopher Crabtree, and M.R. Wilkinson. Cost-effective condition
monitoring for wind turbines. https://doi.org/10.1109/TIE.2009.2032202, February 2010.
[54] Kong Zhang, Vikram Pakrashi, Jimmy Murphy, and Guangbo Hao. Inspection of floating offshore
wind turbines using multi-rotor unmanned aerial vehicles: Literature review and trends. https:
//doi.org/10.3390/s24030911, 2024.
[55] W. Yang, R. Court, and J. Jiang. Wind turbine condition monitoring by the approach of scada
data analysis. https://doi.org/10.1016/j.renene.2012.11.030, 2013.
[56] H. Fox, A.C. Pillai, D. Friedrich, M. Collu, T. Dawood, and L. Johanning. A review of predic-
tive and prescriptive offshore wind farm operation and maintenance. https://doi.org/10.3390/
en15020504, 2022.

20.04.2024 Student number: 120997 73


Title: Data Cubes and Deep Learning: Gearbox Condition Monitoring for Offshore Wind Turbine

[57] E. Medina-Lopez, D. McMillan, J. Lazic, E. Hart, S. Zen, A. Angeloudis, E. Bannon, J. Browell,


S. Dorling, R.M. Dorrell, R. Forster, C. Old, G.S. Payne, G. Porter, A.S. Rabaneda, B. Sellar,
E. Tapoglou, N. Trifonova, I.H. Woodhouse, and A. Zampollo. Satellite data for the offshore
renewable energy sector: Synergies and innovation opportunities. Remote Sensing of Environment,
2021.
[58] Y. Liu, M. Hajj, and Y. Bao. Review of robot-based damage assessment for offshore wind turbines.
Renewable and Sustainable Energy Reviews, 2022.
[59] Xintao Xia, Zhen Chang, Lijun Zhang, and Xiaowei Yang. Estimation on reliability models of
bearing failure data. https://doi.org/10.1155/2018/6189527, 2018.
[60] R. J. Hyndman and G. Athanasopoulos. Forecasting: Principles and practice. https://otexts.
org/fpp2/, 2018.
[61] Peter McCullagh. What is a statistical model? https://doi.org/10.1214/aos/1035844977,
October 2002.
[62] M. Rezamand, R. Carriveau, D. S.-.-K. Ting, M. Davison, and J. J. Davis. Aggregate reliability
analysis of wind turbine generators. https://doi.org/10.1049/iet-rpg.2018.5909, 2019.
[63] Jianshe Kang, Xinghui Zhang, and Tongdan Jin. Tracking gearbox degradation based on sta-
ble distribution parameters: A case study. 2015 IEEE Conference on Prognostics and Health
Management (PHM), June 2015.
[64] E. Mollasalehi, D. Wood, and Q. Sun. Indicative fault diagnosis of wind turbine generator bearings
using tower sound and vibration. https://doi.org/10.3390/en10111853, 2017.
[65] Wanwan Zhang, Jørn Vatn, and Adil Rasheed. A review of failure prognostics for predictive
maintenance of offshore wind turbines. Journal of Physics: Conference Series, 2022.
[66] R.K. Pandit, D. Astolfi, and I. Durazo Cardenas. A review of predictive techniques used to
support decision making for maintenance operations of wind turbines. https://doi.org/10.
3390/en16041654, 2023.
[67] F. Zhang, M. Chen, Y. Zhu, K. Zhang, and Q. Li. A review of fault diagnosis, status prediction,
and evaluation technology for wind turbines. https://doi.org/10.3390/en16031125, 2023.
[68] European Commission. Artificial intelligence for europe. Communication from the Commission
to the European Parliament, The European Council, The Council, the European Economic and
Social Committee and the Committee of the Regions, 2018. COM (2018) 237.
[69] International Committee for Information Technology Standards [INCITS]. Information technol-
ogy—american national standard dictionary of information technology. ANSI INCITS 172–2002
(R2007) (Revision and Redesignation Of ANSI X3.172–1996), 2007.
[70] International Organization for Standardization. Information technology—artificial intelli-
gence—artificial intelligence concepts and terminology, Under development. ISO/IEC 3WD 22989.
[71] Office of the President of the Russian Federation. Decree of the president of the russian federation
on the development of artificial intelligence in the russian federation, 2019.
[72] The Conference Toward AI Network Society. Draft ai r & d guidelines for international discussions,
2017.
[73] I. M. Black, M. Richmond, and A. Kolios. Condition monitoring systems: a systematic literature
review on machine-learning methods improving offshore-wind turbine operational management.
https://doi.org/10.1080/14786451.2021.1890736, 2021.
[74] L. Alzubaidi, J. Zhang, A.J. Humaidi, et al. Review of deep learning: concepts, cnn architectures,
challenges, applications, future directions. Journal of Big Data, 2021.
[75] Andre Khuri. Introduction to linear regression analysis, fifth edition by douglas c. montgomery,
elizabeth a. peck, g. geoffrey vining. https://doi.org/10.1111/insr.12020_10, 2013.

20.04.2024 Student number: 120997 74


Title: Data Cubes and Deep Learning: Gearbox Condition Monitoring for Offshore Wind Turbine

[76] Bernhard Schölkopf and Alexander J. Smola. Learning with kernels: Support vector machines,
regularization, optimization, and beyond. https://doi.org/10.7551/mitpress/4175.001.0001,
2018.
[77] E. Hunt, J. Marin, and P. Stone. Experiments in induction. https://doi.org/10.2307/1421207,
1966.
[78] B. W. Silverman and M. C. Jones. E. fix and j.l. hodges (1951): An important contribution
to nonparametric discriminant analysis and density estimation: Commentary on fix and hodges
(1951). https://doi.org/10.2307/1403796, 1989.
[79] Christopher M. Bishop. Neural networks for pattern recognition. https://api.semanticscholar.
org/CorpusID:60563397, 1995.
[80] J. S. R. Jang. Anfis: adaptive-network-based fuzzy inference system, May-June 1993.
[81] N. Dervilis, M. Choi, S.G. Taylor, R.J. Barthorpe, G. Park, C.R. Farrar, and K. Worden. On
damage diagnosis for a wind turbine blade using pattern recognition. https://doi.org/10.1016/
j.jsv.2013.11.015, 2014.
[82] Meik Schlechtingen, Ilmar Santos, and Sofiane Achiche. Using data-mining approaches for wind
turbine power curve monitoring: A comparative study. IEEE Transactions on Sustainable Energy,
1-9, 2013.
[83] Michael Wilkinson, Brian Darnell, Thomas van Delft, and Keir Harman. Comparison of methods
for wind turbine condition monitoring with scada data. Special Issue: European Wind Energy
Association 2013, Free Access, 2014.
[84] Edzel Lapira, Dustin Brisset, Hossein Davari Ardakani, David Siegel, and Jay Lee. Wind turbine
performance assessment using multi-regime modeling approach. Renewable Energy, Elsevier, 2012.
[85] Z. Zhang and A. Kusiak. Monitoring wind turbine vibration based on scada data. ASME. J. Sol.
Energy Eng., May 2012.
[86] O. Uluyol, G. Parthasarathy, W. Foslien, and K. Kim. Power curve analytic for wind turbine
performance monitoring and prognostics. Annual Conference of the PHM Society, 2011.
[87] I. Antoniadou, N. Dervilis, E. Papatheou, A. E. Maguire, and K. Worden. Aspects of structural
health and condition monitoring of offshore wind turbines. http://doi.org/10.1098/rsta.2014.
0075, 2015.
[88] P. W. Khan and Y.-C. Byun. A review of machine learning techniques for wind turbine’s fault
detection, diagnosis, and prognosis. International Journal of Green Energy, 2023.
[89] X. Xiao, J. Liu, D. Liu, Y. Tang, J. Dai, and F. Zhang. Stacked sparse autoencoders-based multi-
layer perceptron for main bearing temperature prediction of large-scale wind turbines. Concurrency
and Computation: Practice and Experience, 2021.
[90] F.P. Garcı́a Márquez and A. Peinado Gonzalo. A comprehensive review of artificial intelligence
and wind energy. Archives of Computational Methods in Engineering, 2022.
[91] Tomasz Barszcz, Andrzej Bielecki, and Mateusz Wójcik. Art-type artificial neural networks appli-
cations for classification of operational states in wind turbines. Book chapter in published volume,
2010.
[92] Ursula Smolka and Po Wen Cheng. On the design of measurement campaigns for fatigue life mon-
itoring of offshore wind turbines. Proceedings of the International Offshore and Polar Engineering
Conference, 2013.
[93] Alberto Pliego Marugán, Ana Marı́a Peco Chacón, and Fausto Pedro Garcı́a Márquez. Reliability
analysis of detecting false alarms that employ neural networks: A real case study on wind turbines.
Reliability Engineering & System Safety, 2019.
[94] Fausto Pedro Garcı́a Márquez and Ana Marı́a Peco Chacón. A review of non-destructive testing
on wind turbines blades. Renewable Energy, 2020.

20.04.2024 Student number: 120997 75


Title: Data Cubes and Deep Learning: Gearbox Condition Monitoring for Offshore Wind Turbine

[95] Abdul Butt, Bilal Akbar, Jawad Aslam, Naveed Akram, Manzoore Elahi Soudagar, Fausto Pedro
Garcı́a Márquez, Yamin Younis, and Emad Uddin. Development of a linear acoustic array for
aero-acoustic quantification of camber-bladed vertical axis wind turbine. Sensors, 2020.
[96] Maik Reder, Nurseda Y. Yürüşen, and Julio J. Melero. Data-driven learning framework for asso-
ciating weather conditions and wind turbine failures. Reliability Engineering and System Safety,
2018.
[97] Mahendra Raghav and Ram Sharma. A review on fault diagnosis and condition monitoring of
gearboxes by using ae technique. Archives of Computational Methods in Engineering, 2020.
[98] Yue Cui, Pramod Bangalore, and Lina Bertling Tjernberg. An anomaly detection approach based
on machine learning and scada data for condition monitoring of wind turbines. Conference paper
presented at PMAPS 2018, 2018.
[99] S. Koukoura. Failure and remaining useful life prediction of wind turbine gearboxes. Annual
Conference of the PHM Society, 2018.
[100] Q. Xu, D. Wu, C. Jiang, et al. A composite quantile regression long short-term memory net-
work with group lasso for wind turbine anomaly detection. Journal of Ambient Intelligence and
Humanized Computing, 2023.
[101] Yongchao Zhu, Caichao Zhu, Jianjun Tan, Yong Tan, and Lei Rao. Anomaly detection and
condition monitoring of wind turbine gearbox based on lstm-fs and transfer learning. Renewable
Energy, 2022.
[102] Qiucheng Lyu, Yuwei He, Shijing Wu, Deng Li, and Xiaosun Wang. Anomaly detection of wind
turbine driveline based on sequence decomposition interactive network. Sensors, 2023.
[103] C. McKinnon, J. Carroll, A. McDonald, S. Koukoura, D. Infield, and C. Soraghan. Comparison
of new anomaly detection technique for wind turbine condition monitoring using gearbox scada
data. Energies, 2020.
[104] H. Zhao, H. Liu, W. Hu, and X. Yan. Anomaly detection and fault analysis of wind turbine
components based on deep learning network. Renewable Energy, 2018.
[105] N. Renström, P. Bangalore, and E. Highcock. System-wide anomaly detection in wind turbines
using deep autoencoders. Renewable Energy, 2020.
[106] Jiarui Liu, Guotian Yang, Xinli Li, Qianming Wang, Yuchen He, and Xiyun Yang. Wind tur-
bine anomaly detection based on scada: A deep autoencoder enhanced by fault instances. ISA
Transactions, 2023.
[107] M. Li, S. Wang, S. Fang, and J. Zhao. Anomaly detection of wind turbines based on deep small-
world neural network. Applied Sciences, 2020.
[108] C. Zhang and T. Yang. Anomaly detection for wind turbines using long short-term memory-
based variational autoencoder wasserstein generation adversarial network under semi-supervised
training. Energies, 2023.
[109] Xiaobo Liu, Wei Teng, Shiming Wu, Xin Wu, Yibing Liu, and Zhiyong Ma. Sparse dictionary
learning based adversarial variational auto-encoders for fault identification of wind turbines. Mea-
surement, 2021.
[110] J. E. D. Albuquerque Filho, L. C. P. Brandão, B. J. T. Fernandes, and A. M. A. Maciel. A review
of neural networks for anomaly detection. IEEE Access, 2022.
[111] B. Altice, E. Nazario, M. Davis, M. Shekaramiz, T.K. Moon, and M.A.S. Masoum. Anomaly
detection on small wind turbine blades using deep learning algorithms. Energies, 2024.
[112] Anfeng Zhu, Qiancheng Zhao, Tianlong Yang, Ling Zhou, and Bing Zeng. Condition monitoring of
wind turbine based on deep learning networks and kernel principal component analysis. Computers
and Electrical Engineering, 2023.

20.04.2024 Student number: 120997 76


Title: Data Cubes and Deep Learning: Gearbox Condition Monitoring for Offshore Wind Turbine

[113] Eric Stefan Miele, Fabrizio Bonacina, and Alessandro Corsini. Deep anomaly detection in horizon-
tal axis wind turbines using graph convolutional autoencoders for multivariate time series. Energy
and AI, 2022.
[114] Aurelien Geron. Hands-on machine learning with scikit-learn, keras, and tensorflow: Concepts,
tools, and techniques to build intelligent systems, 2019.
[115] Sylvaine Peyrols. How the body works (My First Discoveries), May 2016. Illustrated edition,
Spiral-bound.
[116] Sh. Sheng. Wind turbine gearbox vibration condition monitoring benchmarking datasets. NREL
National Wind Technology Center, Boulder, CO, 2012.
[117] John (Editor) Wang. Encyclopedia of data science and machine learning. IGI Global, 2022.
[118] Junyi Ma et al. Seqot: A spatial–temporal transformer network for place recognition using se-
quential lidar data. IEEE Transactions on Industrial Electronics, 2022.
[119] Z. Wang and T. Oates. Encoding time series as images for visual inspection and classification
using tiled convolutional neural networks. Available online, 2015.
[120] S. Barra, S. M. Carta, A. Corriga, A. S. Podda, and D. R. Recupero. Deep learning and time
series-to-image encoding for financial forecasting. IEEE/CAA Journal of Automatica Sinica, 2020.
[121] J. Song, Y. C. Lee, and J. Lee. Deep generative model with time series-image encoding for
manufacturing fault detection in die casting process. Journal of Intelligent Manufacturing, 2022.
[122] M. Fahim, K. Fraz, and A. Sillitti. Tsi: Time series to imaging based model for detecting anomalous
energy consumption in smart buildings. Information Sciences, 2020.
[123] L. Wen, X. Li, L. Gao, and Y. Zhang. A new convolutional neural network-based data-driven fault
diagnosis method. IEEE Transactions on Industrial Electronics, 2018.
[124] B. Boashash. Time-frequency signal analysis and processing: A comprehensive reference. Academic
Press, an imprint of Elsevier, 2016.
[125] M. Thill, W. Konen, H. Wang, and T. Bäck. Temporal convolutional autoencoder for unsupervised
anomaly detection in time series. Applied Soft Computing, 2021.

20.04.2024 Student number: 120997 77

You might also like