0% found this document useful (0 votes)
33 views10 pages

Failure Prediction

Uploaded by

sandip bhand
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
33 views10 pages

Failure Prediction

Uploaded by

sandip bhand
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

Hindawi

Scientific Programming
Volume 2020, Article ID 8616039, 10 pages
https://doi.org/10.1155/2020/8616039

Research Article
Failure Prediction of Aircraft Equipment Using Machine
Learning with a Hybrid Data Preparation Method

Kadir Celikmih,1 Onur Inan ,2 and Harun Uguz 3

1
Department of Information and Communication Technologies, Havelsan, Ankara 06510, Turkey
2
Department of Computer Engineering, Necmettin Erbakan University, Konya 42090, Turkey
3
Department of Computer Engineering, Konya Technical University, Konya 42250, Turkey

Correspondence should be addressed to Harun Uguz; [email protected]

Received 12 January 2020; Revised 17 February 2020; Accepted 4 August 2020; Published 28 August 2020

Academic Editor: Rahman Ali

Copyright © 2020 Kadir Celikmih et al. This is an open access article distributed under the Creative Commons Attribution
License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is
properly cited.

There is a large amount of information and maintenance data in the aviation industry that could be used to obtain meaningful
results in forecasting future actions. This study aims to introduce machine learning models based on feature selection and data
elimination to predict failures of aircraft systems. Maintenance and failure data for aircraft equipment across a period of two years
were collected, and nine input and one output variables were meticulously identified. A hybrid data preparation model is proposed
to improve the success of failure count prediction in two stages. In the first stage, ReliefF, a feature selection method for attribute
evaluation, is used to find the most effective and ineffective parameters. In the second stage, a K-means algorithm is modified to
eliminate noisy or inconsistent data. Performance of the hybrid data preparation model on the maintenance dataset of the
equipment is evaluated by Multilayer Perceptron (MLP) as Artificial Neural network (ANN), Support Vector Regression (SVR),
and Linear Regression (LR) as machine learning algorithms. Moreover, performance criteria such as the Correlation Coefficient
(CC), Mean Absolute Error (MAE), and Root Mean Square Error (RMSE) are used to evaluate the models. The results indicate that
the hybrid data preparation model is successful in predicting the failure count of the equipment.

1. Introduction operations. It is implemented by periodic maintenance to


avoid equipment failures or machinery breakdowns. Tasks for
Reliability and availability of aircraft components have al- this type of maintenance are planned to prevent unexpected
ways been an important consideration in aviation. Accurate downtime and breakdown events that would lead to repair
prediction of possible failures will increase the reliability of operations. Predictive maintenance, as the name suggests, uses
aircraft components and systems. The scheduling of some parameters which are measured while the equipment is
maintenance operations help determine the overall main- in operation to guess when failures might happen. It intends to
tenance and overhaul costs of aircraft components. Main- interfere with the system before faults occur [1, 2] and help
tenance costs constitute a significant portion of the total reduce the number of unexpected failures by providing the
operating expenditure of aircraft systems. maintenance personnel with more reliable scheduling options
There are three main types of maintenance for equipment: for preventive maintenance. Assessing system reliability is
corrective maintenance, preventive maintenance, and pre- important to choose the right maintenance strategy.
dictive maintenance [1]. Corrective maintenance helps Machine learning is a rising technology that is supposed
manage repair actions and unscheduled fault events, such as to develop in the future. Machine learning methods are
equipment and machine failures. When aircraft equipment applied in prediction/preventive systems, communications,
fails while it is in use, it is repaired or replaced. Preventive security, energy management, and so on [3]. The data
maintenance can reduce the need for unplanned repair preparing level is the core module of machine learning and
2 Scientific Programming

the decision making system. It manages the data to make it assess the frequency of problems in the systems of water supply.
useful for decision. The decision making depends on future Ramos et al. [13] carried out a study to predict malfunctions
forecasting, failure event, and availability of equipment [4]. and to do predictive maintenance practices in a piece of
Data mining is a way of classifying and clamping data into manufacturing equipment. In this study, ARIMA forecasting
comprehensible information. It comprehends the applicable methods were successfully compared with neural network
models from a mass of information and adopts different models. The results indicated that both models were good at
approaches to uncover secret data. Data mining can be forecasting defibrator disc replacement, but the ARIMA was
defined as knowledge derivation from raw data [5]. much better in the forecasting the distance between the discs.
Feature selection is a fundamental issue in data mining Trani et al. [14] introduced a basic method to estimate aircraft
and machine learning algorithms that focus on the features fuel consumption through the use of an artificial neural net-
which are the most relevant to the intended prediction [6]. work. A fuel consumption model supported by a neural net-
Features collected from the observation of a circumstance are work was created by using the data given in the performance
not all equivalently significant. Normally, operational data manual of the aircraft. The results from the neural network
tend to be incomplete, insufficient, or partially meaningful or model were compared with analytical models. The results
not meaningful at all. Some of them may be noisy, redundant, revealed that the proposed three-layer ANN with nonlinear
or irrelevant. Feature selection aims to choose a feature set transfer functions could correctly estimate fuel consumption in
that is relevant to a specific duty. This problem is a complex different stages of a flight. Ming et al. [15] investigated the use of
and multidimensional issue [7]. Hsu [8] proposed a novel the ANN method in vibration analysis by using the integrated
feature selection algorithm based on the correlation coeffi- data from the devices of vibration to predict equipment failures.
cient clustering method. It focused on reducing noisy, re- The ANN model was applied to diagnose the faults in a mill.
peated, or redundant features. The performance in the The results lent support to the efficiency of this methodology.
computational speed and the classification accuracy can be Kozik and Sep [16] applied ANN forecasting to identify the
improved through the removal of the irrevelant features. demand for spare parts to be replaced during aircraft engine
Methods of data processing helps improve the quality of the overhaul. The results indicated that the forecasting method that
data and increase the accuracy of data mining, thereby is composed of the engine’s hardtime calculation should be a
making it more efficient [8]. Data quality is important for the power in the implementation of lean manufacturing in MRO
process of information discovery, checking data anomalies, (maintenance, repair, and overhaul) facilities. Altay et al. [17]
and predicting and analyzing for decision making [9]. Pre- used 532 failures of 60 aircraft to model an artificial neural
dicting equipment failures are essential to reduce repair and network to forecast failures. The proposed model produced
equipment costs and to assess equipment availability [1]. high correlation rates of prediction between the actual and
Mass data can be useful for businesses and can guide target failure times of aircraft. Benkedjouh et al. [18] proposed a
systems to follow right paths. To boost performance in ma- method to guess the useful life (RUL) in machinery with
chine learning algorithms, it is critical that meaningful in- bearings. For this purpose, the researchers used the isometric
formation be gathered from the dataset. To eliminate noisy feature mapping reduction technique (ISOMAP) and support
and irrevelant data during data preparation, we used K-means vector regression (SVR). Moura et al. [19] presented an analysis
clustering algorithm, which is one of the popular unsuper- to comparatively assess the SVM effectiveness in predicting
vised machine learning algorithms. It defines k number of failure time. The performance of SVM regression is compared
centroids and allocates every case to the nearest cluster while with other learning methods.
keeping the centroids small [10]. The “means” in the K-means This paper discusses the feature selection of variables in the
refers to the averaging of the dataset to find the centroid. This maintenance data obtained from an aviation company in
algorithm assigns each case to only a single set. The purpose is Turkey. The proposed system will help companies to collect,
to accomplish a high level of similarity within the clusters and extract, and create data to improve the maintenance actions
low similarly across them [11]. It is used for more effective and through more accurate predictions. This study proposes a
better clustering with decreased complexity. hybrid data preparation method for maintenance data and
There are many studies on maintenance data and fore- predicting failure counts of equipment by comparing the re-
casting failure rates. Data preparation is a critical step in the sults of three different algorithms. The feature selection method
feature selection process, and it has a major effect on the success (ReliefF algorihm in the present study) is used for selecting
of a machine learning algorithm. Gurbuz et al. [9] applied attributes, and the modified K-means algorithm is used to
various techniques of preprocessing along with feature selec- eliminate the redundant data. Three methods for predicting
tion on 15 datasets of a Turkish airline company to understand equipment failure counts are introduced and compared using
and clean the dataset and to find the relationships between MLP as an ANN algorithm, SVR algorithm, and LR. The next
input and output features. They came up with 15 rules for section presents an overview of the materials and methodology,
creating failure alerts, and these were found useful by the followed by experimental results and conclusions.
experts of the aviation company. Classification algorithm was
used to extract patterns within equipment data. Kutlylowska 2. Materials and Methodology
[12] proposed the application of artificial neural networks to
failure rate modelling. Data from a water utility in Poland were The context in which the present case study was carried out
used to estimate the output values of failure frequency. The was an avitation company in Ankara, Turkey. The main-
results showed that artificial networks could be an option to tenance data were collected from the records of the
Scientific Programming 3

maintenance department. They included removal of the distance criterion as the Euclidean distance measure-
equipment, repair activities, experience of the operators, ment. Fahad and Alam [10] proposed a method by using
flight hours of the equipment, and other information rel- modified K-means algorithm, which proved less time con-
evant to the case study. suming yet more efficient in clustering. The quality of the
resulting clusters depends on the selection of the initial
centroid. K-means algorithm makes it possible to create a
2.1. Dataset Acquisition. In the ERP platform, a program is new data cluster by eliminating the smallest class value
developed to collect data and to format the dataset for represented in the cluster. Yilmaz et al. [11] proposed a
analysis through machine algorithms. The variables were system using modified K-means algorithm to eliminate
grouped as the input variables, while the failure count was noisy and irrevelant data. In this study, we used modified K-
considered to be the output variable. Selected parameters are means algorithm as in [11] and developed the pseudocode
evaluated by the feature selection ReliefF method to find the given in Algorithm 1.
most influential parameters that have a share in the failures.
Flow logic of the developed program is presented in
Figure 1. Firstly, the selected materials’ serial-numbered 2.4. Prediction Methods. In recent years, research on ma-
equipment used in the landing gear system was selected. chine learning algorithms and data mining has been carried
Their maintenance and operational data were identified. out to study failure prediction applications. In this study, the
Attributes of the maintenance and failure data were iden- MLP, SVR, and LR algorithms were examined to model
tified in cooperation with maintenance personnel and maintenance data and predict the failure count.
technicians. Each instance of a no fault found (NFF) status
was examined to find confirmed failure data. The total flight
hours for each piece of equipment across different aircraft 2.4.1. Multilayer Perceptron as an Artificial Neural Network.
were calculated for a given time period. Nine input variables An Artificial Neural Network (ANN) is a mathematical
that affect the failure of the equipment were determined. The model based on a biological interconnected group of arti-
failure count was calculated as an output variable. These nine ficial neurons. ANN imitates the brain’s ability to process the
input variables and an output variables were used for information approach to computation [24, 25]. Neural
modelling the machine learning algorithms used in the networks are a nonlinear statistical data modelling and
present study (MLP, SVR, and LR). machine learning method. They can be used to model
complex nonlinear relationships between inputs and outputs
in the data. They also describe patterns or relationships in
2.2. Feature Selection and ReliefF Algorithm. Feature selec- the data, and they help forecast output values with the help
tion is a technique to obtain the relevant features by re- of training, learning, and testing processes. A cell in a neural
moving irrelevant and noisy data from the original dataset. It network is called a neuron, and a fixed number of neurons
is the process of selecting a subset of features that could build a layer. Neurons connect to other neurons in other
adequately depict all the datasets. The main objective of layers by a weight factor. ANN algorithms compute weights
feature selection is to mine the data to obtain the minimum for input values, hidden layer, and output layer neurons by a
number of features to achieve maximum accuracy. Feature feed forward approach [26, 27]. Weights in an ANN are
selection methods are used in data mining and machine calculated by using a training algorithm as the most popular
learning, as well as in artificial intelligence. They reduce backpropagation algorithm. Backpropagation is a learning
model complexity and let algorithms operate faster. Relief is algorithm that seeks to minimize the difference between the
a feature selection algorithm used for random selection of real and target outputs. The weights are updated, so that the
instances for feature weight calculation. The Relief algorithm total error is distributed to the various neurons in the neural
is proposed by Kira and Rendell in 1992 [20, 21]. It estimates network. The error remains at a low level through feeding
feature weights iteratively, according to their ability to make forward and backpropagating [17]. The predictive capability
a distinction between neighboring models. Relief was ex- of neural networks comes from their multilayered structure.
tended to deal with noisy, irrevelant, and missing data to Neural networks have an input layer, one or more hidden
address multiclass issues. Kononenko [22] proposed an layers, and an output layer. MLP algorithms are comprised
extension to Relief called ReliefF to address the multiclass of the activation function of the neurons [28]. In this study,
problems. ReliefF is an extension of the Relief algorithm, multilayer perceptron (MLP) feed forward neural networks
which fails to remove irrelevant or incomplete features in were used with a backpropagation learning algorithm.
two-class classification problems. The ReliefF algorithm
finds one near miss for each different class and averages their
value to revise feature weights. 2.4.2. Support Vector Regression. Support Vector Machines
(SVM) algorithm was introduced by Cortes and Vapnik in
1995 [29]. It is a linear model used to address classification
2.3. Data Elimination and Modified K-Means Algorithm. and regression problems. The SVM algorithm produces a
K-means, a widely used algorithm in a wide range of ap- hyperplane that classifies the data. There are two distinct
plications, was first developed in 1967 by MacQueen [23]. It classes separated by a linear plane. The training in the al-
allows each data point to be a member of a single set. It has gorithm involves the process of identifying the parameters
limitation fields: fixed K value and an initial centroid. It uses [11]. Support Vector Regression (SVR) is a regression
4 Scientific Programming

Select materials’ Select equipment’s Diagnose failure


serial numbered maintenance and data and check NFF
equipment operational data status

Find confirmed Find input variables Format input and


failure count and from maintenance output variables for
flight hours data modelling

Figure 1: Flow logic of developed program.

(1) Procedure prepare_data_set.


(2) Get Clustured_dataset, distance center of cluster, elimination_number
(3) For i = 1 to cluster_count
(4) Sort distance of dataset_cluster(i) to cluster_center(i) descending
(5) For j = 1 to elimination_number
(6) Delete j. data in clustered_dataset(i)
(7) End for
(8) End for
(9) End procedure

ALGORITHM 1: Pseudocode of modified K-means algorithm.

algorithm that uses a similar method of SVM to carry out 1 N 􏼌􏼌􏼌 􏼌􏼌


regression analysis. SVR is a supervised machine learning MAE � 􏽘􏼌Xi − Yi 􏼌􏼌, (1)
N i�1
algorithm and an effective method which can be used for
prediction and data mining and is successfully adopted for 􏽶��������������
􏽴
regression problems.
1 N 2
RMSE � 􏽘 X − Yi 􏼁 , (2)
N i�1 i
2.4.3. Linear Regression. Linear regression is defined as a
machine learning algorithm that is based on supervised N
learning, involving a regression task. It is used to model the 􏽘i�1 Xi − X􏼁 Yi − Y􏼁
linear relationship among dependent variables or inde- CC � 􏽱������������
N
�􏽱������������,
N
(3)
2 2
􏽘i�1 Xi − X􏼁 􏽘i�1 Yi − Y􏼁
pendent variables. It helps determine the relationship be-
tween variables and prediction. Schuld et al. [30] proposed a where N is the number of data; Xi is the observed value; Yi is
prediction algorithm on a quantum computer, based on a
the predicted value; X is the mean of the observed data, and
linear regression model with least-squares optimization. Its
Y is the mean of the observed data and predicted data values.
scheme focused on the machine learning task of assuming
CC measures the variability of the observed data defined by
the output corresponding to a new input. The prediction
the model as a correlation coefficient.
result can be used for further quantum information pro-
cessing routines.
3. Proposed Methods
2.5. Evaluation Performance Measures. In this study, the In this study, as noted in Section 2.1, the 585-line mainte-
mean absolute error (MAE), root mean square error nance data in two years from a Turkish aviation company
(RMSE), and correlation coefficient (CC) criteria were used were used. The dataset consists of nine input variables and an
to evaluate the success of the all the models. There are many output variable (failure count). The input variables/factors
error measurement techniques, and they are most com- are operational and environmental parameters which could
monly used to quantify error measures. The error param- influence failure occurrence and the length of operation
eters, adopted from [31], are presented in the following before failures occur. Input variables include such param-
equations, respectively. eters as flight hours, the number of removals of equipment,
Scientific Programming 5

Table 1: The nine input variables and an output variable obtained from the maintenance data.
Parameter Description
Flight hours (FH) The total duration of flight for an equipment on different aircraft in a selected time period
RM The number of removals of the equipment in the last 24 months
PR The number of planned removals of the equipment in the last 24 months
UR The number of unplanned removals of the equipment in the last 24 months
OR The number of other removals of the equipment in the last 24 months
FR The number of faults with removals of the equipment in the last 24 months
FPR The number of faults with planned removals of the equipment in the last 24 months
FUR The number of faults with unplanned removals of the equipment in the last 24 months
SR The number of safe removals of the equipment in the last 24 months
NF (output) The number of equipment failures in the last 24 months

Table 2: Sample maintenance data.


FH RM PR UR OR FR FPR FUR SR NF
272.8 8 0 8 0 7 0 7 1 7
332.5 6 1 6 1 3 0 3 3 3
329.1 8 0 8 0 6 0 5 2 6
285.2 8 0 7 0 7 1 6 1 7
433.7 12 0 11 0 9 0 10 2 9

Table 3: Description of the selected attributes used in modelling.


Description The full name of the equipment
Flight hours (FH) The duration flight of the equipment on the aircraft or different aircraft in the selected period
RM The number of removals of the equipment in the last 24 months
UR The number of unplanned removals of the equipment in the last 24 months
SR The number of safe removals of the equipment in the last 24 months
NF (output) The number of equipment failures in the last 24 months

and the number of faults with planned/unplanned removals. Seventy-five records (approximately 13%) of the dataset
These data were analysed and represented in a format were eliminated by the pseudocode of the proposed data
suitable for modelling, and variables were characterised with preparation model. Five hundred and ten records were
the corresponding domain classification, shown in Table 1. obtained from 585 records of the dataset. Our proposed
The output variable is the number of equipment failures. A hybrid data preparation model is comprised of two stages, as
sample of the dataset is provided in Table 2. shown in Figure 3. In the first stage, nine input attributes
Feature selection is carried out using these ten attributes. were reduced to four attributes by feature selection ReliefF
The number of equipment failures is the target of the algorithm. In the second stage, the dataset was reduced to
analysis. For this purpose, feature selection ReliefF algo- 510 records by the modified K-means algorithm. The ob-
rithm was used to find relations and weighting coefficient tained dataset with 510 records were provided as inputs to
dependencies. According to the ranked values, four most the MLP, SVM, and LR prediction algorithms.
effective attributes were selected (Table 3).
Noisy and inconsistent data in the prepared datasets 4. Experimental Results
often affect prediction negatively and reduce the perfor-
mance of the system. Therefore, the modified K-means al- A program is developed to gather data for analysis through
gorithm was used to eliminate the noisy and inconsistent machine algorithms. Selected equipment’s maintenance and
data to increase the performance of the prediction. It was operational data were identified. Nine input variables and an
developed using the pseudocode given in Algorithm 1. In output variable were determined. According to using pure
this model, set centers are initially allocated, and instances 585 rows, nine inputs, and an output (585 × 10) data, MLP,
are properly distributed to the sets. A predetermined LR, and SVR models were trained and tested. The param-
number (N � 5) of records furthest to the center in each set eters of the predictors used in the study are provided in
were eliminated. The distance criterion was the Euclidean Tables 4–6, respectively.
distance measurement. The eliminated instances are shown To illustrate the performance of the suggested two-phase
in Figure 2. hybrid system, the prediction results for the raw dataset that
6 Scientific Programming

Input
Dataset Elimination
dataset

Figure 2: Data preparation model in modified K-means algorithm.

Feature selection Data row elimination


Dataset Reduced Reduced
dataset Modified dataset
ReliefF
585 × 5 K-means 510 × 5
585 × 10 algorithm
algorithm

Figure 3: Block diagram of the proposed hybrid data preparation model.

Table 4: Tuned SVR parameters.


Then, the ReliefF algorithm was applied to the raw data
Parameters Description to identify the features that prove the most effective in
C 1.0 prediction. Feature selection ReliefF algorithm was applied
ε 0.001 to nine input parameters, and according to ranked values,
Validation method 10-fold cross validation the last five of them were eliminated. MLP, LR, and SVR
Kernel function Linear models are built with selected 585 rows, four inputs, and an
output. The dataset (585 × 5) was trained and tested. As seen
in Table 8, all the results are better than those obtained
Table 5: Tuned MLP Parameters. without feature selection. The results provided by the MLP
(CC � 0.9127, MAE � 0.7301, RMSE � 0.9853) is better than
Parameters Description those of the other algorithms. The performance results for
Number of neurons for the hidden layer 2 the error parameters in each prediction algorithm are
Hidden layers 2 provided in Table 8.
Learning rate 0.2 In the final phase, the modified K-means algorithm
Momentum 0.2 was applied to the dataset to eliminate noisy and in-
Epoch 500
consistent data (585 × 5). The best k value was found to be
Goal 0.005
(k � 15). Five parameters that were farthest from the
center of each cluster were eliminated. As a result, 75 rows
were eliminated. So, a hybrid model approach was applied
Table 6: Tuned LR parameters. to the maintenance data, and the quality of the data was
improved. The LR, MLP, and SVR models were built with
Parameters Description
the selected 510 rows, four inputs, and an output. The
Batch size 100 selected data (510 × 5) were trained and tested. The results
Attribute selection method No attribute selection
indicated that the performance of the model was highly
Ridge 1.0E − 8
successful, compared to the other results obtained
without feature selection and data reduction. The per-
Table 7: Performance rating of models for (585 × 10) dataset (9 formance of the LR, MLP, and SVR algorithms are pre-
inputs 1 output). sented in Table 9.
As shown in Table 9, based on the CC, MAE, and RMSE
Method CC MAE RMSE performance criteria, the best results were provided by the
LR 0.8967 0.7341 1.0646 LR algorithm in the suggested two-phase hybrid system. For
MLP 0.8925 0.8125 1.0992 the test data, CC � 0.9316, MAE � 7108, and RMSE � 0.835
SVR 0.9008 0.741 1.0889 were obtained.
Figures 4–6 provide the linear correlation between the
is composed of 585 records and 9 attributes are presented in predicted and target results for the test data of LR, SVR, and
Table 7. Table 7 shows that based on the CC performance MLP, respectively. The results indicated that the regression
criterion, the best results were provided by the SVR algo- line in the test and the predicted data of LR provide
rithm, while the LR algorithm provided the best results y1 � 0.9976x + 0.0155; those of SVR provide y2 � 0.9184x +
based on the MAE and RMSE performance criteria. 0.48, and those of MLP provide y3 � 0.9999x − 0.0744.
Scientific Programming 7

Table 8: Performance rating of the models for (585 × 5) dataset (4 inputs and 1 output).
Method CC MAE RMSE
LR 0.8967 0.7341 1.0646
MLP 0.9127 0.7301 0.9853
SVR 0.9013 0.7415 1.0909

Table 9: Performance rating of models for (510 × 5) dataset (4 inputs 1 output).


Method CC MAE RMSE
LR 0.9316 0.6807 0.835
MLP 0,9284 0.6816 0.8555
SVR 0.9316 0.7108 0.8558

Outputs y = 0.9976x + 0.0155


14
12
10
8
Target

6
4
2
0
0 2 4 6 8 10 12
Predicted
Figure 4: Correlation between predicted and target values of the dataset for LR.

Outputs y = 0.9184x + 0.48


14
12
10
8
Target

6
4
2
0
–2 0 2 4 6 8 10 12
–2
Predicted
Figure 5: Correlation between predicted and target values of the dataset for SVR.

Outputs y = 0.9999x + 0.0744


14
12
10
8
Target

6
4
2
0
0 2 4 6 8 10 12 14
Predicted
Figure 6: Correlation between predicted and target values of the dataset for MLP.
8 Scientific Programming

1-fold test data


12

10

8
Failures
6

0
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49
Times
Predicted
Target
Figure 7: Test data for the prediction and target values of 1-fold CV for the LR.

1-fold test data


12

10

8
Failures

0
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49
Times
Predicted
Target
Figure 8: Test data for the prediction and target values of 1-fold CV for the SVR.

1-fold test data


14

12

10
Failures

0
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49
Times
Predicted
Target
Figure 9: Test data for the prediction and target values of 1-fold CV for the MLP.
Scientific Programming 9

The target and predicted values in the test dataset of References


the suggested two-phase hybrid system were provided for
the LR, SVR, and MLP in Figures 7–9, respectively. The [1] Q. Fan and H. Fan, “Reliability analysis and failure prediction of
test results provided in Figures 7–9 are the graphical construction equipment with time series models,” Journal of
Advanced Management Science, vol. 3, no. 3, pp. 203–210, 2015.
representation of the test results of the 1-fold cross
[2] P. Bastos, I. Lopes, and L. Pires, “A maintenance prediction
validation. system using data mining,” in Proceedings of the Proceedings of
Table 9 presents the predicted and target values ob- the World Congress on Engineering, vol. 3, pp. 2–7, London,
tained through the two-phase hybrid system in UK, July 2012.
Figures 7–9. As seen in Figures 7–9, the proposed hybrid [3] I. U. Din, M. Guizani, J. J. P. C. Rodrigues, S. Hassan, and
data preparation model increased the performance of V. V. Korotaev, “Machine learning in the Internet of Things:
prediction models LR, SVR, and MLP. The suggested designed techniques for smart cities,” Future Generation
hybrid system helped attain higher accuracy in prediction Computer Systems, vol. 100, pp. 826–843, 2019.
as it enabled us to select the most effective features and [4] B. Jan, H. Farman, M. Khan, M. Talha, and I. U. Din, “De-
eliminate noisy or redundant data that could lower the signing a smart transportation system: an internet of things
accuracy of predictions. and big data approach,” IEEE Wireless Communications,
vol. 26, no. 4, pp. 73–79, 2019.
[5] I. U. Din, M. Guizani, S. Hassan et al., “The internet of things:
a review of enabled technologies and future challenges,” IEEE
5. Conclusions Access, vol. 7, pp. 7606–7640, 2019.
[6] S. Lecturer and A. Pradesh, “Feature selection using ReliefF
In aviation, the use of maintenance data is highly critical algorithm,” International Journal of Advanced Research in
in the analysis of reliability and maintenance costs. This is Computer and Communication Engineering, vol. 3, no. 10,
because predictive maintenance scheduling can be pp. 8215–8218, 2014.
planned in line with estimates. The main target of pre- [7] S. F. Rosario, “RELIEF: feature selection approach,” Inter-
dictive maintenance is to predict equipment failures and national Journal of Innovative Research and Development,
planning strategies for spare parts of the system com- vol. 4, no. 11, pp. 218–224, 2015.
ponents to analyze the reliability and maintainability of a [8] H. Hsu, “Feature selection via correlation coefficient clus-
complex repairable system. In this study, a hybrid data tering,” Journal of Software, vol. 5, no. 12, pp. 1371–1377, 2010.
preparation model was applied to the landing gear system [9] F. Gürbüz, L. Özbakir, and H. Yapici, “Data mining and
maintenance dataset using feature selection ReliefF al- preprocessing application on component reports of an airline
gorithm to select attributes and a modified K-means al- company in Turkey,” Expert Systems with Applications,
vol. 38, no. 6, pp. 6618–6626, 2011.
gorithm to eliminate noisy and inconsistent data. The
[10] S. K. A. Fahad and M. Alam, “A modified K-means algorithm for
proposed hybrid data preparation method was put into big data clustering,” International Journal of Computer Science
practice through LR, SVR, and MLP models. The results Engineering and Technology, vol. 6, no. 4, pp. 129–132, 2016.
indicated that the LR model had better performance than [11] N. Yilmaz, O. Inan, and M. S. Uzer, “A new data preparation
MLP and SVR models in predicting the failure counts. The method based on clustering algorithms for diagnosis systems
results indicate that the proposed hybrid data preparation of heart and diabetes diseases,” Journal of Medical Systems,
model significantly improves the accurate prediction of vol. 38, no. 5, 2014.
failure counts. This study could function as a guide for [12] M. Kutyłowska, “Neural network approach for failure rate
using hybrid data preparation methods in machine prediction,” Engineering Failure Analysis, vol. 47, pp. 41–48,
learning algorithms and data mining. 2015.
[13] P. Ramos, J. M. Oliveira, and P. Silva, “Predictive maintenance
of production equipment based on neural network autore-
gression and ARIMA,” in Proceedings of the 21st International
Data Availability EurOMA Conference-Operations Management in An Inno-
vation Economy, pp. 1–10, Helsinki, Finland, June 2014.
The maintenance data used to support the findings of this
[14] A. A. Trani, F. C. Wing-Ho, G. Schilling, H. Baik, and
study have not been made available because sharing the A. Seshadri, “A neural network model to estimate aircraft fuel
data might compromise data privacy. Moreover the au- consumption,” in Proceedings of the AIAA 4th Aviation
thors are not allowed to share these data due to security Technology, Integration, and Operations Forum, ATIO, vol. 2,
concerns. pp. 669–692, Chicago, IL, USA, September 2004.
[15] M. Chen, R. Zhou, R. Zhang, and X. Zhu, “Application of
artificial neural network to failure diagnosis on process in-
Conflicts of Interest dustry equipments,” in Proceedings 2010 6th International
The authors declare that they have no conflicts of interest. Conference on Natural Computation, ICNC 2010, vol. 3,
pp. 1190–1193, Yantai, China, August 2010.
[16] P. Kozik, “Aircraft engine overhaul demand forecasting using
Acknowledgments ANN,” Management and Production Engineering Review,
vol. 3, no. 2, pp. 21–26, 2012.
This study was supported by the Scientific Research Project [17] A. Altay, O. Ozkan, and G. Kayakutlu, “Prediction of aircraft
of Havelsan and Presidency of Defence Industries project, failure times using artificial neural networks and genetic al-
grant no. HVL-SÖZ-18/033. gorithms,” Journal of Aircraft, vol. 51, no. 1, pp. 47–53, 2014.
10 Scientific Programming

[18] T. Benkedjouh, K. Medjaher, N. Zerhouni, and S. Rechak,


“Remaining useful life estimation based on nonlinear feature
reduction and support vector regression,” Engineering Ap-
plications of Artificial Intelligence, vol. 26, no. 7, pp. 1751–
1760, 2013.
[19] M. D. C. Moura, E. Zio, I. D. Lins, and E. Droguett, “Failure
and reliability prediction by support vector machines re-
gression of time series data,” Reliability Engineering & System
Safety, vol. 96, no. 11, pp. 1527–1534, 2011.
[20] K. Kira and L. A. Rendell, “Feature selection problem: tra-
ditional methods and a new algorithm,” in Proceedings Tenth
National Conference on Artificial Intelligence, pp. 129–134,
1992.
[21] K. Kira and L. A. Rendell, “A practical approach to feature
selection,” Machine Learning Proceedings 1992, vol. 1992,
pp. 249–256, 1992.
[22] I. Kononenko, “Estimating attributes: analysis and extensions
of RELIEF,” in Lecture Notes in Computer Science (including
subseries Lecture Notes in Artificial Intelligence and Lecture
Notes in Bioinformatics), vol. 784, pp. 171–182, Springer,
Berlin, Germany, 1994.
[23] J. MacQueen, “Some methods for classification and analysis of
multivariate observations,” in Proceedings of the Fifth Berkeley
Symposium on Mathematical Statistics and Probability, Ber-
keley, CA, USA, 1967.
[24] N. Nadai, A. H. A. Melani, G. F. M. Souza, and S. I. Nabeta,
“Equipment failure prediction based on neural network
analysis incorporating maintainers inspection findings,” in
Proceedings of the Annual Reliability and Maintainability
Symposium, Orlando, FL, USA, January 2017.
[25] T. S. Lan, P. C. Chen, M. Y. Wang, K. S. Hsu, and T. Y. Chen,
“A study of using back-propagation network to predict air-
craft component life span,” in 2016 International Conference
on Applied System Innovation,” in Proceedings of the IEEE
ICASI 2016, Okinawa, Japan, May 2016.
[26] S. Oladokun, “Predicting mean time between failures of a
maintained equipment using artificial neural network,”
American Journal of Scientific and Industrial Research, vol. 1,
no. 3, pp. 500–503, 2010.
[27] P. S. Rajpal, K. S. Shishodia, and G. S. Sekhon, “An artificial
neural network for modeling reliability, availability and
maintainability of a repairable system,” Reliability Engineering
and System Safety, vol. 91, no. 7, pp. 809–819, 2006.
[28] M. Mahmudul, A. Mia, S. K. Biswas, M. C. Urmi, and
A. Siddique, “An algorithm for training multilayer perceptron
MLP for image reconstruction using neural network without
overfitting,” International Journal of Scientific & Technology
Research, vol. 4, no. 2, pp. 271–275, 2015.
[29] C. Cortes and V. Vapnik, “Support-vector networks,” Ma-
chine Learning, vol. 20, no. 3, pp. 273–297, 1995.
[30] M. Schuld, I. Sinayskiy, and F. Petruccione, “Prediction by
linear regression on a quantum computer,” Physical Review A,
vol. 94, no. 2, 2016.
[31] M. Buyukyildiz and S. Y. Kumcu, “An estimation of the
suspended sediment load using adaptive network based fuzzy
inference system, support vector machine and artificial neural
network models,” Water Resources Management, vol. 31,
no. 4, pp. 1343–1359, 2017.

You might also like