Architecture and Algorithm Design for Civil Aviation Data Real-Time Analysis System
Architecture and Algorithm Design for Civil Aviation Data Real-Time Analysis System
ABSTRACT Currently, the analysis of civil aviation flight data relies primarily on post-flight data. This
involves transmitting onboard data to the servers of the airline and aircraft manufacturer after the aircraft has
landed, and the engines have been shut down, using ground-based cellular stations or wireless hotspots. Real-
time data analysis during flight remains a challenge. With the advent of integrated communication networks
in the aerospace sector, the real-time transmission of onboard data is poised to become a future trend. But
there is a gap in real-time civil aviation data analysis system. This study is based on Kappa distributed
computing architecture, established a real-time aviation data analysis system, which inputs Quick Access
Recorder (QAR) data stream and achieves fast data decoding and multiple real-time analysis algorithms,
including flight trajectory outlier repair, flight phase identification, steady cruising state determination, and
engine lubricating oil quantity monitoring. The results of simulation experiments indicate that the maxi-
mum average latency for individual algorithms is consistently below 250 milliseconds, meeting real-time
performance requirements. And all real-time algorithms have similar performances with or better than
existing excellent post-flight algorithms. Specifically, the flight trajectory outlier repair algorithm can repair
all outliers and cost 0.2ms to process an outlier. The flight phase identification achieved 97.8% accuracy
referring to commercial software’s results. The steady cruising state determination exhibits computational
efficiency approximately five times faster than the naive method. The proposed engine lubricating oil
quantity monitoring method based on the Clustering Optimized Transformer (CO-Transformer) neural
network yields an average mean square error of approximately 0.057 L2 between predicted and actual
values, respectively 7.36%, 66.01% and 66.05% better than common Transformer, LSTM and GRU neural
network, satisfying the airborne leakage monitoring requirements for engine lubrication systems. The system
also exhibits strong load capacity and excellent scalability. Under the employed hardware conditions, the
system can concurrently process several practical real-time QAR data streams for up to 505 civilian planes
in simulated experiments.
INDEX TERMS Distributed data processing, machine learning, aviation data analysis.
the operational status and high-level information of aircraft’s whether aircraft is steady in its cruising phase by the informa-
critical components, and enhance the safety and reliability tion of its attitude and environment, there is currently limited
of civil aviation. The real-time transmission of recorded data research on algorithms for assessing the cruising steady state
from onboard sensors is poised to become a new trend. of civil aviation aircraft. Yang [17], in a study on the regional
However, research on real-time analysis systems for avia- aircraft Xinzhou 600, achieved the determination of engine
tion onboard sensor data is still in its nascent stage. Current cruising steady state. Cao et al. [18] proposed a method by
methods are confined to standalone software, providing only calculating a list of judgment conditions to judge steady state
fundamental decoding capabilities, namely, converting raw values in her work. And in Wang’s study [19], his method
binary data records in aviation-specific formats into sensor- of calculating judgement conditions was still common naive
recorded engineering values [4], [5]. These existing systems method based on sliding window. This method is not real-
did not consider streaming architecture and big data pro- time optimized, showing higher time cost in the simulating
cessing, also lack further in-depth analysis or cutting-edge experiment. Presently, the determination of steady cruising
machine learning methods. Thus, they may fall short in state in post-flight analysis employs a naive computational
real-time civil aviation data processing and in-depth anal- method for determining conditions, incurring high compu-
ysis. This study proposes a design of a real-time data tational costs. Therefore, there is a need to design efficient
analysis software system for civil aviation based on the discriminative algorithms for real-time analysis.
Kappa [6] streaming processing architecture, which is one In addressing the issues, this paper proposes three efficient
of the main contributions in this paper, addressing the gap algorithms suitable for real-time aviation data analysis, laying
in real-time civil aviation data analysis system. The system the foundation for subsequent real-time monitoring of critical
is oriented towards real-time Quick Access Recorder (QAR) aircraft components’ performance.
data streams, enabling fast data decoding. Multiple real-time 1. Flight trajectory outlier repair, proposed in this paper,
analysis algorithms have been developed in the system, which concerns the reason why outliers emerge based on the
are also the main contributions and case studies about deploy- characteristic of trajectory data recording format. And
ment of in-depth data analysis on real-time system, including according to the practical aircraft trajectory’s changing
flight trajectory outlier repair, flight phase identification, rate, it utilizes trajectory changing rate to dynamically
steady cruising state determination, and engine lubricating oil detect anomalous trajectory points in real-time and
quantity monitoring. subsequently restores the trajectory values to their orig-
In aviation data analysis, flight trajectory outlier repair, inal true values. The algorithm can promptly identify
flight phase identification, and steady cruising state deter- and accurately rectify trajectory anomalies, thereby
mination are foundational for in-depth analysis such as enhancing system’s precision in tracking aircraft flight
monitoring the performance of critical aircraft components. trajectories.
Existing real-time onboard data processing software sys- 2. Flight phase state identification, a perceptual real-time
tems [4], [5] do not incorporate functionalities such as algorithm, is proposed, which inputs multiple real-time
flight trajectory outlier repair, flight phase identification and parameter data, dynamically assesses the aircraft’s
steady cruising state determination. Methods employed in environmental conditions based on several identifying
post-flight batch processing cannot be directly applied to criteria and calculates the real-time flight phase results.
real-time analysis. For instance, concerning the detection Experimental results demonstrate that, in comparison
of outliers in flight trajectory repair, commonly employed to post-flight identifying method results, the real-time
methods include the Local Outlier Factor (LOF) algorithm one achieves a consistency rate of 97.8%, close to
proposed by Breunig et al. [7], and the Isolation Forest above newly post-flight research’s results. Based on
algorithm utilized by Cheng et. al [8]. Sameer et. al [9] has real-time flight phase values, reliable support can be
applied LOF in anomaly detection in flight data. Regard- offered for further analyses [20].
ing flight phase identification, Hou et al. proposed a flight 3. Steady cruising determination, which needs to deter-
phase division approach based on decision tree model [10]. mine whether aircraft is in steady cruising state.
Zhang et al. [11] employed the Time Interval-based Clus- The algorithm utilizes monotonic queue to compute
tering [12] (TICC) algorithm and Density-Based Spatial the interval extremum of real-time data streams and
Clustering of Applications with Noise [13] (DBSCAN). employs a queue to calculate the interval mean and
Sun et al. [14] and Wang et al. [15] both utilized the DBSCAN sample variance. This algorithm improves the perfor-
algorithm, while Kuzmenko et al. [16] applied Maximum mance of calculating intermediate values and reduces
Posteriori Probability model to construct a model for flight the overall computation latency of the module. In com-
phase identification based on post-flight data. These models parison to the naive method deployed in many recent
both show high accuracy in their works, but all require a studies, our proposed method’s performance is approx-
batch of data for pre-processing or model learning, making it imately five times faster, achieving rapid determination
challenging to deploy them in real-time data analysis systems of steady state during the cruising phase.
for civil aviation. Regarding the determination of steady state Building upon these algorithms, this paper also presents a
during the cruising phase of aircraft, which is to identify machine learning case study, focusing on the aircraft engine
(Algorithm 1). The algorithm sets a maximum discriminative 2. After the taxi-out phase, Wang et al. [15] utilized
threshold m for the latitude and longitude changing rate, the data from the lift devices (slats and flaps) to
which is represented as the difference between the current partition the flight phases. In this study, we further
and previous values. When the absolute value of difference incorporate radio altitude and relative barometric alti-
exceeds the threshold m, indicating a mismatch between high tude (H = ALT-ALTg ) to distinguish the takeoff,
and low parts, a trajectory anomaly occurs. According to the initial climb, and climb phases. This helps elimi-
ARINC 717 data format, both latitude and longitude low part nate the impact of variations in lift device models
data will experience numerical carryover when growing to and pilot-induced delays in operating the lift devices,
0.7032832◦ . Therefore, the threshold m needs to be no higher enhancing the algorithm’s overall applicability.
than 0.7032832◦ . At the same time, to prevent misjudgment 3. Upon entering the stratosphere and setting the route, the
of correct data, m must be greater than the maximum correct algorithm employs a 5-second average rate of change
difference. Considering that the maximum flight speed of of barometric altitude to determine the aircraft’s verti-
civil aviation aircraft does not exceed 1000 kilometers per cal motion attitude, thus distinguishing between climb,
hour, this paper sets m = 0.5◦ . In Algorithm 1, if the absolute cruise, and descent phases. The partitioning condition
difference between current trajectory value point and the is expressed as (1), where the threshold value m can be
previous is over the threshold, the low part value ci will be adjusted based on the aircraft model [15].
restore to ci−1 at which numerical carry or borrow does not 4. As the aircraft approaches the destination, the
occur. Subsequently, the high and low part data are combined algorithm employs vertical motion attitude, slat and
to obtain the complete data value di , which is then written to flap status, and radio altitude data to partition the initial
the database. approach, approach, and final approach phases. When
the radio altitude reaches zero, indicating landing, the
C. FLIGHT PHASE IDENTIFICATION algorithm utilizes ground speed (GS) to determine entry
This paper proposes a perception-based real-time flight phase into the flare phase.
identification method. It utilizes the flight phase at the
previous and current parameter data, employing multiple
dALT < −m,
DESCENT
identifying criteria to perceive the aircraft’s flight phase
−m ≤ dALT ≤ m, CRUISE (1)
and environmental conditions. This process calculates the
m < dALT ,
CLIMB
real-time flight phase at the current moment. The specific
identifying workflow is illustrated in Figure 5. The thresholds
in conditions are determined by the specific dataset. D. STEADY CRUISING STATE DETERMINATION
The algorithm dynamically divides the aircraft’s flight This subsection will introduce the algorithm for determining
process into 14 flight phases using six parameters listed in the steady state during the cruise phase. Steady state during
Table 2. The main process is outlined as follows: the cruise phase refers to the condition where commercial
1. In the initial phase of engine stopped, the algorithm aircraft maintain stable flights by controlling factors such
utilizes the high-pressure rotor speed of the engine as a as speed, altitude, attitude, and considering external factors
criterion to determine the aircraft’s engine-started and such as weather conditions and turbulence. The determination
taxi-out phases. of steady cruising state plays a crucial role in aviation data
TABLE 2. Parameters used in flight phase identification. Algorithm 2 Computing interval maximum/minimum based
on sliding window monotonic queue
Data: x is a data record, q is a double-ended queue for data
buffering, where h is the head element, t is the last
element of the queue.
Require: Maximum/Minimum value of the required interval
1 /∗ Assume T (·), V (·) is to get the timestamp and value
of a data record respectively ∗/
2 h ← q. headElement();
3 while h ̸= null and T (h) < T (x) − M do/∗ window
length M ∗/
TABLE 3. Parameters used in flight phase identification. 4 q.popHead();
5 h ← q. headElement();
6 end
7 t ← q. tailElement();
8 while t ̸= null and V (t) ≤ V (x) do/∗ For minimum value,
V (t) ≥ V (x). For maximum value V (t) ≤ V (x) ∗/
9 q.popHead();
10 h ← q. headElement();
11 end
12 q.pushAsTail(x);
13 h ← q. headElement();
14 return h
analysis. For instance, analyzing fuel consumption during length is M , the overall enqueue and dequeue operations
the steady cruising state provides insights into the aircraft’s are both performed N times since each timestamp’s record
performance degradation, guiding maintenance efforts [45], enters and exits the queue only once. The query operations
[46]. Additionally, analyzing flight attitudes during the steady are performed N -M + 1 times. The time complexity for all
state of the cruise phase helps detect attitude anomalies, guid- cruising data is O(N ) and the amortized one is O(1). The
ing adjustments to aircraft control surfaces [47]. Deploying a minimum value is similar with above.
real-time algorithm for determining steady state during the For the calculation of the window interval mean and sample
cruise phase can provide support for further data analysis. variance, this paper introduces a fast algorithm that utilizes
The criteria for determining the steady state during the queues to update real-time prefix sum [49] and prefix square
cruise phase include maintaining stability in various param- sum [49] for mean and variance computation, reducing the
eters such as aircraft altitude, flight speed, aircraft attitude, computational complexity. For a parameter of time-series
and atmospheric static temperature over a specified time data, establish a queue q to buffer parameter data, and set
interval (Table 3). The mean(∗), std(∗), min(∗), and max(∗) up two state variables s and ss to record the total sum and
in Table 3 represent the mean, standard deviation, minimum, total square sum of elements in the queue, respectively. Based
and maximum values of the parameters over the time interval, on real-time data stream, elements are dynamically popped
respectively. out and pushed into the queue q to ensure a constant queue
This paper proposes a fast real-time determination method length of M , where M is the interval length of sliding window,
for steady cruising state, improving the calculation process and simultaneously update the values of state variables s and
for interval extrema, mean, and sample variance. For the ss. By combining state variables s, ss, and the queue length
maximum and minimum interval value, a monotonic queue L(q), interval mean and interval sample variance within a
algorithm [48] is employed to reduce the number of redundant window can be calculated. Refer to Algorithm 3 for the
calculations, as detailed in Algorithm 2. Taking the example detailed algorithmic process. For a parameter with a total of
of calculating the maximum value within a window interval N data records in the cruise phase, there are N -M + 1 data
for a specific parameter, establish a queue for that parameter windows. Each data record is pushed into and popped out of
and ensure that the elements in the queue are always arranged the queue only once. For each window, state variables s and
in descending order. When new data for the parameter is ss are recalculated only when queue elements are updated,
received at a new timestamp, update the window interval first and interval mean, and sample variance are calculated only
and remove outdated data from the queue. Then, compare the when queue elements are updated. Therefore, the total time
new data with the elements in the queue starting from the tail complexity for calculating all windows of a single parameter
until encountering an element greater than the new data. Stop is O(N ), amortized to O(1) per timestamp.
the comparison, remove elements smaller than or equal to the
new value from the queue, and finally add the new data to V. CO-TRANSFORMER
the tail of the queue. Assuming a parameter has a total data This section introduces the CO-Transformer model, which
count of N within the cruising phase, and the window interval is used for real-time monitoring of engine lubricating oil
Algorithm 3 Computing interval mean and sample variance TABLE 4. Symbols that CO-Transformer utilize and their meanings.
base on sliding window queue
Data: x is a data record, q is a double-ended queue for data
buffering, where h is the head element, t is the last
element of the queue.
Require: Interval mean µ, interval sample variance σ .
1 /∗ Assume L (·), is to get the length of queue ∗/
2 h ← q. headElement();
3 while h ̸= null and T (h) < T (x) − M do/∗ window length M
∗/
4 s ← s − V (h);
5 ss ← ss − [V (h)]2 ;
6 q. popHead();
7 h ← q. headElement();
8 end
9 q.pushAsTail(x);
10 s ← s + V (x);
11 ss ← ss − [V (x)]2 ;
12 µ ← p s/L(q);
13 σ ← [ss − s2 /L(q)]/[L (q) − 1]
14 return µ, σ
where WQ , WK , WV are weight matrices learned during the
training process. And X is input data matrix. The parameters
quantity. The lubrication system of an aviation engine plays the data set utilizes are shown in Table 4. The attention unit
a crucial role, and a malfunction can lead to engine shut- can horizontally expand into a multi-head attention mecha-
down, significantly impacting flight safety [50] Currently, nism, and its output is the weighted sum of the outputs from
fault monitoring of the lubrication system relies largely on multiple attention units (4).
experience and qualitative analysis, lacking precision and
quantification [51]. There is still a lack of real-time cal-
MutiHeadAttention (Q, K , V )
culation methods for the baseline value of lubricating oil
quantity during flight. This paper proposes a CO-Transformer = concati∈heads (Attention (Qi , Ki , Vi )) WO (4)
model, using clustering algorithm to optimize Transformer.
The model utilizes 15 parameters (Table 4) including engine where heads number is a set of heads and WV is a weight
high-pressure rotor speed, aircraft altitude, flight speed, and matrix. The output from one Encoder is further processed
attitude as inputs, with the baseline value of lubricating through an activation layer, average pooling layer, and fully
oil quantity as the output. When a significant deviation is connected layer to obtain the final output, which represents
observed between the monitored actual oil quantity during the predicted baseline value of lubricating oil quantity for
flight and the calculated baseline value, predicting lubricant an individual sub-model. The loss function value l between
leakage, an alarm is triggered. This prompts both the flight the predicted values from the Encoder and the true values is
crew and ground maintenance personnel to take necessary utilized to update sub-model’s parameters. Refer to Figure 6
measures to enhance flight safety. for an illustrative diagram of the sub-model computation,
The CO-Transformer model is based on the Bagging con- where y represents the output baseline value of lubricating
cept in ensemble learning within the context of a random oil quantity.
forest [52]. It integrates multiple Transformer Encoder as sub- To introduce diversity to sub-models, Bootstrap sam-
models, utilizes the Bootstrap method to generate training pling [26] is employed to partition the training data into
datasets for these sub-models, and employs the DBSCAN distinct training subsets. These subsets are independently
clustering algorithm to evaluate and select sub-models that utilized for training each sub-model, enhancing the overall
exhibit superior performance. The Transformer Encoder model’s generalization capability, and mitigating overfitting.
sub-models use an attention mechanism to depict the signifi- The CO-Transformer model employs DBSCAN clustering
cance of past temporal data for predicting values in sequence algorithm to filter out sub-models with superior performance.
data, and its calculation formula (2) is expressed as follows. Model selection utilizes the sub-model loss function value (l
in Figure 6) as the clustering metric. After z-score normal-
QK T
Attention (Qi , Ki , Vi ) = softmax √ V (2) ization [53], the loss values are inputted into the DBSCAN
dk clustering model, which clusters the sub-models into multi-
where Q, K and V are all vectors and dk are the dimension ple model clusters based on the density of the metric. The
of K . Their calculation formula (Formula 3) is expressed as Mean Square Error (MSE) is used as the loss function for
follows. filtering. And compared to Mean Absolute Error (MAE), the
differences in normalized MSE are more pronounced among
Q = WQ X, K = WK X, V = WV X (3) the sub-models. The formulas of MSE and MAE are shown
TABLE 6. The results of the two types of latency for each module.
FIGURE 10. Comparison of results of real-time flight phase identification
method and commercial post-flight data analysis software.
FIGURE 11. Metric values of different models in training dataset and validating dataset during the whole training process.
FIGURE 12. Comparison of results of real-time flight phase identification method and commercial post-flight data analysis software.
TABLE 7. As the average throughput increases, the average cluster CPU occupancy and the average type 1 latency of each module.
Kappa architecture and various algorithm modules. We ana- packets into the system and ends with each module complet-
lyzed the system’s load capacity by recording two types of ing the calculation and outputting the result. We calculated
delays. The first type of latency starts with the entry of data the average latency of each algorithm module, including
queue delay, algorithm processing delay, and inter-module TABLE 8. The regression formulas for the first type of average latency in
each module with the system throughput as the independent variable.
data interaction delay. The second type of latency starts with
the module receiving data and ends with the module com-
pleting the calculation and outputting the result, measuring
the module’s inherent processing delay. The overall system
latency is the maximum latency among all modules. The
latency test results are shown in Table 6, where both types
of latency for each module are below 250 milliseconds. This
is less than the QAR data packet sending interval (1 sec-
ond), ensuring that there is no data packet accumulation and
meeting the real-time aviation data processing requirements.
And compared with Type 1 latency of decoding module in
Qin’s work (71.45ms) [4], this system has significantly higher
performance (12ms) in the experimental hardware environ- data analysis and processing scenarios for multiple flights in
ment, which enables the system to leave more time for other actual flight routes.
complex data analysis algorithms.
For the system’s load testing, the average throughput of VII. CONCLUSION
QAR data for a single flight in the simulated data stream is This paper designs and implements a real-time aviation data
164 kbps. In the tests, we increased the parallel flights send- processing system based on the Kappa streaming processing
ing real-time data streams to simulate a high-load data stream architecture. Multiple real-time analysis algorithms, includ-
environment. We then recorded and analyzed the average ing flight trajectory outlier repair, flight phase identification,
CPU usage across all servers in the cluster and the first type of steady cruising state determination, and engine lubricant oil
average delay for each algorithm module. The experimental quantity monitoring, have been developed. This provides a
results are presented in Table 7, where the average CPU novel data analysis solution for the future real-time transmis-
utilization of the cluster shows a linear increase with the rising sion scenarios of onboard sensors in civil aviation. Simulation
throughput. Based on the experimental results, a regression test results demonstrate that the maximum average delay for
formula for the average CPU utilization as a function of the individual algorithms is consistently below 250 milliseconds.
average throughput is derived as L(x) =0.701x-11.85, where The algorithms can run in parallel, meeting real-time analy-
x is the system’s average throughput, and L(x) is the CPU sis requirements. Specifically, the improved steady cruising
average utilization. For each algorithm module, the regres- state determining algorithm exhibits computation efficiency
sion formula between its first type of latency and the system’s five times faster than the naive non-optimized method. The
average throughput can be obtained, as shown in Table 8, CO-Transformer neural network-based lubricant oil quantity
where Fi (x) represents the first type of delay for i-th algorithm monitoring method achieved an average mean square error of
module. approximately 0.057 L2 between predicted and actual values,
According to the system performance formula, under the showing significant performance improvement compared to
constraints of CPU utilization and system processing delay, existing methods and meeting the requirements for moni-
a theoretical model maximizing the system throughput can be toring lubricant system airborne leaks. The system shows
established, as shown in (7), where the optimization objective strong load capacity and good scalability, providing a solution
is the system throughput. The constraints Fi (x) <1000 rep- to the gap in real-time civil aviation data analysis system,
resent that the first type of delay for i-th algorithm module enabling further research on real-time analysis of aircraft
is less than the real-time data packet sending interval (1000 fuel efficiency and critical systems such as control surfaces,
milliseconds), and the constraint L(x) <100% indicates that hydraulics, and avionics.
the CPU average utilization does not exceed 100%. But the system still has shortcomings that need concerns
in the future. According to Kumari et al. work [54], [55],
max x which surveyed existing literature about multimedia big data
( (MMBD), computing security, and other challenges about big
Fi (x) < 1000, i = 1, 2, . . . , 5 data processing in IoT. Encrypted data transfer and private
s.t. (7)
L (x) < 100% computing also deserves more contributions in this system,
also in the field of aviation big data, which are the most
By employing the linear programming method to solve the important cyber security aspects in your solution. Moreover,
optimization problem, the maximum system throughput can fog computing (or edge computing) in industry is also vital
be obtained. Under the hardware conditions used in the to be concerned about, especially in the new 5G environ-
experiment, the maximum throughput is 79 Mbps, corre- ment [56], [57]. This technology can be introduced to reduce
sponding to 505 civilian planes concurrently send real-time the load of center cluster and throughput of ground net-
data streams. The analysis results indicate a strong system work, like data compression and pre-processing work can be
load capacity, capable of meeting high-concurrency real-time deployed in the ground station or edge servers.
[44] J. Kim and J. Lyou, ‘‘Enhanced QAR flight data encoding and decod- QI XI received the bachelor’s and Ph.D. degrees in
ing algorithm for civil aircraft,’’ in Proc. SICE-ICASE Int. Joint Conf., information and communication engineering from
Oct. 2006, pp. 5169–5173. Shanghai Jiao Tong University, in 2017.
[45] J. L. Speyer, ‘‘Nonoptimality of the steady-state cruise for aircraft,’’ AIAA He used to be the Key Engineer of the Data
J., vol. 14, no. 11, pp. 1604–1610, Nov. 1976. Analysis Team, Commercial Aircraft Corporation
[46] K. N. Amrutha, Y. K. Bharath, and J. Jayanthi, ‘‘Aircraft engine fuel flow of China, Ltd., Shanghai, China. Currently, he is
parameter prediction and health monitoring system,’’ in Proc. 4th Int. Conf. the Deputy Director of the Industrial Big Data
Recent Trends Electron., Inf., Commun. Technol. (RTEICT), May 2019, Laboratory, School of Data Science and Engineer-
pp. 39–44, doi: 10.1109/RTEICT46194.2019.9016703.
ing, South China Normal University. His research
[47] G. Zhang, Z. Chen, and M. Xu, ‘‘High-speed aircraft position and
interests include industrial internet, industrial data
attitude control using reinforcement learning,’’ in Proc. 2nd Int. Conf.
Robot., Artif. Intell. Intell. Control (RAIIC), Aug. 2023, pp. 156–161, doi: analytics, and 5G communications.
10.1109/raiic59453.2023.10280846.
[48] C. E. Leiserson, R. L. Rivest, T. Cormen, and C. Stein, Introduction to
Algorithms, vol. 3. Cambridge, MA, USA: MIT Press, 1994.
[49] G. E. Blelloch, ‘‘Prefix sums and their applications,’’ School Comput.
Sci., Carnegie Mellon Univ., Pittsburgh, PA, USA, Tech. Rep., Nov. 1990.
[Online]. Available: http://shelf2.library.cmu.edu/Tech/23445461.pdf
[50] G. Pierre, ‘‘Monitoring of the lubrication system of an aircraft engine
through a prognostic and health monitoring approach,’’ M.S. thesis, School
Ind. Eng. Manag., KTH, 2015.
[51] E. D. Lledo, ‘‘Diagnostic et pronostic de défaillances dans des composants
d’un moteur d’avion,’’ Ph.D. thesis, Toulouse III, 2008. Accessed: Jan. 18,
2024. [Online]. Available: https://www.theses.fr/2008TOU30109
[52] G. Biau and E. Scornet, ‘‘A random forest guided tour,’’ TEST, vol. 25,
no. 2, pp. 197–227, Jun. 2016, doi: 10.1007/s11749-016-0481-7. JING WANG received the Ph.D. degree in engi-
[53] V. N. G. Raju, K. P. Lakshmi, V. M. Jain, A. Kalidindi, and neering from Shanghai Jiao Tong University,
V. Padma, ‘‘Study the influence of normalization/transformation process in 2016.
on the accuracy of supervised classification,’’ in Proc. 3rd Int. Conf. She is currently an Assistant Researcher with
Smart Syst. Inventive Technol. (ICSSIT), Aug. 2020, pp. 729–735, doi: South China Normal University. Her research
10.1109/ICSSIT48917.2020.9214160.
interests include system reliability and engineering
[54] A. Kumari, S. Tanwar, S. Tyagi, N. Kumar, M. Maasberg, and
optimization.
K.-K.-R. Choo, ‘‘Multimedia big data computing and Internet of Things
applications: A taxonomy and process model,’’ J. Netw. Comput. Appl.,
vol. 124, pp. 169–195, Dec. 2018.
[55] A. Kumari, S. Tanwar, S. Tyagi, and N. Kumar, ‘‘Verification and vali-
dation techniques for streaming big data analytics in Internet of Things
environment,’’ IET Netw., vol. 8, no. 3, pp. 155–163, May 2019.
[56] A. Kumari, S. Tanwar, S. Tyagi, and N. Kumar, ‘‘Fog computing for
Healthcare 4.0 environment: Opportunities and challenges,’’ Comput.
Electr. Eng., vol. 72, pp. 1–13, Nov. 2018.
[57] A. Kumari, S. Tanwar, S. Tyagi, N. Kumar, M. S. Obaidat, and
J. J. P. C. Rodrigues, ‘‘Fog computing for smart grid systems in the 5G
environment: Challenges and solutions,’’ IEEE Wireless Commun., vol. 26,
no. 3, pp. 47–53, Jun. 2019.
YIFENG ZHANG was born in Guangdong, China, SHUHUAI GU was born in Guangdong, China,
in 2002. He is currently pursuing the B.S. degree in 2003. He is currently pursuing the B.S. degree
in data science and engineering with South China in data science and engineering from South China
Normal University. Normal University.
His research interests include big data applica- His research interests include system reliability
tions, distributed data processing, machine learn- and engineering optimization, machine learning,
ing, and aviation data analysis. and aviation data analysis.