Samuel M.
Hipple
Simulation Modeling and Decision Science
Program Ames Laboratory,
1620 Howe Hall, Using Machine Learning Tools to
Ames, IA 50011
e-mail: [email protected] Predict Compressor Stall
Harry Bonilla-Alvarado Clean energy has become an increasingly important consideration in today’s power
Simulation Modeling and Decision Science
Program Ames Laboratory,
systems. As the push for clean energy continues, many coal-fired power plants are being
1620 Howe Hall,
decommissioned in favor of renewable power sources such as wind and solar. However,
Ames, IA 50011
the intermittent nature of renewables means that dynamic load following traditional
e-mail: [email protected]
power systems is crucial to grid stability. With high flexibility and fast response at a
wide range of operating conditions, gas turbine systems are poised to become the main
load following component in the power grid. Yet, rapid changes in load can lead to fluid
Paolo Pezzini flow instabilities in gas turbine power systems. These instabilities often lead to compressor
Simulation Modeling and Decision Science
surge and stall, which are some of the most critical problems facing the safe and efficient
Program Ames Laboratory,
operation of compressors in turbomachinery today. Although the topic of compressor surge
1620 Howe Hall,
and stall has been extensively researched, no methods for early prediction have been proven
Ames, IA 50011
effective. This study explores the utilization of machine learning tools to predict compressor
e-mail: [email protected]
stall. The long short-term memory (LSTM) model, a form of recurrent neural network
(RNN), was trained using real compressor stall datasets from a 100 kW recuperated gas
Lawrence Shadle turbine power system designed for hybrid configuration. Two variations of the LSTM
National Energy Technology Laboratory, model, classification and regression, were tested to determine optimal performance. The
3610 Collins Ferry Road, regression scheme was determined to be the most accurate approach, and a tool for predict-
Morgantown, WV 26507 ing compressor stall was developed using this configuration. Results show that the tool is
e-mail: [email protected] capable of predicting stalls 5–20 ms before they occur. With a high-speed controller
capable of 5 ms time-steps, mitigating action could be taken to prevent compressor stall
Kenneth M. Bryden before it occurs. [DOI: 10.1115/1.4046458]
Simulation Modeling and Decision Science
Program Ames Laboratory, Keywords: energy conversion/systems, natural gas technology, compressor surge and
1620 Howe Hall, stall, machine learning, long short-term memory, turbomachinery
Ames, IA 50011
e-mail: [email protected]
1 Introduction Plotting stall inception points as shown in Fig. 1 produces what is
referred to as a “surge line.” Generally, operating close to the surge
The growing popularity of renewable energy combined with the
line produces higher compressor efficiency but increases the risk of
rising demand for electrical power requires that today’s conven-
stall. As the push for more efficient power generation continues,
tional power plants have high efficiency, low emissions, and high
compressors in natural gas turbine systems will be required to
flexibility [1,2]. With low cost, high efficiency, and high flexibility,
operate closer to the surge line. This, coupled with highly
natural gas turbines are well suited to become the main load follow-
dynamic load following operations, means that the potential for
ing component of the power grid. However, high load following
compressor stall will continue to rise. Because these issues can
flexibility represents one of the most critical problems to be
lead to major turbomachinery failures, several techniques that can
addressed even by natural gas turbines. Forcing traditional fossil
detect, control, or predict compressor surge and stall have been
generation assets to operate in a load following mode strongly
studied in the research field.
affects the dynamic operation of these systems. For instance, in
In the first area of research, compressor surge, and stall detection,
gas turbine systems compressor instabilities such as surge and
Day performed the most notable work because Day was able to
stall may become a significant problem.
detect stall inception by measuring rotor tip velocity at six
Compressor stall occurs when the inlet flow is separated from the
annular positions in the compressor [6]. However, this method
compressor blade due to an increase in the angle of incidence
required a compressor that had to be retrofitted with embedded
between the inlet air and the blade [3]. This flow separation
sensors, which is not common in most applications. Although
results in turbulence along the blade, causing a localized pressure
advanced methods for detecting compressor stalls are well
drop. When left unchecked, the turbulent flow propagates through-
researched, no current detection systems can identify precursors
out the compressor until a complete choking of the airflow occurs
to a stall that might indicate its formation in advance [7]. In addition
[3]. This phenomenon, called surge, results in strong pressure pul-
to detection, active control and mitigation of compressor stall repre-
sations that can cause extensive damage to the turbomachinery [4].
sents a second main area of research with many different strategies
Although compressor stalls are yet to be fully understood [5], these
and methods in use [8]. Three primary methods have been tested
instabilities are generally a function of the mass flowrate and the
and implemented to actively control compressor surge and stall;
pressure ratio between the inlet and the outlet of the compressor.
these include bleed-air (BA) valves, controlled inlet air jets, and
When the mass flowrate decreases relative to the compressor pres-
variable inlet guide vanes [9–12]. In their review, Gu et al. [9] con-
sure ratio, stall may occur.
cluded that inlet air jets and guide vanes are more effective at reduc-
ing stall, but bleed valves are effective for both stall and surge.
The work was authored in part by a U.S. Government employee in the scope of his/ While stall detection, control, and mitigation have been widely
her employment. ASME disclaims all interest in the U.S. Government’s contribution. studied, work in compressor stall prediction has been limited
Contributed by the Advanced Energy Systems Division of ASME for publication in mostly to numerical models. The most common is the Moore and
the JOURNAL OF ENERGY RESOURCES TECHNOLOGY. Manuscript received November 19,
2019; final manuscript received February 9, 2020; published online February 25,
Greitzer model [13], which aims to capture the nonlinear character-
2020. Assoc. Editor: Ronald Breault. istics of compressor stall. However, the Moore–Greitzer model
Journal of Energy Resources Technology Copyright © 2020 by ASME JULY 2020, Vol. 142 / 072305-1
series data, acquired at a 5-ms rate from a 100 kW recuperated gas
turbine power system designed for hybrid configuration, were used
to train the model. The neural network was trained with two
approaches, classification and regression, which were compared
to understand the performance for predicting stall events. The
regression approach was then implemented with a forecasting
threshold logic to predict stall inception points. Finally, the applica-
tion of this tool was evaluated by proposing how it could be com-
bined with the current stall recovery scheme of the micro gas
turbine hybrid system.
2 Experimental Facility
Experimental data from the Hybrid Performance (Hyper) facility
Fig. 1 Operating points on a compressor surge map at the National Energy Technology Laboratory served as a training
dataset to develop the compressor stall prediction tool. Hyper con-
sists of a modified recuperated gas turbine and a cyber-physical
assumes incompressible flow within the compressor, which is not solid oxide fuel cell (SOFC). The cyber-physical SOFC is a combi-
valid for high-speed compressors [9]. Physics-based models gener- nation of physical and virtual components designed to emulate the
ally rely on simplifications and assumptions of ideal behavior, dynamic operation of a real SOFC in a SOFC/GT hybrid power
which makes them difficult to apply to dynamic events in highly system [17]. This makes it possible to conduct studies on system
coupled systems such as turbomachinery. For this reason, a differ- integration and controls without the risk of destroying a fragile
ent approach is needed to capture the system dynamics that can and expensive SOFC [17,18]. A diagram depicting the layout of
cause compressor stall so action can be taken to prevent it. the Hyper facility is shown in Fig. 2. In the direct-fired SOFC/
Machine learning algorithms have the ability to process exten- GT, the fuel cell stack and air/fuel distribution components are
sive, multidimensional datasets and establish connections between inserted between the compressor and the turbine. The compressor
behaviors that cannot be identified in other ways. Once a machine inlet takes in ambient air and increases its pressure while reducing
learning algorithm has identified these connections, it can predict its volume. Compressed air passes through a recuperated system
outcomes based on provided inputs. Early work has been done in that further increases the temperature of the air before reaching
sequence prediction and time series forecasting with methods utiliz- the fuel cell system. The turbine expands the fluid coming out of
ing machine learning such as support vector machines and neural the fuel cell system, extracting mechanical power by decreasing
networks, and more specialized methods quickly emerged such as pressure and increasing volume.
recurrent neural networks (RNNs) [14,15]. Long short-term Turbomachinery failures such as stall may lead to the destruction
memory (LSTM) is a type of recurrent neural network that has of the fuel cell, making the prevention and control of this phenom-
been studied for time series prediction and anomaly detection. enon crucial for the effective operation of the SOFC/GT. In hybrid
Malhorta et al. [16] utilized LSTM networks to model normal beha- systems such as Hyper, the compressor volume can be up to 200
vior in several datasets. Using the prediction error distribution from times larger than that observed in a traditional gas turbine cycle
these models, they demonstrated that anomalies could be detected. [19]. As such, to supply the same mass flowrate as a compressor
With its previous success in time series prediction and anomaly in a traditional gas turbine configuration, the compressor in the
detection, machine learning is a viable approach for predicting com- SOFC/GT configuration will operate closer to the surge line due
pressor stall. to the larger volume which creates a significant pressure drop to
In this paper, a prediction tool based on a machine learning algo- overcome. The increase in volume coupled with the significant pres-
rithm was used to predict compressor stall. The prediction tool was sure drop creates operational and control challenges in the SOFC/
developed using an LSTM neural network where multivariate time GT system. The complexity of stall as well as the sensitivity of
Fig. 2 Diagram of the hyper facility
072305-2 / Vol. 142, JULY 2020 Transactions of the ASME
the fuel cell material augments these operational challenges that performed to determine the operating conditions in which incipient
make it difficult to establish a real-time model for predicting and stalls were observed without completely surging the compressor.
preventing stall [10]. This test started at nominal conditions and the BA valve and CA
and HA bypass valves were used to reduce the stall margin by
restricting the amount of air flow in the compressor. First, the BA
2.1 Hardware Components. A 120 kW Garrett Series 85 valve was closed from a 6% to a 0% open position. Next, the CA
auxiliary power unit was modified for this testing facility. The bypass valve was closed from a 40% to a 28% open position
gas turbine compressor system is composed of a single shaft, while HA bypass valve was maintained at a 25% open position.
directly coupled turbine, and a two-stage radial compressor. The After the operating conditions were determined, four tests, which
compressor was designed to deliver approximately 2 kg/s of dis- are summarized in Table 1, were conducted to apply a more sys-
charge flow and a pressure ratio of 4:1. The gas turbine is tematic approach to reducing the surge margin. These four datasets
coupled to a gear-driven asynchronous generator, and the load to were obtained under the same operating conditions, but the tests
the generator is applied by a 120 kW resistor bank that dissipates were conducted on different dates. Therefore, some inlet conditions
the power output of the source. The load bank is used to replicate on the compressor may differ, but the same test procedure was used
real-life demand of a power generation system connected to the for each experiment. In case 1, the BA valve was set at the fully
grid. The turbine operates at a nominal speed of 40,500 rpm, and closed position, the CA bypass valve at 28% open, and the HA
a Woodward proportional-integral-derivative (PID) controller, bypass valve at 25% open. This dataset included 7692 data points
capable of 5-ms time-steps, acts on the swift fuel valve during of operation near the surge line with two incipient stall events
startup operation or to maintain constant turbine speed during elec- observed. In case 2, the BA valve was set at 6% open, the CA
tric load perturbation. The swift fuel valve is electrically actuated; a bypass valve at 40% open, and the HA bypass valve at 25%
high-speed stepper motor rotates a 2.54 cm sonic needle and nozzle, open. This dataset included 9216 data points of nominal operating
which controls the fuel flow going into the combustor at a 5-ms rate conditions, and no stall events were observed. In case 3, the BA
if needed. valve was closed from 6% open to 0% open in one-step change,
The BA valve was installed to bypass air from the compressor the HA bypass valve was maintained at 25% open, and the CA
discharge directly into the atmosphere. The BA valve reduces the bypass valve was changed from 40% to 30% open in 5% step
total air flow through the turbine and heat exchangers, which is changes, then to 28% open in one step change. The system was
offset by the increase in air flow at the compressor inlet. This allowed to stabilize for 50 s between each step change. This
valve is also used to ensure enough stall margin to the compressor dataset included 56,960 data points with the facility operating
during startup. A 6-in. valve with a 3-in. body is used as the BA from nominal conditions to near the surge line. As shown in
actuator in the Hyper facility. The range of operation is between Fig. 3, the third test was divided into two datasets; approximately
100% and 88% of the closing position when the electric load is 80% of the data were used for training the model (case 3.1) and
engaged to the turbine. During startup and nominal condition oper- the remaining data were used for validation (case 3.2). Case 3.1
ations, the valve is set at 94% to guarantee enough stall margin to included eight incipient stall events, and case 3.2 included seven
the compressor. The cold-air (CA) bypass valve is used to bypass incipient stall events with some stall propagating and becoming
cold air from the compressor directly into the turbine inlet
through the mixing volume, and the hot-air (HA) bypass valve is
used to bypass air from the outlet of the heat exchangers directly
into the turbine inlet through the mixing volume. Table 1 Operating conditions for each test case
Test Data Incipient stall Operating Use in machine
2.2 Sensors. Various sensors are mounted in the hardware case points events conditions learning model
system to measure mass flow, pressure, temperature, and rotational
speed. These sensors are generally sampled at a rate of 200 mea- 1 7692 2 Near surge line Training
surements per second. Three optical sensors are used to measure 2 9216 0 Nominal Training
turbine rotational speed. Each sensor optically measures the speed 3.1 46,080 8 Nominal to near Training
of the light reflected from the rotating target located at the end of surge line
the generator shaft. These three signals are transmitted to the 3.2 10,880 7 Near surge line Validation
control system every 5 ms and averaged to give an average
turbine speed. The compressor inlet flow is measured using an
annular flow element that provides a mechanical average of the dif-
ference between stagnation pressure and static pressure in the inlet
pipe to determine the flowrate. Compressor inlet temperature and
pressure, measured with a thermocouple and a pressure transducer,
determine the ambient condition in the facility test cell during oper-
ation and are used to calculate the corrected air flow and compressor
pressure ratio.
3 Experimental Operating Conditions
At the Hyper facility, thermal steady-state under nominal condi-
tions is needed before any experiment is performed. The thermal
steady-state condition is achieved when the turbine rotational
speed reaches the nominal setpoint of 40,500 rpm, and the skin tem-
perature of the mixing volume varies less than 1.0 K for a 30 s
period. When the nominal turbine speed is reached, the electrical
load bank is generally used to engage a 40 kW resistive load to dis-
sipate the energy produced by the turbine generator. During startup
and around nominal conditions, a single-input single-output PID
controller maintains nominal turbine speed by handling the fuel
flow into the gas turbine combustor. An initial scoping study was Fig. 3 Division of test case 3
Journal of Energy Resources Technology JULY 2020, Vol. 142 / 072305-3
surge but not enough to terminate the experiments. These experi- contains several activation functions. Each activation function uses
ments were short due to the potential for mechanical damage to logical gates to determine the output of the function. The gates
the facility and turbomachinery components. In total, 62,988 data utilize logistic sigmoid (σ) and hyperbolic tangent functions
points were used to train the machine learning model and 10,880 (tanh) to scale inputs and previous values. The weights for each
data points were used to validate the model. gate (Wg, Vg) are set at each time-step by the learning algorithm
according to the error gradient. These weights collectively form
the machine learning model. The forget gate activation function
4 Machine Learning Model (ft), defined by Eq. (1), looks at some input xt and regulates what
information from the previous hidden state (ht−1) is “forgotten”
When considering the machine learning tool to utilize for this
by the cell
study, several characteristics of the data were evaluated including
magnitude, type, number of variables, noise, and complexity. ft = σ( Wf ht−1 + Vf xt ) (1)
Based on the characteristics of the data and given that the objective
of this study involves sequence prediction, a RNN was an obvious The input gate activation function (it), defined by Eq. (2), regulates
candidate. An RNN is a form of neural network that utilizes what information from the input array is important to the cell. The
feedback to impose a “memory” in each network node. A simplified same input parameters (ht−1, xt) of Eq. (1) are fed into Eq. (2) and
diagram of an RNN is shown in Fig. 4. Generic RNNs are useful for similar to Eq. (1) the weights Wi and Vi forms the output it
pattern recognition but are limited to short-term applications due to
unstable backflow error [20]. Introduced by Hochreiter and Schmid- it = σ( Wi ht−1 + Vi xt ) (2)
huber [20], the gradient-based LSTM learning algorithm is a form of
RNN that was developed to overcome backpropagation error insta- The candidate cell state (ct), defined by Eq. (3), creates the new
bility and can be used for extended sequence prediction. values that are possible for the cell’s output state. Similar to
The LSTM learning algorithm utilizes memory cells with logical Eqs. (1) and (2), ht−1 and xt represent the output from the previous
gates to enforce stable error flow. As shown in Fig. 5, a memory cell hidden state and the data point, respectively. In this case, Wc and
Vc represents the weights for the candidate cell state ct
ct = tanh ( Wc ht−1 + Vc xt ) (3)
The output gate activation function (ot), defined by Eq. (4), regu-
lates what information is passed along as an output for the cell.
Similar to the previous equations, Wo and Vo are the weights deter-
mined at each time-step by the error gradient machine learning algo-
rithm, and ht−1 and xt are the hidden state and the data point
ot = σ( Wo ht−1 + Vo xt ) (4)
The current cell state (Ct), defined by Eq. (5), combines what
should be remembered from the previous cell state with new
values that have been deemed important. The current cell state is
defined as a linear combination between the current cell state from
the previous state (Ct−1), the output of the forget activation function
(ft) at the current state, the output of the current cell state (Ct), and the
output of the input gate activation function (it)
Ct = ft Ct−1 + it ct (5)
The hidden state (ht), defined by Eq. (6), normalizes the cell’s
Fig. 4 Example of a recurrent neural network current state and determines what information actually produced as
Fig. 5 Memory cell in an LSTM network
072305-4 / Vol. 142, JULY 2020 Transactions of the ASME
an output for the cell. The output gate activation function (ot) multi- the model to process enough data to make a good prediction for
plied by the hyperbolic tangent functions (tanh) related to the current the defined number of output time-steps.
cell state determines the hidden state As previously described, test case 3.2 was used to test the accuracy
of the trained model. This dataset was used as inputs for the LSTM
ht = ot tanh (Ct ) (6) model, which predicted the stall classification in the classification
Unlike other neural networks, LSTM is capable of recognizing approach and the turbine speed in the regression approach. Once
patterns over gaps exceeding 1000 steps and is well suited for the LSTM model output was predicted, stall or normal operating con-
noisy data, making it the ideal model for this study. ditions were determined based on the predicted values. This was then
compared with the stall results from the original data to determine the
model’s accuracy for forecasting stall using Eqs. (7)–(10)
1
5 Machine Learning Model Setup Accuracy = (TP + TN ) (7)
P
Two LSTM model implementations, classification and regres-
sion, were studied to determine the best approach. The classification 1
approach requires training data to be manually classified prior to Error rate = (FP + FN ) (8)
P
training. The output of a classification model is then a predicted
classification. In this study, the training data for the classification TP
Sensitivity = (9)
approach was composed of 33 sensors as well as manually classified TP + FN
stalls (1 for stall and 0 nonstall). The regression approach, alterna-
tively, predicts the output of a variable based on given inputs. For TN
Specificity = (10)
this test, the sensor data was used as inputs to the LSTM model, FP + TN
which was used to predict turbine speed. Because a stall can be
characterized by a large spike in turbine speed, a threshold logic
was designed to identify when rpm surpassed a designated stall 5.1 Hyperparameter Tuning and Performance Validation
limit. Steady operation in this test was 40,500 rpm, so the stall in the Classification Model. An LSTM model structure was
thresholds were set at 40,400 rpm and 40,600 rpm (i.e., trained based on training data sets with labeled data that identify
±100 rpm). This allowed turbine speed spikes to be used as an indi- the stall event. The data was labeled using a boolean classification
cator that a stall had occurred. (1 for a stall and 0 for a nonstall) by manually identifying points in
The model was implemented in PYTHON using Keras [21], a sim- the training data when turbine speed passed the designated thresh-
plified interface for neural networks that runs on top of Tensor- olds (i.e., ±100 rpm around the turbine speed nominal operation).
flow. The Keras interface allows model parameters, referred to As previously mentioned, to optimize the model, several configura-
as hyperparameters, to be set at a higher level than a manual tions of hyperparameters were tested, which included the number of
implementation of LSTM. This enables a more efficient model epochs, the batch size for each epoch, the number of input time-
development process and improved testing capability for a range steps, the number of prediction output time-steps ahead, the size
of model configurations. The first layer of the neural network, of the hidden layer, and the size of the second hidden layer. From
the LSTM input layer, was set for the number of inputs fed to these tests, the batch size for each epoch was set to 30 data
the model. The input layer was then followed by another LSTM points, the best number of input time-steps for the classification
layer and two fully connected dense layers, which compute the model was determined to be 40 time-steps, the optimal number of
output of the model based on the weights and hidden state. The output time-steps was 10 time-steps, the first hidden layer size
output layer was set based on the desired output and number of was 64, and the second hidden layer size was 128.
time-steps to predict into the future. The dropout regularization After training, the model was validated with the test data set and
method was implemented in the LSTM layers, which forces the used to predict a final case with multiple stall events. Accuracy and
model to exclude a given percentage of inputs and recurrent con- mean-squared error (MSE) were used to determine the performance
nections during a memory cell’s training. This helps to prevent a of the model for predicting stalls. In this study, the number of train-
highly noise-influenced model (i.e., overfitting) and improves ing epochs, or iterations over the dataset, were varied while keeping
model performance. the other hyperparameters constant. The results of the classification
Three datasets were used to train the model, ranging from 7692 to tests are shown in Table 2. An average accuracy of 86% was
46,080 time-steps in length. Because the datasets were not sequen- achieved with the classification model. The 15-epoch model with
tial, the LSTM model was fitted for a given number of epochs on no dropout produced the lowest MSE at 0.00959. During the opti-
each dataset before moving onto the next. The number of epochs mization process, both the test and training had unstable losses,
is an LSTM hyperparameter that defines the number of times the meaning model error did not decrease smoothly as training and
learning algorithm iterates over each dataset during the training testing progressed. Although this did not contribute much to the
process. During training, each dataset was separated into smaller accuracy of the stall classification, it did affect the MSE. Unstable
batches by Keras. The size of these batches is defined by the loss can be an indicator of overfitting, so a dropout layer was con-
user, and the LSTM model weights are updated after a batch of sidered to stabilize losses. As shown in Table 2, however, the
data has processed. The number of input time-steps the learning 15-epoch model with 20% dropout produced similar results as the
algorithm trained on was a critical hyperparameter that allowed other tests with no dropout.
Table 2 Results of classification tests
Hyperparameters Results
Training Batch Number of input Number of output First hidden Second hidden Accuracy,
epochs size time-steps time-steps ahead layer size layer size Dropout MSE %
5 30 40 10 64 128 0% 0.024577 86.35
10 30 40 10 64 128 0% 0.013339 86.51
15 30 40 10 64 128 0% 0.00959 86.24
15 30 40 10 64 128 20% 0.010293 86.15
Journal of Energy Resources Technology JULY 2020, Vol. 142 / 072305-5
5.2 Hyperparameter Tuning and Performance Validation appropriate action can be taken to prevent the stall. The key func-
in the Regression Model. The regression model was optimized tion of a stall prediction tool, therefore, is to predict the point
by changing various hyperparameters, similar to the classification where a stall begins. Thus, the regression approach was selected
approach. In addition to accuracy, three other metrics were consid- as the best LSTM implementation and was combined with a thresh-
ered: error rate, sensitivity, and specificity. These metrics help to old logic that was designed to identify stall inception points.
determine what areas the model is strong or weak in predicting
stall. Based on the performance comparison, the optimal LSTM 6.1 Stall Inception Point Identification. A stall is character-
hyperparameter configuration had an output sequence of 15 time- ized by a large, rapid change in rpm compared with normal opera-
steps ahead. The results from this model are shown in Fig. 6. The tions. In the regression LSTM model testing, this was captured
regression model prediction accuracy was high; however, the sensi- using a turbine speed threshold to detect stall in the prediction
tivity was low in all hyperparameter configurations, as shown in data. The stall begins at the moment the turbine speed begins to
Table 3. It can be deduced from the high accuracy and low sensitiv- spike, but due to noise in the turbine speed data, the turbine
ity that the model performed well in predicting points where stall speed threshold must be set well above normal operation conditions
did not occur, but did not perform well in predicting points where to avoid a false prediction. This means that a delay can occur
stall did occur. While accuracy was higher than that of the classifi- between the beginning of a stall and when the stall threshold is trig-
cation model, the poor results for sensitivity showed that further gered. Because the rpm changes rapidly, a stall can also be classified
work was required to accurately predict stalls. using the differential of turbine speed, where an increase in the dif-
ferential can be used to predict a stall event. Furthermore, the stall
6 LSTM Regression Model Implementation inception point can be identified as the moment when the turbine
Comparing Tables 2 and 3, which represent the performance of speed differential exceeds a designated threshold. The differential-
the classification LSTM model against the regression model, it based threshold approach can detect stall earlier than using a thresh-
was determined that a higher prediction accuracy was achieved old approach only on the turbine speed. This provides an important
with the regression model. As previously discussed, however, the advantage because a corrective system action could be taken sooner
regression model did not perform well in predicting all data and the chance of avoiding the stall would improve.
points where the turbine speed surpassed the stall threshold. For The immediate result of a stall may be either an increase or
the scope of detecting and controlling compressor instabilities, the decrease in turbine speed (i.e., a positive or negative differential),
full dynamics of a stall event must be understood. However, so the absolute value of the differential was used to account for
model training shows that the highly variable nature of compressor both cases with a single threshold. To ensure that the tool can be
instabilities makes predicting the dynamics of a stall difficult, even used for real-time stall prediction, the instantaneous differential
for a machine learning algorithm. The wide variety of existing stall was calculated over the turbine speed array using only current
detection, mitigation, and control techniques further complicate and previous data points, as shown in Eq. (11)
stall dynamics and magnify this challenge. The goal of this study, ∂ω ω[i] − ω[i − n]
however, is to predict compressor stall rather than detect it. [i] = (11)
∂t t[i] − t[i − n]
Although the turbine speed prediction during a stall was not accu-
rate, the predictions at the beginning of each stall event appear to To reduce noise and better identify outliers (i.e., stall points), the
be more accurate. If the onset of a stall can be predicted, then turbine speed data were processed using an unweighted moving
Fig. 6 Regression model results Fig. 7 Effect of moving average on noise and outliers
Table 3 Results of regression tests with various hyperparameter configurations
Hyperparameters Results
Training Batch Input Output Input Second Dense Accuracy,
epochs size time-steps time-steps layer size layer size layer size Error rate % Sensitivity Specificity
2 30 40 15 32 64 64 0.050195 94.9805 0.011654 0.998284
5 30 40 15 32 64 64 0.051080 94.8920 0.021617 0.996816
10 30 40 15 32 64 64 0.050429 94.9571 0.013409 0.997947
2 30 40 15 32 64 16 0.050729 94.9271 0.022744 0.997126
072305-6 / Vol. 142, JULY 2020 Transactions of the ASME
Fig. 8 Stall detection threshold logic
Fig. 9 Sample of turbine speed data during stalls 3–5 Fig. 10 Stall 3 inception point identification
average filter (Eq. (12)) before taking the differential. Figure 7
shows the turbine speed differential with and without the
moving average filter. A window size of 10 data points (0.05 s)
for the moving average filter was selected as the optimal setting
for the data in this study. The overall threshold logic is shown in
Fig. 8.
1 i
MAω [i] = ω[k] (12)
N k=i−N
Figure 9 shows a snippet of turbine speed data in which a stall
occurs. It can be seen from these data that a stall is accompanied
by several large swings in turbine speed. Using the threshold
logic previously described, these swings would cause the turbine
speed differential to cross the threshold several times in rapid suc-
cession during a single stall event. The objective, however, is to
identify only the first point where threshold is crossed in a stall.
To avoid this problem, a memory logic was implemented that
Fig. 11 Stall 6 inception point predictions for various training
does not allow a threshold crossing to be identified as a stall incep- epochs
tion point within a 200 ms window of the previous threshold cross-
ing. This time period ensured that only one inception point was
identified for each stall, but different individual stalls could still
be recognized. Figure 10 shows a subsection of test data that has
7 Results
been processed using the threshold to identify a stall point. It can The stall inception points presented in Fig. 9 (i.e., test case 3.2)
be seen that although the turbine speed differential crossed the were compared for prediction accuracy against the LSTM predic-
threshold several times, the memory logic allowed only the first tion test data. To align the data for proper comparison, any false pre-
threshold crossing point in the stall to be identified as a stall incep- dictions were identified manually. Also, the stall points were
tion point. compared numerically to determine time disparities between real
Table 4 Results of prediction tool with various training epochs
Prediction error (ms)
Training epochs Stall 1 Stall 2 Stall 3 Stall 4 Stall 5 Stall 6 Stall 7 Missed stalls False stalls
2 70 80 70 65 65 90 60 0 0
8 60 80 65 65 60 75 55 0 0
16 60 70 65 65 60 70 55 0 0
32 65 70 70 – 65 70 60 1 2
Journal of Energy Resources Technology JULY 2020, Vol. 142 / 072305-7
Fig. 12 Comparison of predicted and actual stall inception Fig. 13 Comparison of predicted and actual stall inception
points for 16 epoch test points for 32 epoch test
and predicted stalls. Most of the optimized hyperparameters pre- results of this test, where dashed vertical lines represent predicted
sented in Table 3 were used in the test data set shown in Fig. 9. stall points and solid vertical lines represent actual stall points. It
For instance, the number of training epochs used in Fig. 9 was dif- can be seen from the overlap of predicted and actual stall points
ferent from Table 3, as shown in Table 4 they were varied and com- that all stalls were predicted and no false stall predictions occurred.
pared to determine optimal settings. In all cases, the turbine speed Prediction errors varied from 55 to 70 ms, meaning the stalls were
was predicted 15 time-steps (75 ms) ahead. The results of these forecasted to occur later than those in the test data. Because the pre-
tests are shown in Table 4. In the test data set 3.2, seven stall diction occurs 75 ms in advance, 5–20 ms remain before the stall
events occur. To evaluate the models, the predicted stall points occurs for system intervention.
were subtracted from the real stall points to determine the prediction When the model was trained with more than 16 epochs, predic-
error in milliseconds. tions became less accurate. In the 32-epoch test case, the model
As shown in Table 4, a general decreasing trend in prediction gave two false predictions of stall that did not exist in the test data,
error was observed from the 2-epoch test case to the 16-epoch and also failed to predict stall 4. This is shown in Fig. 13, where
test case. This trend is exemplified in Fig. 11, which shows a two false predictions occur near times 238 s and 261 s, and stall 4
detailed view of stall 6 and the corresponding predicted stall is not predicted at all. The latter is exemplified in Fig. 14, where a
points for each training epoch configuration. The best results detailed view of stalls 3 and 4 shows that the 16-epoch model pre-
were achieved in the 16-epoch test case. Figure 12 shows the dicted both stalls, while the 32-epoch model did not predict stall 4.
Fig. 14 Comparison of 16 and 32 epoch predictions for stalls 3 and 4 in the test data
072305-8 / Vol. 142, JULY 2020 Transactions of the ASME
8 Conclusion σ = sigmoid squashing function
ω = turbine speed array
This paper discusses various techniques for applying machine ∂ω
learning to predict compressor stall in a 100 kW recuperated gas ∂t = turbine speed differential array
turbine power system designed for hybrid configuration. It was
determined that a LSTM neural network was the best model to
utilize. Testing concluded that because stalls occur very infre-
quently relative to nonstalls, a more robust and accurate model
References
[1] Fell, H., and Kaffine, D. T., 2018, “The Fall of Coal: Joint Impacts of Fuel Prices
could be created using regression to predict turbine speed rather and Renewables on Generation and Emissions,” Am. Econ. J.: Econ. Policy,
than classification to predict the stalls themselves. The highest accu- 10(2), pp. 90–116.
racies were achieved using the LSTM model to predict turbine [2] Aghaei, J., and Alizadeh, M.-I., 2013, “Demand Response in Smart Electricity
speed 15 time-steps (75 ms) ahead with 16 training epochs. This Grids Equipped With Renewable Energy Sources: A Review,” Renewable
Sustainable Energy Rev., 18, pp. 64–72.
study concludes that by analyzing data with the stall prediction [3] Tavakoli, S., Griffin, I., and Fleming, P., 2004, “An Overview of Compressor
tool, it is possible to predict compressor stalls 5–20 ms before Instabilities: Basic Concepts and Control,” IFAC Proceedings Volumes, 37(6),
they occur. With a high-speed data acquisition system, this leaves pp. 523–528.
time for preventative action, such as valve changes, to be taken. [4] Almasi, A., 2012, “Latest Techniques and Practical Notes on Anti-Surge Systems
for Centrifugal Compressors,” Aust. J. Mech. Eng., 10(1), pp. 81–90.
Specifically, automated control actions could be implemented in [5] De Jager, B., 1995, “Rotating Stall and Surge Control: A Survey,” Proceedings of
combination with the compressor stall prediction model to avoid 1995 34th IEEE Conference on Decision and Control, New Orleans, LA, Dec.
surge during the operation of a gas turbine system. Sensors from 13–15, IEEE, pp. 1857–1862.
a gas turbine system can be fed to the stall prediction tool, which [6] Day, I. J., 1991, “Active Suppression of Rotating Stall and Surge in Axial
Compressors,” ASME 1991 International Gas Turbine and Aeroengine
is capable of predicting a stall event 5–20 ms before it begins. Congress and Exposition, Orlando, FL, June 3–6, p. V001T01A035.
After the 5 ms calculation time-step, an automated controller can [7] Tan, C. S., Day, I., Morris, S., and Wadia, A., 2010, “Spike-Type Compressor
open an actuator that will reduce the pressure ratio across the com- Stall Inception, Detection, and Control,” Annu. Rev. Fluid Mech., 42(1),
pressor and increase inlet flow. Subsequently, the surge margin will pp. 275–300.
[8] Day, I. J., 2016, “Stall, Surge, and 75 Years of Research,” ASME J. Turbomach.,
increase and the stall will be prevented. This study shows promise 138(1), p. 011001.
for a new generation of tools that utilize machine learning to predict [9] Gu, G., Sparks, A., and Banda, S. S., 1999, “An Overview of Rotating Stall and
compressor stall. Surge Control for Axial Flow Compressors,” IEEE Trans. Control Syst. Technol.,
7(6), pp. 639–647.
[10] Pezzini, P., Tucker, D., and Traverso, A., 2013, “Avoiding Compressor Surge
During Emergency Shutdown Hybrid Turbine Systems,” ASME J. Eng. Gas
9 Future Work Turbines Power, 135(10), p. 102602.
[11] Weigl, H. J., Paduano, J. D., Frechette, L. G., Epstein, A. H., Greitzer, E. M.,
Future work for this project will include a comprehensive study Bright, M. M., and Strazisar, A. J., 1997, “Active Stabilization of Rotating
of LSTM hyperparameter configurations to attempt to further Stall and Surge in a Transonic Single Stage Axial Compressor,” ASME 1997
extend model prediction times. Once this work is completed, the International Gas Turbine and Aeroengine Congress and Exhibition, Orlando,
prediction tool will be implemented on the Hyper system to deter- FL, June 2–5, American Society of Mechanical Engineers, p. V004T15A034.
[12] Paduano, J., Epstein, A. H., Valavani, L., Longley, J. P., Greitzer, E. M., and
mine the effectiveness of early intervention on stall reduction and Guenette, G. R., 1991, “Active Control of Rotating Stall in a Low Speed Axial
prevention using the Hyper system time scale. Other interesting Compressor,” ASME 1991 International Gas Turbine and Aeroengine Congress
work would include training the LSTM model on additional data- and Exposition, Orlando, FL, June 3–6, American Society of Mechanical
sets, including supplementary sensors or valves in the training Engineers, p. V001T01A036.
[13] Moore, F. K., and Greitzer, E. M., 1986, “A Theory of Post-Stall Transients in
data and predicting other parameters beyond turbine speed. Axial Compression Systems: Part I—Development of Equations,” ASME
J. Eng. Gas Turbines Power, 108(1), pp. 68–76.
[14] Müller, K.-R., Smola, A. J., Rätsch, G., Schölkopf, B., Kohlmorgen, J., and
Nomenclature Vapnik, V., 1997, “Predicting Time Series With Support Vector Machines,”
International Conference on Artificial Neural Networks, Lausanne, Switzerland,
i = array index Oct. 8–10, Springer, Berlin, Heidelberg, pp. 999–1004.
n = number of time-steps to include in differential calculation [15] Kolarik, T., and Rudorfer, G., 1994, “Time Series Forecasting Using Neural
Networks,” ACM SIGAPL APL Quote Quad, 25(1), pp. 86–94.
N = moving average window size [16] Malhotra, P., Vig, L., Shroff, G., and Agarwal, P., 2015, “Long Short Term
P = total number of model predictions Memory Networks for Anomaly Detection in Time Series,” Proceedings,
ct = candidate cell state Bruges, Belgium, Apr. 22–24, p. 89.
ft = forget gate activation function [17] Tucker, D., Pezzini, P., and Bryden, K. M., 2018, “Cyber-Physical Systems: A
New Paradigm for Energy Technology Development,” ASME 2018 Power
ht = hidden state Conference Collocated With the ASME 2018 12th International Conference on
it = input gate activation function Energy Sustainability and the ASME 2018 Nuclear Forum, Lake Buena Vista,
ot = output gate activation function FL, June 24–28, American Society of Mechanical Engineers, p. V001T04A001.
Ct = current cell state [18] Pezzini, P., Bryden, K. M., and Tucker, D., 2018, “Multicoordination Control
Strategy Performance in Hybrid Power Systems,” ASME J. Electrochem.
FN = number of stall time-steps predicted as nonstall Energy Convers. Storage, 15(3), p. 031007.
FP = number of nonstall time-steps predicted as stall [19] Tucker, D., Shadle, L., and Harun, N. F., 2017, “Automated Compressor Surge
TN = number of nonstall time-steps predicted as nonstall Recovery With Cold Air Bypass in Gas Turbine Based Hybrid Systems,”
TP = number of stall time-steps predicted as stall International Symposium on Transport Phenomena and Dynamics of Rotating
Machinery, Maui, HI, Dec. 16–21.
Vg = weight of previous state [20] Hochreiter, S., and Schmidhuber, J., 1997, “Long Short-Term Memory,” Neural
Wg = weight of hidden state Comput., 9(8), pp. 1735–1780.
MAω = moving average array for turbine speed [21] Chollet, F., 2015, Keras, https://keras.io
Journal of Energy Resources Technology JULY 2020, Vol. 142 / 072305-9