1 LSTM-MPC A Deep Learning Based Predictive Control Method For Multimode Process Control
1 LSTM-MPC A Deep Learning Based Predictive Control Method For Multimode Process Control
Abstract—Modern industrial processes often operate un- industrial processes are becoming large-scale, complicated, and
der different modes, which brings challenges to model pre- highly coupled, various operation modes would appear due to
dictive control (MPC). Recently, most MPC related methods different system parameters or structures [5], [6]. Take a zinc
would establish prediction models independently for dif-
ferent modes, which results in their control effect highly roasting process as an example, according to field workers’
relying on switching strategies. Inspired by the powerful experience, different production indexes like production load
representation capabilities of deep learning, this article pro- and market demands may lead to various operation modes and
posed a deep learning based MPC method. Specifically, bring multimode control problem [7], [8]. For each mode, indus-
the LSTM network is applied to predict behaviors of con- trial process has various operating parameters and if the control
trolled system, which can automatically match different op-
eration modes without switching strategy. Then combined strategy fails to match the corresponding mode, it will reduce
with MPC framework, an adaptive gradient descent method the stability of the system, and even cause serious production
is introduced to handle optimization problem and its con- accidents and losses. Therefore, how to improve MPC method
straints. In addition, stability and feasibility analysis have to solve multimode control problem is becoming a key research
been conducted from the aspect of theory to ensure prac- focus [9].
tical application of the proposed method. Experiments on
a numerical simulation process and an industrial process Currently, some improved MPC methods aiming to deal with
platform show the strength and reliability of the proposed multimode process are called multimodel MPC or switching
method, which reduces the overshoot by about 10% com- MPC [10], [11]. Their goal is to establish multiple models for
pared to common learning-based MPC methods and im- each operation mode and design model switching strategy. Li
proves the control accuracy effectively. et al. combined Takagi–Sugeno fuzzy models with multimodel
Index Terms—Deep learning, long short-term memory framework to achieve stable control for different operation
network, model predictive control (MPC), multimode pro- modes [12]. Tang et al. proposed a multimodel neural network
cess. based MPC for a nonlinear pH neutralization process [13]. As for
model switching strategy, Wan et al. designed a soft-switching
I. INTRODUCTION scheme for autonomous underwater vehicle to get better control
performance [14]. Although these methods have achieved some
ODEL predictive control (MPC) is an efficient control
M method that is widely applied in industrial processes.
Generally, the MPC includes three fundamental elements: the
success, they still have two drawbacks. First, since these methods
would establish multiple models for each operation mode, it
may cost more resources if the number of operation modes
predictive model of the dynamics of controlled system, the ref- increases. Second, their control effect highly depends on model
erence trajectory, and the optimal controller obtained by rolling switching strategy. In real industrial processes, it is difficult to
optimization [1], [2], [3], [4]. Although MPC has been proven distinguish every mode accurately due to complex environment,
efficient in industrial plants, it still has some limitations. Since which further degrades the control performance and robustness.
Since prediction model in MPC largely determines the final
Manuscript received 16 August 2022; revised 26 October 2022; ac- control effect [15], one way to overcome the abovementioned
cepted 4 December 2022. Date of publication 20 December 2022; date limitations is to study designing more accurate predictive models
of current version 8 May 2023. This work was supported in part by the for multimode process while reducing the dependence of control
National Natural Science Foundation of China under Grant 62073340
and Grant 61860206014, in part by the Major Key Project of Peng effect on switching strategies.
Cheng Laboratory under Grant PCL2021A09, in part by the National Deep learning, as a typical automatic feature extraction
Key R&D Program of China under Grant 2022YFB3304900 and Grant method, has attracted more and more attention recently. Gen-
2019YFB1705300, and in part by the Science and Technology Inno-
vation Program of Hunan Province under Grant 2022JJ10083, Grant erally, when modeling with Big Data containing complex fea-
2021RC3018, and Grant 2021RC4054. (Corresponding author: Chun- tures, deep learning performs better than shallow learning meth-
hua Yang.) ods [16]. Based on the powerful representation capability of
The authors are with the School of Automation, Central South
University, Changsha 410083, China (e-mail: [email protected]; deep learning, some pioneering works combine deep learning
[email protected]; [email protected]; [email protected]; gwh@ with MPC framework and come up with DeepMPC [17], [18].
csu.edu.cn). Lucia et al. used deep neural networks (DNN) to extract the
Color versions of one or more figures in this article are available at
https://doi.org/10.1109/TIE.2022.3229323. changing characteristics of resonant power converters and suc-
Digital Object Identifier 10.1109/TIE.2022.3229323 cessfully deployed the control method to industrial sites with
0278-0046 © 2022 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See https://www.ieee.org/publications/rights/index.html for more information.
Authorized licensed use limited to: University of Science & Technology of China. Downloaded on July 08,2023 at 13:24:20 UTC from IEEE Xplore. Restrictions apply.
HUANG et al.: LSTM-MPC: A DEEP LEARNING BASED PREDICTIVE CONTROL METHOD 11545
Authorized licensed use limited to: University of Science & Technology of China. Downloaded on July 08,2023 at 13:24:20 UTC from IEEE Xplore. Restrictions apply.
11546 IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, VOL. 70, NO. 11, NOVEMBER 2023
Authorized licensed use limited to: University of Science & Technology of China. Downloaded on July 08,2023 at 13:24:20 UTC from IEEE Xplore. Restrictions apply.
HUANG et al.: LSTM-MPC: A DEEP LEARNING BASED PREDICTIVE CONTROL METHOD 11547
Through these two gates, the state of memory cell can be updated optimization problem can be formulated as follows:
as follows:
min J(t) = min (R(t) − Y (t))T a(R(t) − Y (t))
U (t) U (t)
Ct = ft Ct−1 + it Ĉt . (5)
+ΔU (t)T bΔU (t)
Finally, the output gate is used to filter the information to obtain subject to
ŷ(t) = g(U p (t), Y p (t)) (10)
the network output yt
|Δu(t)| ≤ Δumax
ot = δ (Wo · [ht−1 , xt ] + bo ) (6) umin ≤ u(t) ≤ umax
ŷmin ≤ ŷ(t) ≤ ŷmax
ht = ot tanh (Ct ) (7)
where a and b are weight parameters, the historical input
yt = φ (Wy ht + by ) (8) and output are U p (t) = [u(t), u(t − 1), . . ., u(t − lu )],
and Y p = [y(t), y(t − 1), . . ., y(t − ly )], R(t) = [r(t + 1),
where Ct is the cell state, Ĉt denotes the updated new state, ht r(t + 2), . . ., r(t + Tp )] is reference output, Ŷ (t) =
represents the hidden layer state, Wi , Wf , Wc , Wo , Wy denote [ŷ(t + 1), ŷ(t + 2), . . ., ŷ(t + Tp )] is predictive output,
weight matrices, and bi , bf , bc , bo , by stands for bias vectors. U (t) = [u(t), u(t + 1), . . . , u(t + Tc − 1)] is optimal control
These parameters can be confirmed by backpropagation mecha- input, ΔU (t) = [Δu(t), Δu(t + 1), . . . , Δu(t + Tc − 1)] is
nism during LSTM network training. Besides, δ(·) and tanh(·) incremental control moves, and g(·) represents predictive model
represent sigmoid function and Hyperbolic tangent function, based on LSTM network whose expression can be represented
respectively, is the elementwise product of the vectors and below according to (6)–(8):
φ is the network output activation function.
Motivated by the strong predictive ability of LSTM, the g(U p (t), Y p (t)) = g(xt )
prediction model in proposed control method was estab- 1 eCt −e−Ct
lished through the following steps. First, design the loss func- = Wy + by . (11)
1+e−(Wo [ht−1 ,xt ]+bo ) eCt +e−Ct
tion. Since LSTM network applies multistep states to pre-
dict system output, its loss function can be formulated as Since the proposed predictive method is based on deep learn-
follows: ing, its mathematical expression cannot be represented directly,
2 which brings challenge to settle optimization problem (10).
L = R(t) − Ŷ (t) (9) In order to reduce the number of iterations and find optimal
2
result quickly, GD method is chosen to solve the optimization
where R(t) = [r(t + 1), r(t + 2), . . ., r(t + Tp )] is the refer- problem [15]:
ence output, Ŷ (t) = [ŷ(t + 1), ŷ(t + 2), . . ., ŷ(t + Tp )] is pre-
Uk+1 (t) = Uk (t) + ΔUk (t) (12)
dictive output, and Tp represents prediction horizon. Then, de-
termine the form of network input. According to the controlled ∂J(t)
ΔUk (t) = η1 − (13)
system in this article formulated in (1), the input sequence ∂Uk (t)
comes from two parts: previous system states and previous ma-
where η1 > 0 is learning rate and k is the number of iteration.
nipulated variables, that is, xt = [U p (t), Y p (t)] = [u(t), u(t −
According to (10), the derivative of the objective function J(t)
1), . . .u(t − lu ), y(t), y(t − 1), . . .y(t − ly )]. Finally, update
can be rewritten as
network weights with training samples through backpropagation T
through time mechanism [25]. ∂J(t) ∂ Y (t)
Compared to traditional predictive methods, the main advan- = −a R(t) − Y (t) + bΔUk (t).
∂Uk (t) ∂Uk (t)
tage of the LSTM-based predictive model lies in two folds.
First, LSTM network has a unique memory cell that can extract (14)
sequential features for time series, which was in accord with Thus, (13) can be formulated as follows:
the feedback control feature of MPC, thus it lays foundation for ⎛ T ⎞
accurate multimode prediction. Second, unlike the traditional
∂ Y (t)
ΔUk (t) = η1 ⎝a R(t) − Y (t) − bΔUk (t)⎠
predictive methods like ARIMA, the order of system has less ∂Uk (t)
impact on LSTM network prediction effect, which enhances
T
generalization ability of proposed predictive model. Therefore,
1 ∂ Y (t)
LSTM network is chosen as prediction model for multimode = η1 a R(t) − Y (t) . (15)
process in this article [15], [18]. 1 + η1 b ∂Uk (t)
It can be found that the key to deal with optimization problem is
C. Online Optimal Control to calculate the derivative of the prediction model’s output, that
(t)
∂Y
The goal of MPC is to find a suitable control signal u(t) is, Jacobian matrix ∂U k (t)
. Since the activation functions used
through online optimization so that the output of the system in LSTM network are sigmoid function and hyperbolic tangent
tracks the reference trajectory r(t) as closely as possible. In this function, which are continuous and derivable, it is feasible to
article, given prediction and control horizons Tp and Tc , the calculate Jacobian matrix of the proposed prediction model.
Authorized licensed use limited to: University of Science & Technology of China. Downloaded on July 08,2023 at 13:24:20 UTC from IEEE Xplore. Restrictions apply.
11548 IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, VOL. 70, NO. 11, NOVEMBER 2023
where α is the decay rate of each iteration and k is the number g1 (y(t), u(t)) = W ∗ θ(y(t), u(t)) = ẏ(t) + y(t) (21)
of iterations.
˙ + ŷ(t)
g2 (y(t), u(t)) = W θ(y(t), u(t)) = ŷ(t) (22)
Through solving optimization problem, the optimal control
input sequences U (t) can be obtained. Then, the first element
where g1 (·) and g2 (·) are nonlinear function, u(t) is manipu-
of U (t) will be applied as control signals in system to ensure
lated variable, W ∗ stands for ideal network weight matrix, W
the output of the system can track reference trajectory in dif-
represents real network weight matrix during training process,
ferent modes accurately. The proposed method is summarized
and θ(·) denotes activation function. During theoretical proving,
in Algorithm 1, where Jset represents optimization termination
several assumptions are introduced as follows:
threshold, which is designed according to the characteristics of
1) The training samples are the bounded sequences.
controlled system and set as Jset = 10−4 in this article.
2) The weight matrix W is bounded.
3) There exists optimal constant weight matrix W ∗ .
IV. CONVERGENCE AND STABILITY ANALYSIS
Lemma 1: if assumptions 1)–3) are valid, then we have
When it comes to practical industrial processes, stability is
an important feature to determine whether a control method W ∗ − W < α (23)
performs better or not [27]. What makes MPC widely used has g1 (y(t), u(t)) − g2 (y(t), u(t)) = e(t) ≤ ef (24)
two reasons: First, its predictive model is convergence during
training stage so that the dynamic features of controlled system where ef > 0, α > 0, · represents calculating Euclidean
can be fully extracted. Second, it can ensure that control law distance.
Authorized licensed use limited to: University of Science & Technology of China. Downloaded on July 08,2023 at 13:24:20 UTC from IEEE Xplore. Restrictions apply.
HUANG et al.: LSTM-MPC: A DEEP LEARNING BASED PREDICTIVE CONTROL METHOD 11549
Authorized licensed use limited to: University of Science & Technology of China. Downloaded on July 08,2023 at 13:24:20 UTC from IEEE Xplore. Restrictions apply.
11550 IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, VOL. 70, NO. 11, NOVEMBER 2023
where the indices pair (i∗ , t∗ ) is the minimizer, whose expression TABLE I
MODE DISTRIBUTION OF HAMMERSTEIN SYSTEM
is formulated as
(i∗ , t∗ ) = argmin Ct→∞
i
(x), f or x ∈ SS j . (36)
(i,t)∈F j (x)
Uj (t) = [uj (t|t), . . ., uj (t + Tc − 1|t)]. (37) where ri represents reference trajectory, ŷi stands for model
prediction output, ymax is the maximum output, y(∞) denotes
Similarly, the optimal control law sequence at time t + 1 of jth steady state of output, and Δu(i) is the change of manipulated
iteration is like variable.
Uj (t + 1) = [uj (t + 1|t + 1), . . ., uj (t + Tc |t + 1)]. (38)
A. Numerical Simulation
The goal is to find a suitable way to represent Uj (t + 1) with
the elements in Uj (t) to ensure recursive feasibility according In this section, Hammerstein system [32] is selected to study
to theory of set invariance [30]. We assume Tc = N and it can performance of each control method. To construct a multimode
be found that the length of optimal control law sequence is system, original expression has been modified as
⎧
equal to N . Under the MPC framework, only the first element ⎨x(k) = au(k) − bu(k)2 + 0.5u(k)3
in Uj (t) would be applied to the controlled system and others y(k + 1) = 0.6y(k) − 0.1y(k − 1) + 1.2x(k) (42)
are thought to be feasible optimal control law for later time ⎩
−0.1x(k − 1) + v(k)
step. Therefore, the first N − 1 elements in Uj (t + 1) can be
transferred as follows to ensure feasibility: where v(k) stands for Gaussian white noise output with mean 0
and the standard derivation 0.01. In this article, we mainly focus
uj (t + i|t + 1) = uj (t + i|t), i ∈ [1, N − 1]. (39) on different operation modes caused by different system param-
Since ŷj (t + N |t) ∈ SS j , according to (33), we can obtain eters. Therefore, three modes of the systems are established as
∗
shown in Table I by changing system parameters.
Qj (ŷj (t + N |t)) = Cti∗ →∞ (ŷj (t + N |t)) During the experiment, the range of system input is u ∈ (0, 1).
∞
Grid search is used to fine-tune LSTM network parameters
= J (ui∗ (k), ŷi∗ (k)) . (40) for better prediction accuracy and the parameters used in the
k=t∗ experiment are as follows: the number of hidden nodes in the
network is 16, the time step is 2, the batch size is 128, and
Through the definition of Qj (·) and F j (x), it has ŷj (t + N |t) =
the number of iterations is 200. As for the proposed multimode
ŷi∗ (t∗ ) and the last element in Uj (t + 1) can be rewritten as
MPC, its hyperparameters are a = 1, b = 1, prediction horizon
uj (t + Tc |t + 1) = ui∗ (t∗ ) [29]. Therefore, it can be found that
is Tp = 2, control horizon is Tc = 1.
Uj (t + 1) can be represented with elements in Uj (t), which
1) Prediction Effect: Since prediction model is one of the
means if the proposed method is feasible at time t, then the
essential parts of MPC, we first conduct comparison experiment
proposed method is also feasible at time t + 1.
on it. In this part, LSTM network and DNN are selected to com-
Through above analysis, the solution obtained by the proposed
pare since LMPC mainly focuses on how to combine Lyapunov
method is recursively feasible.
constraints with MPC framework and its prediction model is the
same as DNN-MPC. The experiment result can be found in Fig. 5
V. CASE STUDY and it indicates LSTM network has better prediction effect since
In this part, a numerical simulation process and an industrial it can extract each operation mode features and automatically
process simulation platform are designed to verify whether the identify the difference of mode to achieve accurate dynamic
proposed method has practical application value. To demonstrate tracking with controlled system.
strengths of proposed method, DNN-MPC [19], and Lyapunov- 2) Optimization Effect: Another important procedure of
based MPC (LMPC) [31] are selected as compared methods. The MPC is rolling optimization, which would try to find the best
performance is measured using root mean-square error (RMSE), control strategy at each iteration. Thus, for each comparison
Authorized licensed use limited to: University of Science & Technology of China. Downloaded on July 08,2023 at 13:24:20 UTC from IEEE Xplore. Restrictions apply.
HUANG et al.: LSTM-MPC: A DEEP LEARNING BASED PREDICTIVE CONTROL METHOD 11551
TABLE II
CONTROL RESULTS OF NUMERICAL SIMULATION PROCESS
can not stabilize the system output near the set value. Although
DNN-MPC can realize stable control, it has a larger steady-state
error due to its insufficient capacity to predict controlled system
dynamic and find optimal control law for multimode process
shown in Figs. 5 and 6, which cannot meet the demand of precise
control. The proposed method applies LSTM-based predictive
Fig. 6. Value of objective function during optimization procedure of model to accurately predict the system output under different
each compared method. (a) LSTM-MPC. (b) DNN-MPC. (c) LMPC. modes, which lays a foundation for precise control. Meanwhile,
an improved GD method is used to solve the control optimization
method, the value of objective function has been collected to problem, which meets the real-time demand.
4) Computational Efficiency: Another aspect to evalu-
study the optimization effect as shown in Fig. 6. It can be found
that compared to other two methods, the proposed method has ate the effect of the proposed method is computational effi-
two advantages: First, when the optimization procedure is done, ciency. The computational efficiency of the proposed method
the value of objective function of proposed method is much can be analyzed from two aspects: computational complex-
smaller than other compared methods, leading to a better control ity and spending time. For the proposed method, if the con-
effect. Second, when operation mode switching, the proposed trol input and system output dimensions are nu and ny ,
the computational complexity of the proposed GD method is
method has less fluctuation and can eliminate the impact of
different modes immediately, ensuring the optimization result O Tp 2 + Tc2 (nu + ny )3 [27]. Therefore, when applying
is always in the best state. the proposed method to a real industrial system, some parameters
3) Control Effect: We choose DNN-MPC and LMPC as like prediction horizon Tp can be adjusted to reduce the overall
comparison methods to show the superiority of the proposed computational complexity while ensuring the performance of
method for multimode process and the setting value for three the method.
modes is 0.6. The control results of all methods are shown in Fig. In order to demonstrate the comparison of time cost for each
7 and Table II. It can be found that affected by noise and mode method, another comparison experiment on spending time has
changes, LMPC control method causes system fluctuation and been conducted. From Fig. 8, it can be found the time cost of the
Authorized licensed use limited to: University of Science & Technology of China. Downloaded on July 08,2023 at 13:24:20 UTC from IEEE Xplore. Restrictions apply.
11552 IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, VOL. 70, NO. 11, NOVEMBER 2023
TABLE III
MODE DISTRIBUTION OF CSTR PROCESS
Authorized licensed use limited to: University of Science & Technology of China. Downloaded on July 08,2023 at 13:24:20 UTC from IEEE Xplore. Restrictions apply.
HUANG et al.: LSTM-MPC: A DEEP LEARNING BASED PREDICTIVE CONTROL METHOD 11553
Authorized licensed use limited to: University of Science & Technology of China. Downloaded on July 08,2023 at 13:24:20 UTC from IEEE Xplore. Restrictions apply.
11554 IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, VOL. 70, NO. 11, NOVEMBER 2023
Fanbiao Li received the B.Sc. degree in applied Weihua Gui received the B.Eng. degree in elec-
mathematics from Mudanjiang Normal Univer- trical engineering and the M.S. degree in au-
sity, Mudanjiang, China, in 2008, the M.Sc. de- tomatic control engineering from Central South
gree in operational research and cybernetics University, Changsha, China, in 1976 and 1981,
from Heilongjiang University, Harbin, China, in respectively.
2012, and the Ph.D. degree in control theory Since 2013, he has been an Academician
and control engineering from the Harbin Insti- of the Chinese Academy of Engineering. He is
tute of Technology, Harbin, in 2015. currently with the School of Automation, Central
In 2016, he joined as an Associate Profes- South University. His current research interests
sor with Central South University, China. His include modeling and optimal control of com-
research interests include stochastic systems, plex industrial processes, fault diagnoses, and
sliding mode control, and fault diagnosis and identification. distributed robust control.
Authorized licensed use limited to: University of Science & Technology of China. Downloaded on July 08,2023 at 13:24:20 UTC from IEEE Xplore. Restrictions apply.