An Echo State Gaussian Process-Based Nonlinear Model Predictive Control For Pneumatic Muscle Actuators
An Echo State Gaussian Process-Based Nonlinear Model Predictive Control For Pneumatic Muscle Actuators
An Echo State Gaussian Process-Based Nonlinear Model Predictive Control For Pneumatic Muscle Actuators
Authorized licensed use limited to: MANIPAL INSTITUTE OF TECHNOLOGY. Downloaded on March 29,2023 at 13:09:11 UTC from IEEE Xplore. Restrictions apply.
1072 IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, VOL. 16, NO. 3, JULY 2019
deficiencies of the existing approaches, we would like to training algorithms. The echo state network (ESN) is a new
propose an efficient strategy that is easy to be implemented kind of RNN characterized by the ability of uniquely mapping
and has sound theoretical support to obtain a satisfactory a temporal input history to an echo state [30]. By this way,
performance for high-precision tracking tasks of PMAs. the high computational complexity of the conventional RNN
The model predictive control (MPC) is a powerful tech- is significantly reduced due to the sparse connections among
nique for optimizing performance of control systems with the hidden neurons. In addition, the learning requirements are
wide recognition from the academia and industry [15], [16]. reduced to only the weights connecting the hidden layer and
Conventionally, the MPC is mostly used in the industrial the readout neuron, and it turns out that the simple structure
processes because of its easy implementation [17]. In the past, makes it more applicable in physical systems.
the high-complexity calculation made it difficult to be used in However, when the testing data change slightly away from
the motion control, which requires high sampling frequency. the training data, ESNs may be ill-posed and accompanied
Because of the rapid development of hardware, the MPC is with large output weights, which weakens their generaliza-
starting to be applied in the field of motion control [18]–[20]. tion capability [31]. One solution for this problem is apply-
This makes the motion controller to be efficiently designed and ing an extension of the ESN called echo state Gaussian
easy to be implemented. By adopting the current states as ini- process (ESGP) that is a fusion of ESN with Bayesian infer-
tial states, an optimal control sequence is generated by solving ence for GPs. The ESGP model generalizes the conventional
a constrained finite-horizon optimal control problem at each ESN, treated by means of ridge or linear regression, under
sample time, and the first element in this sequence is applied a Bayesian perspective, and includes them as special sub-
to the plant [16]. This demands a reasonably accurate model of cases [32]. As a result, the ESGP combines both the merits
the plant characteristics that are able to capture the dynamics of the ESN and the GP. Not only the prediction but also
of the system. Nonetheless, a quantity of applications are based the measurement of model uncertainties can be attained by
on linear models, which leads to poor control performances the method of the ESGP. In addition, on the contrary to the
for highly nonlinear processes. One method to deal with this high computing complexity of conventional neural networks,
problem is to linearize nonlinear plants with Taylor series the training methods of the ESGP can be conducted by type-II
expansion and estimate the higher order unknown term to maximum likelihood, which turns out to be computationally
compensate the nonlinearity [21], [22]. An alternative way is efficient. Though the ESGP has been presented for years, there
to model the nonlinear plants for nonlinear MPC (NMPC), are few studies concerning the application in control systems,
which results in numerically determining the optimal values except for [33].
of the control signal. In this way, an important aspect for a In this paper, an ESGP-NMPC strategy is developed for
practical implementation of a real-time capable MPC scheme stabilizing the transformed model of tracking error dynamics
is to use an efficient algorithm [23], [24]. Generally, it is of a PMA plant. The main contribution of this paper lies
difficult to obtain accurate parameters of the model in identi- in: 1) the proposed control strategy called ESGP-NMPC for
fying nonlinear physical systems. Fortunately, with the great PMAs’ high-precision tracking tasks; 2) the stability analysis
improvement of artificial intelligence, the neural networks of the closed-loop system that depends on the reference
have been widely used in approximating the unknown systems trajectory; and 3) the experimental studies for validating the
due to their excellent capabilities of modeling uncertain and proposed control approaches.
nonlinear systems. This provides an alternative solution for the The rest of this paper is organized as follows. In Section II,
design of model-free NMPC. Thus, an idea of neural network- the three-element model of the PMA is presented. In addition,
based NMPC (NN-NMPC) schemes has been gotten approval. the architectures and the formulations of the ESN and the
Different applications of NN-NMPC have been investigated ESGP are stated. In Section III, the overall processes of the
in [24] and [25]. At the same time, accurately describing an NMPC with the detail deduction of the ESGP differential
unknown, complex system remains a challenging problem. coefficients are denoted. Then, in Section IV, the stability
At present, there are various methods that can approximate of the proposed ESGP-NMPC is analyzed. The simulations,
dynamics of the nonlinear system. The common function as well as practical experiments, are presented to demonstrate
approximators include multilayer perceptrons (MLPs) [26], the effectiveness and robustness of the strategy in Sections V
fuzzy logical systems [27], and recurrent neural net- and VI. The conclusions are given in Section VII.
works (RNNs) [28]. However, the MLPs are limited in the
II. M ODEL F ORMULATION AND P RELIMINARIES
sense that they are only capable of providing a static map
between inputs and outputs (i.e., they have no way of internally A. Model Formulation
representing the dynamics of a nonlinear system). In fuzzy A generalized three-element model of the PMA is often
control, the tuning of the fuzzy rule and membership function assumed in its control applications [10]. Fig. 1 shows that the
are strongly dependent on experience and time-consuming. PMA is placed on the vertical position with a load attached to
For RNNs, the universal approximation property makes them the bottom. As the air pressure of inner bladder is changing,
particularly suitable for modeling dynamical systems with the contractile length of the muscle will alter, correspondingly.
arbitrary precision [29]. Hence, they are more applicable to When the rubber bladder expands, the diameter largens in the
nonlinear system modeling and control. Despite the potential radial direction and the length decreases in the axial direction,
and capability of RNNs, the main problem is the difficulty of simultaneously. In this way, the force occurs in the axial
training them, the complexity, and slow convergence of the direction. The dynamic behaviors of a PMA hanging vertically
Authorized licensed use limited to: MANIPAL INSTITUTE OF TECHNOLOGY. Downloaded on March 29,2023 at 13:09:11 UTC from IEEE Xplore. Restrictions apply.
HUANG et al.: ESGP-NMPC FOR PMAs 1073
Authorized licensed use limited to: MANIPAL INSTITUTE OF TECHNOLOGY. Downloaded on March 29,2023 at 13:09:11 UTC from IEEE Xplore. Restrictions apply.
1074 IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, VOL. 16, NO. 3, JULY 2019
y(t) = f u ([u c (t − 1), y(t − 1), y(t − 2)]T ) (15) x(t + 1) = f [Win u(t + 1) + Wx(t) + Wback ym (t)]. (17)
Authorized licensed use limited to: MANIPAL INSTITUTE OF TECHNOLOGY. Downloaded on March 29,2023 at 13:09:11 UTC from IEEE Xplore. Restrictions apply.
HUANG et al.: ESGP-NMPC FOR PMAs 1075
To overcome the errors between the prediction and the plant’s where e(t) = r(t)− ŷ(t). In practical, for simplicity, the weight
actual output, the compensation term y(t) − ym (t) is also parameters can be denoted by W y = ρ y I and Wu =
obtained. With the combination of the compensation and the ρu I (ρ y , ρu > 0).
prediction, the actual predicted output can be described as To minimize J (t) at each time step t, we consider the gra-
dient descent algorithm, then the control input sequences uc (t)
ŷ(t + i ) = ym (t + i ) + (y(t) − ym (t)) (18) are updated as follows:
where ym , ŷ, and y represent the predicted output of the model, ∂ J (t)
uc (t + 1) = uc (t) − η (21)
the actual predicted output, and the output of the system, ∂uc (t)
respectively. As soon as ŷ is calculated, the error of the signal
where η = ρη I (ρη > 0) is the learning rate for control
and the reference r (t) will be fed into the NMPC controller
signals. ∂ J (t)/∂uc (t) is a vector, which can be derived from
to obtain the control signal.
the predictive model, referring to the ESN or ESGP, based on
the control input sequences.
B. Nonlinear Model Predictive Control Scheme It is obvious that
The basic idea of NMPC is that the current control inputs ∂ J (t) ∂e(t)T y
= W e(t) + Wu uc (t). (22)
are chosen to minimize the cost function over several steps in ∂uc (t) ∂uc (t)
the future for stable control. From (21) and (22), the increments of the control input
The formulation based on the constrained finite-horizon vector uc (t) are attained as follows:
optimization can be described as
∂e(t)T y
Np
uc (t) = −(I + ηWu )−1 η W e(t)
y ∂uc (t)
J (t) = [r (t + i ) − ŷ(t + i )]Wi [r (t + i ) − ŷ(t + i )]
∂ ŷ(t)T y
i=1 = (I + ηWu )−1 η W e(t) (23)
Nu ∂uc (t)
+ u c (t + j − 1)W uj u c (t + j − 1) (19) where the Jacobian matrix ∂ ŷ(t)T /∂uc (t) can be derived from
j =1 the predictive model based on the control input sequences.
subjected to Remark 2: In order to realize the real-time property, we first
applied the reservoir computing of the sparse connections
| u c | < u cmax among the hidden neurons, which can significantly reduce
u c min ≤ u c (t) ≤ u c max the computational complexity. Note that there are K inputs
ŷmin ≤ ŷ(t) ≤ ŷmax in the input layer and N internal neurons in the internal
layer. Hence, because of the matrix inversion, referring to (12),
where r and u c represent the reference and the control input, the computational complexity of the ESGP is O((N + K )3 ).
u c is the incremental control signal, N p is the prediction On the other hand, the parametric calculation complexity of
horizon, Nu is the control horizon, Nu must be smaller than gradient descent algorithm is based on the internal size of the
y
N p (Nu < N p ), and Wi and W uj are the weight parameters. reservoir, as well as the prediction horizon N p and control
According to the MPC scheme, the iterative optimization horizon Nu . The control signal is obtained by calculating
strategy repeats at each time step over a finite prediction the increment of the control signal u. The complexity for
horizon N p and its solution leads to an optimal control the proposed gradient descent method is O((N p2 + Nu2 )N 2 )
sequence. according to (23) and (39). Although the computational com-
plexity is high, due to the small scale of the calculation
(K = 3, N = 20, N p = 2, and Nu = 1), the time consumption
C. Optimization Method
of the proposed method can still meet the requirement of this
To minimize the cost function (19) over several steps in the application. Therefore, based on the above-mentioned analysis,
future, the PMA system is utilized. The input air pressure and the real-time property of the proposed control strategy can be
the PMA’s contractile length serve as the input and the output satisfied.
of the system. The following equations are defined to describe Remark 3: NMPC requires the accurate computation of
the MPC scheme: gradients and comes with lack of convexity [39]. However, it is
difficult to obtain the optimal solution. In addition, the typical
r(t) = [r (t + 1), . . . , r (t + N p )]T
numerical solvers are not able to distinguish a local optimum
ŷ(t) = [ ŷ(t + 1), . . . , ŷ(t + N p )]T from a global optimum. In this application, the control strategy
e(t) = [e(t + 1), . . . , e(t + N p )]T requires fast update intervals within a highly resource-limited
uc (t) = [u c (t + 1), . . . , u c (t + Nu )]T computer environment. We cannot expect that the optimization
solver will be allowed to execute until strict tolerances on
uc (t) = [ u c (t), . . . , u c (t + Nu − 1)]T . the optimality conditions or stopping criteria are met in
Then, the cost function of NMPC can be rewritten as every case. In this situation, we prefer to get a feasible solution
instead of an optimal solution. Actually, a recent survey of
J (t) = e(t)T W y e(t) + uc (t)T Wu uc (t) (20) optimization algorithms for NMPC is given in [40]. Because of
Authorized licensed use limited to: MANIPAL INSTITUTE OF TECHNOLOGY. Downloaded on March 29,2023 at 13:09:11 UTC from IEEE Xplore. Restrictions apply.
1076 IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, VOL. 16, NO. 3, JULY 2019
nonconvexity, most algorithms find local minima or stationary From (6), it follows that
points rather than global minima. On the other hand, stability ⎛ ⎞
may still be achieved using suboptimal (approximate) MPC. 1 ··· 0 ··· 0
This indicates that model predictive controllers that require ∂φ(t + i )T ⎜ .. .. .. .. .. ⎟
=⎝. . . . .⎠
feasibility rather than optimality have a much better prospect ∂x(t + i )
0 ··· 1 ··· 0
of being implemented when the system is nonlinear [16], [41].
Remark 4: The adaptive control (MIT rule) is designed with = I N×N 0 N×K = Ī ∈ N×(N+K ) (29)
a reference model. Then, a cost function that is dependent
on the errors between the controlled object and the reference where I N×N is the identity matrix with N dimension and
model is generated. Based on the negative gradient direc- 0 N×K is the zero matrix with N × K dimension.
tion of the cost function, the reference model parameters Combining (24)–(29), we have
can be adaptively adjusted. The main idea of the adaptive
control is to drive the reference model to approximate to ∂ ŷ(t + i ) ∂x(t + j )T ∂x(t + i )T out
the controlled object [42], [43]. This requires a reasonable = Ī(Ŵ )T . (30)
∂u c (t + j ) ∂u c (t + j ) ∂x(t + j )
model of the object. However, in this paper, the ESGP model
is not a reference model. It can be regarded as a global In order to analyze the relationship between the reservoir
approximator. By online training this model, the ESGP can states and the model inputs, (17) can be denoted as
approach the dynamic model of PMA with higher accuracy.
The significant difference between the adaptive control and the ⎛ ⎞
g1 (t + j − 1)
proposed method is that the adaptive control is still a model- ⎜ .. ⎟
x(t + j ) = f ⎝ . ⎠ (31)
based control strategy, while the proposed method is a data-
driven strategy. Furthermore, this strategy includes the idea g N (t + j − 1)
of adaptive control. The gradient descent algorithm makes the
predicted trajectory to approach the reference trajectory, which where
results in the accurate tracking.
K
N
in
gr (t + j − 1) = wrs u s (t + j ) + wrs x s (t + j − 1)
D. Echo State Gaussian Process-Based NMPC s=1 s=1
To get the Jacobian matrix ∂ ŷ(t)T /∂uc (t), we consider the
L
situation i ≥ j given time point t + i and t + j since the term + wrs
back
ym (t + j − 1). (32)
∂ ŷ(t + i )/∂u c (t + j ) will be equal to zero when i < j . s=1
Activated by the input information, the reservoir states are
updated by a trace of previously acquired reservoir states and Then, we have
the system output. Thus, the relationship of the reservoir states
at different time instants can be obtained ∂x(t + j )T
= Wuin (t + j − 1) (33)
⎧ i−1 ∂u c (t + j )
⎪
⎪ ∂x(t + l + 1)T
⎪
⎨ ,i> j
∂x(t + i ) T
∂x(t + l) where Wuin = [w11 in , . . . , w in ], (t + j − 1) = diag{ f (g (t +
= l= j (24) N1 1
∂x(t + j ) ⎪
⎪ = j − 1)), . . . , f (g N (t + j − 1))}
⎪
⎩ I, i j
0, i < j. Since the dynamic reservoir states contain the information
of historical outputs, we can rewrite the reservoir states
The relationship between the reservoir states and the control equation as
signals can be obtained from (17) as follows:
∂x(t + i )T ∂x(t + j )T ∂x(t + i )T
N
L
out
= (25) x m (t + l + 1) = f wmn + wback
ms Ŵsn x n (t + l)
∂u c (t + j ) ∂u c (t + j ) ∂x(t + j )
n=1 s=1
where u c (t + j ) is the first element of the input vector.
K
It indicates that the input vector is comprised of the control + wms
in
u s (t + l + 1) (34)
signal u c and the actual predictions ŷ of the PMA model. s=1
Thus, according to (16) and (18), it follows that
where x m and x n are the elements of the reservoir states, and
∂ ŷ(t + i ) ∂φ(t + i )T ∂ ŷ(t + i )
= (26) wmn , wms
back , and ŵ out are the elements of the internal weights,
sn
∂u c (t + j ) ∂u c (t + j ) ∂φ(t + i ) feedback weights, and output weights, respectively.
∂ ŷ(t + i ) ∂ym (t + i ) Then, it follows that
= = (Ŵout )T . (27)
∂φ(t + i ) ∂φ(t + i )
Meanwhile, we have ∂ x m (t + l + 1)
L
= wmn + wms ŵsn f (gm (t + l)).
back out
∂φ(t + i )T ∂x(t + i )T ∂φ(t + i )T ∂ x n (t + l)
= . (28) s=1
∂u c (t + j ) ∂u c (t + j ) ∂x(t + i ) (35)
Authorized licensed use limited to: MANIPAL INSTITUTE OF TECHNOLOGY. Downloaded on March 29,2023 at 13:09:11 UTC from IEEE Xplore. Restrictions apply.
HUANG et al.: ESGP-NMPC FOR PMAs 1077
In addition, we have the Jacobian matrix ∂x(t + l + 1)T / The change of the function can be obtained as the following
∂x(t + l) form:
⎡ ∂ x (t + l + 1) ∂ x N (t + l + 1) ⎤ ∂ V (t) ∂e(t)T y
1
··· = W e(t). (41)
⎢ ∂ x 1(t + l) ∂ x 1 (t + l) ⎥ ∂t ∂t
∂x(t + l + 1)T ⎢ ⎢ . . .. ⎥
⎥.
=⎢ .. .. . ⎥ Now, we want to compute ∂e(t)T /∂t, which directly relates
∂x(t + l) ⎣ ∂ x (t + l + 1)
1 ∂ x N (t + l + 1) ⎦ to the model prediction. According to (16), we only care about
···
∂ x N (t + l) ∂ x N (t + l) the augmented reservoir states φ(t+i ) = [x(t+i )T , u(t+i )T ]T
(36) since the output matrix Ŵout is a certain matrix when the train-
ing process is finished. Meanwhile, the feedback matrix Wback
Thus is set to be a zero matrix. By iteratively updating the reservoir
∂x(t + l + 1)T states, the relationship between the future states and the
= (WT + W̄)(t + l) (37)
∂x(t + l) historical states can be expressed as
where x(t + N p ) = f [Win u(t + N p ) + Wx(t + N p − 1)]
⎡ ⎤
L
L = f [Win u(t + N p ) + W f (Win u(t + N p − 1)
⎢ back out
w1s ŵs1 ··· wback out
Ns ŵs1
⎥
⎢ ⎥ + Wx(t + N p − 2))]
⎢ s=1 s=1 ⎥
⎢ .. .. .. ⎥ ..
W̄ = ⎢
⎢ . . .
⎥
⎥ .
⎢ L ⎥
⎢ L ⎥ = f {Win u(t + N p ) + W f [Win u(t + N p − 1)
⎣ wback ŵout ··· w back out ⎦
ŵ
1s sN Ns sN + W f (Win u(t + N p − 2)
s=1 s=1
+ W f (· · · f (Win u(t + 1) + Wx(t))))]}. (42)
and
Due to the known historical states x(t), the derivative
(t + l) = diag{ f (g1 (t + l)), . . . , f (g N (t + l))}. of x(t) for time will be equal to 0. Thus, ∂e(t)T /∂t only
From (24), (26), (33), and (37), and let relates to the network input u(t + 1), . . . , u(t + N p ), in which
the control signal u c is the first element.
i−1
According to the previous analysis and the rules of deriva-
Q= [(WT + W̄)(t + l)] (38)
tion, ∂e(t)T /∂t can be obtained as
l= j
∂e(t)T ∂uc (t)T ∂ ŷ(t)T ∂r(t)T
then we can finally get =− + (43)
⎧ ∂t ∂t ∂uc (t) ∂t
out T
⎪
⎨Wu (t + j − 1)QĪ(Ŵ ) , i > j
in
∂ ŷ(t + i ) out where ∂r(t)T /∂t is the derivation of the reference. From (23),
= Wuin (t + j − 1)Ī(Ŵ )T , i= j (39) the relationship between the derivation of error ∂e(t)T /∂t and
∂u c (t + j ) ⎩⎪
0, i < j. uc (t) can be attained
∂ ŷ(t)T y
IV. S TABILITY A NALYSIS W e(t) = (η−1 + Wu ) uc (t). (44)
∂uc (t)
As it is essential for successful applications, the stability
Combining (41), (43), and (44), the derivation of the function
of the closed-loop system is a vital issue and requires careful
∂ V (t)/∂t can be expressed as
investigation. In Section IV, the Lyapunov function is proposed
and the stability of the closed-loop system is analyzed from ∂ V (t) ∂uc (t)T −1 ∂r(t)T y
=− (η + Wu ) uc (t) + W e(t).
two situations under the condition that the ESGP can approx- ∂t ∂t ∂t
imate the dynamics of the PMA, as described by Remark 1. (45)
Theorem 1: Assume that the parameters η−1 + Wu is pos-
itive definite and set the feedback matrix Wback to be a zero Approximately, we have
matrix. V (t) = − uc (t)T (η−1 + Wu ) uc (t) + r(t)T W y e(t).
1) If the reference trajectory does not change over time, (46)
the asymptotic stability of the proposed ESGP-NMPC
can be guaranteed. First, considering the case that r(t) is equal to zero, the
2) If the reference trajectory changes over time, the system asymptotic stability of the proposed ESGP-NMPC can be
will be passive. guaranteed when the parameter η−1 + Wu is positive definite
Proof: Let us define a function as since the function can be seen as Lyapunov function and
V (t) is less than or equal to zero ( V (t) ≤ 0 ). This
1
V (t) =
e(t)T W y e(t). (40) corresponds to the situation that the reference does not change
2 over time.
When the weight parameter W y keeps positive definite, Considering the second case in which r(t) is not equal
the function will be nonnegative [V (t) ≥ 0]. to zero, the closed-loop system can be regarded as a system,
Authorized licensed use limited to: MANIPAL INSTITUTE OF TECHNOLOGY. Downloaded on March 29,2023 at 13:09:11 UTC from IEEE Xplore. Restrictions apply.
1078 IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, VOL. 16, NO. 3, JULY 2019
This simulation is conducted to demonstrate the effec- where n is the total number of samples.
tiveness of the proposed control strategy for the reference
trajectory B. System Modeling
yd = A x cos(2π f t − π/2) + Bx (47) As described in Remark 1, the ESGP has universal computa-
tion and approximation properties. It can model the PMA with
where we choose A x = 0.0125, f = 0.25 Hz, and sufficient accuracy. The ESGP model will be trained by the
Bx = 0.0225. input–output data of the PMA system. The effectiveness of the
The system model is based on Fig. 1. The system parameters ESGP model is evaluated by the online modeling and multistep
in (2) are selected based on the identification of the three- prediction. Assume that the input u(t) = α1 sin(2πwt) +
element model [10], and the values are shown in Table I. The α2 (α1 = 10000, α2 = 145000, w = 0.25) is known. The
damping coefficient B(P) is dependent on whether the PMA prediction of the ESGP model is compared with an ESN,
is being inflated or deflated, which corresponds to two sets an MLP and an RNN. In the simulations, all the function
of B0 and B1 . In fact, the spring coefficient K (P) turns out approximators ESGP, ESN, MLP, and RNN are set to have the
to be a piecewise linear function at point P = 163892 Pa, similar topologies of three-layer structures, including the input
and hence, there are also two different sets of K 0 and K 1 for layer, hidden layer, and output layer. The significant difference
P < 163892 and P > 163892, respectively. between the ESGP, the ESN, and other function approximators
We adopt the parameters N p = 2 and Nu = 1. The is that the hidden layer applies the reservoir computing.
choice of other parameters Wu = ρu I, W y = ρ y I, In order to ensure a fair comparison, the hidden nodes of
and η = ρη I (ρu , ρ y , ρη > 0) were selected based on all the approximators are set to 20, and the weight matrices
Authorized licensed use limited to: MANIPAL INSTITUTE OF TECHNOLOGY. Downloaded on March 29,2023 at 13:09:11 UTC from IEEE Xplore. Restrictions apply.
HUANG et al.: ESGP-NMPC FOR PMAs 1079
Fig. 4. Prediction performance of the PMA using models. Fig. 6. Tracking performances of the control strategies with the sinusoidal
reference.
TABLE II
C OMPARISON OF THE P REDICTION OF D IFFERENT M ODELS and PID control strategies are shown in Fig. 6, and the
corresponding tracking error results are presented in Fig. 7.
It is easy to find that the ESGP-NMPC presents the best
performance, whereas the tracking errors of the ESN-NMPC
and ESGP-NMPC are almost the same. The reason is that
ESGP derives from ESN and has a similar principle to it.
The RNN-NMPC and MLP-NMPC behave a litter worse
than the proposed strategies since the prediction accuracy
are randomly initialized according to the principles of each of RNN and MLP is directly related to the number of
approximator. Therefore, we run the program 100 times to hidden neurons. However, with the increase in the number
eliminate the effects of randomness. The forecast performance of neurons, the structure of MLP and RNN would become
is measured by applying the root-mean-square error (RMSE) complex. Compared with the dynamic reservoir computing
that can be described as of ESGP, the training methods of MLP and RNN would be
complicated, and the convergence speed may not guarantee
1
n
RMSE(t) = ! (y(t) − ŷ(t))2 . (50) real-time calculation. On the other hand, the PID controller
2n is not only limited by the lack of theoretical support to
t =1
prove the stability but also difficult to find suitable controller
The predictions of all the models are shown in Figs. 4 and 5 for parameters for high-precision control with blind tuned P, I, and
comparisons. The forecasting performance is assessed using D. Consequently, the PID controller performs the worst with
the same number of hidden nodes, as shown in Table II. In the our best efforts for parameter regulation in this simulation.
case of the same number of hidden layer nodes, the results At last, the corresponding results about the maximum error
indicate that the ESGP can predict the behavior of the PMA and the integral of absolute error are shown in Table III.
with highest accuracy among all the approaches.
VI. E XPERIMENTS S TUDIES
C. Simulation Results A. Experimental Setup
The trajectory tracking results of the PMA employed The main module in the PMA hardware system is the
ESGP-NMPC, ESN-NMPC, RNN-NMPC, MLP-NMPC, xPC target, a product from MathWorks, which makes it easy to
Authorized licensed use limited to: MANIPAL INSTITUTE OF TECHNOLOGY. Downloaded on March 29,2023 at 13:09:11 UTC from IEEE Xplore. Restrictions apply.
1080 IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, VOL. 16, NO. 3, JULY 2019
TABLE III
R ESULTS OF THE C ONTROL S TRATEGIES W ITH
THE S INUSOIDAL R EFERENCE
B. Experimental Results
To verify the validity of the proposed approach, a sinusoidal
trajectory is also utilized as the desired trajectory. Comparisons
are given to show the tracking performance of the
ESGP-NMPC, ESN-NMPC, RNN-NMPC, MLP-NMPC,
and PID. The proposed control strategy will be applied to
the PMA attaching different loads for testing its robustness
to different loads. In addition, experiments with different
frequencies and amplitudes of reference trajectories are
applied to further verify the effectiveness of this method.
Before presenting the experimental results, we would like
to show the calculation time of the control algorithm. At first,
we performed the whole experiment 10 times and calculated
the average computational time within a control period. The
create real-time application models by SIMULINK/BLOCKS. time consumption of each control period includes two parts.
With the dedicated hardware, the proposed strategy can be One is the learning procedure of the ESGP that is used to
implemented. Two computers are utilized. One is the host, approximate the dynamics of the PMA. The other one is the
and the other one is the target. The host computer is installed gradient descent algorithm that is used to obtain the control
with MATLAB/SIMULINK and C compiler, and it applies signal. It turns out that the average consumption time of the
models to generate executable codes, while the target computer ESGP is 8.1758 × 10− 5 s. This is due to the relatively small
executes the generated code in real-time. Hence, the xPC target size of the internal states. In addition, the average consumption
can provide the necessary software that makes use of real-time time of the gradient descent algorithm is 1.483 × 10− 5 s.
resources on the target computer hardware. Table IV shows Hence, this fully meets our real-time requirements.
Authorized licensed use limited to: MANIPAL INSTITUTE OF TECHNOLOGY. Downloaded on March 29,2023 at 13:09:11 UTC from IEEE Xplore. Restrictions apply.
HUANG et al.: ESGP-NMPC FOR PMAs 1081
TABLE V
R ESULTS OF THE C ONTROL S TRATEGIES
Fig. 10. Tracking performances of the control strategies with the sinusoidal
reference.
Fig. 11. Tracking errors of the control strategies with the sinusoidal reference.
Authorized licensed use limited to: MANIPAL INSTITUTE OF TECHNOLOGY. Downloaded on March 29,2023 at 13:09:11 UTC from IEEE Xplore. Restrictions apply.
1082 IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, VOL. 16, NO. 3, JULY 2019
C. Discussion
Note that the ESGP used in this paper has K input neurons,
N internal neurons, and one output neuron. The storage of
ESGP is mainly used for weight matrices and the vector of
internal states. The computational and storage cost of the
NMPC-ESGP algorithms are O((N + K )3 ) and O((N + K )2 ),
respectively. In order to satisfy the real-time property, the inter-
nal size is rather small in our application. Currently, the high-
speed multicore digital signal processing (DSP), such as
TMS320C6678 SoC, is clocked at 1.25 GHZ and 512 KB of
memory, which has reached a fairly high speed of operation.
In fact, there have been studies using DSPs to implement ESNs
to accomplish specific tasks [45]. In addition, the advanced
RISC machine (ARM) Cortex-A53 that contains eight cores
with 1.4 GHZ is considered to be the most promising ARM
processor in recent years. It can carry embedded operating
Fig. 14. Tracking performance of the control strategy ESGP-NMPC with a system, and it has been applied in various fields. There is no
sinusoidal reference signal of frequency 0.1 Hz and amplitude 0.0075, 0.01, doubt that it can also meet our requirement. FPGAs allow a
and 0.0125 m.
much faster implementation cycle for a “small” number of
chips and offer an intermediate between the programmability
of classic processors and the parallel nature and high speed of
Application Specific Integrated Circuits. There exist several
studies about the implementation of liquid state machines
for real-time speech recognition [46]. Note that the liquid
state machine is an RNN of spiking neurons with reservoir
computing, whose principle is the same as ESNs. In the
current stage, our main purpose is to verify the effectiveness
and robustness of the algorithm. For our further application,
we may implement this algorithm on a DSP or FPGA.
VII. C ONCLUSION
This paper proposed a systematic design methodology to
develop an ESGP-NMPC for a PMA system. By analyzing the
NMPC scheme, a gradient descent algorithm was utilized to
minimize the cost function, and a control signal sequence was
obtained. As the stability is essential for the physical applica-
tion, characteristics of the closed-loop system were analyzed to
Fig. 15. Tracking performance of the control strategy ESGP-NMPC with a
sinusoidal reference signal of amplitude 0.0125 m and frequency 0.1, 0.25,
get the stability condition. The simulation and experiment were
and 0.4 Hz. conducted to demonstrate the validation of the ESGP-NMPC,
and its robustness was also verified through the physical
experiments that the PMA mounted different loads. Compared
with other control strategies, the ESGP-NMPC can achieve
Another experiments with sinusoidal reference signals of a better model fitting for the PMA and a better control
frequency 0.1 Hz and amplitude 0.0075, 0.01, and 0.0125 m performance for the high-precision tracking tasks.
are conducted, respectively. The results indicate that the
control performance is almost unchanged with the reference R EFERENCES
trajectory of the same frequency and different amplitudes,
[1] J. Yoon, B. Novandy, C.-H. Yoon, and K.-J. Park, “A 6-DOF gait rehabil-
as shown in Fig. 14. itation robot with upper and lower limb connections that allows walking
To further validate the suitability of the proposed control velocity updates on various terrains,” IEEE/ASME Trans. Mechatronics,
strategy, the tracking performance for sinusoidal reference vol. 15, no. 2, pp. 201–215, Apr. 2010.
[2] J. Huang, X. Tu, and J. He, “Design and evaluation of the RUPERT
signals of amplitude 0.0125 m and frequency 0.1, 0.25, and wearable upper extremity exoskeleton robot for clinical and in-home
0.4 Hz are presented in Fig. 15. As the frequency of the ref- therapies,” IEEE Trans. Syst., Man, Cybern., Syst., vol. 46, no. 7,
erence trajectory increases, the control performance gradually pp. 926–935, Jul. 2016.
[3] A. B. Zoss, H. Kazerooni, and A. Chu, “Biomechanical design of the
deteriorates. Actually, our applications for the PMA are mainly Berkeley lower extremity exoskeleton (BLEEX),” IEEE/ASME Trans.
aimed at the rehabilitation robots to help patients complete Mechatronics, vol. 11, no. 2, pp. 128–138, Apr. 2006.
the rehabilitation training process. The reference trajectory is [4] Z. Li, C. Su, L. Wang, Z. Chen, and T. Chai, “Nonlinear disturbance
observer-based control design for a robotic exoskeleton incorporating
usually slowly changing. Therefore, this control strategy can fuzzy approximation,” IEEE Trans. Ind. Electron., vol. 62, no. 9,
fulfill the rehabilitation requirement. pp. 5763–5775, Sep. 2015.
Authorized licensed use limited to: MANIPAL INSTITUTE OF TECHNOLOGY. Downloaded on March 29,2023 at 13:09:11 UTC from IEEE Xplore. Restrictions apply.
HUANG et al.: ESGP-NMPC FOR PMAs 1083
[5] D. G. Caldwell, G. A. Medrano-Cerda, and M. Goodwin, “Control of [27] J. Huang, M. Ri, D. Wu, and S. Ri, “Interval type-2 fuzzy logic modeling
pneumatic muscle actuators,” IEEE Control Syst. Mag., vol. 15, no. 1, and control of a mobile two-wheeled inverted pendulum,” IEEE Trans.
pp. 40–48, Feb. 1995. Fuzzy Syst., vol. 26, no. 4, pp. 2030–2038, Aug. 2017.
[6] C.-P. Chou and B. Hannaford, “Static and dynamic characteristics of [28] T. G. Barbounis, J. B. Theocharis, M. C. Alexiadis, and
McKibben pneumatic artificial muscles,” in Proc. IEEE Int. Conf. Robot. P. S. Dokopoulos, “Long-term wind speed and power forecasting
Automat., San Diego, CA, USA, vol. 1, May 1994, pp. 281–286. using local recurrent neural network models,” IEEE Trans. Energy
[7] P. K. Jamwal, S. Hussain, and S. Q. Xie, “Three-stage design analysis Convers., vol. 21, no. 1, pp. 273–284, Mar. 2006.
and multicriteria optimization of a parallel ankle rehabilitation robot [29] D. T. Mirikitani and N. Nikolaev, “Recursive Bayesian recurrent neural
using genetic algorithm,” IEEE Trans. Autom. Sci. Eng., vol. 12, no. 4, networks for time-series modeling,” IEEE Trans. Neural Netw., vol. 21,
pp. 1433–1446, Oct. 2015. no. 2, pp. 262–274, Feb. 2010.
[8] P. Beyl, M. Van Damme, R. Van Ham, B. Vanderborght, and D. Lefeber, [30] H. Jaeger, “The ‘echo state’ approach to analysing and training recur-
“Pleated pneumatic artificial muscle-based actuator system as a torque rent neural networks,” German Nat. Res. Center Inf. Technol., Bonn,
source for compliant lower limb exoskeletons,” IEEE/ASME Trans. Germany, GMD Rep. 148, 2001.
Mechatronics, vol. 19, no. 3, pp. 1046–1056, Jun. 2014. [31] H. Jaeger and H. Haas, “Harnessing nonlinearity: Predicting chaotic sys-
[9] H. Aschemann and D. Schindele, “Sliding-mode control of a high-speed tems and saving energy in wireless communication,” Science, vol. 304,
linear axis driven by pneumatic muscle actuators,” IEEE Trans. Ind. no. 5667, pp. 78–80, Apr. 2004.
Electron., vol. 55, no. 11, pp. 3855–3864, Nov. 2008. [32] S. P. Chatzis and Y. Demiris, “Echo state Gaussian process,” IEEE Trans.
Neural Netw., vol. 22, no. 9, pp. 1435–1445, Sep. 2011.
[10] J. Wu, J. Huang, Y. Wang, and K. Xing, “Nonlinear disturbance
[33] J. Park, B. Lee, S. Kang, P. Y. Kim, and H. J. Kim, “Online learning
observer-based dynamic surface control for trajectory tracking of pneu-
control of hydraulic excavators based on echo-state networks,” IEEE
matic muscle system,” IEEE Trans. Control Syst. Technol., vol. 22, no. 2,
Trans. Autom. Sci. Eng., vol. 14, no. 1, pp. 249–259, Jan. 2017.
pp. 440–455, Mar. 2014.
[34] J. Herbert, “Echo state network,” Scholarpedia, vol. 2, no. 9, p. 2330,
[11] K. Kawashima, T. Sasaki, A. Ohkubo, T. Miyata, and T. Kagawa, 2007.
“Application of robot arm using fiber knitted type pneumatic artificial [35] W. Maass, P. Joshi, and E. D. Sontag, “Computational aspects of
rubber muscles,” in Proc. IEEE Int. Conf. Robot. Automat. (ICRA), feedback in neural circuits,” PLoS Comput. Biol., vol. 3, no. 1, p. e165,
vol. 5, Apr./May 2004, pp. 4937–4942. 2007.
[12] G. Andrikopoulos, G. Nikolakopoulos, and S. Manesis, “Advanced non- [36] M. Buehner and P. Young, “A tighter bound for the echo state property,”
linear PID-based antagonistic control for pneumatic muscle actuators,” IEEE Trans. Neural Netw., vol. 17, no. 3, pp. 820–824, May 2006.
IEEE Trans. Ind. Electron., vol. 61, no. 12, pp. 6926–6937, Dec. 2014. [37] I. B. Yildiz, H. Jaeger, and S. J. Kiebel, “Re-visiting the echo state
[13] K. Balasubramanian and K. S. Rattan, “Fuzzy logic control of a pneu- property,” Neural Netw., vol. 35, pp. 1–9, Nov. 2012.
matic muscle system using a linearing control scheme,” in Proc. 22nd [38] C. E. Rasmussen and C. K. I. Williams, Gaussian Processes for Machine
Int. Conf. North Amer. Fuzzy Inf. Process. Soc., Jul. 2003, pp. 432–436. Learning. Cambridge, MA, USA: MIT Press, 2006.
[14] J. Wu, J. Huang, Y. Wang, K. Xing, and Q. Xu, “Fuzzy PID control [39] T. A. Johansen, “Toward dependable embedded model predictive con-
of a wearable rehabilitation robotic hand driven by pneumatic muscles,” trol,” IEEE Syst. J., vol. 11, no. 2, pp. 1208–1219, Jun. 2017.
in Proc. Int. Symp. Micro-NanoMechatronics Hum. Sci., Nagoya, Japan, [40] L. T. Biegler, “A survey on sensitivity-based nonlinear model predictive
Nov. 2009, pp. 408–413. control,” IFAC Proc. Volumes, vol. 46, no. 32, pp. 499–510, 2013.
[15] S. J. Qin and T. A. Badgwell, “A survey of industrial model predictive [41] D. Q. Mayne, “Model predictive control: Recent developments and
control technology,” Control Eng. Pract., vol. 11, no. 7, pp. 733–764, future promise,” Automatica, vol. 50, no. 12, pp. 2967–2986, 2014.
2003. [42] P. Parks, “Liapunov redesign of model reference adaptive control sys-
[16] D. Q. Mayne, J. B. Rawlings, C. V. Rao, and P. O. M. Scokaert, “Con- tems,” IEEE Trans. Autom. Control, vol. AC-11, no. 3, pp. 362–367,
strained model predictive control: Stability and optimality,” Automatica, Jul. 1966.
vol. 36, no. 6, pp. 789–814, 2000. [43] C.-C. Hang and P. Parks, “Comparative studies of model reference
[17] F. Verrilli et al., “Model predictive control-based optimal operations of adaptive control systems,” IEEE Trans. Autom. Control, vol. AC-18,
district heating system with thermal energy storage and flexible loads,” no. 5, pp. 419–428, Oct. 1973.
IEEE Trans. Autom. Sci. Eng., vol. 14, no. 2, pp. 547–557, Apr. 2017. [44] K. Dalamagkidis, K. P. Valavanis, and L. A. Piegl, “Nonlinear model
[18] R. Cao and K. S. Low, “A repetitive model predictive control approach predictive control with neural network optimization for autonomous
for precision tracking of a linear motion system,” IEEE Trans. Ind. autorotation of small unmanned helicopters,” IEEE Trans. Control Syst.
Electron., vol. 56, no. 6, pp. 1955–1962, Jun. 2009. Technol., vol. 19, no. 4, pp. 818–831, Jul. 2011.
[19] R. Ginhoux, J. Gangloff, M. D. Mathelin, L. Soler, M. M. A. Sanchez, [45] B. Schrauwen, M. Wardermann, D. Verstraeten, J. J. Steil, and
and J. Marescaux, “Active filtering of physiological motion in robotized D. Stroobandt, “Improving reservoirs using intrinsic plasticity,” Neu-
surgery using predictive control,” IEEE Trans. Robot., vol. 21, no. 1, rocomputing, vol. 71, nos. 7–9, pp. 1159–1171, Mar. 2008.
pp. 67–79, Feb. 2005. [46] B. Schrauwen, M. D’Haene, D. Verstraeten, and J. Van Campenhout,
[20] H.-T. Zhang et al., “Electrospinning sedimentary microstructure feed- “Compact hardware liquid state machines on FPGA for real-time speech
back control by tuning substrate linear machine velocity,” IEEE Trans. recognition,” Neural Netw., vol. 21, nos. 2–3, pp. 511–523, 2008.
Ind. Electron., vol. 64, no. 11, pp. 8686–8694, Nov. 2017.
[21] Y. Pan and J. Wang, “Model predictive control of unknown nonlinear
dynamical systems based on recurrent neural networks,” IEEE Trans.
Ind. Electron., vol. 59, no. 8, pp. 3089–3101, Aug. 2012.
[22] Z. Yan and J. Wang, “Model predictive control of nonlinear systems
with unmodeled dynamics based on feedforward and recurrent neural
networks,” IEEE Trans. Ind. Informat., vol. 8, no. 4, pp. 746–756,
Nov. 2012.
[23] H. Han and J. Qiao, “Nonlinear model-predictive control for industrial
processes: An application to wastewater treatment process,” IEEE Trans.
Ind. Electron., vol. 61, no. 4, pp. 1970–1982, Apr. 2014. Jian Huang (M’07–SM’17) received the B.S., M.E.,
[24] L. Cheng, W. Liu, Z.-G. Hou, J. Yu, and M. Tan, “Neural-network- and Ph.D. degrees from the Huazhong University
based nonlinear model predictive control for piezoelectric actuators,” of Science and Technology (HUST), Wuhan, China,
IEEE Trans. Ind. Electron., vol. 62, no. 12, pp. 7717–7727, Dec. 2015. in 1997, 2000, and 2005, respectively.
[25] M. A. Hosen, M. A. Hussain, and F. S. Mjalli, “Control of polystyrene From 2006 to 2008, he was a Post-Doctoral
batch reactors using neural network based model predictive control Researcher with the Department of Micro-Nano
(NNMPC): An experimental investigation,” Control Eng. Pract., vol. 19, System Engineering and the Department of
no. 5, pp. 454–467, May 2011. Mechano-Informatics and Systems, Nagoya
[26] W. Liu, L. Cheng, Z.-G. Hou, J. Yu, and M. Tan, “An inversion-free University, Nagoya, Japan. He is currently a Full
predictive controller for piezoelectric actuators based on a dynamic Professor with the School of Automation, HUST.
linearized neural network model,” IEEE/ASME Trans. Mechatronics, His main research interests include rehabilitation
vol. 21, no. 1, pp. 214–226, Feb. 2016. robot, robotic assembly, networked control systems, and bioinformatics.
Authorized licensed use limited to: MANIPAL INSTITUTE OF TECHNOLOGY. Downloaded on March 29,2023 at 13:09:11 UTC from IEEE Xplore. Restrictions apply.
1084 IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, VOL. 16, NO. 3, JULY 2019
Yu Cao (S’17) received the B.S. degree in control Hai-Tao Zhang (M’07–SM’13) received the B.E.
science and control engineering from the Wuhan and Ph.D. degrees from the University of Science
University of Technology, Wuhan, China, in 2011, and Technology of China, Hefei, China, in 2000 and
and the M.S degree in software engineering from 2005, respectively.
the Huazhong University of Science and Technology, In 2007, he was a Post-Doctoral Researcher with
Wuhan, in 2014, where he is currently pursuing the the University of Cambridge, Cambridge, U.K. From
Ph.D. degree. 2005 to 2010, he was an Associate Professor with
His current research interests include modeling the Huazhong University of Science and Technology,
and control of rehabilitation robotics exoskeleton Wuhan, China, where he has been a Professor since
systems. 2010. His research interests include multi-agent sys-
tems control, model predictive control, and multi-
robot collaboration manufacturing control.
Dr. Zhang serves as an Associate Editor of the IEEE T RANSACTIONS ON
C IRCUITS AND S YSTEMS II and the Asian Journal of Control and an Editorial
Board Member for the IEEE Control Systems Society (CSS). He also serves
Caihua Xiong (M’12) received the Ph.D. degree as the Chair of IEEE CSS Wuhan Chapter.
in mechanical engineering from the Huazhong
University of Science and Technology (HUST),
Wuhan, China, in 1998.
From 1999 to 2003, he was a Post-Doctoral
Fellow with the City University of Hong Kong,
Hong Kong, and The Chinese University of
Hong Kong, Hong Kong, and a Research Scientist
with the Worcester Polytechnic Institute, Worcester,
MA, USA. He is currently a Chang Jiang Professor
and the Director of the Institute of Rehabilitation
and Medical Robotics, HUST. His current research interests include
biomechatronic prostheses, rehabilitation robotics, and robot motion planning
and control.
Dr. Xiong received the National Science Fund for Distinguished Young
Scholars of China.
Authorized licensed use limited to: MANIPAL INSTITUTE OF TECHNOLOGY. Downloaded on March 29,2023 at 13:09:11 UTC from IEEE Xplore. Restrictions apply.