Adaptive Dynamic Programming Algorithm For Uncertain Nonlinear Switched Systems
Adaptive Dynamic Programming Algorithm For Uncertain Nonlinear Switched Systems
Corresponding Author:
Nguyen Hong Quang
Thai Nguyen University of Technology,
So 666 D. 3/2, P, Thành pho Thái Nguyên, Thái Nguyên, Vietnam
Email: [email protected]
1. INTRODUCTION
It is worth noting that many systems in industry can be described by switched system such as DC-
DC converter [1]-[3], H-bridge inverter [4], multilevel inverter [5], photovoltaic inverter [6]. Although many
different approaches for switched systems have been proposed, e.g., switching-delay tolerant control [7], clas-
sical nonlinear control [8]-[12], the optimization approaches with the advantage of mentioning the input/state
constraint has not been mentioned much. The approaches of fuzzy and neural network as well as ANN, par-
ticle swarm optimization (PSO) technique were investigated in several different systems such as photovoltaic
inverter, transmission line. [13]-[17].
Adaptive dynamic programming has been considered in many situations, such as nonlinear continuous
time systems [18], actuator saturation [19], linear systems [20]-[22], output constraint [23]. In the case of non-
linear systems, the algorithm should be implemented based on Neural Networks (NNs). However, Kronecker
product was employed in linear systems. Furthermore, the data driven technique should to be mentioned to
compute the actor/critic precisely. It should be noted that the robotic systems has been controlled by ADP
algorithm [24]-[25].
Our work proposed the solution of adaptive dynamic programming in nonlinear perturbed switching
systems based on the neural networks. The consideration of the Halminton function enables us obtaining the
learning technique of these neural networks. The UUB stability of closed system is analyzed and simulation
results illustrate the high effectiveness of given controller.
2. PROBLEM STATEMENTS
Consider the following uncertain nonlinear continuous time switched systems of the form:
d
ξ(t) = fi (ξ(t)) + gi (ξ(t)) (u + ∆ (ξ, t)) (1)
dt
where ξ (t) ∈ Ωx ∈ Rn denotes the state variables and u (t) ∈ Ωu ∈ Rm describes the control variables.
The function β : [ 0, +∞) 7→ Ω = {1, 2, ..., l} is a information of switching processing, which is known as
a function with many continuous piecewise depending on time, and l is the subsystems number. fi (ξ) are
uncertain smooth vector functions with fi (0) = 0. gi (ξ) are mentioned as smooth vector functions with the
property Gmin ⩽ ∥gi (ξ)∥ ⩽ Gmax . The switching index β (t) is unknown.
Assumption 1: ∆ (ξ, t) is bounded by a certain function ϱ (ξ) as ∥∆ (ξ, t)∥ ⩽ ϱ (ξ)
Consider the cost function connected with the uncertain switched system (1):
Z∞
J(ξ, u) = r (ξ (τ ) , u (τ )) dτ (2)
t
3. CONTROL DESIGN
The obtained nominal system after eliminating the disturbance in switched system (3) is described by:
d
ξ = fi (ξ) + gi (ξ) u (3)
dt
The performance index of system (3) is modified as (4)
Z∞ h i
2
Q1 (ξ, u) = r(ξ, u) + γ (ρ (ξ)) dτ (4)
t
We prove that Q1 (ξ, u) with γ ⩾ ∥R∥ is the one of appropriate performance indexes of dynamical
system (1). Define: V ∗ (t) = min Q1 (ξ, u), we have (5)
u∈Ωu
Z∞
∗
r(ξ, u) + γρ2 (ξ) dλ
V (t) = min (5)
u∈Ωu
t
based on nominal system and cost function (4), it leads to Halminton function as (6)
T
∂V ∗
∗ 2
H (ξ, u, V ) = r(ξ, u) + γρ (ξ) + (fi (ξ) + gi (ξ) u) (6)
∂ξ
by using optimality principle, the optimal control input can be obtained as (7).
∗
1 T ∂V
u∗ (ξ) = − R−1 (gi (ξ)) (7)
2 ∂ξ
We continue to utilize this control law (7) for nonlinear continuous SW system (1) and obtain that:
T ∗
Theorem 1: The system (1) under the controller u∗ (ξ) = − 12 R−1 (gi (ξ)) ∂V ∂ξ is stable with the
associated Lyapunov function candidate:
Int J Pow Elec & Dri Syst, Vol. 12, No. 1, March 2021 : 551 – 557
Int J Pow Elec & Dri Syst ISSN: 2088-8694 ❒ 553
Z∞
r(ξ, u) + γϱ2 (ξ) dλ
V (t) = (8)
t
where γ ⩾ ∥R∥.
T
Proof: Taking the derivative of V under the control input u (ξ) = − 21 R−1 (gi (ξ)) ∇V ∗ , we imply
that (9):
d
T
T
V = −ξ T Qξ − γϱ2 (ξ) − ∆ (ξ, t) R∆ (ξ, t) − (u + ∆ (ξ, t)) R (u + ∆ (ξ, t)) (9)
dt
It is able to conclude that (10):
V̇ (t) ⩽ −ξ T Qξ (10)
Therefore, the system (1) is robustly stable. However, it is impossible to solve directly HJB equation.
Hence, the optimal performance index V ∗ for system (3) can be described based on a NN as (11)
T 1 T T
H (ξ, u∗ , V ∗ ) = ξ T Qξ + λϱ2 (ξ) + (∇V ∗ ) fi (ξ) − (∇V ∗ ) gi (ξ) R−1 gi (ξ) (∇V ∗ ) = 0 (12)
4
Formula (19) leads to (13).
T
∇V ∗ = (∇σ (ξ)) w + ∇ε (ξ) (13)
Obtain the description as (14).
T 1 T T
eN N = −∇ε (ξ) (fi (ξ) + gi (ξ) u∗ ) + ∇ε (ξ) gi (ξ) R−1 gi (ξ) ∇ε (ξ) (14)
4
It follows that eN N converges uniformly to zero as N → ∞. For each number N , eN N is bounded
on a region as eN N ⩽ emax . Under the structure of ADP-based controller, a critic NN is computed as (15).
T 1 T
V̂ = ŵT σ (ξ) = σ (ξ) ŵ; û = − R−1 (gi (ξ)) ∇V̂ (15)
2
It is able to achieve that:
1 T T
eHJB = ξ T Qξ + λϱ2 (ξ) + ŵT ∇σ (ξ) fi (ξ) − ŵT ∇σ (ξ) gi (ξ) R−1 gi (ξ) ∇σ (ξ) ŵ (16)
4
The training law is handled based on a steepest descent method:
d ∂E
b = −α
w (17)
dt ∂w
b
with E = 21 eTHJB eHJB .
b is trained to minimize the network error part G = 12 eTHJB eHJB . This result
Remark 1: The weight w
is obtained from (18).
2
∂G ∂G
= −α (18)
∂t ∂w
b
Adaptive dynamic programming algorithm for uncertain nonlinear switched systems (Dao Phuong Nam)
554 ❒ ISSN: 2088-8694
Theorem 2: Consider the feedback controller in (15) and the critic weight is updated by (18), the
weight estimate error w̃ = w − ŵ and the closed system’s state vector x(t) are uniformly ultimately bounded
(UUB).
Proof: Let’s choose the Lyapunov function:
1 T
V (t) = V1 (t) + V2 (t) , where: V1 (t) = w̃ (t) w̃ (t) , V2 (t) = V ∗ (19)
2α
Using the Assumption 3: ∥fi (ξ) + gi (ξ) u∗ ∥ ⩽ ρmax and the definition:
T
ρi = fi (ξ) + gi (ξ) u∗ ; Gi = gi (ξ) R−1 gi (ξ) ; ∇σ = ∇σ (ξ) ; ∇ε = ∇ε (ξ). Taking the derivative of V1 (t),
we imply that:
1 1
V̇1 (t) = −w̃T −eN N + w̃T ∇σµi + w̃T ∇σGi ∇ε + w̃T ∇σGi ∇σ T w̃
2 4
1
∇σ (x) µi + Gi ∇σ T w̃ + ∇ε
(20)
2
It leads to the estimation: V̇1 (t) ⩽ −π1 . For the term V2 (t) , from (20) we have (21).
T 1 T
V̇2 = (∇V ∗ ) (fi + gi (û + ∆)) = − ξ T Qξ + λρ2 (ξ) − (∇V ∗ )
4
1 T
T
T
gi R−1 giT (∇V ∗ ) + (∇V ∗ ) gi R−1 giT ∇σ (ξ) w̃ + ∇ε (ξ) + (∇V ∗ ) gi ∆ (21)
2
Assume that ρ (ξ) = ϖ ∥ξ∥. From (40) we have (22).
2
V̇2 ⩽ − (λmin (Q) + λϖ) ∥ξ∥ + θ2 (22)
T T T T
with θ2 = − 41 (∇V ∗ ) gi R−1 giT (∇V ∗ ) + 21 (∇V ∗ ) gi R−1 giT ∇σ (x) w̃ + ∇ε (x) + (∇V ∗ ) gi ∆.
Based on the two above assumptions, we have (23).
1 2 2 1 2 2
θ2 ⩽ λmax R−1 + (ϑ∇σmax + ∇εmax ) gmax λmax R−1
(wmax ∇σmax + ∇εmax ) gmax
4 2
.
Remark 2: The coefficients ϑ1 ; ϑ2 can be chosen by renovating the NN of the optimal performance
V (0)
index. Moreover, for arbitrary switching index, after min(π 1 ;π2 )
the variable ∥ξ∥ and ∥w̃∥ tend to the accurate
domains. The ADP controller û is proposed in (15), which tends to the neighborhood of u∗ .
Proof: The deviation of control input is estimated as (25).
1 T
T
∥û − u∗ ∥ = R−1 (gi (ξ)) (∇σ (ξ)) w̃ + ∇ε (ξ)
2
1
λmax R−1 .Gmax . (∇σmax .υ1 + ∇εmax ) = ϑ3
⩽ (25)
2
Thus the proof is completed.
Int J Pow Elec & Dri Syst, Vol. 12, No. 1, March 2021 : 551 – 557
Int J Pow Elec & Dri Syst ISSN: 2088-8694 ❒ 555
4. SIMULATION RESULTS
In this section, we consider the simulations to validate the performance of the established control
scheme: Let N = 2 and the subsystems of the switched system are (26) and (27).
5. CONCLUSION
This paper has investigated the ADP problem of switched nonlinear systems under the external dis-
turbance. We consider previously for nominal system by eliminating the disturbance, then using classical
nonlinear control technique. The neural networks have been designed to estimate the actor and critic NN of
iteration. It is possible to develop the learning algorithm with simultaneous tuning. Finally, UUB description
of the closed system is guaranteed under this work.
ACKNOWLEDGEMENT
This research was supported by Research Foundation funded by Thai Nguyen University of Technol-
ogy.
REFERENCES
[1] Vu, Tran Anh and Nam, Dao Phuong and Huong, Pham Thi Viet, “Analysis and control design of
transformerless high gain, high efficient buck-boost DC-DC converters,” in 2016 IEEE International
Conference on Sustainable Energy Technologies (ICSET), Hanoi, 2016, pp. 72-77, doi: 10.1109/IC-
SET.2016.7811759.
Adaptive dynamic programming algorithm for uncertain nonlinear switched systems (Dao Phuong Nam)
556 ❒ ISSN: 2088-8694
[2] Nam, Dao Phuong and Thang, Bui Minh and Thanh, Nguyen Truong, “Adaptive Tracking Control for
a Boost DC–DC Converter: A Switched Systems Approach,” in 2018 4th International Conference on
Green Technology and Sustainable Development (GTSD), Ho Chi Minh City, 2018, pp. 702-705, doi:
10.1109/GTSD.2018.8595580.
[3] Thanh, Nguyen Truong and Sam, Pham Ngoc and Nam, Dao Phuong, “An Adaptive Backstepping Con-
trol for Switched Systems in presence of Control Input Constraint,” in 2019 International Conference
on System Science and Engineering (ICSSE), Dong Hoi, Vietnam, 2019, pp. 196-200, doi: 10.1109/IC-
SSE.2019.8823125.
[4] Panigrahi, Swetapadma and Thakur, Amarnath, “Modeling and simulation of three phases cascaded H-
bridge grid-tied PV inverter,” Bulletin of Electrical Engineering and Informatics (BEEI), vol. 8, no. 1,
pp. 1-9, 2019, doi: 10.11591/eei.v8i1.1225.
[5] Devarajan, N and Reena, A, “Reduction of switches and DC sources in Cascaded Multilevel Inverter,”
Bulletin of Electrical Engineering and Informatics (BEEI), vol. 4, no. 3, pp. 186-195, 2015, doi:
10.11591/eei.v4i3.320.
[6] Venkatesan, M and Rajeshwari, R and Deverajan, N and Kaliyamoorthy, M, “Comparative study of three
phase grid connected photovoltaic inverter using pi and fuzzy logic controller with switching losses cal-
culation,” International Journal of Power Electronics and Drive Systems (IJPEDS), vol. 7, no. 2, pp.
543-550, 2016.
[7] Zhang, Lixian and Xiang, Weiming, “Mode-identifying time estimation and switching-delay tolerant con-
trol for switched systems: An elementary time unit approach,” Automatica, vol. 64, pp. 174-181, 2016,
doi: 10.1016/j.automatica.2015.11.010.
[8] Yuan, Shuai and Zhang, Lixian and De Schutter, Bart and Baldi, Simone, “A novel Lyapunov function for
a non-weighted L2 gain of asynchronously switched linear systems,” Automatica, vol. 87, pp. 310-317,
2018, doi: 10.1016/j.automatica.2017.10.018.
[9] Xiang, Weiming and Lam, James and Li, Panshuo, “On stability and H control of switched
systems with random switching signals,” Automatica, vol. 95, pp. 419-425, 2018, doi:
10.1016/j.automatica.2018.06.001.
[10] Lin, Jinxing and Zhao, Xudong and Xiao, Min and Shen, Jingjin, “Stabilization of discrete-time switched
singular systems with state, output and switching delays,” Journal of the Franklin Institute, vol. 356, pp.
2060-2089, 2019, doi: 10.1016/j.jfranklin.2018.11.034.
[11] Briat, Corentin, “Convex conditions for robust stabilization of uncertain switched systems with guaranteed
minimum and mode-dependent dwell-time,” Systems & Control Letters, vol. 78, pp. 63-72, 2015, doi:
10.1016/j.sysconle.2015.01.012.
[12] Lian, Jie and Li, Can, “Event-triggered control for a class of switched uncertain nonlinear systems,”
Systems & Control Letters, vol. 135, pp. 1-5, 2020, doi: 10.1016/j.sysconle.2019.104592.
[13] Anyaka, Boniface O and Manirakiza, J Felix and Chike, Kenneth C and Okoro, Prince A, “Opti-
mal unit commitment of a power plant using particle swarm optimization approach,” International
Journal of Electrical and Computer Engineering (IJECE), vol. 10, no.2, pp. 1135-1141, 2020, doi:
10.11591/ijece.v10i2.pp1135-1141.
[14] Devi, Palakaluri Srividya and Santhi, R Vijaya, “Introducing LQR-fuzzy for a dynamic multi area LFC-
DR model,” International Journal of Electrical & Computer Engineering, vol. 9, no. 2, pp. 861-874, 2019,
doi: 10.11591/ijece.v9i2.pp861-874.
[15] Omar, Othman AM and Badra, Niveen M and Attia, Mahmoud A, “Enhancement of on-grid pv sys-
tem under irradiance and temperature variations using new optimized adaptive controller,” Interna-
tional Journal of Electrical and Computer Engineering (IJECE), vol. 8, no. 5, pp. 2650-2660, 2018, doi:
10.11591/ijece.v8i5.2650-2660.
[16] Sharma, Purva and Saini, Deepak and Saxena, Akash, “Fault detection and classification in transmission
line using wavelet transform and ANN,” Bulletin of Electrical Engineering and Informatics (BEEI), vol.
5, no. 3, pp. 284-295, 2016.
[17] Ilamathi, P and Selladurai, V and Balamurugan, K, “Predictive modelling and optimization of nitrogen
oxides emission in coal power plant using Artificial Neural Network and Simulated Annealing,” IAES
International Journal of Artificial Intelligence (IJ-AI), vol. 1, no. 1, pp. 11-18, 2012.
[18] Vamvoudakis, Kyriakos G and Vrabie, Draguna and Lewis, Frank L, “Online adaptive algorithm for
optimal control with integral reinforcement learning,” International Journal of Robust and Nonlinear
Int J Pow Elec & Dri Syst, Vol. 12, No. 1, March 2021 : 551 – 557
Int J Pow Elec & Dri Syst ISSN: 2088-8694 ❒ 557
Control, vol. 24, no. 17, pp. 2686-2710, 2013, doi: 10.1002/rnc.3018.
[19] Bai, Weiwei and Zhou, Qi and Li, Tieshan and Li, Hongyi, “Adaptive reinforcement learning neural
network control for uncertain nonlinear system with input saturation,” IEEE transactions on cybernetics,
vol. 50, no. 8, pp. 3433-3443, Aug. 2020, doi: 10.1109/TCYB.2019.2921057.
[20] Chen, Ci and Modares, Hamidreza and Xie, Kan and Lewis, Frank L and Wan, Yan and Xie, Shengli, “Re-
inforcement learning-based adaptive optimal exponential tracking control of linear systems with unknown
dynamics,” in IEEE Transactions on Automatic Control, vol. 64, no. 11, pp. 4423-4438, Nov. 2019, doi:
10.1109/TAC.2019.2905215.
[21] Vamvoudakis, Kyriakos G and Ferraz, Henrique, “Model-free event-triggered control algorithm for
continuous-time linear systems with optimal performance,” in Automatica, vol. 87, pp. 412-420, 2018,
doi: 10.1016/j.automatica.2017.03.013.
[22] Gao, Weinan and Jiang, Yu and Jiang, Zhong-Ping and Chai, Tianyou, “Output-feedback adaptive optimal
control of interconnected systems based on robust adaptive dynamic programming,” Automatica, vol. 72,
pp. 37-45, 2016, doi: 10.1016/j.automatica.2016.05.008.
[23] Zhang, Tianping and Xu, Haoxiang, “Adaptive optimal dynamic surface control of strict-feedback non-
linear systems with output constraints,” International Journal of Robust and Nonlinear Control, vol. 30,
no. 5, pp. 2059–2078, 2020, doi: 10.1002/rnc.4864.
[24] Wang, Ding and Mu, Chaoxu, “Adaptive-critic-based robust trajectory tracking of uncertain dynamics
and its application to a spring–mass–damper system,” IEEE Transactions on Industrial Electronics, vol.
65, no. 1, pp. 654-663, Jan. 2018, doi: 10.1109/TIE.2017.2722424.
[25] Wen, Guoxing and Ge, Shuzhi Sam and Chen, CL Philip and Tu, Fangwen and Wang, Shengnan, “Adap-
tive tracking control of surface vessel using optimized backstepping technique,” IEEE transactions on
cybernetics, vol. 49, no. 9, pp. 3420-3431, Sept. 2019, doi: 10.1109/TCYB.2018.2844177.
Adaptive dynamic programming algorithm for uncertain nonlinear switched systems (Dao Phuong Nam)