DC/ DC Power Converter Control-Based Deep Machine Learning Techniques - Real-Time Implementation
DC/ DC Power Converter Control-Based Deep Machine Learning Techniques - Real-Time Implementation
Letters
DC/DC Power Converter Control-Based Deep Machine Learning Techniques:
Real-Time Implementation
Mojtaba Hajihosseini , Milad Andalibi , Meysam Gheisarnejad , Hamed Farsizadeh,
and Mohammad-Hassan Khooban , Senior Member, IEEE
Abstract—The recent advances in power plants and energy re- converters feeding constant power loads (CPLs) [2]–[4] that can
sources have extended the applications of buck-boost converters lead to large oscillations in the voltage and frequency terms.
in the context of dc microgrids (MGs). However, the implementa- The advances in the hardware technologies with high comput-
tion of such interface systems in the MG applications is seriously
threatened with instability issues imposed by the constant power ing power facilitated the practical implementation of advanced
loads (CPLs). The objective is that without the accurate modeling control mythologies, e.g., nonfragile controller [1], sliding mode
information of a dc MG system, to develop a new adaptive con- controller (SMC) [5], and backstepping scheme [6], to amelio-
trol methodology for voltage stabilization of the dc–dc converters rate the control performance of dc converters feeding CPLs. By
feeding CPLs with low ripples. To achieve this goal, in this letter,
combining a nonlinear disturbance observer and backstepping
the deep reinforcement learning (DRL) technique with the Actor–
Critic architecture is incorporated into an ultralocal model (ULM) technique, a composite nonlinear controller is developed in [6]
control scheme to address the destabilization effect of the CPLs to mitigate the instability imposed by CPLs on the dc MGs. In
under the reference voltage variations. In the suggested control [7], a systematic and simple state feedback controller has been
approach, the feedback controller gains of the ULM controller are extended to stabilize the dc MGs with multiple CPLs. To meet
considered as the adjustable controller coefficients, which will be
the stability and efficient performance, the authors of [7] stated
adaptively designed by the DRL technique through online learning
of its neural networks (NNs). It is proved that the suggested scheme the nonlinear dc MG with some CPLs in a Takagi–Sugeno fuzzy
will ensure the rigorous stability of the power electronic system, model combined with a quadratic D-stability theory.
for simultaneous effects of CPL and reference voltage changes, In the abovementioned works, the stabilization of the convert-
by adaptively adjusting the ULM controller gains. To appraise ers is satisfied in the presence of ideal CPLs, however, due to the
the merits and usefulness of the suggested adaptive methodology,
inevitable uncertainties (e.g., unmodeled dynamics) in practical
some dSPACE MicroLabBox outcomes on a real-time testbed of
the dc–dc converter feeding a CPL are presented. applications, the robust model-based strategies fail to effectively
Index Terms—Constant power load (CPL), dc–dc buck-boost suppress the CPL’s nonlinearity. Moreover, the need for accurate
converter, deep reinforcement learning (DRL), ultralocal model modeling to design the model-based control strategies limits
(ULM). their applicability to handle the process with high nonlinear-
ities. These difficulties motivated researchers to develop their
I. INTRODUCTION control techniques based on the input–output (I/O) measure-
ECENTLY, the usage of dc microgrid (MG) has widened ments, referred to as data-driven strategies [8], which disappear
R in numerous industrial applications due to its more ad-
vantages than the ac MG [1]. Despite the advantages of dc
the modeling procedure and unknown dynamics. The model-
independent schemes are one of the most popular data-driven
MGs, they are faced with an instability problem of the dc–dc techniques, which are also known as model-independent adjust-
ing or intelligent controllers such as intelligent proportional inte-
Manuscript received January 3, 2020; revised February 9, 2020; accepted gral derivative [9] and model-independent nonsingular terminal
February 24, 2020. Date of publication March 2, 2020; date of current version sliding-mode control (MINTSMC) [10]. Based on the ultralocal
June 23, 2020. (Corresponding author: Mohammad-Hassan Khooban.)
Mojtaba Hajihosseini and Milad Andalibi are with the School of Electrical and model (ULM) concept, the model-independent schemes adopt
Computer Engineering, Shiraz University, Shiraz 71946-84471, Iran (e-mail: a quick observer (e.g., extended state observer, sliding mode
mojtaba. [email protected]; [email protected]). (SM) observer, etc. [10], [11]) to estimate the unknown terms of
Meysam Gheisarnejad is with the Department of Electrical Engineering,
Islamic Azad University, Najafabad Branch, Isfahan 85141-43131, Iran (e-mail: the process model. To achieve the optimal performance of the
[email protected]). intelligent controllers, the evolutionary algorithms (e.g., genetic
Hamed Farsizadeh is with the Department of Electrical and Electronics En- algorithm) are often adopted to adjust the design coefficients of
gineering, Shiraz University of Technology, Shiraz 71946-84471, Iran (e-mail:
[email protected]). the intelligent controllers in a heuristic manner. However, the
Mohammad-Hassan Khooban is with the DIGIT, Department of Engineering, implementation of such approaches can guarantee the optimal
Aarhus University, 8200 Aarhus, Denmark (e-mail: [email protected]). performance of the system only for a specific cycle period and
Color versions of one or more of the figures in this article are available online
at http://ieeexplore.ieee.org. suffer the lack of capability to learn from the observed process
Digital Object Identifier 10.1109/TPEL.2020.2977765 data and restricted generalization capability.
0885-8993 © 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See https://www.ieee.org/publications/rights/index.html for more information.
Authorized licensed use limited to: Birla Institute of Technology & Science. Downloaded on January 26,2024 at 09:02:20 UTC from IEEE Xplore. Restrictions apply.
9972 IEEE TRANSACTIONS ON POWER ELECTRONICS, VOL. 35, NO. 10, OCTOBER 2020
Authorized licensed use limited to: Birla Institute of Technology & Science. Downloaded on January 26,2024 at 09:02:20 UTC from IEEE Xplore. Restrictions apply.
IEEE TRANSACTIONS ON POWER ELECTRONICS, VOL. 35, NO. 10, OCTOBER 2020 9973
Authorized licensed use limited to: Birla Institute of Technology & Science. Downloaded on January 26,2024 at 09:02:20 UTC from IEEE Xplore. Restrictions apply.
9974 IEEE TRANSACTIONS ON POWER ELECTRONICS, VOL. 35, NO. 10, OCTOBER 2020
Authorized licensed use limited to: Birla Institute of Technology & Science. Downloaded on January 26,2024 at 09:02:20 UTC from IEEE Xplore. Restrictions apply.
IEEE TRANSACTIONS ON POWER ELECTRONICS, VOL. 35, NO. 10, OCTOBER 2020 9975
TABLE I TABLE II
PARAMETERS OF THE PPO TUNER SPECIFICATIONS OF THE DC–DC BUCK-BOOST CONVERTER
Authorized licensed use limited to: Birla Institute of Technology & Science. Downloaded on January 26,2024 at 09:02:20 UTC from IEEE Xplore. Restrictions apply.
9976 IEEE TRANSACTIONS ON POWER ELECTRONICS, VOL. 35, NO. 10, OCTOBER 2020
Fig. 8. Transient outcomes of the MINTSMC scheme according to Scenario I. observe that the experimental outcomes of the suggested ULM
control scheme (realized based on the PPO agent) experience
minor control degradation than that of the MINTSMC scheme;
specifically, within the range [0.3 s 0.7 s] of operation, where the
highest CPL’s power is applied to the converter, the set points
of the voltage term are precisely tracked while simultaneously
the power and current of CPL experience smaller deviation
from their nominal values. Such improvement in the converter
stabilization performance is valuable in the power electronic
engineering, to protect the CPL in which it is connected to,
from possible great deviations of system responses during a long
transient.
For quantitative analysis of the suggested controller, several
error measurement criteria including integral absolute error,
mean square error, and mean absolute error corresponding to
Fig. 9. Transient outcomes of the PPO based-ULM control scheme with SM the Scenario I and Scenario II are compared using bar charts, as
observer according to Scenario II. depicted in Fig. 11.
V. CONCLUSION
At t = 0.3 s, the CPL’s power is increased from its initial power
to 250 W; at t = 0.7 s, the CPL’s power is reduced from 250 The robust stabilization problem of a class of power electronic
to 75 W. The experimental outcomes of Figs. 9 and 10 includ- systems exposed to the dynamic loads has been studied in this
ing CPL power (blue line), bus voltage (red line), and CPL’s letter. Particularly, by employing the adaptive capability of DRL,
current (green line), respectively, illustrate how the suggested a novel adaptive model-independent ULM controller-based SM
controller and the MINTSMC scheme stabilize the buck-boost observer has been developed to suppress the destructive effects
converter feeding the nonideal CPL under the reference voltage of CPLs when the system is subjected to the reference voltage
variations. By comparing the results of Figs. 9 and 10, one can changes. This control strategy can achieve promising outcomes
Authorized licensed use limited to: Birla Institute of Technology & Science. Downloaded on January 26,2024 at 09:02:20 UTC from IEEE Xplore. Restrictions apply.
IEEE TRANSACTIONS ON POWER ELECTRONICS, VOL. 35, NO. 10, OCTOBER 2020 9977
due to the following two reasons. With the aim of practical [7] M. M. Mardani, N. Vafamand, M. H. Khooban, T. Dragičević, and F.
implementation, an SM observer is incorporated into the ULM Blaabjerg, “Design of quadratic D-stable fuzzy controller for DC micro-
grids with multiple CPLs,” IEEE Trans. Ind. Electron., vol. 66, no. 6,
control scheme that ensures good compatibility with the unmod- pp. 4805–4812, Jun. 2019.
eled system dynamics and the learning ability of the PPO agent [8] J. Sun, J. Yang, W. X. Zheng, and S. Li, “GPIO-based robust control of
keeps driving the feedback controller to its optimal point, where nonlinear uncertain systems under time-varying disturbance with applica-
tion to DC–DC converter,” IEEE Trans. Circuits Syst. II: Express Briefs,
the coefficients in the Actor and Critic NNs are trained under vol. 63, no. 11, pp. 1074–1078, Nov. 2016.
the CPL and the reference voltage changes. [9] H. Abouaïssa and S. Chouraqui, “On the control of robot manipulator: A
The experimental outcomes of the prototype confirm an ex- model-free approach,” J. Comput. Sci., vol. 31, pp. 6–16, 2019.
[10] K.-H. Zhao et al., “Robust model-free nonsingular terminal sliding mode
cellent transient behavior in the voltage responses of the dc–dc control for PMSM demagnetization fault,” IEEE Access, vol. 7, pp. 15737–
buck-boost converter with the use of suggested adaptive method- 15748, 2019.
ology than the MINTSMC controller. [11] H. P. Wang, G. I. Y. Mustafa, and Y. Tian, “Model-free fractional-order
sliding mode control for an active vehicle suspension system,” Advances
Eng. Softw., vol. 115, pp. 452–461, 2018.
REFERENCES [12] L. Huang, X. Feng, C. Zhang, L. Qian, and Y. Wu, “Deep reinforcement
learning-based joint task offloading and bandwidth allocation for multi-
[1] N. Vafamand, M. H. Khooban, T. Dragicevic, F. Blaabjerg, and J. Boud- user mobile edge computing,” Digit. Commun. Netw., vol. 5, pp. 10–17,
jadar, “Robust non-fragile fuzzy control of uncertain DC microgrids feed- 2019.
ing constant power loads,” IEEE Trans. Power Electron., vol. 34, no. 11, [13] C. Wang, J. Wang, Y. Shen, and X. Zhang, “Autonomous navigation of
pp. 11300–11308, Nov. 2019. UAVs in large-scale complex environments: A deep reinforcement learning
[2] S. R. Huddy and J. D. Skufca, “Amplitude death solutions for stabilization approach,” IEEE Trans. Veh. Technol., vol. 68, no. 3, pp. 2124–2136,
of DC microgrids with instantaneous constant-power loads,” IEEE Trans. Mar. 2019.
Power Electron., vol. 28, no. 1, pp. 247–253, Jan. 2012. [14] Y. Wang, J. Sun, H. He, and C. Sun, “Deterministic policy gradient with
[3] M. H. Khooban, M. Gheisarnejad, H. Farsizadeh, A. Masoudian, and integral compensator for robust quadrotor control,” IEEE Trans. Syst.,
J. Boudjadar, “A new intelligent hybrid control approach for DC/DC Man, Cybern., Syst., 2019.
converters in zero-emission ferry ships,” IEEE Trans. Power Electron., [15] M. Gheisarnejad, J. Boudjadar, and M.-H. Khooban, “A new adaptive
vol. 35, no. 6, pp. 5832–5841, Jun. 2020. type-II fuzzy-based deep reinforcement learning control: Fuel cell air-
[4] H. Farsizadeh, M. Gheisarnejad, M. Mosayebi, M. Rafiei, and M. H. feed sensors control,” IEEE Sens. J., vol. 19, no. 20, pp. 9081–9089,
Khooban, “An intelligent and fast controller for DC/DC converter feeding Oct. 2019.
CPL in a DC microgrid,” IEEE Trans. Circuits Syst. II: Express Briefs, [16] X. Wang, T. Li, and Y. Cheng, “Proximal parameter distribution optimiza-
2019. tion,” IEEE Trans. Syst., Man, Cybern., Syst., 2019.
[5] S. Singh, D. Fulwani, and V. Kumar, “Robust sliding-mode control of [17] W. He and R. Ortega, “Voltage regulation in buck–boost coniverters
DC/DC boost converter feeding a constant power load,” IET Power Elec- feeding an unknown constant power load: An adaptive passivity-based
tron., vol. 8, no. 7, pp. 1230–1237, Jul. 2015. control,” 2019, arXiv:1909.04438.
[6] Q. Xu, C. Zhang, C. Wen, and P. Wang, “A novel composite nonlinear [18] Y. Zhang, Z. Deng, and Y. Gao, “Angle of arrival passive location algorithm
controller for stabilization of constant power load in DC microgrid,” IEEE based on proximal policy optimization,” Electronics, vol. 8, p. 1558,
Trans. Smart Grid, vol. 10, no. 1, pp. 752–761, Jan. 2019. 2019.
Authorized licensed use limited to: Birla Institute of Technology & Science. Downloaded on January 26,2024 at 09:02:20 UTC from IEEE Xplore. Restrictions apply.