Grid Impedance Shaping For Grid-Forming Inverters A Soft Actor Critic Deep Reinforcement Learning Algorithm
Grid Impedance Shaping For Grid-Forming Inverters A Soft Actor Critic Deep Reinforcement Learning Algorithm
Oshnoei, Arman; Sorouri, Hoda; Oshnoei, Soroush; Teodorescu, Remus; Blaabjerg, Frede
Published in:
IPEMC 2024-ECCE Asia - 10th International Power Electronics and Motion Control Conference - ECCE Asia
Publication date:
2024
General rights
Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners
and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights.
- Users may download and print one copy of any publication from the public portal for the purpose of private study or research.
- You may not further distribute the material or use it for any profit-making activity or commercial gain
- You may freely distribute the URL identifying the publication in the public portal -
Take down policy
If you believe that this document breaches copyright please contact us at [email protected] providing details, and we will remove access to
the work immediately and investigate your claim.
Grid Impedance Shaping for Grid-Forming
Inverters: A Soft Actor-Critic Deep Reinforcement
Learning Algorithm
Arman Oshnoei∗ , Hoda Sorouri∗ , Soroush Oshnoei† , Remus Teodorescu∗ , Frede Blaabjerg∗
∗ Department of Energy, Aalborg University, Aalborg, Denmark
Email: [email protected], [email protected], [email protected], [email protected]
† Department of Electrical and Computer Engineering, Aarhus University, Aarhus, Denmark
Abstract—This paper proposed an advanced method for ad- effect can lead to voltage fluctuations and power quality issues,
justing grid impedance in grid-forming inverters, utilizing the posing challenges to the stable operation of microgrids. These
Soft Actor-Critic Deep Reinforcement Learning (SAC-DRL) challenges are exacerbated in systems with a high penetration
algorithm. The approach contains a flexible strategy for con-
trolling virtual impedance, supported by an equivalent grid of renewable energy sources, such as solar PV and wind
impedance estimator. This facilitates accurate modifications of turbines, where the variability in power output can further
virtual impedance based on the grid’s X/R ratio and the destabilize the grid.
converter’s power capacity, aiming to optimize power flow and To overcome these challenges, various control and power
maintain grid stability. A unique feature of this methodology is decoupling strategies have been proposed. Virtual Impedance
the division of virtual reactance into two segments: one adhering
to standard control protocols and the other designated for Decoupling Strategies (VI-DS) are designed to modify the
precision enhancement via the SAC-DRL method. This strategy output impedance of DGs, enabling the independent control
introduces a layer of intelligence to the system, strengthening its of active and reactive powers [6]. By integrating a feed-
resilience against fluctuations in grid impedance. Experimental forward control scheme, VI-DS effectively introduces a virtual
validations, executed on a laboratory setup, verify the robustness impedance, creating either inductive or resistive characteris-
of this approach, highlighting its potential to significantly improve
intelligent power grid management practices. tics in the DG output. This adjustment is achieved through
Index Terms—Virtual impedance, power decoupling, grid- the strategic implementation of a virtual impedance, which
forming inverter, soft actor-critic deep reinforcement learning, involves a series connection of negative resistance and induc-
grid impedance estimation. tance. Such a configuration alters the equivalent impedance
between a virtual power source and the point of common
I. I NTRODUCTION coupling (PCC), rendering it inductive [7]. However, its perfor-
Integrating distributed generators (DGs) into microgrids mance is contingent on precise knowledge of the microgrid’s
marks a paradigm shift in energy systems, emphasizing decen- structure, making it vulnerable to uncertainties and variations
tralized control and enhanced resilience. Grid-forming (GFM) in grid impedance. The Q-V modified droop control method
inverters play a pivotal role in this integration, ensuring [8] offers an alternative, illustrating effectiveness in inductive
stability and reliable power-sharing, especially in islanded conditions but struggling under complex feeder equivalent
operational modes [1]. These inverters are crucial in transform- impedance. Also, closed-loop power decoupling strategies in-
ing the operational dynamics of microgrids, enabling them to corporate techniques like impedance droop estimation, small-
operate independently from the central grid and thus providing signal injection, and virtual power source control [9]-[11]. The
a more flexible and resilient power supply [2]. majority of grid-shaping techniques discussed in the literature
The conventional approach to managing this integration em- rely on accurate knowledge of grid impedance parameters,
ploys droop control strategies that emulate the behavior of syn- rendering them less effective in situations where uncertainties
chronous generators. This method is essential for controlling arise from variations in grid impedance.
microgrids, offering a simple and effective way to maintain Recognizing these challenges, this paper introduces an
system stability and distribute power evenly among various innovative control strategy using the Soft Actor-Critic Deep
distributed energy resources. By mimicking the characteristics Reinforcement Learning (SAC-DRL) algorithm to tune grid
of traditional power systems, droop control provides a familiar impedance in GFM inverters properly. This method inte-
framework for grid management [3], [4]. grates a dynamic virtual impedance control mechanism with
However, these strategies encounter limitations, especially an equivalent grid impedance estimator, facilitating precise,
in low-voltage microgrids with small X/R ratios, where an adaptive adjustments based on the grid’s X/R ratio and the
undesirable coupling between active and reactive powers converter’s power capacity. This ensures optimal power flow
emerges, compromising system stability [5]. This coupling and grid stability, even in the face of impedance fluctuations. A
remarkable aspect of the proposed approach is the division of Given the system’s symmetry and balance, the magnitude
virtual reactance into two segments, providing standard control of its equivalent impedance can be determined through the
protocols with accuracy enhancement via SAC-DRLA, thus following estimation:
adding an additional layer of intelligence and resilience to the ∗
Voα ∗
Voβ
system. Ze = = (2)
Ioα Ioβ
The components of resistance and reactance in Ze , represent-
II. C ONFIGURATION OF P ROPOSED GFM I NVERTER SETUP ing its real and imaginary parts, are calculated as follows:
Z e Pc
In Fig. 1, the configuration of the test system is shown, Re = (3)
Sc
showcasing a GFM inverter integrated into the grid at the
PCC via a three-phase LC filter. The LC filter consists Ze Qc
Xe = (4)
of capacitance Cf and inductance Lf . A three-phase RL Sc
branch (Zg = rg + jωlg ) emulates the grid’s impedance.
p
where Sc = Pc2 + Q2c represents the apparent power asso-
The architecture of the GFM inverter includes a voltage ciated with the system’s equivalent impedance.
source inverter equipped with PI controllers for inner current This proposed method calculates the system’s equivalent
and voltage regulation in the dq reference frame. The active impedance, Ze , breaking it down into its resistive (Re ) and
power control loop utilizes the synchronous power controller inductive (Xe ) components. This breakdown facilitates the
approach, simulating virtual inertia and providing damping to assessment of the feeder’s X/R ratio.
the system [12], [13]. The voltage of the GFM inverter is
regulated through the inclusion of a PI controller within the B. Definition of VI profile
reactive power control loop. The suggested impedance shaping The VI profile block specifies the VI profile and the values
method incorporates a block for estimating equivalent grid of its components to achieve the targeted X/R ratio. This
impedance, a block for defining the Virtual Impedance (VI) is done by considering the estimated grid impedance (Ze =
profile, and a block for the VI system based on SAC-DRL Re + jXe ) and the power availability from the converter.
method. The calculation of virtual resistance and reactance (rvi and
xvi ) is determined as follows:
A. Equivalent Grid Impedance Estimation rvi = γRe (5)
Traditional power flow models can predict the equivalent xvi = λXe (6)
impedance of a feeder when the voltage and power metrics where γ and λ represent the reduction factors, which are estab-
at the PCC are available. However, DG control mechanisms lished based on the desired X/R ratio and power availability
rely solely on information about output voltage and power of GFM inverter.
for managing power flows, which restricts the ability to
accurately determine the equivalent impedance at the PCC. C. SAC-DRL based VI control
Adding current sensors at the PCC can provide the neces- The calculation of xvi splits it into two components us-
sary data for power analysis, thus facilitating the estimation ing a distribution factor µ, leading to xvi2 = µxvi and
of the PCC’s equivalent impedance [1], [14]. This method, xvi1 = (1 − µ)xvi . The approach for virtual reactance xvi
though, does not account for the effects of virtual impedance. adopts a variable structure strategy, where xvi1 adheres to
Therefore, in this study, the reference voltages from the DG the conventional method, and xvi2 employs a SAC-DRL-
∗ based method to compensate for unmodeled disturbances and
control system (voαβ ) are included to accurately estimate the
equivalent impedance that consists of the impact of virtual accommodate potential variations in grid impedance. In Fig. 1,
impedance. Active and reactive power values corresponding Ze = Re +jXe represents the estimated equivalent impedance,
to the equivalent impedance can be determined through the while rvi and xvi stand for the components of the VI. The term
following calculations: xvi2 , derived via the SAC-DRL algorithm, is combined with
xvi1 to fulfill the X/R control objectives.
Pc = δvbα ioα + δvbβ io β The SAC method is a sophisticated reinforcement learning
(1)
Qc = δvbα ioβ − δvbβ io α strategy employing deep neural networks to overcome the
constraints of limited dimensionality in states and actions.
In this context, Pc and Qc represent the total power output Compared to the deep deterministic policy gradient technique,
∗
of the DG system, while δvbαβ = voαβ − voαβ denotes the capable of managing continuous state-action spaces, SAC
discrepancy in voltage between the designated reference and achieves faster convergence [15]. Distinct from traditional RL
the actual output from the DG. ioαβ refers to the current algorithms, where the action-value function focuses on max-
observed at the PCC. To minimize the impact of the switching imizing cumulative reinforcement signals, the SAC method
frequency on the calculations of Pc and Qc , low-pass filters optimizes the entropy of data concerning the state indepen-
are utilized. dently. To do this, a soft Bellman function is adopted for the
Fig. 1. Block diagram of the proposed GFM inverter setup.
action-value function [16]. The actor-network is designed to shaping algorithms were facilitated using a fast prototyping
generate the best possible actions given the current state of system, dSPACE1202, operating at a sampling rate of Ts =
the system. More information on this method is available in 100 microseconds. The RL Agent block in MATLAB does not
[15]. support Code Generation. However, enhancements have been
xvi2 is the control design parameter adjustable through the made to the MATLAB Function Block to model SAC-DRL
SAC-DRL method. Consequently, the action within SAC-DRL within Simulink. This modification permits the use of pre-
is characterized as ac = xvi2 . To assess the performance of trained networks, including reinforcement learning policies,
the SAC’s agent, the reward signal is established to reduce the for inference within Simulink. Consequently, this advancement
deviation between the reactive power, Qe , and its set-point enables the implementation of SAC-DRL in dSPACE1202
value, Qr . Fig. 2 shows the proposed SAC-DRL-based grid through code generation. For the initial synchronization of the
impedance shape implementation. inverter with an established grid voltage, a PLL is required,
1 and then, after synchronization, a transition into GFM control
r= (7) is executed. It should be noted that the transition process
|Qr − Qe |
is not mandatory because the synchronous power controller
The system state, Sc , receives Qe as the input for the deep embedded in the active power loop naturally possesses the
neural networks. ability to synchronize. In this study, the transition approach
III. E XPERIMENTAL R ESULTS is employed as a procedural measure in experimental tests to
facilitate a soft start.
The proposed control strategy was validated using a GFM
inverter laboratory setup shown in Fig. 2, configured as per the Fig. 3(a)–(c) shows the experimental comparisons (P , Q,
block diagram shown in Fig. 1. A Cinergia Grid Simulator is ioa ) when Pr steps from 600 W to 1000 W at t = 2s and Qr
used to mimic the power grid. The main parameters of the steps from 100 to 300 VAr at t = 7s in a stiff-grid connection
system are detailed in Table 1. The control system execution (SCR = 8.66, estimated by grid inductance Lg ). The proposed
and implementation of the SAC-DRLA-based grid impedance SAC-DRL-based grid impedance shaping method enhances
Fig. 2. Schematic of SCA-DRLA applied to GFM inverter.
TABLE I
M AIN PARAMETERS OF GFM INVERTER SETUP
lg , rg Lf , Cf Sr Vdc fSW vg ωo
5.4 mH and 0.2 Ω 2.4 mH and 15 µF 1 kW 200 V 20 kHz 70 RMS 100πrad/s
*
Fig. 3. Step responses with/without the proposed grid shaping system when SCR=8.6, Pr : 600 W to 1000 W, and Qr steps from 100 VAr to 300 VAr: (a)
active power; (b) active power; and (c) current.
oscillation damping and delivers faster and smoother dynam- implementation with fixed values (rvi = 0.38, xvi = 1.4).
ics, while also effectively reducing overshoot instances caused Fig. 4(a)–(c) compares the system’s performance with SCR
by power step insertions, compared to the virtual impedance = 2.88 and an step change of 400 W in Pr at t = 5s. The
Fig. 4. Step responses with/without the proposed grid shaping system when SCR=2.88 and Pr : 600 W to 1000 W: (a) active power; (b) active power; and
(c) current.
findings also indicate that the power coupling is substantially [7] C. Dou, Z. Zhang, D. Yue and M. Song, “Improved droop control based
lower with the proposed strategy compared to stronger grids on virtual impedance and virtual power source in low-voltage micro-
grid,” IET Gener. Transmiss. Distrib., vol. 11, no. 4, pp. 1046–1054,
such as Fig. 3(a)–(b). Mar. 2017.
[8] H. Han, X. Hou, J. Yang, J. Wu, M. Su and J. M. Guerrero, “Review
IV. C ONCLUSION of power sharing control strategies for islanding operation of AC
microgrids,” IEEE Trans. Smart Grid, vol. 7, no. 1, pp. 200–215, Jan.
This paper proposed an intelligent method for adjusting 2016.
grid impedance in GFM inverters through the SAC-DRLA. [9] B. Liu, Z. Liu, J. Liu, R. An, H. Zheng and Y. Shi, “An adaptive
This advanced approach combines dynamic virtual impedance virtual impedance control scheme based on small-AC-signal injection
for unbalanced and harmonic power sharing in islanded microgrids,”
control with an equivalent grid impedance estimator. It enables IEEE Trans. Power Electron., vol. 34, no. 12, pp. 12333–12355, Dec.
precise and adaptive adjustments to virtual impedance based 2019.
on the grid’s X/R ratio and the converter’s power capacity, [10] D. K. Alves, R. L. d. A. Ribeiro, F. B. Costa, T. d. O. A. Rocha and
J. M. Guerrero, “Wavelet-based monitor for grid impedance estimation
ensuring optimal power flow and maintaining grid stability, of three-phase networks,” IEEE Trans. Ind. Electron., vol. 68, no. 3, pp.
even amid impedance fluctuations. The introduced SAC-DRL- 2564–2574, Mar. 2021.
based strategy significantly improves oscillation damping and [11] F. Zhao, X. Wang and T. Zhu, “Power Dynamic Decoupling Con-
trol of Grid-Forming Converter in Stiff Grid,” IEEE Transactions on
ensures more rapid and fluid dynamic responses. Furthermore, Power Electronics, vol. 37, no. 8, pp. 9073–9088, Aug. 2022, doi:
it adeptly minimizes the occurrence of overshoots resulting 10.1109/TPEL.2022.3156991.
from sudden power changes, highlighting its efficacy in en- [12] A. Oshnoei, H. Sorouri, R. Teodorescu and F. Blaabjerg, “An Intelligent
Synchronous Power Control for Grid-Forming Inverters Based on Brain
hancing grid operation and stability. Emotional Learning,” IEEE Transactions on Power Electronics, vol. 38,
no. 10, pp. 12401–12405, Oct. 2023.
V. ACKNOWLEDGMENT [13] A. Oshnoei, S. Peyghami and F. Blaabjerg, “Intelligent Control
This work was supported by the Reliable Power Electronic- Approach Applied for Grid-Forming Power Converters,“ 2023
IEEE Applied Power Electronics Conference and Exposition
Based Power Systems (REPEPS) project and the “SMART (APEC), Orlando, FL, USA, 2023, pp. 3013–3019, doi:
BATTERY” project (project number 222860), both hosted at 10.1109/APEC43580.2023.10131254.
the AAU Energy Department, Aalborg University, and part [14] A. Oshnoei, R. L. A. Ribeiro, A. Anvari-Moghaddam and F. Blaabjerg,
“Learning-based Grid Impedance Shaping Method Applied for High-
of the Villum Investigator Program funded by the Villum Accuracy Power Hardware-in-the-Loop,“ 2023 11th International Con-
Foundation. ference on Power Electronics and ECCE Asia (ICPE 2023 - ECCE
Asia), Jeju Island, Korea, Republic of, 2023, pp. 2543–2548, doi:
R EFERENCES 10.23919/ICPE2023-ECCEAsia54778.2023.10213962.
[15] A. Fathollahi et al., “Robust Artificial Intelligence Controller for Sta-
[1] R. L. d. A. Ribeiro, A. Oshnoei, A. Anvari-Moghaddam and F. Blaabjerg, bilization of Full-Bridge Converters Feeding Constant Power Loads,”
“Adaptive Grid Impedance Shaping Approach Applied for Grid-Forming IEEE Transactions on Circuits and Systems II: Express Briefs, vol. 70,
Power Converters,” in IEEE Access, vol. 10, pp. 83096–83110, 2022. no. 9, pp. 3504–3508, Sept. 2023, doi: 10.1109/TCSII.2023.3270751.
[2] A. Oshnoei, S. Peyghami, H. Mokhtari, and F. Blaabjerg, “Grid syn- [16] Y. Zheng et al., “Load frequency active disturbance rejection control for
chronization for distributed generations,” Encyclopedia of Sustainable multi-source power system based on soft actor-critic,” ” Energies, vol.
Technologies. Elsevier, pp. 1–21, 2023. 14, no. 16, p. 4804, 2021.
[3] Y. Han, H. Li, P. Shen, E. A. A. Coelho and J. M. Guerrero, “Review
of active and reactive power sharing strategies in hierarchical controlled
microgrids,” IEEE Trans. Power Electron., vol. 32, no. 3, pp. 2427–2451,
Mar. 2017.
[4] L. Ding, Q.-L. Han and X.-M. Zhang, “Distributed secondary control
for active power sharing and frequency regulation in islanded microgrids
using an event-triggered communication mechanism,” IEEE Trans. Ind.
Informat., vol. 15, no. 7, pp. 3910–3922, Jul. 2019.
[5] J. M. Guerrero, J. C. Vasquez, J. Matas, L. G. de Vicuna and M. Castilla,
“Hierarchical control of droop-controlled AC and DC microgrids—A
general approach toward standardization,” IEEE Trans. Ind. Electron.,
vol. 58, no. 1, pp. 158–172, Jan. 2011.
[6] J. He, Y. W. Li and F. Blaabjerg, “An enhanced islanding microgrid
reactive power, imbalance power, and harmonic power sharing scheme,”
IEEE Trans. Power Electron., vol. 30, no. 6, pp. 3389–3401, Jun. 2015.