Model Predictive Control For Virtual Synchronous
Model Predictive Control For Virtual Synchronous
Abstract—In this paper we propose here a nonlinear control Optimal frequency support in islanded microgrids with high
scheme for frequency support in low-inertia microgrids with high integration level of renewable energy sources and decreas-
level integration of renewable energy sources. We first develop ing inertia is one of the core challenges for future power
a multi-loop reinforcement learning based controller with deep
deterministic policy gradient optimization. Then, we apply it to systems [17], [18]. Therefore, development of novel and
the simultaneous frequency support and control of renewable advanced methodologies with potential to handle renewable
energy generation. In addition, we adjust the reward system to power, domestic loads and low-inertia uncertainties becomes
track the thermal power and provide the balance between energy extremely important. Implementations of traditional control
generation and consumption. This modified controller is shown to schemes have certain limitations for the optimal frequency
work well in several practical scenarios, in which it is compared
to a single loop RL controller. support. For example, the standard PID controller is single
Index Terms—MicroGrid, Reinforcement learning, Renewable input single output system with limited performance, including
energy weak tolerance to disturbances and efficiency in specific condi-
tions. Another popular H∞ controller operates with additional
I. I NTRODUCTION inputs as disturbances and synthesized controller provides
Concerns on future state of environment gained attention input/output relation similar to that of PID [9]. At the same
to electronic-based renewable energy sources. However, the time, data-driven algorithms can be implemented as multi
integration of renewable energy in large amounts creates power input systems. For example, in [13] fuzzy-logic controller
stability related problems such as frequency drop, power designed with two inputs Δf and ΔPRES . Similar approach is
oscillations and mismatch in energy generation [1]–[3]. These applied in [14] to develop state-space based model predictive
challenges raise important problems in the light of future control with inputs Δf and ΔPW . The work [12] proposed
power systems. Among others frequency support gains an reinforcement learning based control approach, which consider
increased trend [3]–[6]. the Δf dynamics and acceleration, and total power dynamics
Several recent works [7]–[12] addressed the problem of ΔPM . The above mentioned work consider multi-input single-
robust frequency regulation with high penetration of renewable output scenario.
energy sources. For example, in [9] the robust H∞ con- In this paper, we extend our previous work [12] and pro-
troller combined with phase-lock loop (PLL) for microgrid pose reinforcement learning controller with multi-input multi-
support with 100% and 10% inertia, and 80% penetration output architecture. The new design enables more robust multi-
renewable energy. In [13] the problem was studied with using loop frequency support in islanded/isolated microgrids with
fuzzy-logic controller in scenarios with various uncertainties, varying system inertia. Specifically, we develop a technique
including 20% and 80% integration, and 80%, 40%, 30% which provides simultaneous control of renewable energy
system inertia mismatch in primary/secondary control loops. flow and virtual inertia emulation. Furthermore, we adjust
In [14] the model predictive control approach proposed and the reward system by changing the error band for controller
compared with fuzzy-logic controller with 34% integration reward/punishment, selecting the optimal constants at each
of RES power. In [12] reinforcement learning based control step, and change the optimal number of artificial neurons and
strategy proposed compared with H∞ and PID in microgrid fully connected layers in the network. The benefit of such
with 20% of renewable energy integration and 100%, 80%, modification is the possibility to decrease the negative influ-
40% system inertia. Also frequency control by reinforcement ence of renewable energy and provide the balance in energy
learning strategy was addressed in [15], [16]. production between different energy generation sources. As
The work of V. Skiparev and E. Petlenkov was partly supported by the result extended RL controller provides the effective frequency
Estonian Research Council grant PRG658. stability with integration up to 50% of renewable power.
978-1-6654-0507-2/21/$31.00
Authorized licensed use limited to:©2021 IEEEUniversity. Downloaded on October 06,2024 at 12:31:33 UTC from IEEE Xplore.
Sultan Qaboos Restrictions apply.
II. S TRUCTURE OF THE S TUDIED M ICROGRID TABLE I
N OMENCLATURE : PARAMETERS OF M ICRO G RID .
The studied microgrid is adopted from several recent pub-
Parameter Value
lications [8]–[10], [13] and depicted in Fig. 1.
Tt (s), time constant of the turbine 0.4
Tg (s), time constant of the governor 0.1
KI (s), integral control variable gain 0.05
H(p.u.MW s), System inertia 0.083
D(p.u.MW s), System damping coefficient 0.05
R(Hz/p.u.MW), Droop characteristic 2.4
TW T (s), time constant of wind turbines 1.85
TP V (s), time constant of the solar system 1.5
TRES (s), time constant of the renewable energy storage 1
TV I (s), time constant of the virtual inertia emulation 10
VU , Maximum limit of valve gate speed 0.5
VL , Minimum limit of valve gate speed -0.5
Pinertia,max , Maximum capacity of ESS 0.25
Pinetia min , Minimum capacity of ESS -0.25
PRES,max , Maximum capacity of renewable power storage 0.5
Fig. 1. Block-scheme of the islaned microgrid with hierarchical control. PRES,min , Minimum capacity of renewable power storage -0.5
Pg,max (p.u. MW/min), Maximum generation rate 0.12
The addressed setup includes residential ΔPRL and in- Pg,min (p.u. MW/min), Minimum generation rate -0.12
dustrial loads ΔPIL , energy sources (thermal power plant
ΔPm , wind farm ΔPW , and solar power plant ΔPP V ), and
III. P ROPOSED S OLUTION
energy storage system. Thermal power plant is composed of
a governor with generator rate constraints (Pg,max , Pg,min ) A. Neural Actor-Critic
and turbine with frequency rate limiter, which restricts the Neural actor-critic architecture is special type of RL agent
valve opening/closing (VU , VL ). The dynamic model of mi- implies the parallel work of two artificial neural networks: the
crogrid utilizes the hierarchical architecture with primary and actor network μ(s | θμ ) and the critic network Q(s, a | θQ ).
secondary control loops. The primary control loop has droop In the proposed solution the critic network is an algorithm
coefficient 1/R, and the secondary loop has an area control tracking errors from interaction of actor network at with
error system ΔPACE with the second frequency controller KI environment st according to selected policy. The critic net-
and the first-order integrator 1/s. The frequency regulation is work corrects them in order to find potential right estimation
performed by virtual inertia ΔPinertia and renewable energy of actor action at , which predicts max possible reward r,
control ΔPRES . Microgrid balancing system performed as see [19] for more technical details. The key advantage of
transfer function with microgrid damping coefficient D and reinforcement learning algorithms is the data-driven study
system inertia H. The power generation by variable energy based on interaction with environment. In other words, when
sources computed as combination of two random signals with agent makes an action, it expects to receive the reward +r
first-order holder. The deviation of frequency in the studied or the punishment −r. Control mechanism can be briefly
microgrid can be calculated as summarized as follows. The measured variables Δf , ΔPL ,
ΔPm formulate observation for RL agent. In this paper Δf
1 is considered as the control error and other variables as
Δf = (ΔPm + ΔPRES + ΔPinertia − ΔPL ),
2Hs + D disturbances. At the same time calculated deviations Δf and
(1) ΔPm go to the block “calculate reward” to reward or punish
neural RL actor. General structure of the proposed controller
where is shown in Fig. 2.
1 B. Deep Deterministic Gradient Descent
ΔPm = ΔPg ,
1 + sTt Next, we briefly explain the key application details of
1 1
ΔPg = ΔPACE − Δf , the proposed controller optimized by deep deterministic pol-
1 + sTg R icy gradient (DDPG). DDPG is a model-free reinforcement
KI learning algorithm, designed for tasks with low-dimensional
ΔPACE = Δf,
s continuous action space [20]. Optimization phase introduces
ΔPL = ΔPRL + ΔPIL . the fusion of deep Q-learning (DQL) and deterministic policy
gradient (DPG), which inherits from DQL strategy neural
Modeling parameters of the microgrid are summarized in actor-critic. The principle of optimization is based on the
Table I. search of a minimal difference between target action-value
Authorized licensed use limited to: Sultan Qaboos University. Downloaded on October 06,2024 at 12:31:33 UTC from IEEE Xplore. Restrictions apply.
function of the policy yi and the critic network Q(si , ai | θQ ) of frequency deviation and the power generated by traditional
at each actor network μ(s | θμ ) decision at (KV I ) per step i power plant. To provide instructions for Δf regulation, system
in the state st (i.e., Δft ). This is done to minimize the loss is organized as follows: initial signal is transformed to the
function L and receive the max possible reward rt per training absolute value |Δf |; if Δf < 0.05 the RL agent receives
episode. Next, we summarize the overall procedure in the form the reward, otherwise system does the punishment. In order
of a pseudo-code as shown in Algorithm 1. to provide reasonable rewarding for each action, reward is
limited to the range Pm ∈ {0.05, . . . , 2}; however, due to the
Algorithm 1 DDPG Algorithm. specific procedure of the controller training the punishment is
1: Initialize critic Q(s, a | θQ ) and actor μ(s | θμ ) networks unlimited and multiplied by 2. To force RL agent to adjust
with random weights θQ and θμ . renewable energy generation to traditional power plant the
2: Initialize target network Q and μ with θQ ← θQ , θμ ← reward/punishment system provides a negative reward −10
θμ , respectively. if Pm ∈ / {0.05, . . . , 0.45}, otherwise reward becomes 0. If
3: Initialize replay buffer R. agent’s actions collect punishment rt < −500, then the com-
4: for episode = 1 to M do mand forced replay is activated. The proposed implementation
5: Receive initial process observation as state s1 . of reward/punishment system can be summarized as:
6: for t = 1 to T do ⎧ 1
7: Select action at = μ(st | θμ ) according to current ⎪
⎪ 0.5+|Δf | , if |Δf | < 0.05,
⎪
⎨
policy and disturbances exploration. −2|Δf |, if |Δf | > 0.05,
rt = (2)
8: Execute action at . Observe reward rt , state st+1 . ⎪
⎪ if ΔPm ∈ {0.05, . . . , 0.45},
9: Store transition (st , at , rt , st+1 ) in R. ⎪0,
⎩
−10, if ΔPm ∈ / {0.05, . . . , 0.45}.
10: Sample random minibatch of N transitions
(si , ai , ri , si+1 ) from R. D. Frequency Support Controller
11: Set yi = ri + γQ (si+1 , μ (si+1 |θμ ) | θQ
).
Here we present a modified controller for the frequency
12: Update critic by minimizing loss: L = N1 i (yi − support, which combines the virtual inertia control and re-
Q(si , ai | θQ ))2 . newable energy control loops. The proposed architecture is
13: Update actor policy using sampled gradient: designed to provide more advanced frequency support in the
1 low inertia microgrids. Virtual inertia consists of 3 input
∇θμ J ≈ ∇a Q s, a | θQ |s=si ,α=μ(si ) system (i.e. discrete time integrator, derivative component
N i
and initial input), energy storage system and power limiter
× ∇θμ μ(s | θμ )|si . as depicted in Fig. 3. It has the inertia power saturation
14:
Update the target network: θQ ← τ θQ +(1−τ )θQ
limiter (Pinertia,max , Pinertia,min ), which provides additional
and θ ← τ θμ + (1 − τ )θμ .
μ robustness for tested algorithms and creates limitations for
15: end for more realistic simulations. The renewable energy control loop
16: end for
is organized in a similar manner. It has the input ΔPm , en-
ergy storage system and power saturation limiter (PRES,max ,
PRES,min ).
Authorized licensed use limited to: Sultan Qaboos University. Downloaded on October 06,2024 at 12:31:33 UTC from IEEE Xplore. Restrictions apply.
The RoCoF defines time derivative of the frequency signal, middle plot shows results similar to those from the nominal
which is used to calculate the inertia response of as system case. In fact, according to majority of numerical results the
as: performance of both implementations is affected. The bottom
KV I d(Δf ) plot shows an observable influence of low inertia on RoCoF
ΔPinertia = , (4)
1 + sTV I dt change after microgrid being lunched. However, rest of the
where TV I is time constant of energy reservation and KV I picture is similar to previous scenarios. Summary of several
virtual inertia constant. The renewable energy control flow can statistical measures is presented in Table II.
be defined as:
KRES
ΔPRES = (ΔPW + ΔPP V ), (5)
1 + sTRES
where
1
ΔPW = ΔPwind ,
1 + sTW T
1
ΔPP V = ΔPsolar ,
1 + sTP V
and TRES is time of renewable energy storage and KRES is
renewable energy passing coefficient.
IV. N UMERIC R ESULTS
Consider a microgrid shown in Fig. 1. In this paper we
increase the upper limit on a total amount of renewable energy
to the level of 50%, and experiment with 100%, 80% and
40% inertia. Figure 4 depicts variation limits of renewable
energy ΔPRES ∈ {0.35, 0.475} and residential loads ΔPL ∈
{0.45, 0.75}. Microgrid parameters are presented in Table I.
TABLE II
S UMMARY OF S TATISTICAL M EASURES FOR D IFFERENT L EVELS OF
I NERTIA .
Authorized licensed use limited to: Sultan Qaboos University. Downloaded on October 06,2024 at 12:31:33 UTC from IEEE Xplore. Restrictions apply.
after the microgrid lunch and periodically repeat. It happens [12] V. Skiparev, J. Belikov, and E. Petlenkov, “Reinforcement learning based
due to conflict in power generation between traditional power approach for virtual inertia control in microgrids with renewable energy
sources,” in IEEE PES Innovative Smart Grid Technologies Europe
plant and renewable energy. Proposed modified reinforcement (ISGT-Europe), The Hague, NL, 2020.
learning controller is designed to avoid such unwanted situa- [13] T. Kerdphol, M. Watanabe, K. Hongesombut, and Y. Mitani, “Self-
tions and yields better performance in all scenarios than the adaptive virtual inertia control-based Fuzzy logic to improve frequency
stability of microgrid with high renewable penetration,” IEEE Access,
previously proposed single loop controller. vol. 7, pp. 76 071–76 083, 2019.
[14] T. Kerdphol, F. Rahman, Y. Mitani, K. Hongesombut, and S. Küfeoğlu,
V. C ONCLUSION AND D ISCUSSION “Virtual inertia control-based model predictive control for microgrid
frequency stabilization considering high renewable energy integration,”
Presented paper demonstrates the potential of extended Sustainability, vol. 9, no. 5, p. 773, 2017.
reinforcement learning based controller in mitigation of low [15] W. Guo, F. Liu, J. Si, and S. Mei, “Incorporating approximate dynamic
programming-based parameter tuning into PD-type virtual inertia control
inertia phenomena with high penetration of renewable en- of DFIGs,” in International Joint Conference on Neural Networks, 2013.
ergy. Reinforcement learning control with single loop and [16] D. Shrestha, U. Tamrakar, N. Malla, Z. Ni, and R. Tonkoski, “Re-
single input reward system cannot provide satisfactory control duction of energy consumption of virtual synchronous machine using
supplementary adaptive dynamic programming,” in IEEE International
quality in case of microgrids with high renewable energy Conference on Electro Information Technology, 2016.
integration. Therefore, in this paper we propose modified [17] A. Ulbig, T. S. Borsche, and G. Andersson, “Impact of low rotational
reward/punishment system and control architecture of rein- inertia on power system stability and operation,” IFAC Proceedings
Volumes, vol. 47, no. 3, pp. 7290–7297, 2014.
forcement learning based controller for virtual inertia and [18] H. Bevrani, T. Ise, and Y. Miura, “Virtual synchronous generators: A
renewable energy flow control. The comparative study is based survey and new perspectives,” International Journal of Electrical Power
on three different scenarios in conditions with high renewable & Energy Systems, vol. 54, pp. 244–254, 2014.
[19] R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction,
energy integration. In the future research we plan to apply 2nd ed. The MIT Press, 2018.
more advanced models of energy storage, and develop a [20] T. P. Lillicrap, J. J. Hunt, A. Pritzel, N. Heess, T. Erez, Y. Tassa,
software for RL based controller stability validation. D. Silver, and D. Wierstra, “Continuous control with deep reinforcement
learning,” arXiv, 2019.
R EFERENCES
[1] “Renewables global futures report,” REN21, Tech. Rep., 2017.
[2] B. Kroposki, B. Johnson, Y. Zhang, V. Gevorgian, P. Denholm, B.-M.
Hodge, and B. Hannegan, “Achieving a 100% renewable grid: Operating
electric power systems with extremely high levels of variable renewable
energy,” IEEE Power and Energy Magazine, vol. 15, no. 2, pp. 61–73,
2017.
[3] G. Magdy, E. A. Mohamed, G. Shabib, A. A. Elbaset, and Y. Mitani,
“Microgrid dynamic security considering high penetration of renewable
energy,” Protection and Control of Modern Power Systems, vol. 3, no. 1,
2018.
[4] D. E. Olivares, A. Mehrizi-Sani, A. H. Etemadi, C. A. Canizares,
R. Iravani, M. Kazerani, A. H. Hajimiragha, O. Gomis-Bellmunt,
M. Saeedifard, R. Palma-Behnke, G. A. Jimenez-Estevez, and N. D.
Hatziargyriou, “Trends in microgrid control,” IEEE Transactions on
Smart Grid, vol. 5, no. 4, pp. 1905–1919, 2014.
[5] M. Dreidy, H. Mokhlis, and S. Mekhilef, “Inertia response and frequency
control techniques for renewable energy sources: A review,” Renewable
and Sustainable Energy Reviews, vol. 69, pp. 144–155, mar 2017.
[6] K. Y. Yap, C. R. Sarimuthu, and J. M.-Y. Lim, “Virtual inertia-based
inverters for mitigating frequency instability in grid-connected renewable
energy system: A review,” Applied Sciences, vol. 9, no. 24, p. 5300,
2019.
[7] G. Magdy, A. Bakeer, G. Shabib, A. A. Elbaset, and Y. Mitani,
“Decentralized model predictive control strategy of a realistic multi
power system automatic generation control,” in Nineteenth International
Middle East Power Systems Conference, 2017.
[8] G. Magdy, G. Shabib, A. A. Elbaset, and Y. Mitani, “A novel coordina-
tion scheme of virtual inertia control and digital protection for microgrid
dynamic security considering high renewable energy penetration,” IET
Renewable Power Generation, vol. 13, no. 3, pp. 462–474, 2019.
[9] T. Kerdphol, F. S. Rahman, M. Watanabe, and Y. Mitani, “Robust
virtual inertia control of a low inertia microgrid considering frequency
measurement effects,” IEEE Access, vol. 7, pp. 57 550–57 560, 2019.
[10] H. Ali, G. Magdy, B. Li, G. Shabib, A. A. Elbaset, D. Xu, and Y. Mitani,
“A new frequency control strategy in an islanded microgrid using virtual
inertia control-based coefficient diagram method,” IEEE Access, vol. 7,
pp. 16 979–16 990, 2019.
[11] N. Sockeel, J. Gafford, B. Papari, and M. Mazzola, “Virtual inertia
emulator-based model predictive control for grid frequency regulation
considering high penetration of inverter-based energy storage system,”
IEEE Transactions on Sustainable Energy, vol. 11, no. 4, pp. 2932–2939,
2020.
Authorized licensed use limited to: Sultan Qaboos University. Downloaded on October 06,2024 at 12:31:33 UTC from IEEE Xplore. Restrictions apply.