Read 1
Read 1
Abstract—In order to reduce the negative impact of the generation and load, this has caused a strong obstacle to day-
arXiv:2108.06764v1 [eess.SP] 15 Aug 2021
uncertainty of load and renewable energies outputs on micro- ahead dispatching plans of a microgrid [2]. It’s known that
grid operation, an optimal scheduling model is proposed for an accurate forecasting result can provide a reliable basis for
isolated microgrids by using automated reinforcement learning-
based multi-period forecasting of renewable power generations dispatching plans to arrange the start and stop of microturbines
and loads. Firstly, a prioritized experience replay automated (MT) and set the spinning reserve capacity. A scheduling
reinforcement learning (PER-AutoRL) is designed to simplify model combined with advanced forecasting methods can un-
the deployment of deep reinforcement learning (DRL)-based doubtedly reduce the impact of the uncertainty of renewable
forecasting model in a customized manner, the single-step multi- generations and load on microgrids, and improve the economy
period forecasting method based on PER-AutoRL is proposed for
the first time to address the error accumulation issue suffered of microgrid operation [3].
by existing multi-step forecasting methods, then the prediction
values obtained by the proposed forecasting method are revised A. Literature Review
via the error distribution to improve the prediction accuracy; Forecasting methods of renewable energy generation and
secondly, a scheduling model considering demand response is con- load have been extensively investigated. As far as current fore-
structed to minimize the total microgrid operating costs, where
the revised forecasting values are used as the dispatch basis, and casting methods are concerned, they are mainly divided into
a spinning reserve chance constraint is set according to the error two categories. 1) One is the traditional time series analysis
distribution; finally, by transforming the original scheduling method: In the 1970s, George Box and Gwilym Jenkins pro-
model into a readily solvable mixed integer linear programming posed the “Box-Jenkins method”, Ref. [4] used this method for
via the sequence operation theory (SOT), the transformed model the first time for short-term load forecasting, and subsequently,
is solved by using CPLEX solver. The simulation results show
that compared with the traditional scheduling model without more modern mathematical theories were applied to power
forecasting, this approach manages to significantly reduce the system forecasting. In [5], a gray theory was used to predict
system operating costs by improving the prediction accuracy. electricity prices and renewable energy power generation, and
Index Terms—Microgrid, optimal scheduling, automated re- [6] used probability forecasting methods to predict the load.
inforcement learning, uncertainty handling, single-step multi- However, traditional time series forecasting methods heavily
period forecasting, sequence operation theory. rely on the choice of model parameters, the appropriate model
parameters can largely determine the accuracy of the predicted
results, and they have a poor generic capability. 2) Another one
I. I NTRODUCTION
is machine learning (ML) algorithms such as support vector
but such kind of methods require a lot of domain knowledge fixed values or rough estimations, and do not combine a
and human interventions when constructing a forecasting sophisticated forecasting model to establish an organic
model, which seriously affects the efficiency of model con- dynamic integration.
struction and the model’s versatility. As is known, accurate
multi-period forecasting results are necessary for microgrid B. Contribution of This Paper
day-ahead scheduling. Unfortunately, these methods used in
The main contributions of this paper are the following
the aforementioned literature can only guarantee a certain
fourfold:
accuracy for multi-period prediction issues, which poses a
significant barrier for developing a forecasting model with 1) A prioritized experience replay automated reinforcement
high performances in practical applications. In this context, our learning (PER-AutoRL) is designed to predict renewable
motivation is to leverage reinforcement learning framework energy outputs and load in a customized manner, which
for renewable energy and load forecasting so that compared can automatically determine the most appropriate model
with supervised learning, which is more robust and more architecture and optimize hyperparameters based on the
potential connections can be learned under the same amount input data. This design will improve modeling efficiency,
of information to improve the performance of a forecasting strengthen the combination of forecasting and schedul-
model [12]. ing, and provide reliable data support for scheduling.
For isolated microgrids, how to deal with the uncertainty 2) We propose a multi-period single-step forecasting
of renewable energy output and load is a crucial problem due method based on PER-AutoRL, which can significantly
to its relatively small capacity and unavailability to obtain the improve the forecasting accuracy by solving the error
power support from the main grids. An economic operation accumulation issues suffered by traditional multi-step
model of microgrid was proposed in [13] earlier, on the basis forecasting methods.
of including multiple forms of DGs, the model considered 3) By modeling uncertainty of forecasting errors, we con-
the constraints of microgrid cogeneration, reserve capacity, sider the impact of the errors on the prediction accuracy,
etc. Ref. [14] adopted the method of robust programming, and revise the predicted values according to the error
expressed the variation range of wind turbine (WT) output distribution described by the t location-scale (TLS) dis-
and load according to the uncertainty set. In order to improve tribution to further improve the forecasting performance.
the operating economy of the microgrid, Ref. [15] used mul- 4) We construct a microgrid scheduling model integrating
tivariate global sensitivity analysis to identify and retain the the designed forecasting method with consideration of
key uncertain factors that affect the operation of the system. In demand response, which can significantly reduce the
[16], authors fitted the best probability distribution based on microgrid operating costs.
historical data of WT output and load, then used the chance
constraint programming method for microgrid dispatch. This II. F ORECASTING M ODEL BASED ON P RIORITIZED
method balanced the economy and reliability of microgrid E XPERIENCE R EPLAY AUTOMATED REINFORCEMENT
operation, but it used WT output and load data in their LEARNING
expectation form as the input for scheduling, which was not A. Deep Deterministic Policy Gradient
able to provide a reliable basis for the dispatch model. In [17], Deep reinforcement learning (DRL) is a combination of
authors used LSTM to predict PV output and load, and then deep learning and reinforcement learning, which provides
used particle swarm optimization to optimize the scheduling solutions for the perception and decision-making problems
model. However, in this method, the forecasting model and of sophisticated systems [18]. As a powerful DRL algorithm,
the scheduling model are two separate parts. deep deterministic policy gradient (DDPG) has demonstrated
It can be seen from the above literature that the existing
a powerful ability in dealing with continuous action space
research has made an in-depth discussion on the optimal
problems.
scheduling of microgrids, however, there are still the following
In DDPG, deep neural networks with parameters θµ and
gaps that need to be addressed:
θ are used to represent the main actor network, a = π(s|θµ )
Q
1) The architecture and hyperparameters of a machine and the main critic network, Q(s, a|θQ ). The main function
learning model have a significant impact on model of the Actor part is to interact with the environment, that is,
performances, and building a learning model requires directly select the output action, a according to state, s, and
lots of domain knowledge and human interventions. get the next state, s0 and reward, r after interacting with the
2) The prediction accuracies of most existing forecasting environment. The objective function is the total reward with a
models are not ideal for multi-period prediction, which discount factor γ, which is formulated as
cannot provide reliable data support for microgrid day-
ahead scheduling. J(θµ ) = Eθµ [r1 + γr2 + γ 2 r3 + ... + γ i−1 ri ] (1)
3) The existing forecasting approaches rarely consider un- To improve the total reward J, the objective function is
certainty modeling of forecasting errors and the impact optimized through the stochastic gradient. Silver et al. proved
of the errors on forecasting results, which will inevitably that the gradient of the objective function with respect to θµ is
deteriorate the forecasting performance. equivalent to the expectation gradient of the Q-value function
4) Most of current microgrid scheduling models adopt the with respect to θQ [19], when updating the main actor network,
load and renewable energy generations in the form of the gradient can be approximated as
IEEE TRANSACTIONS ON SUSTAINABLE ENERGY 3
PtIE ≤ ρC(PEL,t
P red
) ∀t (18) where PRess,t denotes the ESS reserve capacity.
It can be seen from (17) that when the power balance
P red P red P red P red constraint is considered, the EL is processed by the revised
PEL,t = PL,t − (PW T,t + PP V,t ) (19) P red
prediction value C(PEL,t ). Due to the uncertainty of load and
where ρ is the ratio of PtIE , PtCH and PtDC are the charging renewable energy outputs, the total spinning reserve provided
and discharging power of the energy storage in period t, by ESS and MTs must be able to make up for the difference
P red
PtCN LOAD is the controllable load. PEL,t P red
is the predicted between the fluctuating EL and C(PEL,t ).
P red P red When the extreme situation with zero renewable energy out-
value of the EL, C(PEL,t ) is the revised value of PEL,t , the
calculation process is as follows: puts occurs, the system needs a large spinning reserve capacity,
P red P red
− E(σtL ) − [PWP red WT which will incur high spinning reserve costs. However, the
C(PEL,t ) = PL,t T,t − E(σt )]
(20) possibility of such cases occurring is very low. To balance
−[PPPV,t
red
− E(σtP V )]
the reliability and economy in the economic operation of the
b) MT output constraint: The output of MT must comply microgrid, the spinning reserve constraint can be expressed as
with the following inequality: a chance constraint [16]:
MG
MT MT MT
Un,t Pn,min ≤ Pn,t ≤ Un,t Pn,max ∀t, n ∈ MG (21)
X
MT
Prob { Rn,t + PRess,t ≥ (PtL −PtW T −PtP V )− C(PEL,t P red
)}
MT MT
where Pn,max and Pn,minare the upper and lower limits of n=1
the output power of MT unit n. ≥α ∀t
c) Energy storage system constraints: In this paper, the (28)
energy storage system (ESS) adopts lead-acid batteries to By using
balance the random fluctuations in the microgrid because it
P red
P L = PL,t − σtL
t
provides many advantages, such as low price and long service PtW T = PW P red WT
T,t − σt (29)
life [27].
P P V = P P red − σ P V
Charge-discharge equation: The relationship between the t P V,t t
ESS and the charge-discharge powers is expressed as where PtL denotes load in period t; PtW T and PtP V are the
P red
St+1 = St + (ηch PtCH − PtDC /ηdc )∆t ∀t (22) WT and PV outputs, respectively; C(PEL,t ) can be calculated
by (20). Accordingly, (28) can be simplified to
where St+1 and St are respectively the ESS energy storage
MG
at the beginning of period t + 1 and t, ηch and ηdc are the X
MT
Prob { Rn,t +PRess,t ≥ E(σtEL )−(σtL −σtW T −σtP V )}
charge/discharge efficiency, and ∆t denotes the duration of
n=1
each period, which is taken as 1h in this paper.
≥α ∀t
The output limits of lead-acid batteries: (30)
0 ≤ PtDC ≤ PmaxDC
∀t where α is the preset confidence level, and E(σtEL ) is the
CH CH (23)
0 ≤ Pt ≤ Pmax ∀t expectation of the EL forecasting error which can be obtained
CH
where Pmax DC
and Pmax are the maximum values of lead-acid by the method in Sec.III.
battery charge and discharge in time period t, respectively.
The capacity limit of lead-acid batteries is B. Deterministic Transformation of Chance Constraints
Smin ≤ St ≤ Smax ∀t (24) To facilitate the processing of (30), we design a new type
of 0-1 variable as follows:
where Smax and Smin are the maximum and minimum XMG
MT EL,t
1,
Rn,t + PRess,t ≥ E(σtEL ) − (σmin + ie,t q)
allowable capacities of the batteries, respectively.
n=1
Wie,t = ∀t, ie,t = 0, 1, ..., Ne,t
The ESS starting and ending constraints are
0, otherwise
S0 = STend = S∗ ∀t (25) (31)
where S0 is the ESS initial energy storage, S∗ is the limit of
the initial energy storage in ESS, and Tend is the end of the As Tab. I shows, the probability of EL forecasting error
EL,t
scheduling cycle (Tend =24h in this paper). For the ESS energy (σmin + ie,t q) is e(ie,t ), so (30) can be transformed into
balance, we set the remaining capacity STend of the ESS after Ne,t
X
each scheduling cycle the same as S0 . Wie,t e(ie,t ) ≥ α (32)
d) Spinning reserve constraint: The spinning reserve is ie,t=0
an important resource for an isolated microgrid to suppress The expression of Wie,t in (32) is not compatible with the
the renewable DGs outputs fluctuations and ensure the system solution form of the mixed integer programming. In order to
operates reliably. The required spinning reserves are provided solve the problem, (31) must be replaced by
by the MT units and ESS. Therefore, the spinning reserve M G
EL,t
X
MT
constraints are expressed as ( Rn,t + PRess,t + σmin + ie,t q − E(σtEL ))/Φ ≤ Wie,t
MT MT MT n=1
Pn.t + Rn,t ≤ Un,t Pn,max ∀t, n ∈ MG (26) M G (33)
EL,t
X
MT
DC ≤1+( Rn,t + PRess,t + σmin + ie,t q − E(σtEL ))/Φ
PRess,t ≤ min{ηdc (St − Smin )/∆t, Pmax − PtDC } ∀t n=1
(27) ∀t, ie,t = 0, 1, ..., Ne,t
IEEE TRANSACTIONS ON SUSTAINABLE ENERGY 7
sec. II-C to divide the datasets into 24 new time series, and
regard these as the original datasets for prediction. Finally,
the original datasets are divided at a ratio of 0.8:0.2 for model
training and testing.
We use this system to verify the feasibility of the scheduling
model based on PER-AutoRL forecasting. Tab. II shows the
specific parameters of the MTs in this system.
Fig. 3: The microgrid test system. The parameters of lead-acid batteries are as follows:
DC DC
Pmax =Pmax = 40kW, Smin = 32kW·h, Smax = 160kW·h,
ηch = ηdc = 0.9.
where Φ is a very large positive number in this formula. By
substituting (30) with (32) and (33), the model in Sec. IV-A
will be completely transformed into the form of the mixed B. Multi-period Renewable Power Outputs and Load Fore-
integer programming. The overall flowchart of the proposed casting
scheduling model is shown in Fig. 2 .
For the prediction problem in this work, the proposed PER-
V. C ASE S TUDY AutoRL will automatically determine the model architecture
and hyperparameters based on the input multi-sub time series
To examine the effectiveness of the proposed approach, a of WT, PV outputs, and load in a customized manner. Fig. 4
real-world microgrid in North China is used for numerical shows the optimization results of the hyperparameters required
simulation analysis. In this paper, Python is used as the by the PER-AutoRL to model the load data. Each curve in the
programming language and all simulations are performed on figure represents a set of hyperparameters, each ordinate axis
a PC platform with 2 Intel Core dual-core CPUs (2.6Hz) and represents the values of various hyperparameters, and the last
6 GB RAM. ordinate is the negative MAPEs of using these hyperparameter
sets; the darker the color, the more appropriate the hyperpa-
A. Introduction of Test System rameters. It can be seen from the figure that our designed PER-
As shown in Fig. 3, the system is mainly composed of 3 AutoRL manages to achieve satisfactory optimization results
MT units, a WT unit, a PV panel, and a battery pack. in most cases, and only the MAPEs resulting by three sets
In order to reasonably verify the rationality of the proposed of hyperparameters are higher than 15%. Furthermore, only a
forecasting method, the 5 years of WT and PV outputs and few optimization iterations are needed for the PER-AutoRL to
load data from the microgrid from January 1, 2015 to Decem- find suitable hyperparameters. Therefore, these results prove
ber 31, 2019 are adopted. Then use the method described in that our PER-AutoRL is able to automatically determine the
IEEE TRANSACTIONS ON SUSTAINABLE ENERGY 8