0% found this document useful (0 votes)
9 views10 pages

Read 1

This document discusses an optimal scheduling model for isolated microgrids that uses automated reinforcement learning-based multi-period forecasting of renewable power generation and loads. It proposes a prioritized experience replay automated reinforcement learning method to simplify deep reinforcement learning-based forecasting and address error accumulation in multi-step forecasting. Prediction values are revised using error distribution to improve accuracy. An scheduling model considers demand response and uses revised forecasts to minimize costs with a spinning reserve chance constraint. The model is transformed into a mixed integer linear program solved using CPLEX to reduce operating costs compared to traditional scheduling without forecasting.

Uploaded by

Amruth Raju
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views10 pages

Read 1

This document discusses an optimal scheduling model for isolated microgrids that uses automated reinforcement learning-based multi-period forecasting of renewable power generation and loads. It proposes a prioritized experience replay automated reinforcement learning method to simplify deep reinforcement learning-based forecasting and address error accumulation in multi-step forecasting. Prediction values are revised using error distribution to improve accuracy. An scheduling model considers demand response and uses revised forecasts to minimize costs with a spinning reserve chance constraint. The model is transformed into a mixed integer linear program solved using CPLEX to reduce operating costs compared to traditional scheduling without forecasting.

Uploaded by

Amruth Raju
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

IEEE TRANSACTIONS ON SUSTAINABLE ENERGY 1

Optimal Scheduling of Isolated Microgrids Using


Automated Reinforcement Learning-based
Multi-period Forecasting
Yang Li, Senior Member, IEEE, Ruinong Wang, Zhen Yang

Abstract—In order to reduce the negative impact of the generation and load, this has caused a strong obstacle to day-
arXiv:2108.06764v1 [eess.SP] 15 Aug 2021

uncertainty of load and renewable energies outputs on micro- ahead dispatching plans of a microgrid [2]. It’s known that
grid operation, an optimal scheduling model is proposed for an accurate forecasting result can provide a reliable basis for
isolated microgrids by using automated reinforcement learning-
based multi-period forecasting of renewable power generations dispatching plans to arrange the start and stop of microturbines
and loads. Firstly, a prioritized experience replay automated (MT) and set the spinning reserve capacity. A scheduling
reinforcement learning (PER-AutoRL) is designed to simplify model combined with advanced forecasting methods can un-
the deployment of deep reinforcement learning (DRL)-based doubtedly reduce the impact of the uncertainty of renewable
forecasting model in a customized manner, the single-step multi- generations and load on microgrids, and improve the economy
period forecasting method based on PER-AutoRL is proposed for
the first time to address the error accumulation issue suffered of microgrid operation [3].
by existing multi-step forecasting methods, then the prediction
values obtained by the proposed forecasting method are revised A. Literature Review
via the error distribution to improve the prediction accuracy; Forecasting methods of renewable energy generation and
secondly, a scheduling model considering demand response is con- load have been extensively investigated. As far as current fore-
structed to minimize the total microgrid operating costs, where
the revised forecasting values are used as the dispatch basis, and casting methods are concerned, they are mainly divided into
a spinning reserve chance constraint is set according to the error two categories. 1) One is the traditional time series analysis
distribution; finally, by transforming the original scheduling method: In the 1970s, George Box and Gwilym Jenkins pro-
model into a readily solvable mixed integer linear programming posed the “Box-Jenkins method”, Ref. [4] used this method for
via the sequence operation theory (SOT), the transformed model the first time for short-term load forecasting, and subsequently,
is solved by using CPLEX solver. The simulation results show
that compared with the traditional scheduling model without more modern mathematical theories were applied to power
forecasting, this approach manages to significantly reduce the system forecasting. In [5], a gray theory was used to predict
system operating costs by improving the prediction accuracy. electricity prices and renewable energy power generation, and
Index Terms—Microgrid, optimal scheduling, automated re- [6] used probability forecasting methods to predict the load.
inforcement learning, uncertainty handling, single-step multi- However, traditional time series forecasting methods heavily
period forecasting, sequence operation theory. rely on the choice of model parameters, the appropriate model
parameters can largely determine the accuracy of the predicted
results, and they have a poor generic capability. 2) Another one
I. I NTRODUCTION
is machine learning (ML) algorithms such as support vector

W ITH the gradual depletion of fossil fuels, the deteriorat-


ing ecological environment, combined with traditional
centralized power supply exposes many shortcomings, the
machines and artificial neural networks: Ref. [7] used support
vector regression to finely predict the load of the distribution
network and an artificial neural network (ANN) was used
proportion of renewable power generation in power system is to forecast photovoltaic (PV) power output in [8]; With the
increasing due to its sustainability and environmental friend- development of computer technology and the advent of the era
liness [1]. As an effective carrier of distributed generations of big data, deep learning has become a hot spot in forecasting
(DGs), a microgrid has been widely used in power systems in methods; In [9], long short term memory (LSTM) network
recent years. In-depth research and accelerated construction was used to predict load, and a more sophisticated neural
of microgrids can promote the large-scale integration of DGs network was proposed in [10], which combined convolutional
and renewable energy sources, which contributes to reshaping neural network (CNN) with gated recurrent unit (GRU) to
today’s power system toward a sustainable and clean energy predict renewable energy outputs, electricity price, and load.
system. However, due to the uncertainty of renewable energy But traditional supervised learning needs high-quality datasets
for training to obtain a perfect forecasting performance, which
This work is partly supported by the Natural Science Foundation of Jilin limits the application of such methods in real-world systems to
Province, China under Grant No. 2020122349JC. (Corresponding author:
Yang Li.) a certain extent. Reinforcement learning, which has received
Y. Li and R. Wang are with the School of Electrical Engineering, Northeast widespread attention, has also been applied to building energy
Electric Power University, Jilin 132012, China (e-mail: [email protected]; consumption forecasting [11]. Compared with traditional time
[email protected]).
Z. Yang is with the State Grid Beijing Electric Power Company, Xicheng series analysis methods, forecasting methods based on ma-
District, Beijing 100032, China (e-mail: [email protected]). chine learning can provide more accurate prediction results,
IEEE TRANSACTIONS ON SUSTAINABLE ENERGY 2

but such kind of methods require a lot of domain knowledge fixed values or rough estimations, and do not combine a
and human interventions when constructing a forecasting sophisticated forecasting model to establish an organic
model, which seriously affects the efficiency of model con- dynamic integration.
struction and the model’s versatility. As is known, accurate
multi-period forecasting results are necessary for microgrid B. Contribution of This Paper
day-ahead scheduling. Unfortunately, these methods used in
The main contributions of this paper are the following
the aforementioned literature can only guarantee a certain
fourfold:
accuracy for multi-period prediction issues, which poses a
significant barrier for developing a forecasting model with 1) A prioritized experience replay automated reinforcement
high performances in practical applications. In this context, our learning (PER-AutoRL) is designed to predict renewable
motivation is to leverage reinforcement learning framework energy outputs and load in a customized manner, which
for renewable energy and load forecasting so that compared can automatically determine the most appropriate model
with supervised learning, which is more robust and more architecture and optimize hyperparameters based on the
potential connections can be learned under the same amount input data. This design will improve modeling efficiency,
of information to improve the performance of a forecasting strengthen the combination of forecasting and schedul-
model [12]. ing, and provide reliable data support for scheduling.
For isolated microgrids, how to deal with the uncertainty 2) We propose a multi-period single-step forecasting
of renewable energy output and load is a crucial problem due method based on PER-AutoRL, which can significantly
to its relatively small capacity and unavailability to obtain the improve the forecasting accuracy by solving the error
power support from the main grids. An economic operation accumulation issues suffered by traditional multi-step
model of microgrid was proposed in [13] earlier, on the basis forecasting methods.
of including multiple forms of DGs, the model considered 3) By modeling uncertainty of forecasting errors, we con-
the constraints of microgrid cogeneration, reserve capacity, sider the impact of the errors on the prediction accuracy,
etc. Ref. [14] adopted the method of robust programming, and revise the predicted values according to the error
expressed the variation range of wind turbine (WT) output distribution described by the t location-scale (TLS) dis-
and load according to the uncertainty set. In order to improve tribution to further improve the forecasting performance.
the operating economy of the microgrid, Ref. [15] used mul- 4) We construct a microgrid scheduling model integrating
tivariate global sensitivity analysis to identify and retain the the designed forecasting method with consideration of
key uncertain factors that affect the operation of the system. In demand response, which can significantly reduce the
[16], authors fitted the best probability distribution based on microgrid operating costs.
historical data of WT output and load, then used the chance
constraint programming method for microgrid dispatch. This II. F ORECASTING M ODEL BASED ON P RIORITIZED
method balanced the economy and reliability of microgrid E XPERIENCE R EPLAY AUTOMATED REINFORCEMENT
operation, but it used WT output and load data in their LEARNING
expectation form as the input for scheduling, which was not A. Deep Deterministic Policy Gradient
able to provide a reliable basis for the dispatch model. In [17], Deep reinforcement learning (DRL) is a combination of
authors used LSTM to predict PV output and load, and then deep learning and reinforcement learning, which provides
used particle swarm optimization to optimize the scheduling solutions for the perception and decision-making problems
model. However, in this method, the forecasting model and of sophisticated systems [18]. As a powerful DRL algorithm,
the scheduling model are two separate parts. deep deterministic policy gradient (DDPG) has demonstrated
It can be seen from the above literature that the existing
a powerful ability in dealing with continuous action space
research has made an in-depth discussion on the optimal
problems.
scheduling of microgrids, however, there are still the following
In DDPG, deep neural networks with parameters θµ and
gaps that need to be addressed:
θ are used to represent the main actor network, a = π(s|θµ )
Q
1) The architecture and hyperparameters of a machine and the main critic network, Q(s, a|θQ ). The main function
learning model have a significant impact on model of the Actor part is to interact with the environment, that is,
performances, and building a learning model requires directly select the output action, a according to state, s, and
lots of domain knowledge and human interventions. get the next state, s0 and reward, r after interacting with the
2) The prediction accuracies of most existing forecasting environment. The objective function is the total reward with a
models are not ideal for multi-period prediction, which discount factor γ, which is formulated as
cannot provide reliable data support for microgrid day-
ahead scheduling. J(θµ ) = Eθµ [r1 + γr2 + γ 2 r3 + ... + γ i−1 ri ] (1)
3) The existing forecasting approaches rarely consider un- To improve the total reward J, the objective function is
certainty modeling of forecasting errors and the impact optimized through the stochastic gradient. Silver et al. proved
of the errors on forecasting results, which will inevitably that the gradient of the objective function with respect to θµ is
deteriorate the forecasting performance. equivalent to the expectation gradient of the Q-value function
4) Most of current microgrid scheduling models adopt the with respect to θQ [19], when updating the main actor network,
load and renewable energy generations in the form of the gradient can be approximated as
IEEE TRANSACTIONS ON SUSTAINABLE ENERGY 3

Algorithm 1 AutoRL with Prioritized Experience Replay


1 X Initialize: Neural network architecture θµ , θQ .
∇θ µ J ≈ [∇a Q(s, a|θQ )|s=si ,a=π(si ) ∇θµ π(s|θµ )|s = si ]
N i Initialize: Replay buffer R with size S, reward function.
(2) Initialize: Maximum priority, parameters ι, β.
1: for trail = 1, ..., M do
The main function of the Critic part is to evaluate the
2: Select new neural network architecture θµ , θQ according to
strategy proposed by the Actor. The main critic network is the Metis Tuner.
updated with the goal of minimizing the loss function, the 3: Select reward function from reward function pool.
loss function is as follows: 4: Select an initial state st from state space.
5: for t=1, ..., H do
L(θQ ) = Es,a,r,s0 D (T D − error)2 (3) 6: Select action at according to the new policy.
0 0 0 µ0 Q0 Q Obtain reward rt and new state st+1 .
T D − error = [r + γQ (s , π(s |θ )|θ )] − Q(s, a|θ ) (4) 7:
0 0 8: Store experience (st , at , rt , st+1 ) in replay buffer R and
where Q0 is the target Q-value function; θµ and θQ denote set Dt = maxi<t Di .
the parameters of the target actor network and the target critic 9: if t > S then
network, respectively. 10: for j=1, ..., N do
11: Sample experience j with probability P (j).
The target networks of the Actor and Critic part are con-
12: Compute importance-sampling weight Wj and TD-
structed mainly to make the network training more stable, their error δj .
parameters can be updated with a soft update factor ς every 13: Update the priority of transition j according to abso-
fixed period of time: lute TD-error |δj |.
0 0 14: end for
θQ ← ςθQ + (1 − ς)θQ 15: Update main critic network according to minimize the
0 0 (5)
θµ ← ςθµ + (1 − ς)θµ loss function: L = N1 2
P
i w i δi.
16: Update main actor network according to (2).
17: Update target networks according to (5).
B. PER-AutoRL 18: end if
Automated machine learning (AutoML) can be regarded as 19: end for
a system with powerful learning and generalization capabilities 20: Collect the testing MAPE and upload it to the Metis Tuner
21: end for
on given data and tasks. For traditional machine learning, 22: Select the best neural network architecture θ µ , θ Q and the best
obtaining a model with good performances often requires a policy π according to minimal MAPE.
lot of experts’ experience and human debugging. AutoML 23: return θ µ , θ Q ,π
is designed to reduce the demand for data scientists and
enable domain experts to automatically build ML applications
without much requirement for statistical and ML knowledge, slightly modified as options of the reward function. Their
it can undoubtedly improve the efficiency of model design and expressions are

reduce the difficulty in applying ML. 
 −|σ| = −|yiac − pi | .
To simplify the deployment of DRL models and improve the  −M AE = −[ N (|yiac − pi |)] N

 P
 i=1
efficiency, we propose a PER-AutoRL by extending AutoML 
 PN .
 ac 2
−M SE = −[ (y − p ) ] N

to DRL. We use the Metis method, a type of sequence-based 

 i=1 i i
.
Bayesian optimization algorithm, to determine the architecture PN
−M AP E = −[ i=1 |(yi − pi )/yiac |] N
ac
of the DRL model and optimize the hyperparameters [20]. By 

 r .
 PN 2
using the DDPG algorithm as an overall framework, the Actor 


 −RM SE = − [ i=1 (yiac − pi ) ] N
part makes predictions based on the input data and the Critic 
 n . o
 R2 = 1 − [PN (y ac − pi )2 ] [PN (y ac − y ac )2 ]


part evaluates the prediction results and obtains the optimal i=1 i i=1 i i
prediction policy in this study, which enables the perception (6)
ability of deep learning to integrate the decision-making ability where N is the number of samples, y ac is the actual value,
of reinforcement learning for solving the forecasting problem. y ac is the mean value of y ac , and p is the prediction result.
We set the actor network and the critic network to a fix Compared with ML, DRL has more hyperparameters that
depth feed-forward fully-connected neural network, and then need to be optimized, which will affect the efficiency of
determine the network architecture according to the size of the AutoRL. This work adopts Prioritized Experience Replay
each layer and the activation function, and then optimize the (PER) to improve the experience replay mechanism in DDPG,
hyperparameters in the model. which greatly reduces the training time [21]. The core of
For DRL, an appropriate reward function can improve the PER is to increase the frequency of valuable experiences.
performance of the model. In this respect, we set a reward A repeated replay of extreme experience helps accelerate the
function pool to store a reward function that may be applicable agent recognize how to choose the right action to obtain high
and treat the reward function as a hyperparameter, which is reward, or avoid the terrible results of choosing the wrong
optimized in synchronization with other hyperparameters. For action. The DDPG updates the critic network according to
the prediction problem studied in this paper, the absolute error TD-errors. the larger its absolute value, the more aggressive
σ, mean absolute error (MAE), mean square error (MSE), the correction of the critic network, which means that the
mean absolute percentage error (MAPE), root mean square experience has a higher value. Therefore, this paper selects
error (RMSE) and the coefficient of determination R2 are the absolute value of TD-errors as the experience evaluation
IEEE TRANSACTIONS ON SUSTAINABLE ENERGY 4

index to rank the experience and defines the probability P (j)


of the sampled experience j according to this rank.
Djι
P (j) = PN (7)
ι
j=1 Dj
1
where Dj = rank(j) , rank(j) is the rank of experience j, N
is the total number of experience stored in the reply buffer, ι
is a parameter controlling the prioritization.
To avoid frequent sampling of high TD-errors experiences
that may cause the neural network to oscillate or diverge,
we set the importance-sampling weight for the critic network
updates:
1 Fig. 1: Probability distribution of forecasting errors.
Wj = β
(8)
β
S · P (j)
where S is the size of the reply buffer, parameter β controls network are determined until the predetermined number
to what extent the correction is used. of training episodes is completed.
The integrated algorithm of PER-AutoRL is shown in 4) Day-ahead forecasting. Respectively forecast the values
Algorithm 1. in each period, then integrate the forecasting results of
each group to form a forecasting data for the next 24
C. Multi-period Forecasting Based on PER-AutoRL hours.
5) Revising forecasting results. After obtaining the error
The multi-period forecasting methods can be divided into distribution via the forecasting errors of PER-AutoRL,
two categories: multi-step forecasting and single-step forecast- we calculate the expected values of errors, then revise
ing. The former takes the prediction results of each step to the forecasting values based on them. The revised fore-
make next prediction; while the latter is based on the real casting results are used as the input of the scheduling
value in the previous period to predict the next-period value; model and the error expected values of PER-AutoRL are
after each prediction step, the real value in the previous period prepared for the determination of the spinning reserve
needs to be updated for the next-period prediction. The single- in the scheduling.
step forecasting cannot provide complete prediction results
By constructing a multi-period single-step prediction model
for day-ahead scheduling, because the real values of each
based on PER-AutoRL, the error accumulation is effectively
period cannot be updated timely. And it should be noted that
solved. In addition, benefits from the training methods and ex-
the errors accumulation is inevitable in multi-step forecasting,
ploration capabilities of DRL, the robustness and practicability
resulting in a decrease in the forecasting accuracy.
of the prediction model have been improved [12].
This paper proposes a multi-period forecasting method
based on PER-AutoRL, which transforms the multi-step fore-
D. Uncertainty Modeling of Forecasting Errors and Revised
casting problem into a multi-period single-step forecasting
Forecasting Results
problem. The proposed method addresses the error accumula-
tion issue and improves the prediction accuracy, the specific 1) Stochastic Model of Forecasting Errors: In this paper,
procedure is as follows: TLS distribution is used to describe the probability distribution
1) Dataset preprocessing. Extract the data in the same of the PER-AutoRL prediction errors [22]:
2
period of a day from the dataset and reconstruct it into Γ( ϑ+1 ) ϑ + ( x−µε ) −ϑ+1/2
f (x; µ, ε, ϑ) = √ 2 ϑ [ ] (9)
multiple sets of new time series (this work uses 1h as ε ϑπΓ( 2 ) ϑ
the time interval to reconstruct the dataset into 24 new where Γ is the gamma function, µ is the mean value, ε is the
time series). standard deviation, and ϑ is the shape coefficient.
2) Constructing PER-AutoRL model. The different archi- Fig. 1 is a fitting diagram of the probability distribution
tecture and hyperparameters of the PER-AutoRL model of WT output forecasting errors for a total of one year
for predicting each time series are automatically deter- from January 1, 2018 to December 31, 2019. It can be seen
mined according to the separate new time series. For the from Fig. 1 that compared with the normal distribution and
prediction problem in this study, we select the data of the logistic distribution, the TLS distribution can better describe
same period in 7 consecutive days to construct a matrix the forecasting error probability distribution.
as the state s, and then the agent outputs a predicted 2) Revised Forecasting Results: To obtain a more accurate
value as the action a based on its observed state. The forecasting result, we acquire the expectation of error via
loss function of the main actor network aims to minimize the distribution of the prediction errors, and then modify the
the forecasting errors, while the loss function of the main forecasting results. The calculation process is:
critic network is built for minimizing the gaps between

P red P red
 C(PL,t ) = PL,t − E(σtL )
the predicted value and the true value of the reward.


P red P red WT
3) Training PER-AutoRL model. The weight matrices and C(PW T,t ) = PW T,t − E(σt ) (10)

biases of the main actor network and the main critic  C(P P red ) = P P red − E(σ P V )

P V,t P V,t t
IEEE TRANSACTIONS ON SUSTAINABLE ENERGY 5

where C(PL,t P red


), C(PWP red P red TABLE I: Prediction error of the EL and its probabilistic
T,t ), C(PP V,t ) are the revised fore-
casting results of load, WT and PV outputs, respectively; sequence
P red P red P red
PL,t , PW T,t , PP V,t are the forecasting results of load, WT Error(kW) EL,t
σmin EL,t
σmin +q ... EL,t
σmin + ie,t q ... EL,t
σmin + Ne,t q
and PV outputs; σtL , σtW T , σtP V are the forecasting errors
Probability e(0) e(1) ... e(ie,t ) ... e(Ne,t )
of load, WT and PV outputs; and E(σtL ), E(σtW T ), E(σtP V )
are the expectation of forecasting errors of load, WT and PV
outputs, respectively. Then, subtraction-type-convolution is employed to calculate
3) Evaluation Indices: This paper evaluates the forecasting the probabilistic sequence e(ie , t) of the EL forecasting errors
methods based on the common prediction evaluation indicators [25]:
MAPE and RMSE [23].
e(ie,t ) = (d(id,t ) c(ic,t )
P
III. S ERIALIZATION M ODELLING OF R ANDOM VARIABLES d(id )c(ic ), 1 ≤ ie,t ≤ Ne,t
= Pid,t −ic,t =ie,t
A. Sequence Operation Theory id,t ≤ic,t d(id )c(ic ), ie,t = 0
(14)
Sequence operation theory (SOT) is a powerful mathemati-
The corresponding relationship between prediction errors
cal tool for dealing with multiple uncertain variables. The core
of EL and their probabilistic sequence are shown in Tab. I,
idea is to acquire a probabilistic sequence of random variables EL,t
where σmin denotes the minimum value of EL forecasting
through discretization, and then generate a new sequence
errors, Ne,t is the length of e(ie,t ). The expectation of e(ie,t )
through sequence operations [24].
can be calculated by
Suppose the discrete sequence a(i) of length Na satisfies Ne,t
the following conditions: X EL,t
Na E(e) = [(σmin + ie,t q) · e(ie,t )] (15)
X
ie,t =0
a(ia ) = 1, a(ia ) ≥ 0, i = 0, 1, 2, ..., N a (11)
ia =0
the sequence is regarded as a probabilistic sequence. IV. O PTIMAL S CHEDULING M ODEL OF M ICROGRID
BASED ON PER-AUTO RL F ORECASTING
B. Sequence Description of Equivalent Load Forecasting Er-
A. Optimal Scheduling Model
rors
1) Objective Function: The objective function Fc of the
During period t, the forecasting errors σ W T , σ P V and σ L
optimal scheduling model considering demand response is
of WT, PV outputs, and load are all random variables. In this
constructed to minimize the total microgrid operating costs
study, it is assumed that the uncertainty of them does not affect
that are comprised of the MT units fuel costs and the spinning
each other. After discretizating the continuous probability
reserve costs. In light of the nature of the demand-side
distributions of WT outputs forecasting errors according to
loads, this work divides electric load into fixed load and
the step length q, the probabilistic sequences a(ia,t ) can be
interruptible load. Since the load interruption inevitably affects
obtained by
 R σW T ,t +q/2 user experience, certain subsidies will generally be provided



min
W T ,t
σmin
fo (σ W T )dσ W T , ia,t = 0 to users [26], and the subsidies are also considered in the total
operating costs. The objective function is as follows:


 R σW T ,t +ia,t q+q/2
 σWmin fo (σ W T )dσ W T ,

 T ,t T M
PG
min +ia,t q−q/2 MT
+ τn Sn,t + κPtIE
P
a(ia,t ) = min Fc = [ (ςn Rn,t

 ia,t > 0, ia,t 6= Na,t t=1 n=1 (16)
W T ,t MT
+Un,t (ψn + ξn Pn,t ))]

 R σmin +ia,t q


 σmin
 W T ,t
+ia,t q−q/2 o
f (σ W T )dσ W T , ia,t = Na,t
 where T denotes the total number of time period t in a
W T ,t
σmax W T ,t
−σmin
scheduling cycle (T =24 in this paper); MG denotes the total
Na,t = [ q ] number of MT units; n is the MT number; ψn and ξn
(12) represent the consumption factors of MT n; Un,t and Sn,t
where fo is the probability density function, Na,t is the length are 0-1 variables representing the state variable and the startup
W T,t W T,t
of the sequence, σmin and σmax are the minimum and variable of MT n, respectively; PtIE is the interruptible electric
maximum values of WT outputs forecasting errors. load, and κ is the subsidies; ςn and τn denote the costs
In the same way, the probabilistic sequence b(ib,t ), d(id,t ) of spinning reserves and the startup costs of MT n; Pn,t MT
with length Nb,t , Nd,t of the PV outputs forecasting errors MT
and Rn,t are respectively the output power and the spinning
and load forecasting errors can be obtained. reserve of MT n in period t.
We define the equivalent load (EL) as the difference between 2) Constraint Conditions:
the load power and the joint outputs of WT and PV. The a) Power balance constraint: To maintain the power
probabilistic sequence c(ic , t) of the joint outputs forecasting balance of the isolated microgrid, the controllable loads need
errors with length Nc,t can be obtained by addition-type- to be deployed, so the power balance constraints are
convolution operation: P MG
X
MT
c(ic,t ) = a(ia,t ) ⊕ b(ib,t ) = ia,t +ib,t =ic,t a(ia,t )b(ib,t ), Pn,t +PtDC − PtCH = C(PEL,t P red
)
ic,t = 0, 1, ..., Na,t + Nb,t n=1
∀t (17)
CN LOAD
(13) +Pn,t − PtIE
IEEE TRANSACTIONS ON SUSTAINABLE ENERGY 6

PtIE ≤ ρC(PEL,t
P red
) ∀t (18) where PRess,t denotes the ESS reserve capacity.
It can be seen from (17) that when the power balance
P red P red P red P red constraint is considered, the EL is processed by the revised
PEL,t = PL,t − (PW T,t + PP V,t ) (19) P red
prediction value C(PEL,t ). Due to the uncertainty of load and
where ρ is the ratio of PtIE , PtCH and PtDC are the charging renewable energy outputs, the total spinning reserve provided
and discharging power of the energy storage in period t, by ESS and MTs must be able to make up for the difference
P red
PtCN LOAD is the controllable load. PEL,t P red
is the predicted between the fluctuating EL and C(PEL,t ).
P red P red When the extreme situation with zero renewable energy out-
value of the EL, C(PEL,t ) is the revised value of PEL,t , the
calculation process is as follows: puts occurs, the system needs a large spinning reserve capacity,
P red P red
− E(σtL ) − [PWP red WT which will incur high spinning reserve costs. However, the
C(PEL,t ) = PL,t T,t − E(σt )]
(20) possibility of such cases occurring is very low. To balance
−[PPPV,t
red
− E(σtP V )]
the reliability and economy in the economic operation of the
b) MT output constraint: The output of MT must comply microgrid, the spinning reserve constraint can be expressed as
with the following inequality: a chance constraint [16]:
MG
MT MT MT
Un,t Pn,min ≤ Pn,t ≤ Un,t Pn,max ∀t, n ∈ MG (21)
X
MT
Prob { Rn,t + PRess,t ≥ (PtL −PtW T −PtP V )− C(PEL,t P red
)}
MT MT
where Pn,max and Pn,minare the upper and lower limits of n=1
the output power of MT unit n. ≥α ∀t
c) Energy storage system constraints: In this paper, the (28)
energy storage system (ESS) adopts lead-acid batteries to By using 
balance the random fluctuations in the microgrid because it
P red
P L = PL,t − σtL
 t


provides many advantages, such as low price and long service PtW T = PW P red WT
T,t − σt (29)
life [27]. 
 P P V = P P red − σ P V

Charge-discharge equation: The relationship between the t P V,t t
ESS and the charge-discharge powers is expressed as where PtL denotes load in period t; PtW T and PtP V are the
P red
St+1 = St + (ηch PtCH − PtDC /ηdc )∆t ∀t (22) WT and PV outputs, respectively; C(PEL,t ) can be calculated
by (20). Accordingly, (28) can be simplified to
where St+1 and St are respectively the ESS energy storage
MG
at the beginning of period t + 1 and t, ηch and ηdc are the X
MT
Prob { Rn,t +PRess,t ≥ E(σtEL )−(σtL −σtW T −σtP V )}
charge/discharge efficiency, and ∆t denotes the duration of
n=1
each period, which is taken as 1h in this paper.
≥α ∀t
The output limits of lead-acid batteries: (30)
0 ≤ PtDC ≤ PmaxDC

∀t where α is the preset confidence level, and E(σtEL ) is the
CH CH (23)
0 ≤ Pt ≤ Pmax ∀t expectation of the EL forecasting error which can be obtained
CH
where Pmax DC
and Pmax are the maximum values of lead-acid by the method in Sec.III.
battery charge and discharge in time period t, respectively.
The capacity limit of lead-acid batteries is B. Deterministic Transformation of Chance Constraints
Smin ≤ St ≤ Smax ∀t (24) To facilitate the processing of (30), we design a new type
of 0-1 variable as follows:
where Smax and Smin are the maximum and minimum  XMG
MT EL,t
1,
 Rn,t + PRess,t ≥ E(σtEL ) − (σmin + ie,t q)
allowable capacities of the batteries, respectively. 
n=1
Wie,t = ∀t, ie,t = 0, 1, ..., Ne,t
The ESS starting and ending constraints are 

0, otherwise

S0 = STend = S∗ ∀t (25) (31)
where S0 is the ESS initial energy storage, S∗ is the limit of
the initial energy storage in ESS, and Tend is the end of the As Tab. I shows, the probability of EL forecasting error
EL,t
scheduling cycle (Tend =24h in this paper). For the ESS energy (σmin + ie,t q) is e(ie,t ), so (30) can be transformed into
balance, we set the remaining capacity STend of the ESS after Ne,t
X
each scheduling cycle the same as S0 . Wie,t e(ie,t ) ≥ α (32)
d) Spinning reserve constraint: The spinning reserve is ie,t=0
an important resource for an isolated microgrid to suppress The expression of Wie,t in (32) is not compatible with the
the renewable DGs outputs fluctuations and ensure the system solution form of the mixed integer programming. In order to
operates reliably. The required spinning reserves are provided solve the problem, (31) must be replaced by
by the MT units and ESS. Therefore, the spinning reserve M G
EL,t
X
MT
constraints are expressed as ( Rn,t + PRess,t + σmin + ie,t q − E(σtEL ))/Φ ≤ Wie,t
MT MT MT n=1
Pn.t + Rn,t ≤ Un,t Pn,max ∀t, n ∈ MG (26) M G (33)
EL,t
X
MT
DC ≤1+( Rn,t + PRess,t + σmin + ie,t q − E(σtEL ))/Φ
PRess,t ≤ min{ηdc (St − Smin )/∆t, Pmax − PtDC } ∀t n=1
(27) ∀t, ie,t = 0, 1, ..., Ne,t
IEEE TRANSACTIONS ON SUSTAINABLE ENERGY 7

Fig. 4: Hyperparameters optimization.

Fig. 5: Forecasting errors with different methods.

TABLE II: Main parameters of MT units


Fig. 2: Flowchart of the proposed scheduling model.
ψ($) ξ($/Kw) Pmin (Kw) Pmax (Kw) τ ($) ζ($) N
1.2 0.35 5 30 1.6 0.04 2
1.0 0.26 10 65 3.5 0.04 1

sec. II-C to divide the datasets into 24 new time series, and
regard these as the original datasets for prediction. Finally,
the original datasets are divided at a ratio of 0.8:0.2 for model
training and testing.
We use this system to verify the feasibility of the scheduling
model based on PER-AutoRL forecasting. Tab. II shows the
specific parameters of the MTs in this system.
Fig. 3: The microgrid test system. The parameters of lead-acid batteries are as follows:
DC DC
Pmax =Pmax = 40kW, Smin = 32kW·h, Smax = 160kW·h,
ηch = ηdc = 0.9.
where Φ is a very large positive number in this formula. By
substituting (30) with (32) and (33), the model in Sec. IV-A
will be completely transformed into the form of the mixed B. Multi-period Renewable Power Outputs and Load Fore-
integer programming. The overall flowchart of the proposed casting
scheduling model is shown in Fig. 2 .
For the prediction problem in this work, the proposed PER-
V. C ASE S TUDY AutoRL will automatically determine the model architecture
and hyperparameters based on the input multi-sub time series
To examine the effectiveness of the proposed approach, a of WT, PV outputs, and load in a customized manner. Fig. 4
real-world microgrid in North China is used for numerical shows the optimization results of the hyperparameters required
simulation analysis. In this paper, Python is used as the by the PER-AutoRL to model the load data. Each curve in the
programming language and all simulations are performed on figure represents a set of hyperparameters, each ordinate axis
a PC platform with 2 Intel Core dual-core CPUs (2.6Hz) and represents the values of various hyperparameters, and the last
6 GB RAM. ordinate is the negative MAPEs of using these hyperparameter
sets; the darker the color, the more appropriate the hyperpa-
A. Introduction of Test System rameters. It can be seen from the figure that our designed PER-
As shown in Fig. 3, the system is mainly composed of 3 AutoRL manages to achieve satisfactory optimization results
MT units, a WT unit, a PV panel, and a battery pack. in most cases, and only the MAPEs resulting by three sets
In order to reasonably verify the rationality of the proposed of hyperparameters are higher than 15%. Furthermore, only a
forecasting method, the 5 years of WT and PV outputs and few optimization iterations are needed for the PER-AutoRL to
load data from the microgrid from January 1, 2015 to Decem- find suitable hyperparameters. Therefore, these results prove
ber 31, 2019 are adopted. Then use the method described in that our PER-AutoRL is able to automatically determine the
IEEE TRANSACTIONS ON SUSTAINABLE ENERGY 8

Fig. 6: Forecasting results with different forecasting models.

our method is able to handle the error accumulation issue in


traditional multi-step forecasting.
As illustrated in Fig. 6, the scatter diagram shows the load
forecasting results of the proposed method and multi-step
forecasting method with different models. The red diagonal
line in the figure is ideal fitting line. The closer the fitting of
the scatter points is to this diagonal line, the more accurate
the prediction results will be. It can be seen from the figure
that no matter what kind of prediction models, the results
of the proposed prediction method is significantly higher
than the multi-step prediction. In the proposed forecasting
method, the prediction performance of the PER-AutoRL is
significantly better than that of traditional multiple linear
regressions (MLR), autoregressive integrated moving average
(ARIMA), LSTM, and recurrent deterministic policy gradient
Fig. 7: Forecasting results of loads. (RDPG).
Fig. 7 shows the one-day load forecasting results of the
microgrid using the proposed forecasting method. It can be
seen from Fig. 7 that the load in this area has a certain
regularity. The load gradually increases from 5:00 to 10:00,
and decreases during the lunch break from 11:00 to 14:00, then
it increases from 15:00 to 20:00. The load gradually reduced
again from 21:00 to 5:00. It can be seen that the forecasting
results can reflect this trend appropriately.
Fig. 8 shows the forecasting results of WT and PV outputs.
The figure reveals that when we adopt the proposed method,
Fig. 8: Forecasting results of WT and PV outputs. no matter the prediction results of WT or PV outputs can fit
the real power generation curve suitably.
It can be seen from Tab. III that, compared with the
most proper model architecture and hyperparameters in a multi-step prediction method, the forecasting accuracy of the
customized and efficient manner. proposed method is significantly improved, and it is further
Fig. 5 shows the load forecasting errors of the microgrid improved by revising the prediction values via the error distri-
for a total of 72 hours in 3 days. Regarding the multi- bution compared with single-step prediction method, indicat-
step forecasting method, the forecasting results are relatively ing that the proposed method has achieved the goal of reducing
accurate only in the first few steps; while as time goes by, error accumulation and improving prediction accuracy. At the
the error accumulation phenomenon gradually occurs, which same time, it can be seen that the load forecasting errors are
leads to poor performance in forecasting. In contrast, when smaller than that of WT and PV outputs, which suggests that
using our forecasting method, the prediction error is stably the randomness of WT and PV outputs are stronger than that
maintained in a much smaller interval, which confirms that of the load.
IEEE TRANSACTIONS ON SUSTAINABLE ENERGY 9

Fig. 9: Scheduling results.


Fig. 10: Operation costs of scheduling with different
TABLE III: Forecasting errors of different methods prediction methods.

Indicators Proposed method Single-step Multi-step


MAPE 0.01825 0.0215 0.1416
Load RMSE 3.9736 8.8031 49.9875
MAPE 0.0709 0.1015 0.2227
WT RMSE 25.8691 30.5605 66.6884
MAPE 0.0813 0.1261 0.2611
PV RMSE 38.8431 48.7785 59.8621

TABLE IV: Operating costs at different confidence levels


Operating cost($)
Confidence
levels(%) Proposed Scheduling without fore- Fig. 11: Comparison of spinning reserve capacity with
approach casting traditional scheduling model without forecasting.
90% 128.0413 207.9492
95% 139.3082 219.9456
99% 206.8478 253.3075 TABLE V: Performance comparison
Proposed approach Traditional approach
Confidence
levels(%) Operating Calculation Operating Calculation
C. Optimal Scheduling Based on PER-AutoRL Forecasting
cost($) time(s) cost($) time(s)
As a comparison, the renewable energy outputs and load 90% 128.0413 3.7 220.8436 78.2
in traditional scheduling methods without forecasting are 95% 139.3082 5.9 238.5843 89.2
obtained by discretizing their probability distributions in an 99% 206.8478 5.1 249.2354 118.68
expectation form; while the required spinning reserve capacity
is determined via the probability distributions of the WT, PV
outputs, and load [25]. And then, based on the above data
different prediction methods have a significant impact on the
as the input data, the scheduling results of the scheduling
operating costs of the microgrid.
model without forecasting can be obtained by optimizing the
scheduling model. Fig. 11 shows the effect of the proposed forecasting method
Fig. 9 shows the scheduling results with/without a fore- on the required spinning reserve capacity at the 95% con-
casting method at the 95% confidence level. It can be seen fidence level. Specifically, the spinning reserve capacity re-
from Fig. 9 (a) and (b) that the ESS is charged by MTs quired in the proposed scheduling model is less than that
at the beginning of the scheduling cycle. As shown in Fig. of the model without forecasting in 66.67% periods. This
9 (a), when the scheduling model without forecasting, MT1 confirms that accuracy forecasting can reduce the spinning
and MT3 provide the spinning reserve to meet the required reserve capacity significantly.
confidence level and balance the load in most periods. When As shown in Tab. IV, the spinning reserve capacity cost
the load demand increases, MT2 also participates in power must be increased with the increase of the confidence level,
generation. As shown in Fig. 9 (b), the power generated by which will inevitably lead to a increase in the operating
MT3 and ESS can basically meet the load and spinning reserve costs. Furthermore, Tab. IV also illustrates that compared
demand; it is only necessary to start MT1 at 24:00 and MT2 with the scheduling without forecasting, the operating costs
needn’t participate in scheduling. This fact suggests that our of the proposed approach are reduced by 18.3%, 36.7%,
approach is capable of improving the operational economy and 38.4% at the 99%, 95%, and 90% confidence level,
of the microgrid by reducing the MTs’ starting and stopping respectively. Therefore, it can be concluded that the scheduling
costs. with forecasting is able to effectively improve the operational
As shown in Fig. 10, the accuracy of the prediction results is economy of the microgrid, and that choosing a reasonable
closely related to the operating costs, the greater the prediction confidence level is a key to realize the trade-off between
errors, the higher the operating costs. This fact confirms that reliability and economy.
IEEE TRANSACTIONS ON SUSTAINABLE ENERGY 10

D. Performance Comparison [8] F. Rodrı́guez, A. Fleetwood, A. Galarza, and L. Fontán, “Predicting


solar energy generation through artificial neural networks using weather
In generally, traditional chance-constrained programming forecasts for microgrid control,” Renewable Energy, vol. 126, pp. 855–
uses Monte Carlo simulations (MCS) to handle the problem, 864, 2018.
[9] M. Tan, S. Yuan, S. Li, Y. Su, H. Li, and F. He, “Ultra-short-term
so we use the hybrid intelligent algorithm which combines industrial power demand forecasting using lstm based hybrid ensemble
the particle swarm optimization (PSO) algorithm with MCS learning,” IEEE Transactions on Power Systems, vol. 35, no. 4, pp. 2937–
in [28] to solve the chance-constrained model in sec. IV-A 2948, 2019.
[10] M. Afrasiabi, M. Mohammadi, M. Rastegar, and A. Kargarian, “Multi-
as a comparison. In PSO, the population size is 20, and the agent microgrid energy management based on deep learning forecaster,”
maximum number of iterations T is set as 150; in MCS, the Energy, vol. 186, p. 115873, 2019.
number of random variables N is 500. Considering that this [11] T. Liu, Z. Tan, C. Xu, H. Chen, and Z. Li, “Study on deep reinforce-
ment learning techniques for building energy consumption forecasting,”
approach has randomness, the average results of 10 runs are Energy and Buildings, vol. 208, p. 109675, 2020.
taken. [12] T. Kuremoto, T. Hirata, M. Obayashi, S. Mabu, and K. Kobayashi,
The results in Tab. V show that the proposed approach “Training deep neural networks with reinforcement learning for time
series forecasting,” in Time Series Analysis-Data, Methods, and Appli-
outperforms the traditional approach in calculation efficiency cations. Rijeka, Croatia: InTech, 2019.
and can significantly reduce the operation costs. [13] C. A. Hernandez-Aramburo, T. C. Green, and N. Mugniot, “Fuel con-
sumption minimization of a microgrid,” IEEE Transactions on Industry
Applications, vol. 41, no. 3, pp. 673–681, 2005.
VI. CONCLUSION [14] M. R. Ebrahimi and N. Amjady, “Adaptive robust optimization frame-
work for day-ahead microgrid scheduling,” International Journal of
How to deal with the uncertainty of renewable energies out- Electrical Power & Energy Systems, vol. 107, pp. 213–223, 2019.
puts and load is a crucial issue in an isolated microgrids day- [15] H. Wang, Z. Yan, M. Shahidehpour, X. Xu, and Q. Zhou, “Quantitative
evaluations of uncertainties in multivariate operations of microgrids,”
ahead scheduling. In this paper, we for the first time propose IEEE Transactions on Smart Grid, vol. 11, no. 4, pp. 2892–2903, 2020.
a multi-period forecasting method based on PER-AutoRL and [16] O. Ciftci, M. Mehrtash, and A. Kargarian, “Data-driven nonparametric
integrate it into the microgrid scheduling model considering chance-constrained optimization for microgrid energy management,”
IEEE Transactions on Industrial Informatics, vol. 16, no. 4, pp. 2447–
demand response. The designed PER-AutoRL manages to 2457, 2019.
improve the construction efficiency of the forecasting model [17] L. Wen, K. Zhou, S. Yang, and X. Lu, “Optimal load dispatch of
in a customized way and the test results demonstrate that community microgrid with deep learning based solar power and load
forecasting,” Energy, vol. 171, pp. 1053–1065, 2019.
the prediction accuracy of the proposed forecasting method [18] V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G.
is significantly superior to that of a traditional multi-step Bellemare, A. Graves, M. Riedmiller, A. K. Fidjeland, G. Ostrovski
forecasting method. Compared with the traditional scheduling et al., “Human-level control through deep reinforcement learning,”
Nature, vol. 518, no. 7540, pp. 529–533, 2015.
methods without forecasting, our proposed approach is able to [19] D. Silver, G. Lever, N. Heess, T. Degris, D. Wierstra, and M. Riedmiller,
significantly reduce the operating costs with faster calculation “Deterministic policy gradient algorithms,” in International Conference
speed. on Machine Learning. PMLR, 2014, pp. 387–395.
[20] Z. L. Li, C.-J. M. Liang, W. He, L. Zhu, W. Dai, J. Jiang, and
In future work, we will extend this method to the energy G. Sun, “Metis: Robustly tuning tail latencies of cloud systems,” in
management of integrated energy systems. Besides, exploring 2018 {USENIX} Annual Technical Conference ({USENIX}{ATC} 18),
advanced technologies to protect the users privacy is also an 2018, pp. 981–992.
[21] Y. Hou, L. Liu, Q. Wei, X. Xu, and C. Chen, “A novel ddpg method with
interesting topic. prioritized experience replay,” in 2017 IEEE International Conference
on Systems, Man, and Cybernetics (SMC). IEEE, 2017, pp. 316–321.
[22] J. Wang, T. Niu, H. Lu, W. Yang, and P. Du, “A novel framework
R EFERENCES of reservoir computing for deterministic and probabilistic wind power
forecasting,” IEEE Transactions on Sustainable Energy, vol. 11, no. 1,
[1] R. H. M. Zargar and M. H. Y. Moghaddam, “Development of a pp. 337–349, 2019.
markov-chain-based solar generation model for smart microgrid energy [23] C. Li, G. Tang, X. Xue, A. Saeed, and X. Hu, “Short-term wind speed
management system,” IEEE Transactions on Sustainable Energy, vol. 11, interval prediction based on ensemble gru model,” IEEE Transactions
no. 2, pp. 736–745, 2019. on Sustainable Energy, vol. 11, no. 3, pp. 1370–1380, 2019.
[2] A. Kumar, A. Verma, and R. Talwar, “Optimal techno-economic sizing [24] C. Kang, Q. Xia, and N. Xiang, “Sequence operation theory and its ap-
of a multi-generation microgrid system with reduced dependency on grid plication in power system reliability evaluation,” Reliability Engineering
for critical health-care, educational and industrial facilities,” Energy, vol. & System Safety, vol. 78, no. 2, pp. 101–109, 2002.
208, p. 118248, 2020. [25] Y. Li, Z. Yang, G. Li, D. Zhao, and W. Tian, “Optimal scheduling of an
[3] A. Agüera-Pérez, J. C. Palomares-Salas, J. J. G. de la Rosa, and isolated microgrid with battery storage considering load and renewable
O. Florencias-Oliveros, “Weather forecasts for microgrid energy man- generation uncertainties,” IEEE Transactions on Industrial Electronics,
agement: Review, discussion and recommendations,” Applied Energy, vol. 66, no. 2, pp. 1565–1575, 2018.
vol. 228, pp. 265–278, 2018. [26] Y. Li, C. Wang, G. Li, and C. Chen, “Optimal scheduling of integrated
[4] P. Vähäkyla, E. Hakonen, and P. Léman, “Short-term forecasting of grid demand response-enabled integrated energy systems with uncertain re-
load using box-jenkins techniques,” International Journal of Electrical newable generations: A stackelberg game approach,” Energy Conversion
Power & Energy Systems, vol. 2, no. 1, pp. 29–34, 1980. and Management, vol. 235, p. 113996, 2021.
[5] S.-C. Lee and L.-H. Shih, “Forecasting of electricity costs based on an [27] P. B. L. Neto, O. R. Saavedra, and L. A. de Souza Ribeiro, “A dual-
enhanced gray-based learning model: A case study of renewable energy battery storage bank configuration for isolated microgrids based on
in taiwan,” Technological Forecasting and Social Change, vol. 78, no. 7, renewable sources,” IEEE Transactions on Sustainable Energy, vol. 9,
pp. 1242–1253, 2011. no. 4, pp. 1618–1626, 2018.
[6] Y. Wang, Q. Chen, N. Zhang, and Y. Wang, “Conditional residual [28] Z. Wu, W. Gu, R. Wang, X. Yuan, and W. Liu, “Economic optimal sched-
modeling for probabilistic load forecasting,” IEEE Transactions on ule of chp microgrid system using chance constrained programming and
Power Systems, vol. 33, no. 6, pp. 7327–7330, 2018. particle swarm optimization,” in 2011 IEEE Power and Energy Society
[7] H. Jiang, Y. Zhang, E. Muljadi, J. J. Zhang, and D. W. Gao, “A short- General Meeting. IEEE, 2011, pp. 1–11.
term and high-resolution distribution system load forecasting approach
using support vector regression with hybrid parameters optimization,”
IEEE Transactions on Smart Grid, vol. 9, no. 4, pp. 3341–3350, 2016.

You might also like