Energy Demand Forecasting Using Deep Learning Applications for the French Grid
Energy Demand Forecasting Using Deep Learning Applications for the French Grid
Article
Energy Demand Forecasting Using Deep Learning:
Applications for the French Grid
Alejandro J. del Real 1, *, Fernando Dorado 2 and Jaime Durán 2
1 Department of Systems and Automation, University of Seville, 41004 Seville, Spain
2 IDENER, 41300 Seville, Spain; [email protected] (F.D.); [email protected] (J.D.)
* Correspondence: [email protected]
Received: 8 March 2020; Accepted: 21 April 2020; Published: 3 May 2020
Abstract: This paper investigates the use of deep learning techniques in order to perform energy
demand forecasting. To this end, the authors propose a mixed architecture consisting of a convolutional
neural network (CNN) coupled with an artificial neural network (ANN), with the main objective of
taking advantage of the virtues of both structures: the regression capabilities of the artificial neural
network and the feature extraction capacities of the convolutional neural network. The proposed
structure was trained and then used in a real setting to provide a French energy demand forecast using
Action de Recherche Petite Echelle Grande Echelle (ARPEGE) forecasting weather data. The results
show that this approach outperforms the reference Réseau de Transport d’Electricité (RTE, French
transmission system operator) subscription-based service. Additionally, the proposed solution obtains
the highest performance score when compared with other alternatives, including Autoregressive
Integrated Moving Average (ARIMA) and traditional ANN models. This opens up the possibility
of achieving high-accuracy forecasting using widely accessible deep learning techniques through
open-source machine learning platforms.
Keywords: energy demand forecasting; deep learning; machine learning; convolutional neural
networks; artificial neural networks
1. Introduction
The forecasting of demand plays an essential role in the electric power industry. Thus, there are
a wide variety of methods for electricity demand prediction ranging from those of the short term
(minutes) to long term (weeks), while considering microscopic (individual consumer) to macroscopic
(country-level) aggregation levels. This paper is focused on macroscopic power forecasting in the
medium term (hours).
To date, researchers are in agreement that electrical demand arises from complex interactions
between multiple personal, corporate, and socio-economic factors [1]. All these sources make power
demand forecasting difficult. Indeed, an ideal model able to forecast the power demand with the
highest possible level of accuracy would require access to virtually infinite data sources in order to feed
such a model with all the relevant information. Unfortunately, both the unavailability of the data and
the associated computational burden mean that researchers investigate approximate models supplied
with partial input information.
Within this framework, the prediction of power consumption has been tackled from different
perspectives using different forecasting methodologies. Indeed, there is a rich state of the art of
methods which, according to the the authors of [1], can be divided into the following main categories:
• Statistical models: Purely empirical models where inputs and outputs are correlated using
statistical inference methods, such as:
Figure1.1. Comparison
Figure Comparison between
between traditional
traditional machine
machine learning
learning models
models (a)
(a) requiring
requiringmanual
manualfeature
feature
extraction,
extraction,and
andmodern
moderndeep
deeplearning
learningstructures
structures(b)
(b)which
whichcan
canautomate
automateall
allthe
thefeature
featureand
andtraining
training
process
processininan
anend-to-end
end-to-endlearning
learningstructure.
structure.
In
Inthis
thispaper,
paper,the
theauthors
authorsfocus
focusthetheanalysis
analysison
onpredicting
predictingthe
theenergy
energydemand
demandbased
basedon
onartificial
artificial
intelligence
intelligence models. Nevertheless, although modern deep learning techniques have attracted the
models. Nevertheless, although modern deep learning techniques have attracted the
attention
attention ofof many
many researchers
researchers inin aamyriad
myriadof ofareas,
areas,many
manypublications
publications related
relatedtotopower
powerdemand
demand
forecasting
forecastinguseusetraditional
traditionalmachine
machinelearning
learningapproaches
approachessuch
suchasasartificial
artificialneural
neuralnetworks.
networks.
As commented above, the use of ANNs in the energy sector has been
As commented above, the use of ANNs in the energy sector has been widely widely researched. Thanks
researched. Thanksto
their good generalization ability, ANNs have received considerable attention in smart
to their good generalization ability, ANNs have received considerable attention in smart grid grid forecasting
forecasting and management. A comparison between the different methods of energy prediction
using ANN is proposed in [4] by classifying these algorithms into two groups.
Energies 2020, 13, 2242 3 of 15
and management. A comparison between the different methods of energy prediction using ANN is
proposed in [4] by classifying these algorithms into two groups.
On the one hand, the first group consists of traditional feedforward neural networks with only
one output node to predict next-hour or next-day peak load, or with several output nodes to forecast
hourly load [5].
On the other hand, other authors opt for radial basis function networks [6], self-organizing
maps [7], and recurrent neural networks [8].
Lie et al. compared three forecasting techniques, i.e., fuzzy logic (FL), neural networks (NNs),
and autoregressive (AR) processes. They concluded that FL and NNs are more accurate than AR
models [9]. In 2020, Chen Li presented an ANN for a short-term load forecasting model in the smart
urban grids of Victoria and New South Wales in Australia.
Bo et al. proposed a combined energy forecasting mechanism composed of the back propagation
neural network, support vector machine, generalized regression neural network, and ARIMA [10].
Wen et al. explored a deep learning approach to identify active power fluctuations in real time based
on long short-term memory (LSTM) [11].
However, traditional ANN solutions have limited performance when there is a lack of large training
datasets, a significant number of inputs, or when solving computationally demanding problems [12],
which is precisely the case discussed in this paper.
Thus, the authors of this paper found a promising topic related to the application of modern
deep learning structures to the problem of power demand forecasting. More specifically, this paper
describes the novel use of a particular deep neural network structure composed of a convolutional
neural network (widely used in image classification) followed by an artificial neural network for the
forecasting of power demand with a limited number of information sources available.
The network structure of a CNN was first proposed by Fukushima in 1988 [13]. The use of CNNs
has several advantages over traditional ANNs, including being highly optimized for processing 2D
and 3D images and being effective in the learning and extraction of 2D image features [14]. Specifically,
this is a quite interesting application for the purposes of this paper, since the authors here aim to extract
relevant features from the temperature grid of France (as further explained in Section 2.1.2).
The technique used to locate important regions and extract relevant features from images is
referred to as visual saliency prediction. This is a challenging research topic, with a vast number of
computer vision and image processing applications.
Wang et al. [15] introduced a novel saliency detection algorithm which sequentially exploited
the local and global contexts. The local context was handled by a CNN model which assigned a local
saliency value to each pixel given the input of local images patches, while the global context was
handled by a feed-forward network.
In the field of energy prediction, some authors have studied the modeling of electricity consumption
in Poland using nighttime light images and deep neural networks [16].
In [17], an architecture known as DeepEnergy was proposed to predict energy demands using
CNNS. There are two main processes in DeepEnergy: feature extraction and forecasting. The feature
extraction is performed by three convolutional layers and three poling layers while the forecasting
phase is handled by a fully connected structure.
Based on the conclusions and outcomes achieved in previous literature, the authors here
conceptualize their solution which, as described in the next sections, is an effective approach to
dealing with the power demand time series forecasting problem with multiples input variables,
complex nonlinear relationships, and missing data.
Furthermore, the proposed deep learning structure has been applied to the particular problem of
French power demand in a real-setting approach. The next section comprehensively describes the
materials and data sources used to this end, so other researchers can replicate and adapt the work of
this paper to other power demand forecasting applications. As shown later, the performance of this
approach is equal to (if not better than) that of the reference Réseau de Transport d’Electricité (RTE)
Energies 2020, 13, 2242 4 of 15
French power demand forecast subscription-based service. Moreover, the proposed model performs
Energies 2020, 13, x FOR PEER REVIEW 4 of 15
better than existing approaches, as described in the Results section.
2. Materials and Methods
2. Materials and Methods
2.1. Data
2.1. Data Analysis
Analysis
2.2.1. Power
2.1.1. Power Demand
Demand Data
Data
For this
For this paper,
paper, the
the historical
historical data
data of
of French
French energy
energy consumption
consumption were
were downloaded
downloaded fromfrom the
the
official RTE website [18], which provides data from 2012 to present. A first analysis of this data
official RTE website [18], which provides data from 2012 to present. A first analysis of this data resultsresults
is shown
is shownin inFigure
Figure2,2,which
whichshows
showsaaclear
clearseasonal
seasonalpattern
patternononthe
theenergy
energydemand.
demand.
Figure2.2.Monthly
Figure Monthly French
Frenchenergy demand
energy for thefor
demand period
the 2018–2019. Qx indicates
period 2018–2019. the X data the
Qx indicates percentiles.
X data
The colored lines within the Q
percentiles. The colored lines within
25 and Q quartile boxes represent the median (orange line)
the75Q25 and Q75 quartile boxes represent the median (orange andline)
the
mean (dashed green line). Points below and above
and the mean (dashed green line). Points below and 25 Q and Q
above Q75 are shown as well.
25 and Q75 are shown as well.
The
The strong seasonal pattern
strong seasonal patternininthetheenergy
energy demand
demand is further
is further backed
backed by Figure
by Figure 3, computed
3, computed with
with the data of RTE, which depicts the correlation between energy consumption
the data of RTE, which depicts the correlation between energy consumption and temperature. As and temperature.
As shown,
shown, an average
an average variation
variation of 1 °C 1 ◦ C during
of during winterwinter
over theover theterritory
entire entire territory led to a of
led to a variation variation
around
of around 2500 MW in the peak consumption (equivalent to the average winter
2500 MW in the peak consumption (equivalent to the average winter consumption of about 2 million consumption of
about
homes) 2 million
[18]. In homes) [18]. In
the summer, thethe summer, the
temperature temperature
gradient relatedgradient related to airwas
to air conditioning conditioning was
approximately
approximately 400 MW per ◦ C.
400 MW per °C.
2.1.2. Weather Forecast Data
As explained in [19], the meteorological parameters are the most important independent variables
and the main form of input information for the forecast of energy demand. Specifically, temperature
plays a fundamental role in the energy demand prediction, since it has a significant and direct effect on
energy consumption (please refer to Figure 3). Moreover, different weather parameters are correlated,
so the inclusion of more than one may cause multicollinearity [19]. Accordingly, for this paper,
temperature was the fundamental input used.
The strong seasonal pattern in the energy demand is further backed by Figure 3, computed with
the data of RTE, which depicts the correlation between energy consumption and temperature. As
shown, an average variation of 1 °C during winter over the entire territory led to a variation of around
2500 MW in the peak consumption (equivalent to the average winter consumption of about 2 million
homes)2020,
Energies [18].
13,In the summer, the temperature gradient related to air conditioning was approximately
2242 5 of 15
400 MW per °C.
Figure 3. Correlation between energy consumption and temperature as provided by the Réseau de
Transport d’Electricité (RTE).
Figure
Figure 4.
4. Locations
Locations of
of temperature
temperature forecasting
forecasting with
with aa resolution of 11°
resolution of ◦ over
over France.
France.
A key issue related to weather forecasting was the availability of the data, since the providers
released their predictions only at certain moments. In this case, the solution had to be be able to
predict the French power demand for the day ahead (D+1) based on the weather forecasts at day D.
As depicted in Figure 5, the power demand forecast model was run at 08.00 every day (D), with
the most recent weather forecast information available (released at 00.00), and provided a prediction
of the energy demand during day D+1.
Energies 2020, 13, 2242 6 of 15
A key issue related to weather forecasting was the availability of the data, since the providers
released their predictions only at certain moments. In this case, the solution had to be be able to predict
the French power demand for the day ahead (D+1) based on the weather forecasts at day D.
As depicted in Figure 5, the power demand forecast model was run at 08.00 every day (D), with the
most recent weather forecast information available (released at 00.00), and provided a prediction of the
energy demand during day D+1.
Energies 2020, 13, x FOR PEER REVIEW 6 of 15
Energies 2020, 13, x FOR PEER REVIEW 6 of 15
Figure5.5.Real
Figure Realsetting
settingof
ofthe
theenergy
energydemand
demandforecasting
forecastingproblem.
problem.D:
D:day.
day.
Figure 5. Real setting of the energy demand forecasting problem. D: day.
2.2.
2.2.Data
DataPreparation
Preparation
2.2. Data Preparation
First,
First,thethehistorical
historicaldatasets
datasets related to theto
related French energy consumption
the French energy consumption(Section 2.1.1) and forecasted
(Section 2.2.1) and
First,
temperature the historical
(Section 2.1.2)datasets
were related to
pre-processed the toFrench
eliminateenergy consumption
outliers,
forecasted temperature (Section 2.2.2) were pre-processed to eliminate outliers, clean unwanted clean (Section
unwanted 2.2.1) and
characters,
forecasted
and filter nulltemperature
data. Then, (Section
and as is 2.2.2)
usual were
practice pre-processed
when training to eliminate
machine outliers,
learning
characters, and filter null data. Then, and as is usual practice when training machine learning models, clean
models, theunwanted
resulting
characters,
data and
were divided
the resulting filter
data null
into
were data.
three
divided Then,
into and
datasets: asdatasets:
training,
three is usual practice
validation,
training, when
and training
testing.
validation, andmachine
testing.learning models,
the resulting
Although
Although data
the
the were divided
historical
historical French into
French three
consumptiondatasets:
consumption datatraining,
provided validation,
by RTEby
data provided dateand
RTEbacktesting.
to 2012,
date backthetoauthors of
2012, the
this Although
paper had the
only historical
access to theFrench
ARPEGE consumption
historical data
weather provided
forecast by
data
authors of this paper had only access to the ARPEGE historical weather forecast data in the period RTE
in the date back
period to
spanning2012, the
from
1authors
Octoberof2018
spanning this to
from paper
30 had only
September
1 October 2018 access
2019. toSeptember
the ARPEGE
Although
to 30 a wider
2019.historical
availability
Although weather forecast
ofa historical
wider data in
weather
availability ofthe
forecast period
data
historical
spanning
would have from 1
benefittedOctober
the 2018 to
generalization 30 September
capability 2019.
of the Although
resulting a wider
machine
weather forecast data would have benefitted the generalization capability of the resulting machine availability
learning of
model, historical
the data
weather
set
learning forecast
availablemodel, data
still the
coveredwould
data set have year,
a whole
availablebenefitted
so the
still the generalization
seasonal
covered year,capability
influence
a whole was
so thefully of the influence
resulting
captured.
seasonal machine
Additionally,
was fully
learning
the authors model, the
randomly data set
extracted available
eight still
full covered
days from a whole
the year,
original so
data
captured. Additionally, the authors randomly extracted eight full days from the original data setthe seasonal
set in order influence
to further was fully
test thein
captured.
generalization Additionally,
performance the authors
of the randomly
model (as extracted
depicted in eight
Figure full
6 days
and from
further
order to further test the generalization performance of the model (as depicted in Figure 6 and further the original
discussed in data set
Section in
3).
order
This to further
way,
discussed the test the
in remaining
Section 3). generalization
data sets were
This way, performance
randomly data
the remaining of sets
dividedthe asmodel (as depicted
follows:
were randomly in Figure
divided 6 and further
as follows:
discussed in Section 3). This way, the remaining data sets were randomly divided as follows:
• Training
TrainingDataset
Dataset(80%(80%ofofthe thedata):
data):The
Thesample
sampleofofdata dataused
usedtotofitfitthe
themodel.
model.
Training
• Validation Dataset
ValidationDataset (80%
Dataset(10% of the
(10%ofofthe data):
thedata): The
data):The sample
Thesample
sampleofof data
ofdata used
dataused to
usedto fit the
toprovide model.
providean anunbiased
unbiasedevaluation
evaluation
Validation
ofaamodel
of modelfit Dataset
fiton
onthe (10%
the of the
training
training data): while
dataset
dataset The
whilesample
tuning
tuning ofmodel
data used
model to provide an unbiased evaluation
hyperparameters.
hyperparameters.
of a model fit (10%
on the training dataset while tuning datamodel hyperparameters.
• Test Test Dataset
Dataset (10% ofofthe thedata):
data): Thesample
The sample ofdata
of usedto
used to providean
provide anunbiased
unbiasedevaluation
evaluationof ofaa
Test Dataset
finalmodel (10%
modelfitfiton of
onthe the
thetrainingdata): The
trainingdataset. sample
dataset. of data used to provide an unbiased evaluation of a
final
final model fit on the training dataset.
Figure 6. Division of the original dataset (365 days) into testing and training data. The testing data
Figure 6. Division of the original dataset (365 days) into testing and training data. The testing data
Figure 6. Division
were used of the original dataset
as a complementary means to(365 days) analyze
further into testing
the and training data.
generalization The testingofdata
performance the
were used as a complementary means to further analyze the generalization performance of the resulting
were usedmodel.
resulting as a complementary means to
The remaining training further
data were analyze theusual:
divided as generalization
80% train, performance of and
10% validation, the
model. The remaining training data were divided as usual: 80% train, 10% validation, and 10% testing.
resulting model. The remaining training data were divided as usual: 80% train, 10% validation, and
10% testing.
10% testing.
2.3. Deep Learning Architecture
2.3. Deep Learning Architecture
The deep learning architecture used in this paper (as shown in Figure 7) resembles those
The deep
structures learning
widely used architecture used in thisapaper
in image classification: (as shown
convolutional in Figure
neural 7) resembles
network followed by those
an
structures widelynetwork.
artificial neural used in The
image classification:
novelty a convolutional
of this paper is not the deep neural network
neural followed
network by an
itself but its
artificial neural network. The novelty of this paper is not the deep neural network itself
application to the macroscopic forecast of energy demand. In fact, the aforementioned deep learning but its
application
architecturetowas
the macroscopic forecast
originally thought toofautomatically
energy demand.inferInfeatures
fact, thefrom
aforementioned deep (made
an input image learning of
Energies 2020, 13, 2242 7 of 15
Figure 7. Deep learning structure composed of a convolutional neural network followed by an artificial
Figurenetwork
neural 7. Deepadapted
learningtostructure composed
the energy of a convolutional
demand forecasting problem. neural network followed by an
artificial neural network adapted to the energy demand forecasting problem.
For the applications of this paper, the convolutional network received the temperature forecasts
from For the applications
multiple of this paper,
locations within the area the
ofconvolutional network
interest (in this receivedinstead
case, France) the temperature forecasts
of an image. Still,
fromconvolutional
the multiple locations within
network the areaaof“feature”
extracted interest (in
of this
suchcase, France)
input, whichinstead
may of bean image. Still,asthe
understood a
convolutional temperature
representative network extracted
of Franceaautomatically
“feature” ofcalculated
such input, whichto may
attending be understood
the individual as a
contribution
representative
of each location to temperature
the aggregated of energy
France consumption.
automaticallyFor calculated attending
instance, the temperatureto the individual
locations close
contribution
to of each location
large consumption sites (such to as
thehighly
aggregated energy
populated consumption.
areas) For instance,assigned
would be automatically the temperature
a larger
locations
weight close
when to large consumption
compared sites (suchareas.
to other less populated as highly populated areas) would be automatically
assigned a larger
As also weight
discussed when compared
in Section to otherofless
1, the advantage the populated areas.
proposed deep learning structure with respect
As also discussed
to traditional in Section 1, the
(and less sophisticated) advantage
machine of the
learning proposed
structures deepthis
is that learning
featurestructure
extractionwith
is
respect to
implicit to the
traditional
model, and (andthusless sophisticated)
there is no need to machine
design thelearning structuresstep
feature extraction is that this feature
manually.
extraction is implicit
As shown to 7,
in Figure thethe
model, andneural
artificial thus network
there is no need tothe
receiving design the feature
“featured” extraction
temperature fromstep
the
manually.
convolutional network was also fed with additional information found to highly influence the energy
demandAs shown
as well,innamely:
Figure 7, the artificial neural network receiving the “featured” temperature from
the convolutional network was also fed with additional information found to highly influence the
• Week of the year: a number from 1 to 52.
energy demand as well, namely:
• Hour: a number from 0 to 23.
• Week
Day ofof
thethe year:aanumber
week: numberfrom from01toto6.52.
Hour: a number from 0 to 23.
• Holiday: true (1) or false (0).
Day of the week: a number from 0 to 6.
2.3.1.Holiday: true (1)Neural
Convolutional or false (0).
Network
• They use a fewer number of parameters (weights) with respect to fully connected networks.
• They are designed to be invariant in object position and distortion of the scene when used to
process images, which is a property shared when they are fed with other kinds of inputs as well.
• They can automatically learn and generalize features from the input domain.
Attending to these benefits, this paper used a CNN to extract a representative temperature of
the area of interest (France) from the historical temperature forecast data as explained before. For the
sake of providing an easy replication of the results by other researchers, the main features of the CNN
designed for this paper were as follows:
• A two-dimensional convolutional layer. This layer created a convolution kernel that was convolved
with the layer to produce a tensor of outputs. It was set with the following parameters:
The weights of all layers were initialized following a normal distribution with mean 0.1 and
standard deviation 0.05.
and testing. Moreover, the training parameters were also optimized. To this effect, the different
models were trained, repeatedly changing the learning parameters (as the learning rate) to find the
optimal ones.
Once all the results were obtained, the objective was to find the model with the least bias error
(error in the training set) as well as low validation and testing errors. Accordingly, model 5 of the table
below was selected.
Finally, L2 regularization was added to our model in order to reduce the difference between the
bias error and the validation/testing error. In addition, thanks to L2 regularization, the model was able
to better generalize using data that had never been seen.
A summary of the tests performed can be seen in the Table 1 below:
Table 1. Summary of the results of the different structures. ANN: artificial neural network; CNN:
convolutional neural network.
Model 1 2 3 4 5
Layer 1 (CNN) - 64 64 64 64
Layer 2 (CNN) 24 24 24 24 24
Layer 1 (ANN) - - - - 256
Layer 2 (ANN) - - - 128 128
Layer 3 (ANN) - - 64 64 64
Layer 4 (ANN) 32 32 32 32 32
Layer 5 (ANN) 16 16 16 16 16
Layer 6 (ANN) 1 1 1 1 1
Train Error (%) 1.9548 1.2275 0.6532 0.4797 0.4929
Validation Error (%) 2.7721 2.6791 1.2307 0.9378 0.8603
Test Error 1 (%) 2.8185 3.0435 1.2415 0.9125 0.8843
Test Error 2 (%) 4.2818 4.1677 2.0604 1.6873 1.5378
Cross-Validation Error (%) 5.8827 5.3691 2.6341 2.0806 1.6621
2.4. Training
The training process of the proposed deep neural network was aimed at adjusting its internal
parameters (resembling mathematical regression), so the structure was able to correlate its output (the
French energy demand forecast) with respect to its inputs.
What separates deep learning from a traditional regression problem is the handling of the
generalization error, also known as the validation error. Here, the generalization error is defined as the
expected value of the error when the deep learning structure is fed with new input data which were not
shown during the training phase. Typically, the usual approach is to estimate the generalization error
by measuring its performance on the validation set of examples that were collected separately from the
training set. The factors determining how well a deep learning algorithm performs are its ability to:
The tradeoff of these factors results in a deep neural network structure that is either underfitted or
overfitted. In order to prevent overfitting, the usual approach is to update the learning algorithm to
encourage the network to keep the weights small. This is called weight regularization, and it can be
used as a general technique to reduce overfitting of the training dataset and improve the generalization
of the model.
In the model used in this paper, the authors used the so-called L2 regularization in order to reduce
the validation error. This regularization strategy drives the weights closer to the origin by adding
a regularization term to the objective function. L2 regularization adds the sum of the square of the
weights to the loss function [22].
The rest of the training parameters were selected as follows:
Energies 2020, 13, 2242 10 of 15
• Batch size: 100. The number of training examples in one forward/backward pass. The higher the
batch size, the more memory space needed.
• Epochs: 30,000. One forward pass and one backward pass of all the training examples.
• Learning rate: 0.001. Determines the step size at each iteration while moving toward a minimum
of a loss function.
• β1 parameter: 0.9. The exponential decay rate for the first moment estimates (momentum).
• β2 parameter: 0.99. The exponential decay rate for the first moment estimates (RMSprop).
• Loss function: Mean absolute percentage error.
3. Results
Different metrics were computed within this paper in order to evaluate the performance of the
proposed solution. Specifically, mean absolute error (MAE), mean absolute percentage error (MAPE),
mean bias error (MBE), and mean bias percentage error (MBPE) were calculated. Their equations are
listed below:
1X
MAE (MW) = y − ŷ , (1)
n
100 X y − ŷ
MAPE (%) = , (2)
n y
1X
MBE (MW) = y − ŷ, (3)
n
100 X y − ŷ
MBPE (MW) = , (4)
n y
where y is the reference measure (in our case, the real energy demand) and ŷ is the estimated measure
(in our case, the forecasted energy demand).
Once the deep learning structure proposed in this paper was trained with the training data set
shown in Figure 6 and its output was tested against the real French energy demand, the authors
calculated the different metrics, as shown in Table 2. As an additional performance metric, the authors
calculated the metrics of the reference RAE energy demand forecast, which was included for comparison.
Table 2. Performance comparison metrics. MAE: mean absolute error; MAPE: mean absolute percentage
error; MBE: mean bias error; MBPE: mean bias percentage error.
The absolute percentage error is also presented in the graphical form provided in Figure 8.
Deep learning network 808.317 1.4934 21.7444 0.0231
RTE forecast service 812.707 1.4941 280.8350 0.4665
The absoluteEnergies
percentage
2020, 13, 2242
error is also presented in the graphical form provided
11 of 15
in Figure 8.
Figure 8. Absolute percentage error distribution provided by the deep learning structure proposed in
Figure 8. Absolute
thispercentage error
paper and the RTE distribution
subscription-based provided by the deep learning structure proposed in
service.
this paper and the RTE subscription-based service.
Another interesting measure of the performance of the proposed structure is the absolute
percentage error
Energies 2020, 13, monthly
x FOR distribution along a full year, as shown in Figure 9.
PEER REVIEW 11 of 15
Another interesting measure of the performance of the proposed structure is the abso
entage error monthly distribution along a full year, as shown in Figure 9.
Absolute percentage
Figure 9. Absolute
Figure percentage error-specific
error-specific monthly
monthly metrics
metrics over
over an
an entire
entire year
year as
as provided
provided by
by the
the
proposed deep neural network.
In Tables
In Tables 33and
and4,4,the
themonthly
monthly distributed
distributed results
results of the
of the metrics
metrics fromfrom Equations
Equations (1)–(4)
(1)–(4) are
are also
also gathered.
gathered.
The next Figure 10 depicts the results for the testing of the performance of the forecast provided
by the proposed deep neural network on the eight full days extracted from the original data.
Finally, in order to compare the performance achieved by the approach proposed in this paper
with respect to existing solutions, a comparative study was performed. To this end, the CNN + ANN
structure was fed with the temperature grid information while the others, which were not specially
designed for processing images, were fed with the average temperature values of France. The ARIMA
algorithm received as input the past energy demand and predicted that of the future. The results of the
experiment are provided in Table 5.
Table 5. Comparison between the proposed solution and the existing methods.
As the main outcome of the analysis, it can be concluded that the approach that achieved the best
results was the deep learning structure presented in this paper, as it showed an improvement even on
RTE baseline values. Furthermore, it can be observed that the single ANN, which was only supplied
with the average temperature, performed worse than the CNN + ANN method. This fact confirms
that the CNN can extract the temperature features for France, providing relevant information to the
machine learning algorithm and allowing improved results. Accordingly, the solution presented in
this paper was able to enhance the performance of existing methods, thanks to the processing and
extraction of features from the French temperature grid performed by the CNN.
Energies 2020, 13, x FOR PEER REVIEW 12 of 15
The
Energies next
2020, 13, Figure
2242 10 depicts the results for the testing of the performance of the forecast provided
13 of 15
by the proposed deep neural network on the eight full days extracted from the original data.
Figure
Figure 10.
10. Performance
Performanceof of the
the forecast
forecast provided
provided byby the
the proposed
proposed deep
deep neural
neural network
network on
on the
the eight
eight
full
full days
daysextracted
extractedfrom
fromthe
theoriginal
originaldata.
data. (Left column) Real energy consumption, neural network
(Left
energycolumn) Realand
prediction energy consumption,
energy prediction byneural network
RTE model onenergy prediction
a different full dayand energy
in the prediction
Testing by
Set. (Right
RTE model on a different full day in the Testing Set.
column) Absolute Percentage Error in energy prediction throughout the day by the neural network
and the model proposed by RTE).
Energies 2020, 13, 2242 14 of 15
4. Discussion
In this paper, the authors presented the adaptation of a deep neural network structure commonly
used for image classification applied to the forecast of energy demand. In particular, the structure was
trained for the French energy grid.
The results show that the performance of the proposed structure competes with the results
provided by the RTE subscription-based reference service. Specifically, the overall MAPE metric of the
proposed approach delivers an error of 1.4934%, which is slightly better than the value of 1.4941%
obtained with the RTE forecast data.
In addition, a comparison between the proposed solution and existing methods was also performed.
As pointed out in the Results section, the suggested approach performed better than all the existing
methods which were tested. Specifically, the linear regression, regression tree, and support vector
regression (lineal) approaches had a MAPE above 10%, support vector regression (polinomic) had a
MAPE of 9.2218%, and ARIMA and ANN had a MAPE that was slightly lower than 3%. Since the
MAPE achieved by the proposed structure was 1.4934%, it can be confirmed that the CNN + ANN
approach is better than the existing models.
When analyzed on a monthly basis, the errors were uniformly distributed through the year,
despite the noticeable increments during the late autumn and winter seasons. This fact is also in
accordance with the reference RTE forecast data and may be due to the intermittency of the energy
consumption profile observed when French temperatures are low.
The proposed deep neural network also was tested against eight full days randomly selected
from the original dataset in order to provide an additional measure of generalization performance.
On the one hand and as shown in Figure 10, the errors were uniformly distributed along the selected
days. On the other hand, the predictions provided by this paper were quite similar to those predictions
provided by the reference RTE subscription service and were also aligned with the overall MAPE
metrics. These results indicate that the proposed neural network structure is well designed and trained,
and that it generalizes as expected.
The performance achieved in this paper is a promising result for those researchers within the
electrical energy industry requiring accurate energy demand forecasting at multiple levels (both
temporal and geographical). Despite the focus of this paper on the French macroscopic energy demand
problem, the flexibility of the proposed deep neural network and the wide availability of open platforms
for its design and training make the proposed approach an accessible and easy-to-implement project.
To further facilitate the replication of this paper by other researchers in this area, the authors have
included detailed information about the topology and design of the proposed structure.
Author Contributions: Conceptualization, A.J.d.R.; software, F.D.; validation, A.J.d.R. and J.D.; writing—original
draft preparation, A.J.d.R.; writing—review and editing, A.J.d.R., F.D. and J.D.; supervision, A.J.d.R. All authors
have read and agreed to the published version of the manuscript.
Funding: This research received no external funding.
Conflicts of Interest: The authors declare no conflict of interest.
References
1. Hu, H.; Wang, L.; Peng, L.; Zeng, Y.-R. Effective energy consumption forecasting using enhanced bagged
echo state network. Energy 2020, 197, 1167–1178. [CrossRef]
2. Oliveira, E.M.; Luiz, F.; Oliveira, C. Forecasting mid-long term electric energy consumption through bagging
ARIMA and exponential smoothing methods. Energy 2018, 144, 776–778. [CrossRef]
3. Wang, J.; Ma, Y.; Zhang, L.; Gao, R.X.; Wu, D. Deep learning for smart manufacturing: Methods and
applications. J. Manuf. Syst. 2018, 48, 144–156. [CrossRef]
4. RazaKhan, A.; Mahmood, A.; Safdar, A.A.; Khan, Z. Load forecsating, dynamic pricing and DSM in smart
grid: A review. Renew. Sustain. Energy Rev. 2016, 54, 1311–1322.
Energies 2020, 13, 2242 15 of 15
5. Hippert, H.; Pedreira, C.; Souza, R. Neural Network for short-term load forecasting: A review and evaluation.
IEEE Trans. Power Syst. 2001, 16, 44–55. [CrossRef]
6. Gonzalez-Romera, J.-M.; Carmona-Fernandez, M. Montly electric demand forecasting based on trend
extraction. IEEE Trans. Power Syst. 2006, 21, 1946–1953. [CrossRef]
7. Becalli, M.; Cellura, M.; Brano, L.; Marvuglia, V. Forecasting daily urban electric load profiles using artificial
neural networks. Energy Convers. Manag. 2004, 45, 2879–2900. [CrossRef]
8. Srinivasan, D.; Lee, M.A. Survey of hybrid fuzzy neural approches to a electric load forecasting. In Proceedings
of the IEEE international Conference on System, Man and Cybernetics. Intelligent System for the 21st
Century, Vancouver, BC, Canada, 22–25 October 1995.
9. Liu, K.; Subbarayan, S.; Shoults, R.; Manry, M. Comparison of very short-term load forecasting techniques.
IEEE Trans. Power Syst. 1996, 11, 877–882. [CrossRef]
10. Bo, H.; Nie, Y.; Wang, J. Electric load forecasting use a novelty hybrid model on the basic of data preprocessing
technique and multi-objective optimization algorithm. IEEE Access 2020, 8, 13858–13874. [CrossRef]
11. Wen, S.; Wang, Y.; Tang, Y.; Xu, Y.; Li, P.; Zhao, T. Real—Time identification of power fluctuations based on
lstm recurrent neural network: A case study on singapore power system. IEEE Trans. Ind. Inform. 2019, 15,
5266–5275. [CrossRef]
12. Gu, J.; Wangb, Z.; Kuenb, J.; Ma, L.; Shahroudy, A.; Shuaib, B.; Wang, X.; Wang, L.; Wang, G.; Cai, J.; et al.
Recent advances in convolutional neural networks. Pattern Recognit. 2017. [CrossRef]
13. Fukushima, K. Neocognitron: A hierical neural network capable of visual pattern recognition. Neural Netw.
1988, 1, 119–130. [CrossRef]
14. Alom, M.Z.; Taha, T.M.; Yakopcic, C.; Westberg, S.; Sidike, P. A State-of-the-art survey on deep learning
theory and architectures. Electronics 2019, 8, 292. [CrossRef]
15. Wang, L.; Lu, H.; Ruan, X.; Yang, M.H. Deep networks for saliency detection via local estimation and global
search. In Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA,
USA, 7–12 June 2015.
16. Jasiński, T. Modeling electricity consumption using nighttime light images and artificial neural networks.
Energy 2019, 179, 831–842. [CrossRef]
17. Kuo, P.-H.; Huang, C.-J. A high precision artificial neural networks model for short-term energy load
forecasting. Energies 2018, 11, 213. [CrossRef]
18. RTE. November 2014. Available online: http://clients.rte-france.com/lang/fr/visiteurs/vie/courbes_
methodologie.jsp (accessed on 20 December 2019).
19. Arenal Gómez, C. Modelo de Temperatura Para la Mejora de la Predicción de la Demanda Eléctrica: Aplicación al
Sistema Peninsular Español; Universidad Politécnica de Madrid: Madrid, Spain, 2016.
20. ARPEGE. Meteo France, Le Modele. 2019. Available online: https://donneespubliques.meteofrance.fr/client/
document/doc_arpege_pour-portail_20190827-_249.pdf (accessed on 3 February 2020).
21. Goodfellow, I.; Bengio, Y.; Courville, A. Optimization for training deep models. Deep Learning. 2017,
pp. 274–330. Available online: http://faculty.neu.edu.cn/yury/AAI/Textbook/DeepLearningBook.pdf
(accessed on 3 February 2020).
22. Browlee, J. Machine Learning Mastery. Available online: https://machinelearningmastery.com/
weightregularization- (accessed on 3 February 2020).
© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access
article distributed under the terms and conditions of the Creative Commons Attribution
(CC BY) license (http://creativecommons.org/licenses/by/4.0/).