A Deep Learning Framework For Building Energy Consumption Forecast
A Deep Learning Framework For Building Energy Consumption Forecast
A R T I C L E I N F O A B S T R A C T
Keywords: Increasing global building energy demand, with the related economic and environmental impact, upsurges the
Energy consumption need for the design of reliable energy demand forecast models. This work presents kCNN-LSTM, a deep learning
Deep learning framework that operates on the energy consumption data recorded at predefined intervals to provide accurate
Buildings
building energy consumption forecasts. kCNN-LSTM employs (i) k − means clustering – to perform cluster
Clustering
Convolutional neural network
analysis to understand the energy consumption pattern/trend; (ii) Convolutional Neural Networks (CNN) – to
Long short term memory extract complex features with non-linear interactions that affect energy consumption; and (iii) Long Short Term
Memory (LSTM) neural networks – to handle long-term dependencies through modeling temporal information in
the time series data. The efficiency and applicability of kCNN-LSTM were demonstrated using a real time
building energy consumption data acquired from a four-storeyed building in IIT-Bombay, India. The performance
of kCNN-LSTM was compared with the k-means variant of the state-of-the-art energy demand forecast models in
terms of well-known quality metrics. It is also observed that the accurate energy demand forecast provided by
kCNN-LSTM due to its ability to learn the spatio-temporal dependencies in the energy consumption data makes it
a suitable deep learning model for energy consumption forecast problems.
* Corresponding author.
E-mail addresses: [email protected] (N. Somu), [email protected] (G. Raman M R), [email protected] (K. Ramamritham).
https://doi.org/10.1016/j.rser.2020.110591
Received 4 April 2020; Received in revised form 30 October 2020; Accepted 17 November 2020
Available online 14 December 2020
1364-0321/© 2020 Elsevier Ltd. All rights reserved.
N. Somu et al. Renewable and Sustainable Energy Reviews 137 (2021) 110591
conditioning (HVAC) optimization, and fault diagnosis and detection. 1.3. Machine learning based energy forecasting
Accurate and reliable energy demand forecasts enable the utilities to
plan resources and balance supply-demand, thereby ensuring stability Energy consumption forecast models in the literature fall into three
and security of the power grid & reliability of service provisions [12]. categories: (i) Engineering methods: Uses physical and thermody
Overestimation and underestimation of the energy demand lead to a namic laws & require complex building and environmental parameters;
severe impact on the economic and industrial developments [13]. Ac Difficult and time-consuming; Examples-EnergyPlus, Ecotect, etc [2].
curate modeling and prediction of energy demand help in efficient en (ii) Statistical methods: Correlates energy consumption with relevant
ergy management in smart buildings, accurate demand response factors like climate data, occupancy, etc.; Lacks accuracy and flexibility;
strategies, electricity supply management, and context aware control Examples-Time series (autoregressive models) [14] and regression
strategies [14]. Therefore, energy consumption forecast models have models (linear regression) [16,17], and (iii) Artificial intelligence
become an integral part of Building Energy Management System (BEMS) methods: Learns the consumption patterns from the historical energy
to improve the buildings’ energy efficiency for a sustainable economy consumption data, i.e., discover the non-linear relationship between the
through a conservation-minded society, reasonable use of available input (historical data) and output (target consumption) [18,19];
energy resources, and efficient national energy strategy [15]. However, Examples-artificial neural network [20], support vector regression [4],
the non-linear, dynamic, and complex nature of energy consumption etc. Among these, artificial intelligence approaches have become ‘active
data, along with the presence of trend, seasonal & irregular patterns, and research hotspot’ due to their efficiency and flexibility over engineering
dependence on various exogenous factors like climatic conditions, na and statistical methods (Table 1) [21,22].
ture of the day, socio-economic factors, etc. presents accurate and reli From Table 1, it is evident that ANN and its variants have been
able energy consumption forecast as an interesting research problem. widely explored and applied for energy consumption forecast since they
are non-linear, self-adaptable, and can approximate any function, given
2
N. Somu et al. Renewable and Sustainable Energy Reviews 137 (2021) 110591
3
N. Somu et al. Renewable and Sustainable Energy Reviews 137 (2021) 110591
to several factors like high dynamics in schedules, occupancy related 7. The effectiveness of kCNN-LSTM for reliable building energy con
status, operations, etc. sumption forecast is validated through a case study using the real
3. Moreover, the existing energy forecasting models follow static time building operational data acquired from the BEMS deployed at
learning, where its performance is entirely dependent only on the KReSIT, IIT-Bombay. Further, the performance of KCNN-LSTM over
historical energy consumption data. However, the inclusion of recent the k − means variant of existing energy demand forecast models was
observation with the historical data with the help of a sliding win assessed in terms of MAE, MSE, MAPE, and RMSE for the considered
dow approach would result in better forecast accuracy. year, weekdays, and weekend.
4. Further, the existing research works on LSTM based building energy
consumption forecast model operates over static data (benchmark 1.5. Organization
datasets) instead of real time building operation data obtained from
the BEMS. The paper follows the following structure. Section 2 introduces the
basic principles of k − means, energy consumption forecasts, LSTM
What makes kCNN-LSTM different from the existing building energy neural networks, and convolutional neural networks. Section 3 presents
consumption models? The distinctive features of kCNN-LSTM are. a brief description of the formulation of the energy consumption forecast
problem and the proposed deep learning framework to forecast the en
1. Feature generation from timestamp: The considered building en ergy consumption of the buildings. Section 4 provides a detailed analysis
ergy consumption data is a nxm dimensional data, where n denotes of the performance of kCNN-LSTM and state-of-the-art energy demand
the rows (energy consumption records) and m represents the col forecast models in terms of MAE, MSE, MAPE, and RMSE. Section 5
umns (m = 2; timestamp in dd-mm-yyyy HH: MM: SS format and concludes the paper.
energy consumption). As a data preprocessing step, seven features
(day of the year, season, month, day of the week, hour of the day, 2. Preliminaries
minute of the hour, type of the day) were generated from the time
stamp, which enables the learning model to gain better insight on the This section presents a clear view of the formulation of energy con
trend and seasonality of the energy consumption data. sumption forecast problem, k − means clustering, long short term
2. Clustering algorithms: An academic building’s energy consump memory, and convolutional neural networks.
tion is quite complex and dynamic, which does not exhibit an
apparent trend and seasonality in the initial analysis. In such cases, 2.1. Energy consumption forecast problem
the application of clustering algorithms to the energy consumption
data before data modeling provides better insights into the trend and A multi-featured building energy demand forecasting problem can be
seasonal characterization of the data through the generation of defined as the energy consumed by several components (e.g., plugs,
clusters. lights, fans, air conditioners, servers, computers, etc.) and environ
3. Multi-input and multi-output sliding window: The application of mental factors (humidity, temperature, etc.) which are monitored and
multi-input and multi-output sliding window to kCNN-LSTM pro recorded by the sensors (e.g., smart meters, temperature sensors, etc.)
vides robust and reliable forecasting by moving through the window installed at various levels of the considered building. The energy
of historical and recent energy consumption observations. consumed at timestamp t can be represented as in Eqn. (1).
4. Static or live data: kCNN-LSTM has been implemented as an energy { }
Et = e1t , e2t , …, eit , …, ent (1)
consumption forecast model in the BEMS designed for Kanwal Rekhi
School of Information and Technology (KReSIT), IIT-Bombay, India.
where, eit is the energy consumption logged by the ith sensor at time
The key contributions of this paper are. stamp t. This work uses Multi-Input and Multi-Output (MIMO) sliding
window approach for energy consumption forecast to achieve better
1. kCNN-LSTM, a deep learning framework, is presented to provide forecast accuracy. Therefore, let {InputL , OutputL } ∈ N represents the
reliable and accurate building energy consumption forecasts. size of the input and output window. Further, the total number of input
2. As a data preprocessing step, seven timestamp based features were and output window can be defined as if = (Sn − InputL − a)/OutputL ,
generated to enrich the energy consumption data recorded at regular where Sn and a denotes the number of samples and output intervals,
time intervals as parameter rich data. respectively.
3. The complex trend and seasonality in the energy consumption data Input window (SInput ) of size (InputL ) and output window (SOutput ) of
are analyzed using the k − means clustering algorithm that uses the size (OutputL ) are represented as in Eqn. (2) and Eqn. (3) (Fig. 1).
LB-Keough distance metric to identify the similarity between the {
SInput = Et , Et+1 , …, Et+if
}
(2)
time series (energy consumption data) of different months in the
considered annual energy consumption data. { }
4. The spatio-temporal dependencies in the energy consumption data SOutput = Ẽt , Ẽt+1 , …, Ẽt+if (3)
are learned and modeled by the convolutional neural networks and
long short term memory neural networks, respectively. In this work, CNN-LSTM is modeled as an energy forecaster and
5. Multi-input and multi-input sliding window mechanism is deployed therein an approximation function (f) which relates SInput (Eqn. (2)) and
to provide accurate and reliable energy consumption forecast. SOutput is defined in Eqn. (3).
6. Further, effective modeling of higher order and non-linear de ( )
SOutput = f SInput , f : PInputL xn →POutputL xn (4)
pendencies in the energy consumption data enables kCNN-LSTM to
provide accurate forecasts in real time for a long period without Eqn. (4) states that given an input window (InputL ), the model (f)
retraining. learns to forecast the energy consumption values of the output window
(OutputL ) with minimal forecast error (Error) as defined in Eqn. (5).
4
N. Somu et al. Renewable and Sustainable Energy Reviews 137 (2021) 110591
1∑ n ⃒
⃒ i t⃒
⃒ i. Forget gate (f t ) :Decides the information that should be discarded
Error = ⃒et − ẽt ⃒ (5) from the cell state represented by ft , based on the last hidden state
n i=1
(ht− 1 )and the new input (xt ), i.e., the output of the previous cell state
where, eit and ẽtt are the real and forecasted energy consumption value of and input of the current cell state, respectively (Eqn. (7)).
( )
the ith sensor at t th timestamp. ft = σ Wf .[ht− 1 , xt ] + bf (7)
2.2. k-means clustering where, Wf and bf are the weight matrices and bias vector of the forget
gate, respectively; σ is the logistic sigmoid function. The degree of in
k − means clustering algorithm is a widely used partitioning cluster formation retention relies on the value of the forget gate, which lies in
analysis method. It is an iterative hill-climbing algorithm that groups m the range of [0,1] (‘0’-forget all; ‘1’-remember all).
data points into k (user-defined parameter) clusters to optimize the Sum
of Squared Error (SSE) that measures the intra-cluster similarity or inter- ii. Input gate (it ) :Decides the information of input (xt ) that should be
cluster dissimilarity. Each data point is assigned to one of the cluster stored in the cell state represented by the input gate (it ), where the
centroids (initialized randomly) based on its minimum distance from the information in the input gate (it ) and the new candidate cell state (c˜t )
centroid. Next, each cluster centroid is updated by obtaining the mean of are updated (Eqn. (8) and Eqn. (9)). The new cell state (ct ) is updated
the data points presented in the individual clusters. This procedure is by combining the previous cell state (ct− 1 ) and c˜t with the impact of
repeated until SSE (Eqn. (6)) between cluster centroids and the data forget gate (ft ) and input gate (it ) (Eqn. (10)).
points is minimized.
it = σ(Wi .[ht− 1 , xt ] + bi ) (8)
∑
K ∑
SSE = D P i − Cm (6) c̃t = tanh(Wc .[ht− 1 , xt ] + bc ) (9)
m=1 D P i ∈Cm
The k − means algorithm can be applied to massive, high- ct = ft * ct− 1 + it *c̃t (10)
dimensional, and non-linear time-series data (sequential values
measured at equal time intervals) to obtain consistent individual groups where, Wi and bi are the weight matrices and bias vector of the input
of time-series for accurate predictions. In this case, clustering parame gate, respectively; Wc and bc are the weight matrices and bias vector of
ters like distance measure, cluster evaluation measure, etc. should be the cell state, respectively; * is the point-wise multiplication; tanh is a
decided before clustering. hyperbolic tangent function with the range of [-1,1].
2.3. Long short term memory iii. Output gate (ot ) :Decides the information in the cell state (ct )
that should flow as the output of the output gate (ot ). Eqn. (11)
Long Short Term Memory Neural Networks (LSTM), the special and evaluates which part of the cell state is to be exported, and Eqn.
improved architecture of Recurrent Neural Networks (RNN) employs (12) computes the final output.
gate units and ‘self-connected memory cells’ to extract the underlying ot = σ (Wo .[ht− 1 , xt ] + bo ) (11)
complex temporal dependencies in long and short time-series data,
thereby addressing the ‘vanishing gradient problem’ of RNN [42]. LSTM ht = ot .tanh(ct ) (12)
consists of a memory block which is responsible for determining the
addition and deletion of information through three gates, namely input where, Wo and bo are the weight matrices and bias vector of the output
gate (it ), forget gate (ft ), and output gate (ot ) (Fig. 2) [43,44]. The gate, respectively; The activation functions (σ (.) and tanh(.)) used to
memory cell in the memory block remembers temporal state informa express the non-linearity of the LSTM network can be defined as in Eqn.
tion about current and previous timesteps. The workflow of LSTM at (13) and Eqn. (14).
timestep t is detailed as [45,46]:
5
N. Somu et al. Renewable and Sustainable Energy Reviews 137 (2021) 110591
1 ( )
σ (x) = x
(13) Cijn = sum ckn ⊗ fmij + bn (15)
1 + e−
ex − e− x Y n = δ(Cn ) (16)
tanh(x) = (14)
ex + e− x
where, Cnij is the output of the convolution operations in the convolu
2.4. Convolutional neural networks tional layer; ckn is the convolution kernel matrix; fmij is the filter matrix;
sum( *) adds all elements in *; n is the nth feature map; i and j are the
Yann LeCun, et al. designed CNN or ConvNets, a class of deep feed- number of steps of convolution filter in horizontal and vertical di
forward neural network which has proven its significance in extract rections; bn is the bias; δ() is the activation function.
ing the spatial features in time series data, image recognition, and With the completion of convolution, there are n feature maps that
classification tasks [47]. serve as input to the pooling layer. Pooling reduces the dimension of
The overall architecture of CNN comprises of convolutional layer, feature maps by reducing the redundant features without losing
pooling layer, and fully connected layer to model complex data (Fig. 3) important information (Fig. 4(b)). The pooling operations reduce the
[48,49]. In general, CNN has several hierarchies of convolutional and computational burden of the model by compressing the input feature
pooling layers, wherein several convolution runs are done to extract the map. The average, max, and sum pooling can be computed using Eqn.
important features from the input data. In the convolutional layer, (17) – Eqn. (19), respectively.
neurons of different layers of the network are locally connected through ( )
a weight sharing technique. Further, the convolutional layer forms the Pnij = AverageP pfmij (17)
core of CNN, which performs convolution and activation operations on ( )
the input data to create a feature map. A sliding window of convolu Pnij = MaximumP pfmij (18)
tional filter (matrix) of size equal to the size of the convolutional kernel ( )
moves across the horizontal and vertical directions of the 2 − Pnij = SumP pfmij (19)
dimensional input data (Fig. 4(a)). The size of the convolutional filter
should be equal to the size of the convolutional kernel. The convolution where, Pnij is the output of the pooling operation in the pooling layer; i
operation is completed by the kernel through the construction of a and j are the number of steps of pooling filter in horizontal and vertical
feature map, a 2D representation of the kernel generated by calculating directions; pfmij is the pooling filter matrix; AverageP ( *) provides the
the dot product of the convoluted kernel and the convolution filter. The average of all the elements in *; MaximumP ( *) provides the maximum
convolution operation and activation operation in the convolutional element in *; SumP ( *) provides the sum of all the elements in *. Finally,
layer can be defined as in Eqn. (15) and Eqn. (16). the role of the flatten layer is to flatten the 2D data into a 1D vector
representation, which is fed as input to the fully connected layer (Fig. 4
6
N. Somu et al. Renewable and Sustainable Energy Reviews 137 (2021) 110591
Fig. 4. (a) 2D convolution operation; (b) Pooling operation with pooling size = [2,2]; (c) Flatten operation [48].
(c)). The main idea of the fully connected layer is to connect the adjacent where, Wfc is the weight matrix; FC is the output matrix of the fully
layers to integrate the features in providing the linear output for connected layer; bfc is the bias; x is the input to the fully connected layer.
regression problems [50]. The calculation in the fully connected layer is
defined in Eqn. (20).
( )
FC = δ Wfc x + bfc (20)
7
N. Somu et al. Renewable and Sustainable Energy Reviews 137 (2021) 110591
Table 2
Preprocessed energy consumption dataset.
Year Day of the year Season Month Day of the week Hour of the day Minute of the hour Type of the day Energy Consumed
2017 1 3 1 1 0 0 1 256
2017 1 3 1 1 0 15 1 258
… … … … … … … … …
2019 334 3 11 49 23 45 0 426
3. CNN-LSTM: proposed deep learning framework for building 3.2. Data clustering and analysis layer
energy consumption forecast
Given an energy consumption dataset E with M objects, E = {T1 , T2 ,
Fig. 5 presents the architecture of kCNN-LSTM to forecast the energy …, TM }, where Ti is a time series. Time series clustering of E into clusters
consumption of buildings. The overall workflow of kCNN-LSTM consists C = {C1 , C2 , …, Ck } is done by grouping the data points based on the
of (i) Data source and preprocessing layer: Transform the raw energy similarity in the consumption trends (E = Uki=1 Ci , where i ∕ = j and
consumption data into a compatible format, (ii) Data clustering and Ci ∩Cj = ∅) [51]. The procedure for the selection of the cluster param
analytics layer: cluster the data into different groups for trend analysis, eters like distance measure, evaluation measures, etc. is detailed below.
(iii) Dataset generation layer: generate datasets for train, valid, and
test, and (iv) Model building and evaluation layer: create models to 3.2.1. Distance measure
forecast energy consumption and evaluate their performance using Euclidean distance and Dynamic Time Warping (DTW) form the most
various error metrics. kCNN-LSTM energy consumption forecast model widely used distance metric for time series clustering due to the effi
is then integrated with the BEMS to forecast building energy consump ciency of Euclidean distance (simple, fast, and parameter-free) and the
tion for the user-specified intervals (time slot, day, week, month, season, effectiveness of DTW.
etc.).
(i) Euclidean distance measure: It provides one-to-one matching
3.1. Data source and preprocessing layer between the timestamps of the time series data T1 = {t11 , t12 , …
, t1n } and T2 = {t21 , t22 , …, t2n } as defined in Eqn. (21).
The energy consumption data of the KReSIT building at IIT, Bombay, √̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅
√∑
contains more than 30 electrical related features. Out of these, this study √ n ( )2
Euclidean(T1,T2) = √ t1i − t2j (21)
considers timestamp (dd-mm-yyyy HH: MM: SS) and energy consump i,j=1
tion columns for processing, as it is available across different hierarchies
of buildings. Since the yearly energy consumption data of 2018 is However, Euclidean distance is not the best choice for time series
considered for analysis, the available per second granularity data is data since it depends on the domain & time series characteristics of the
aggregated to 15 min granularity. Totally, there are 96 energy con data and disability to capture the distortions in the time domain, i.e.,
sumption values for each day in the selected year (Block0 - 00:00 to sensitive to shifts in the time axis.
00:15, Block1 - 00:15 to 00:30, …, Block 95–23:45 to 00:00). For
techniques employed to handle missing values, outliers, and high (ii) DTW distance measure [52]: DTW finds the optimal non-linear
magnitude values, refer Section 4.2. Further, a simple Python code is alignment between two time series by calculating their matching
designed to generate seven features (day of the year (0–365), season (0- error as follows:
Summer; 1-Monsoon; 2-Winter), month, day of the week (0–52), hour of a. Consider two time series of the same length n, T1 = {t11 , t12 ,
the day (0–24), minute of the hour (0–60), type of the day (0-Holiday; 1- …, t1n } and T2 = {t21 , t22 , …, t2n }, where t1i and t2i are the
Working day)) from the available timestamp. Table 2 shows the struc
ture of the energy consumption data used for experimentation.
8
N. Somu et al. Renewable and Sustainable Energy Reviews 137 (2021) 110591
energy consumption value of time series T1 and T2 at ith 3.4. Model building and evaluation layer
timestamp.
b. Construct a cost matrix of n x n dimension, whose ith and jth kCNN-LSTM, a building energy consumption forecast model is
element represents the Euclidean distance between t1i and t2j . designed to model the spatial correlation between the time stamp based
c. Find the path (P), i.e., optimal alignment between T1 and T2 generated variables and temporal information in the irregular con
through the constructed cost matrix that minimizes the cu sumption patterns for better forecast accuracy. The model architecture
mulative distance (Eqn. (22)). of kCNN-LSTM consists of convolutional layer, pooling layer, LSTM
⎛√ ̅̅̅̅̅̅̅̅̅̅̅̅⎞ ⎛√̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅⎞ layer, and fully connected layer. The convolutional and pooling layers of
√ K
√∑ √∑
√ n ( )2 CNN extracts the spatial characteristics of the multivariate data and pass
*
P = argminP ⎝√ pk = argminP ⎝√
⎠ t1i − t2j ⎠ (22) on the identified features as input to the LSTM. The LSTM models the
irregular trends in the energy consumption data based on the spatial
k=1 i,j=1
features provided by the CNN. Later then, a fully connected layer re
where, P = (p1 ,p2 ,…,pK ), each element P denotes the distance between
ceives and decodes the output of LSTM to provide the forecasted energy
ith and jth data point in T1 and T2 . consumption data. The multi-input and multi-output sliding window
mechanism is employed in such a way that kCNN-LSTM learns the input
d. Find the optimal path using a recursive dynamic programming data for every block (15 min).
function. The input of size 60 × 7 is fed to the convolutional layers with filters,
kernels, padding, and activation of 64, 2 × 1, ‘same’, and ReLU,
DTW has a complexity of O(nm), where n and m are the length of first respectively. The output of the convolutional layer is passed to an LSTM
and second time series, respectively. layer with 64 units and ‘tanh’ activation. With the series of fully con
nected layer, the number of units in the dense layers are 32 and 60 to
(iii) Lower bound Keough distance measure [53,54]: Due to high produce the forecast of the next 60 min energy demand. The hyper
complexity, the recursive application of DTW on a long time se parameters of each layer in kCNN-LSTM were fine-tuned using ISCOA,
ries data is expensive. Imposing locality constraint using a an enhanced variant of SCOA to improve the learning speed and per
threshold determined by the window size and Lower Bound formance of the learning model. ISCOA uses Haar wavelet based mu
Keough (LB Keough) distance metric provides a way to speed up tation operator to enhance the divergence nature of the algorithm
DTW. LB Keough is based on the fact that DTW uses global path towards the global optimum. For more details related to hyperparameter
constraints when comparing two time series. LB Keough for two tuning, refer [55].
time series T1 = {t11 , t12 , …, t1n } and T2 = {t21 , t22 , …, t2n } is
defined as in Eqn. (23). 4. Case study
√̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅
√ ⎧
√ ⎪
√∑ ⎪ T2i − U 2i if T2i > Ui This section presents a detailed description of the considered energy
√ N ⎨
LB − Keough(T1 , T2 ) = √ T − L2i if T2i < Li (23) consumption data, data preprocessing techniques, and evaluation met
√ ⎪ ⎪ 2i
i=1 ⎩
0 Otherwise rics. Further, a detailed analysis of the performance of kCNN-LSTM over
the k − means variant of the state-of-the-art building energy consump
where, Ui and Li are the lower and upper bound of time series T1 , which tion forecast model.
can be defined as Ui = max(T1i− r : T1i+r ) and Li = min(T1i− r : T1i+r ); r de
pends on the type of path constraint used (e.g., Itakura parallelogram 4.1. Dataset
and Sakoe-Chiba band). The complexity of LB Keough is O(m).
This work uses k − means clustering algorithm and LB-Keough dis This work uses the energy consumption data of KReSIT, IIT-Bombay,
tance metric to cluster the energy consumption patterns for trend India. The flow of electricity consumption data from sensing to sharing
analysis. comprises several layers, physical and software components (Fig. 6). The
functionalities of each layer are detailed below.
3.2.2. Optimal value of k
The most primary and essential step in any unsupervised algorithm is (i) Physical layer: It corresponds to the actual building environ
to find the optimal number of clusters to which the data points are ment for which electricity consumption is measured. KReSIT, IIT-
added. Several methods like Elbow, average Silhouette, gap statistics, B is a four-storeyed academic building with three wings (A, B, and
etc. have been proposed in the literature to identify the optimal number C) for each floor, which consists of smart classrooms, office
of clusters (k). This work employs the Elbow method to identify the rooms, auditoriums, lecture halls, research laboratories, and
optimal value of k for precise clustering results. The elbow method plots server rooms. The location of the considered academic building is
various SSE values for different k values through iterative runs of k − in Mumbai. The average temperature of Mumbai falls within 17 −
means clustering algorithm. The basic idea is that as the value of k in 34o C for three seasons, namely summer (April–June), monsoon
creases, the average distortion decreases with fewer elements in the (July–September), and winter (October–March). The energy and
cluster. Therefore, k value at which the distortion decreases the most is temperature profiles of KReSIT are monitored and controlled by
the elbow point, which is chosen as the optimal k for the considered its own BEMS built for “Minimal power consumption in an
dataset. occupied room” and ‘Zero consumption during zero occupancy”.
3.3. Dataset generation layer Since the major objective of this research is to forecast the overall
energy consumption of the considered buildings, the consumption data
The energy consumption data of each cluster is partitioned into provided by the smart meter installed at the MAINS is used for
training (ETrain ), validation (EVal ), and testing (ETest ) in the ratio of experimentations.
60:20:20 for overall, weekdays, weekend, and day analysis. For experimental purposes, the energy consumption data from
January 1, 2018 to December 31, 2018 at 15 min granularity was
considered. For better visualization, each of the 15 min interval is
mapped to a block number, i.e., Block 0: 00:00 to 00:15, Block 2: 00:15
to 00:30, …, Block 95: 23:45 to 0:00. Fig. 8 provides the statistical detail
9
N. Somu et al. Renewable and Sustainable Energy Reviews 137 (2021) 110591
(i) Sensing layer: A three-phase smart meters deployed at various facilities of KReSIT records the various electrical parameters through a MODBUS protocol over
RS485 cable. This study uses the energy consumption variable, which provides energy consumed by air conditioners, plugs, lights, and fans at per-second
granularity in the considered physical space. The energy consumption data is queried from the smart meter by a Raspberry Pi at predefined intervals in a
round robin fashion. A python script is executed in Raspberry Pi to collect the smart meter data and transmit to the central server using Message Queuing
Telemetry Transport (MQTT) protocol. The sensor nodes communicate via. serial or low-range RF. Further, a gateway (NodeMCU and RPi) is used to send the
sensor data to the central server for storage and processing.
(ii) Communication layer: MQTT, a fast and minimal overhead transmission protocol, is used to publish the data from the smart meters to the remote ingestion
servers. It uses ‘publish-subscribe’ concept, which enables multiple subscribers to receive the selected data stream by registering with the central broker.
(iii) Ingestion and storage layer: The data stream of the smart meters are published on the topic of structure <channel>/<data type>/<sensor identifier>, where
different channels used to publish data are specified by the topic, data type is used by the ingestion engine to find the schema of the published data stream, and
sensor identifier corresponds to the global unique ID assigned to each smart meter. The smart meter data is stored in MySQL database and CSV format for
visualization and archival.
(iv) Visualization layer: Grafana, an open source data visualization engine, is used to display the real-time electricity consumption of various appliances (ACs,
plugs, lights, and fans), which enables the consumers to visualize their energy consumption. The academic building dataset can be accessed from Ref. [56] for
research purposes. The live power consumption data can be visualized, as shown in Fig. 7.
10
N. Somu et al. Renewable and Sustainable Energy Reviews 137 (2021) 110591
Fig. 8. Statistics of KReSIT energy consumption data for 2018. (iii) Mean absolute error: It measures the absolute difference be
tween the actual and the forecasted value (Eqn. (29)).
{
avg(x) + 2.std(x) if xi > avg(x) + 2.std(x) n ⃒ ⃒
f (xi ) = (25) 1∑ ⃒ ⃒
xi otherwise MAE = ⃒̂y i − yi ⃒
⃒ ⃒ (29)
n i=1
where, x is a vector that consists of xi ; avg(x) is the average value of x;
std(x) is the standard deviation of x. (iv) Mean absolute percentage error: It is the measure of the
amount of deviation of the forecasted value from the actual value
(iii) Normalization: Min-max normalization technique is used to (Eqn. (30)).
regularize the energy consumption data to avoid inaccurate ⃒ ⃒
forecasts due to the high magnitude in the energy consumption 1∑ n ⃒ ⃒
⃒̂y i − yi ⃒
⃒ ⃒*100 (30)
data (Eqn. (26)). MAPE =
n i=1 ⃒⃒ yi ⃒⃒
xi − min(x)
f (xi ) = (26)
max(x) − min(x) 5. Results and discussions
where, min(x) and max(x) are the minimum and maximum values of x,
The implementation of kCNN-LSTM and the experimental analysis
respectively.
Table 3
LB-Keough distance matrix.
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
Jan 0 0.2573 0.4000 0.5253 0.7467 0.1313 0.3930 0.5967 0.2966 0.7075 0.3134 0.4040
Feb 0.9143 0 1.1917 0.2607 1.2914 0.2951 1.1596 0.9738 0.1183 0.5721 0.2396 1.0829
Mar 0.6175 0.2488 0 0.3578 0.2298 0.1580 0.2730 0.4172 0.3766 0.3179 0.5215 1.0556
Apr 1.2903 1.0702 1.3483 0 1.5387 1.0032 1.3159 1.1997 1.1093 0.8165 1.1244 1.2989
May 1.0323 0.4832 0.1398 0.4926 0 0.2995 0.5324 0.5502 0.6560 0.4026 0.7946 1.4301
Jun 0.7177 0.3172 0.7714 0.3668 0.8568 0 0.7803 0.6995 0.3958 0.5197 0.4220 0.9394
Jul 0.3790 0.0414 0.1022 0.3927 0.4171 0.1514 0 0.2753 0.1328 0.3566 0.2659 0.8025
Aug 0.4490 0.0510 0.1279 0.0746 0.2597 0.0475 0.1072 0 0.1659 0.0817 0.2027 0.6976
Sep 0.5486 0.1005 0.7276 0.3096 0.8489 0.1709 0.6779 0.5866 0 0.4288 0.1016 0.7053
Oct 0.7388 0.2619 0.7102 0.0632 0.9346 0.3679 0.6912 0.4949 0.4222 0 0.3892 0.8452
Nov 0.6150 0.0536 0.9079 0.2452 1.0596 0.2228 0.8744 0.7127 0.0116 0.4468 0 0.6693
Dec 0.0934 0.1955 0.4630 0.3906 0.8102 0.2701 0.4003 0.3970 0.1838 0.4857 0.1660 0
11
N. Somu et al. Renewable and Sustainable Energy Reviews 137 (2021) 110591
Fig. 10. Cluster analysis (a) Cluster 1 (b) Cluster 2 (c) Cluster 3.
12
N. Somu et al. Renewable and Sustainable Energy Reviews 137 (2021) 110591
Fig. 11. Energy consumption patterns in Cluster 1 for different days (a) Monday (b) Wednesday (c) Friday (d) Sunday.
Fig. 12. Energy consumption patterns in Cluster 2 for different days (a) Monday (b) Wednesday (c) Friday (d) Sunday.
13
N. Somu et al. Renewable and Sustainable Energy Reviews 137 (2021) 110591
Fig. 13. Energy consumption patterns in Cluster 3 for different days (a) Monday (b) Wednesday (c) Friday (d) Sunday.
14
N. Somu et al. Renewable and Sustainable Energy Reviews 137 (2021) 110591
Table 5
Cluster 1 – Performance analysis of the state-of-the-art machine learning and deep learning energy demand forecast models.
Type Metrics Models
Table 6
Cluster 2 – Performance analysis of the state-of-the-art machine learning and deep learning energy demand forecast models.
Type Metrics Models
15
N. Somu et al. Renewable and Sustainable Energy Reviews 137 (2021) 110591
Table 7
Cluster 3 – Performance analysis of the state-of-the-art machine learning and deep learning energy demand forecast models.
Type Metrics Models
all days of weekdays and weekend. The experimental design and vali forecast problem was validated through a comparative analysis with
dation are presented with the daily analysis on the selected days of k-means clustering variant of ARIMA, Deep Belief Network (DBN),
weekdays (Monday, Wednesday, and Friday) and weekends (Sunday) of Multi-Layer Perceptron (MLP), CNN and LSTM. The hyperparameters of
the identified clusters. The analysis and results of the other days of the kCNN-LSTM (momentum, dropout, weight decay, learning rate, strides,
weekdays and weekends are provided in the Appendix (Fig. A1, A2 and filters, etc.) were fine-tuned through the enhanced variant of sine cosine
A3; Table A1, A2 and A3). To understand the fluctuations in the energy optimization algorithm to achieve better forecast accuracy (Table 4).
consumption demand of KReSIT and to provide an accurate demand The hyperparameters of the contrast models were fine-tuned using a
forecast, the average energy consumption over the selected days across sequential grid search algorithm.
the months in each cluster was analyzed (Figs. 11, 12 and 13(a)-(d)). The above stated architecture of kCNN-LSTM was implemented
The daily analysis enables a better understanding of the energy using Keras framework with Tensorflow backend. The average error
consumption patterns and aids the deep learning model to provide better values obtained from 30 independent runs were taken for the analysis of
forecast of the energy demands for the user-specified day, month, and kCNN-LSTM over k-means variant of the state-of-the-art energy demand
interval. forecast models. Tables 5–7 provides average error values of the
considered machine learning and deep learning models for cluster wise,
(iii) Phase 3: Future energy demand forecast analysis weekdays, weekends, and selected days of the week in the identified
clusters.
Finally, the performance of kCNN-LSTM for energy consumption From Tables 5–7, it is evident that kCNN-LSTM outperforms k-means
variants of the contrast models. The main reason behind this is that,
apart from learning the load trend characterization, kCNN-LSTM com
Table 8 bines the benefit of the contextual features generated from the time
Computation time analysis.
stamp and the temporal information in the historical energy
Computation time (Secs) consumption data to achieve better forecast accuracy. The computation
Models Cluster 1 Cluster 2 Cluster 3 time of kCNN-LSTM and the considered models reveal that the proposed
energy demand forecast model provides the best computational effi
ARIMA 79 85 34
DBN 97 102 62 ciency (Table 8).
MLP 84 95 51
LSTM 88 110 45 6. Conclusions
CNN 81 105 43
CNN-LSTM 75 90 40
This work presented kCNN-LSTM, a deep learning framework for
16
N. Somu et al. Renewable and Sustainable Energy Reviews 137 (2021) 110591
Fig. A1. Energy consumption patterns in Cluster 1 for different days (a) Tuesday (b) Thursday (c) Saturday
17
N. Somu et al. Renewable and Sustainable Energy Reviews 137 (2021) 110591
Fig. A2. Energy consumption patterns in Cluster 2 for different days (a) Tuesday (b) Thursday (c) Saturday
18
N. Somu et al. Renewable and Sustainable Energy Reviews 137 (2021) 110591
Fig. A3. Energy consumption patterns in Cluster 3 for different days (a) Tuesday (b) Thursday (c) Saturday
premature convergence. A case study using the real time building energy ARIMA DBN MLP LSTM CNN CNN-LSTM
consumption data acquired from the BEMS deployed at KReSIT building, Tuesday MSE 0.0158 0.0113 0.0051 0.0076 0.0065 0.0045
IIT-Bombay was presented. The performance of kCNN-LSTM was RMSE 0.1257 0.1065 0.0715 0.0870 0.0804 0.0668
compared with the k means variant of the state-of-the-art energy de MAE 0.1066 0.0778 0.0573 0.0663 0.0563 0.0442
mand forecast models in terms of MSE, RMSE, MAE, and MAPE. The MAPE 0.2674 0.2401 0.1429 0.1886 0.1620 0.1291
Thursday MSE 0.0144 0.0178 0.0051 0.0008 0.0016 0.0005
experimental results demonstrate the efficiency of kCNN-LSTM model RMSE 0.1202 0.1335 0.0716 0.0277 0.0397 0.0218
over the existing demand forecast models in providing accurate energy MAE 0.0960 0.1239 0.0540 0.0212 0.0333 0.0174
consumption demand forecasting. Therefore, the implementation of MAPE 0.4005 0.3963 0.1297 0.0619 0.0967 0.0514
kCNN-LSTM at the electricity network and user level can aid in decision Saturday MSE 0.0251 0.0122 0.0016 0.0008 0.0005 0.0005
RMSE 0.1586 0.1104 0.0400 0.0276 0.0220 0.0213
making, demand management programs, and energy efficiency aspects.
MAE 0.1397 0.0872 0.0349 0.0214 0.0170 0.0156
The future directions of this research include (i) Application of kCNN- MAPE 0.3280 0.2460 0.1034 0.0708 0.0556 0.0513
LSTM to residential buildings and (iii) Analyze the performance of
various optimization algorithms for hyperparameter tuning of kCNN-
Table A2
LSTM.
Cluster 2 – Performance analysis of the state-of-the-art machine learning
and deep learning energy demand forecast models (Tuesday, Thursday,
Credit author statement
and Saturday)
Nivethitha Somu: Conceptualization, Methodology, Investigation,
Type Metrics Models
Data curation, Writing (Original, Review and Editing), and Visualiza
tion, Gauthama Raman: Conceptualization, Software, Writing (Review ARIMA DBN MLP LSTM CNN CNN-LSTM
and Editing), Krithi Ramamritham: Conceptualization, Validation, Re Tuesday MSE 0.0087 0.0166 0.0175 0.0089 0.0095 0.0003
sources, Writing (Writing – review & editing), Supervision, and Funding RMSE 0.0933 0.1287 0.1323 0.0944 0.0972 0.0192
acquisition. MAE 0.0566 0.1000 0.0904 0.0585 0.0600 0.0119
MAPE 0.5184 0.4124 0.2506 0.2466 0.2558 0.1877
Thursday MSE 0.1003 0.0237 0.0209 0.021 0.0197 0.0189
RMSE 0.1376 0.1538 0.1447 0.1451 0.1405 0.0192
(continued on next page)
19
N. Somu et al. Renewable and Sustainable Energy Reviews 137 (2021) 110591
Table A2 (continued ) [7] Chou JS, Tran DS. Forecasting energy consumption time series using machine
learning techniques based on usage patterns of residential householders. Energy
Type Metrics Models
2018;165:709–26. https://doi.org/10.1016/j.energy.2018.09.144.
ARIMA DBN MLP LSTM CNN CNN-LSTM [8] Ocampo Batlle EA, Escobar Palacio JC, Silva Lora EE, Martínez Reyes AM, Melian
Moreno M, Morejón MB. A methodology to estimate baseline energy use and
MAE 0.1119 0.1176 0.1100 0.1105 0.0968 0.0046 quantify savings in electrical energy consumption in higher education institution
MAPE 0.4184 0.4390 0.3958 0.3715 0.4048 0.2877 buildings: case study, Federal University of Itajubá (UNIFEI). J Clean Prod 2019.
Saturday MSE 0.0048 0.0067 0.0045 0.0064 0.0045 0.0035 https://doi.org/10.1016/j.jclepro.2019.118551.
RMSE 0.0695 0.0816 0.0672 0.0802 0.0673 0.0141 [9] Neto AH, Fiorelli FAS. Comparison between detailed model simulation and
MAE 0.0489 0.0611 0.0471 0.0602 0.048 0.0093 artificial neural network for forecasting building energy consumption. Energy
MAPE 0.1766 0.1787 0.1299 0.1738 0.1484 0.0604 Build 2008;40:2169–76. https://doi.org/10.1016/j.enbuild.2008.06.013.
[10] Afzal Muhammad, Huang Qi, Amin Waqas, Umer Khalid, Raza Asif,
Blockchain MN. Enabled distributed demand side management in community
Table A3 energy system with smart homes. IEEE Access 2020;8:37428–39.
[11] Global demand for energy will peak in 2030, says World Energy Council | Business
Cluster 3 – Performance analysis of the state-of-the-art machine learning
| The Guardian n.d.
and deep learning energy demand forecast models (Tuesday, Thursday, [12] Hassan S, Khosravi A, Jaafar J, Khanesar MA. A systematic design of interval type-2
and Saturday) fuzzy logic system using extreme learning machine for electricity load demand
forecasting. Int J Electr Power Energy Syst 2016;82:1–10. https://doi.org/
10.1016/j.ijepes.2016.03.001.
Type Metrics Models [13] Yaslan Y, Bican B. Empirical mode decomposition based denoising method with
ARIMA DBN MLP LSTM CNN CNN-LSTM support vector regression for time series prediction: a case study for electricity load
forecasting. Meas J Int Meas Confed 2017;103:52–61. https://doi.org/10.1016/j.
Tuesday MSE 0.1040 0.0463 0.0333 0.0298 0.2187 0.0222 measurement.2017.02.007.
RMSE 0.2640 0.2151 0.1824 0.1726 0.1568 0.1489 [14] Deb C, Zhang F, Yang J, Lee SE, Shah KW. A review on time series forecasting
MAE 0.3250 0.1636 0.1300 0.1353 0.2152 0.1218 techniques for building energy consumption. Renew Sustain Energy Rev 2017;74:
MAPE 0.4844 0.3006 0.3458 0.2503 0.2404 0.2314 902–24. https://doi.org/10.1016/j.rser.2017.02.085.
Thursday MSE 0.2029 0.0429 0.0532 0.0234 0.1154 0.0126 [15] Xiao J, Li Y, Xie L, Liu D, Huang J. A hybrid model based on selective ensemble for
energy consumption forecasting in China. Energy 2018;159:534–46. https://doi.
RMSE 0.1546 0.2070 0.2306 0.1531 0.1741 0.1123
org/10.1016/j.energy.2018.06.161.
MAE 0.1212 0.1608 0.1671 0.1160 0.8047 0.0883
[16] Wei N, Li C, Peng X, Zeng F, Lu X. Conventional models and artificial intelligence-
MAPE 0.3397 0.3125 0.5654 0.2409 0.8758 0.2138
based models for energy consumption forecasting: a review. J Petrol Sci Eng 2019;
Saturday MSE 0.0108 0.0209 0.0257 0.0130 0.1077 0.0081 181:106187. https://doi.org/10.1016/j.petrol.2019.106187.
RMSE 0.1285 0.1445 0.1603 0.1139 0.1880 0.0900 [17] Ciulla G, D’Amico A. Building energy performance forecasting: a multiple linear
MAE 0.2159 0.1155 0.1328 0.0906 0.0681 0.0662 regression approach. Appl Energy 2019;253:113500. https://doi.org/10.1016/j.
MAPE 0.2617 0.2544 0.4789 0.1918 0.1574 0.1480 apenergy.2019.113500.
[18] Liu T, Tan Z, Xu C, Chen H, Li Z. Study on deep reinforcement learning techniques
for building energy consumption forecasting. Energy Build 2020;208:109675.
https://doi.org/10.1016/j.enbuild.2019.109675.
[19] Tran D, Luong D, Chou J. Nature-inspired metaheuristic ensemble model for
Declaration of competing interest forecasting energy consumption in residential buildings. Energy 2019:116552.
https://doi.org/10.1016/j.energy.2019.116552.
[20] Raza MQ, Khosravi A. A review on artificial intelligence based load demand
The authors declare that they have no known competing financial forecasting techniques for smart grid and buildings. Renew Sustain Energy Rev
interests or personal relationships that could have appeared to influence 2015;50:1352–72. https://doi.org/10.1016/j.rser.2015.04.065.
the work reported in this paper. [21] Mat Daut MA, Hassan MY, Abdullah H, Rahman HA, Abdullah MP, Hussin F.
Building electrical energy consumption forecasting analysis using conventional and
artificial intelligence methods: a review. Renew Sustain Energy Rev 2017;70:
Acknowledgments 1108–18. https://doi.org/10.1016/j.rser.2016.12.015.
[22] Amasyali K, El-gohary NM. A review of data-driven building energy consumption
prediction studies. Renew Sustain Energy Rev 2018;81:1192–205. https://doi.org/
The authors would like to thank The Indian Institute of Technology-
10.1016/j.rser.2017.04.095.
Bombay (Institute Postdoctoral Fellowship-AO/Admin-1/Rect/33/ [23] Fayaz M, Kim D. A prediction methodology of energy consumption based on deep
2019); The Ministry of Human Resource and Development, India – extreme learning machine and comparative analysis in residential buildings.
Impacting Research Innovation and Technology (IMPRINT - Electron 2018;7. https://doi.org/10.3390/electronics7100222.
[24] Malik S, Kim DH. Prediction-learning algorithm for efficient energy consumption in
16MOPIMP002), New Delhi, India; Prof. Kannan Krithivasan, Dean, smart buildings based on particle regeneration and velocity boost in particle swarm
School of Education, SASTRA Deemed University, Tamil Nadu, India optimization neural networks. Energies 2018;11. https://doi.org/10.3390/
(TATA Realty—SASTRA Srinivasa Ramanujan Research Cell, India) and en11051289.
[25] Bedi J, Toshniwal D. Empirical mode decomposition based deep learning for
SEIL members. electricity demand forecasting. IEEE Access 2018;6:49144–56. https://doi.org/
10.1109/ACCESS.2018.2867681.
References [26] Li C, Ding Z, Yi J, Lv Y, Zhang G. Deep belief network based hybrid model for
building energy consumption prediction. Energies 2018;11:1–26. https://doi.org/
10.3390/en11010242.
[1] Yang Y, Chen Y, Wang Y, Li C, Li L. Modelling a combined method based on ANFIS
[27] Moon J, Park J, Hwang E, Jun S. Forecasting power consumption for higher
and neural network improved by DE algorithm: a case study for short-term
educational institutions based on machine learning. J Supercomput 2018;74:
electricity demand forecasting. Appl Soft Comput J 2016;49:663–75. https://doi.
3778–800. https://doi.org/10.1007/s11227-017-2022-x.
org/10.1016/j.asoc.2016.07.053.
[28] Touzani S, Granderson J, Fernandes S. Gradient boosting machine for modeling the
[2] Bui DK, Nguyen TN, Ngo TD, Nguyen-Xuan H. An artificial neural network (ANN)
energy consumption of commercial buildings. Energy Build 2018;158:1533–43.
expert system enhanced with the electromagnetism-based firefly algorithm (EFA)
https://doi.org/10.1016/j.enbuild.2017.11.039.
for predicting the energy consumption in buildings. Energy 2019:116370. https://
[29] Rahman A, Srikumar V, Smith AD. Predicting electricity consumption for
doi.org/10.1016/j.energy.2019.116370.
commercial and residential buildings using deep recurrent neural networks. Appl
[3] Wang W, Hong T, Xu X, Chen J, Liu Z, Xu N. Forecasting district-scale energy
Energy 2018;212:372–85. https://doi.org/10.1016/j.apenergy.2017.12.051.
dynamics through integrating building network and long short-term memory
[30] Ye Z, Kim MK. Predicting electricity consumption in a building using an optimized
learning algorithm. Appl Energy 2019;248:217–30. https://doi.org/10.1016/j.
back-propagation and Levenberg–Marquardt back-propagation neural network:
apenergy.2019.04.085.
case study of a shopping mall in China. Sustain Cities Soc 2018;42:176–83. https://
[4] Jain RK, Smith KM, Culligan PJ, Taylor JE. Forecasting energy consumption of
doi.org/10.1016/j.scs.2018.05.050.
multi-family residential buildings using support vector regression: investigating the
[31] Yuan T, Zhu N, Shi Y, Chang C, Yang K, Ding Y. Sample data selection method for
impact of temporal and spatial monitoring granularity on performance accuracy.
improving the prediction accuracy of the heating energy consumption. Energy
Appl Energy 2014;123:168–78. https://doi.org/10.1016/j.apenergy.2014.02.057.
Build 2018;158:234–43. https://doi.org/10.1016/j.enbuild.2017.10.006.
[5] Conti J, Holtberg P, Diefenderfer J, LaRose A, Turnure JT, Westfall L. International
[32] Divina F, Gilson A, Goméz-Vela F, Torres MG, Torres JF. Stacking ensemble
energy outlook 2016 with projections to 2040. 2016. https://doi.org/10.2172/
learning for short-term electricity consumption forecasting. Energies 2018;11:
1296780.
1–31. https://doi.org/10.3390/en11040949.
[6] Amber KP, Ahmad R, Aslam MW, Kousar A, Usman M, Khan MS. Intelligent
techniques for forecasting electricity consumption of buildings. Energy 2018;157:
886–93. https://doi.org/10.1016/j.energy.2018.05.155.
20
N. Somu et al. Renewable and Sustainable Energy Reviews 137 (2021) 110591
[33] Ahmad T, Chen H. Utility companies strategy for short-term energy demand term memory neural network for wind speed forecasting. Energy Convers Manag
forecasting using machine learning based models. Sustain Cities Soc 2018;39: 2019;185:783–99. https://doi.org/10.1016/j.enconman.2019.02.018.
401–17. https://doi.org/10.1016/j.scs.2018.03.002. [45] Song X, Liu Y, Xue L, Wang J, Zhang J, Wang J, et al. Time-series well performance
[34] Goudarzi S, Anisi MH, Kama N, Doctor F, Soleymani SA, Sangaiah AK. Predictive prediction based on Long Short-Term Memory (LSTM) neural network model.
modelling of building energy consumption based on a hybrid nature-inspired J Petrol Sci Eng 2019;186:106682. https://doi.org/10.1016/j.petrol.2019.106682.
optimization algorithm. Energy Build 2019;196:83–93. https://doi.org/10.1016/j. [46] Bai Y, Zeng B, Li C, Zhang J. An ensemble long short-term memory neural network
enbuild.2019.05.031. for hourly PM2.5 concentration forecasting. Chemosphere 2019;222:286–94.
[35] Bedi J, Toshniwal D. Deep learning framework to forecast electricity demand. Appl https://doi.org/10.1016/j.chemosphere.2019.01.121.
Energy 2019;238:1312–26. https://doi.org/10.1016/j.apenergy.2019.01.113. [47] Yann LeCun YBGH. Deep learning 2015:436–44. https://doi.org/10.1038/
[36] Kim Y, Son H gu, Kim S. Short term electricity load forecasting for institutional nature14539.
buildings. Energy Rep 2019;5:1270–80. https://doi.org/10.1016/j. [48] Li J, Li X, He D. A directed acyclic graph network combined with CNN and LSTM
egyr.2019.08.086. for remaining useful life prediction. IEEE Access 2019;7:75464–75. https://doi.
[37] Dagdougui H, Bagheri F, Le H, Dessaint L. Neural network model for short-term org/10.1109/ACCESS.2019.2919566.
and very-short-term load forecasting in district buildings. Energy Build 2019;203. [49] Lu X, Lin P, Cheng S, Lin Y, Chen Z, Wu L, et al. Fault diagnosis for photovoltaic
https://doi.org/10.1016/j.enbuild.2019.109408. array based on convolutional neural network and electrical time series graph.
[38] Wen L, Zhou K, Yang S. Load demand forecasting of residential buildings using a Energy Convers Manag 2019;196:950–65. https://doi.org/10.1016/j.
deep learning model. Elec Power Syst Res 2020;179:106073. https://doi.org/ enconman.2019.06.062.
10.1016/j.epsr.2019.106073. [50] Sadaei HJ, de Lima e Silva PC, Guimarães FG, Lee MH. Short-term load forecasting
[39] Wei Y, Zhang X, Shi Y, Xia L, Pan S, Wu J, et al. A review of data-driven approaches by using a combined method of convolutional neural networks and fuzzy time
for prediction and classification of building energy consumption. Renew Sustain series. Energy 2019;175:365–77. https://doi.org/10.1016/j.energy.2019.03.081.
Energy Rev 2018;82:1027–47. https://doi.org/10.1016/j.rser.2017.09.108. [51] Aghabozorgi S, Ying Wah T, Herawan T, Jalab HA, Shaygan MA, Jalali A. A hybrid
[40] Molina-Solana M, Ros M, Ruiz MD, Gómez-Romero J, Martin-Bautista MJ. Data algorithm for clustering of time series data based on affinity search technique. Sci
science for building energy management: a review. Renew Sustain Energy Rev World J 2014;2014. https://doi.org/10.1155/2014/562194.
2017;70:598–609. https://doi.org/10.1016/j.rser.2016.11.132. [52] Minnaar A. Time series classification and clustering with Python | alex minnaar’s
[41] Kim TY, Cho SB. Predicting residential energy consumption using CNN-LSTM blog. 2014.
neural networks. Energy 2019;182:72–81. https://doi.org/10.1016/j. [53] Rath TM, Manmatha R. Lower-bounding of dynamic time warping distances for
energy.2019.05.230. multivariate time series 2 . Multivariate time series extension. 2002.
[42] Hochreiter S, Schmidhuber J. Long short-term memory. Neural Comput 1997;9: [54] Karydis I, Nanopoulos A, Papadopoulos AN, Manolopoulos Y. Evaluation of
1735–80. https://doi.org/10.1162/neco.1997.9.8.1735. similarity searching methods for music data in P2P networks. Int J Bus Intell Data
[43] Ghimire S, Deo RC, Raj N, Mi J. Deep solar radiation forecasting with convolutional Min 2005;1:210–28. https://doi.org/10.1504/IJBIDM.2005.008363.
neural network and long short-term memory network algorithms. Appl Energy [55] Somu N, GR MR, Ramamritham K. A hybrid model for building energy
2019;253:113451–70. https://doi.org/10.1016/j.apenergy.2019.113541. consumption forecasting using long short term memory networks. Appl Energy
[44] Chen Y, Zhang S, Zhang W, Peng J, Cai Y. Multifactor spatio-temporal correlation 2020;261:114131. https://doi.org/10.1016/j.apenergy.2019.114131.
model based on a combination of convolutional neural network and long short- [56] Laboratory SEI. Academic building dataset. n.d. http://seil.cse.iitb.ac.in/datasets/.
[Accessed 4 December 2019].
21