An Evolutionary Deep Learning Model Based On EWKM, Random Forest Algorithm, SSA and BiLSTM For Building Energy Consumption Prediction
An Evolutionary Deep Learning Model Based On EWKM, Random Forest Algorithm, SSA and BiLSTM For Building Energy Consumption Prediction
Energy
journal homepage: www.elsevier.com/locate/energy
A R T I C L E I N F O A B S T R A C T
Handling Editor: X Zhao Accurate prediction of building energy consumption is crucial to the rational scheme of building energy.
Combining the entropy-weighted K-means (EWKM) with the random forest (RF) method, a feature selection
Keywords: (EWKM-RF) method is proposed in this paper. Based on the proposed EWKM-RF method, the classification and
Building energy consumption prediction feature selection of the energy consumption influencing factors can be achieved exclusively. Meanwhile, based
Deep learning
on the EWKM-RF method and the bi-directional long-short-term memory neural network (BiLSTM) optimized by
Bidirectional long short-term memory
the sparrow search algorithm (SSA), an RF-SSA-BiLSTM prediction model for building energy consumption is
Feature selection
Sparrow search algorithm established in this paper. As the weight, learning rate, and hidden layer node parameters of the BiLSTM neural
network are optimized with the SSA, the constraints for manually adjusting parameters are avoided in the
proposed prediction model. To examine the accuracy of the proposed model, energy consumption data of a civil
public building in Dalian city are collected and tested. Results show the prediction error of RF-SSA-BiLSTM after
feature selection is reduced by 24.55 % in high and low energy consumption months. Compared with RF-BiLSTM,
RF-PSO-BiLSTM, and RF-CNN-BiLSTM, the RF-SSA-BiLSTM has strong robustness. The average MAE, RMSE and
MAPE values of energy consumption prediction in the four seasons are 1.30, 1.63 and 0.02.
1. Introduction data-driven method extracts hidden rules and patterns from a large
number of historical energy consumption data, which has higher pre
According to the report of the International Energy Agency (IEA), the diction accuracy and wider application scenarios. Therefore,
energy consumption of the construction industry accounts for more than data-driven methods are popular in the field of building energy con
33 % of the total global energy consumption [1,2]. Among existing sumption prediction.
technologies, building energy consumption prediction is an effective Data-driven methods can be divided into statistical model methods
means to improve energy efficiency, playing an important role in [17–19] and machine learning methods [20–24]. In recent years, ma
ensuring the smooth operation of building energy systems and formu chine learning methods have become popular in building energy con
lating reasonable operation strategies for building managers [3–5]. sumption predictions for high prediction accuracy and the ability to
The building energy consumption prediction methods are divided solve nonlinear problems. In big sample data sets and nonlinear data
into two categories: physical modeling methods and data-driven sets, the machine learning methods have higher prediction accuracy
methods [6–11]. Based on physical principles and engineering exper than the statistical methods. Compared with the shallow machine
tise, the physical modeling method analyzes building energy consump learning model, the deep learning model [25–30] has more layers of
tion by calculating the relationship between energy consumption network structure and can extract more information when processing
influencing factors and building energy consumption. It has been widely periodic energy consumption data. Nowadays, building energy con
used in common building energy simulation software, such as Ener sumption prediction models have developed from traditional physical
gyplus, eQuest, DesT, and DOE-2 [12–15]. However, when detailed and models and statistical models to deep learning models. The deep
accurate building and environmental input parameters cannot be ob learning model can predict energy consumption quickly and accurately,
tained, the model performance will be poor [16]. In contrast, the but there are a large number of hyperparameters in the model that need
* Corresponding author. School of Civil Engineering and Architecture, Zhejiang Sci-Tech University, Hangzhou, 310018, China.
E-mail address: [email protected] (S. Shao).
https://doi.org/10.1016/j.energy.2023.129795
Received 4 July 2023; Received in revised form 26 September 2023; Accepted 25 November 2023
Available online 29 November 2023
0360-5442/© 2023 Elsevier Ltd. All rights reserved.
L. Lei et al. Energy 288 (2024) 129795
2
L. Lei et al. Energy 288 (2024) 129795
to be set. The setting of hyperparameters will have a direct impact on the out the important energy consumption influencing factors and reduce
results of the prediction model. The traditional trial and error method is the dimension of the input parameters of the prediction model. Finally,
less efficient in setting parameters, and so is the enumeration method. the SSA is used to optimize the BiLSTM, and the SSA-BiLSTM building
Introducing an intelligent optimization algorithm is an effective solution energy consumption prediction model is established. The prediction
to solve the problem of hyperparameter setting in a prediction model. process is shown in Fig. 1.
Wang et al. [31] used the improved artificial ecosystem-based optimi
zation (AEO) algorithm to optimize the parameters of the time convo 2.1. Cluster analysis of building energy consumption influencing factors
lution network. Compared to traditional methods such as manual tuning
and grid search, the prediction model has higher computational effi EWKM is a fast and efficient high-dimensional data clustering
ciency and prediction performance. Qiao et al. [32] used the improved method [37]. Compared with the hard subspace clustering method, the
Aquila Optimizer algorithm to optimize multiple parameters in the time clustering accuracy has been improved, and it has better stability and
convolution network to further improve the performance of the pre flexibility. EWKM solves the problem that traditional clustering methods
diction model. Suo et al. [33] used an improved Chimp Optimization cannot reveal the potential relationship between influencing factors
algorithm to optimize the learning rate and hidden layer of the when dealing with energy consumption data by introducing Mahala
Bi-directional Gated Recurrent Unit. The results show that the model has nobis distance. After collecting the energy consumption influencing
the best prediction performance in comparison with other benchmark factors of n buildings, the data set X = {x1, …,xj, …,xn} is obtained,
models. In this study, a new sparrow search algorithm is used to opti where xj=(xj1, …,xji, …,xjm) is the m energy consumption influencing
mize parameters in the energy consumption prediction model. factors set of the jth building. The clustering process of building energy
Compared with other traditional optimization algorithms, this algo consumption influencing factors is: first determine the number of clus
rithm has the characteristics of fast convergence speed and strong global ters K and the positive parameterγ, and set the initial weight of all
search ability. influencing factors to 1/m. Then, the objective function F(W, Z, Λ) is
In addition, building energy consumption data has strong periodicity calculated by updating the partition matrix W, cluster centers Z and
and is affected by many characteristics, such as building envelope dimension weights Λ. Finally, until F(W, Z, Λ) reaches the local mini
characteristics, meteorological characteristics, and occupant behavior mum, the clustering results and the weight of the influencing factors are
characteristics. The influencing factors increase the computational output. The expressions of the objective function F(W, Z, Λ) and the
burden of the prediction model and affect the accuracy of the prediction influence factor weight λki are as follows:
results. To solve this problem, feature selection is widely used in [ ]
∑K ∑n ∑ m ∑
m
building energy consumption prediction research [34]. Qiao et al. [35]. F(W, Z, Λ) =
( )2
ξkj λki zki − xji + τ λki log10 λki (1)
used EMD and Boruta feature selection (BFS) methods to select the k=1 j=1 i=1 i=1
features of the influencing factors. Results show that the proposed
( )
method can effectively identify the influencing factors with significant Dki
3
L. Lei et al. Energy 288 (2024) 129795
4
L. Lei et al. Energy 288 (2024) 129795
⎧
⎪
⎪ ( ) ( )
⎨→ ̅̅̅→ → ← ← ←
h t = LSTM h t− 1 , xt , t ∈ [1, T]h t = L STM h t− 1 , xt , t ∈ [T, 1] (11)
⎪
⎪
⎩
→ ←
ybt =μ h t + ν h t (12)
→ ←
where h t and h t are the energy consumption output at the time t of the
→ ←
forward LSTM and the backward LSTM respectively. h t− 1 and h t− 1 are
Fig. 3. Structure of the BiLSTM network. the energy consumption output of the forward LSTM and the backward
LSTM at time t-1. The yb t is the energy consumption prediction output
of the BiLSTM. The values of forward factorμand backward factorνare
obtained by training the energy consumption data. The T is the length of
the time series. The LSTM structure of each layer in the BiLSTM is shown
in Fig. 4, including the forgetting gate, input gate, output gate, and unit
state.
5
L. Lei et al. Energy 288 (2024) 129795
Table 1
Description of variables from the original dataset.
Category No. Variable Abbreviation Unit Range
o
Meteorological parameter 1 Outdoor temperature OT C [26.86,38.52]
2 Relative humidity RH % [34.1,97.4]
3 Wind speed WS m/s [0.2,1.2]
4 Solar radiation SR W/m2 [55.6,566.7]
Architectural information 5 Number of floors NF – [4,7]
6 Gross floor area GFA m2 [3051,9856]
7 Building orientation BO – [231,285]
8 Window wall ratio WWR % [0.14,0.26]
9 Heat transfer coefficient of external walls HTCW W/(m2⋅K) [0.6,1.56]
10 Shading coefficient SC – [0.49,0.96]
11 Building length-to-width ratio BLR – [1.5,3.19]
12 Heat transfer coefficient of roof HTCR W/(m2⋅K) [0.5,1]
Occupancy activity 13 Personnel density PD person/m2 [0.25,1.2]
Building services and energy systems 14 Lighting power density LPD W/m2 [4.69,15.9]
15 Fresh air volume per capita FAVPC m3/h. person [26,52]
16 COP COP – [4.89,6.75]
o
17 Air supply temperature AST C [17.2,26.1]
18 Fan efficiency FE % [0.55,0.91]
19 Water pump efficiency WPE % [0.29,0.91]
o
Indoor environment 20 Room temperature RT C [18.16,26.75]
nodes is [10, 50]. The RF-SSA-BiLSTM energy consumption prediction university in Dalian are used to train and test the energy consumption
model is established. The main workflow of building energy consump prediction model.
tion prediction is shown in Fig. 5.
3.1. Data collection for the selection of energy consumption influencing
3. Data collection factors characteristics
The measured data is divided into two parts: The relevant data In this study, data collection was conducted on 100 civil public
collected from 100 civil public buildings in Dalian are applied to the buildings in Dalian. Using building electricity consumption as the
cluster analysis and feature selection of energy consumption influencing building energy consumption value, a total of 100 sets of data on
factors. The energy consumption data of an experimental building in a building electricity consumption and 20 building energy consumption
6
L. Lei et al. Energy 288 (2024) 129795
Fig. 6. Dalian experimental building electricity consumption: (a) Hourly energy consumption; (b) Monthly electricity consumption.
influencing factors were collected. The time interval for data collection 3.2. Data collection of building energy consumption prediction
is 1 h. Twenty building energy consumption influencing factors were
collected from five aspects: Architectural information, meteorological The electricity consumption data of an experimental building in a
parameter, occupancy activity, building services and energy systems, university in Dalian is collected as the test data of the energy con
and indoor environment, as shown in Table 1. The architectural infor sumption prediction model. The experimental building is a frame shear
mation is obtained according to the measured data of each civil public structure, five floors above ground and one floor underground. The total
building, including the number of floors, gross floor area, building area of the building is 9533 m2, and the air conditioning area is 8580 m2.
orientation, window wall ratio, heat transfer coefficient of external From 1 January 2018 to 7 December 2018, the hourly energy con
walls, shading coefficient, building length-to-width ratio, and heat sumption data of the experimental building were collected, and 8176
transfer coefficient of the roof. The meteorological parameters were sets of valid data were obtained, as shown in Fig. 6 (a). Fig. 6 (b) shows
collected during the test period, including outdoor temperature, relative the distribution of electricity consumption in different months of the
humidity, wind speed, and solar radiation. The building services and experimental building. In summer, the power consumption increases,
energy systems are measured and calculated, including lighting power and in spring, the power consumption is less. Especially in February, it is
density, fresh air volume per capita, COP, air supply temperature, fan the Chinese Spring Festival holiday, and the number of people on
efficiency, and water pump efficiency. The personnel density is based on campus decreases, which becomes the lowest month of electricity con
investigation and calculation, which is used to represent the occupancy sumption. Fig. 7 is the hourly meteorological data of the area where the
activity. The room temperature is measured during the collection experimental building is located. In the short-term prediction, the en
period, which is used to represent the indoor environment. ergy consumption data of the last week of the first month of each quarter
is used to test the prediction model, and the remaining data is used as the
training set. The ratio of the training set to the test set is about 3:1. In the
7
L. Lei et al. Energy 288 (2024) 129795
Fig. 7. Meteorological parameters of Dalian in 2018: (a) hourly outdoor air temperature, (b) hourly relative humidity, (c) hourly solar horizontal radiation, (d)
hourly wind speed.
8
L. Lei et al. Energy 288 (2024) 129795
9
L. Lei et al. Energy 288 (2024) 129795
Fig. 12. RF-SSA-BiLSTM and SSA-BiLSTM short-term prediction results: (a) Low energy consumption season, (b) High energy consumption season, (c) 80–100 h in
high energy consumption season.
Fig. 13. The prediction errors of RF-SSA-BiLSTM and SSA-BiLSTM: (a) Low energy consumption season, (b) High energy consumption season.
4.3. Building energy consumption prediction and reliability of the building energy consumption prediction model
under different energy consumption levels are evaluated. The short-term
The annual electricity consumption data of an experimental building energy consumption prediction results of the RF-SSA-BiLSTM model and
in a university in Dalian are collected in this study paper to conduct the SSA-BiLSTM model are shown in Fig. 12. Fig. 12 (a) shows the
short-term prediction and medium-term prediction respectively. prediction results of the two prediction models in Spring, in which the
trend of the energy consumption prediction curve is consistent with the
4.3.1. Short-term prediction of building energy consumption actual energy consumption curve. Fig. 12 (b) shows the prediction re
sults of the two prediction models in Summer. At the peak and trough of
4.3.1.1. Impact of feature selection on building energy consumption pre energy consumption, the predicted energy consumption curve is poorly
diction. To evaluate the impact of the feature selection method used in fitted with the actual energy consumption curve. To make the compar
this study on the performance of the building energy consumption ison between the prediction models clearer and more intuitive, the
prediction model, the SSA-BiLSTM prediction model with feature se prediction results from the 80th hour to the 100th hour in Summer are
lection (i.e., RF-SSA-BiLSTM) and the SSA-BiLSTM prediction model enlarged, as shown in Fig. 12 (c). It can be seen from Fig. 12 (c) that
without feature selection are used for short-term prediction of energy there is a significant deviation between the predicted value and the
consumption. By comparing the prediction results of the low energy actual value of the peak energy consumption of the two prediction
consumption season represented by Spring and the high energy con models from the 86th hour to the 90th hour. The prediction error be
sumption season represented by Summer, the prediction performance tween the prediction results of the two models and the actual energy
10
L. Lei et al. Energy 288 (2024) 129795
Table 2
Quantitative evaluation of the RF-SSA-BiLSTM model and SSA-BiLSTM model.
Model Low energy consumption High energy consumption
season season
Table 3
Comparison of calculation time between RF-SSA-BiLSTM and SSA-BiLSTM.
Model The number of input impact factors The time of calculation
RF-SSA-BiLSTM 13 73.78 s
SSA-BiLSTM 20 142.4 s
consumption is shown in Fig. 13. Fig. 13 (a) is the prediction error curve
in Spring, and Fig. 13 (b) is the prediction error curve in Summer. The
prediction error of the RF-SSA-BiLSTM model is smaller than that of the
SSA-BiLSTM model, and it is more stable as a whole, showing better
calculation accuracy. The results show that the prediction model can
capture the key influencing factors more accurately by feature selection
of energy consumption influencing factors, avoid the interference of
redundant influencing factors on the prediction model, achieve more
stable prediction, and improve the reliability and accuracy of the pre
diction model.
To further evaluate the prediction performance of the RF-SSA-
BiLSTM model and the SSA-BiLSTM model, the MAE, RMSE, and
MAPE of the two models were calculated and compared in Table 2. In
the low energy consumption season, the MAE, RMSE and MAPE indexes
of the RF-SSA-BiLSTM model with the feature selection were
1.055,1.270, and 0.021, respectively. Compared with the SSA-BiLSTM
model without the feature selection, the three indexes decreased by
about 36.1 %, 31.4 %, and 40 %, respectively. In the high energy con
sumption season, the MAE, RMSE, and MAPE indexes of the RF-SSA-
BiLSTM prediction model were 1.544,1.846, and 0.026, respectively.
The MAE, RMSE, and MAPE indexes of the SSA-BiLSTM prediction
model were 1.776, 2.237, and 0.028, respectively. Three indexes
decreased by about 13.0 %, 17.5 %, and 7.1 %, respectively. Results
show that the feature selection method can effectively simplify the en
ergy consumption data samples. The prediction performance of the RF-
SSA-BiLSTM prediction model with feature selection is better than that
of the SSA-BiLSTM prediction model, which can predict building energy
consumption more accurately.
4.3.1.3. Performance comparison and analysis of optimization methods in RF-SSA-BiLSTM MAE 1.06 1.54 1.12 1.47
prediction models. In the short-term prediction of building energy con RMSE 1.27 1.85 1.50 1.91
MAPE 0.02 0.03 0.02 0.02
sumption, there are 8176 valid data sets, which are divided into test data
RF-BiLSTM MAE 2.40 3.23 2.43 3.20
and training data by quarter. The 13 influencing factors after feature RMSE 2.82 3.45 3.06 3.57
selection are used as the input values of the RF-SSA-BiLSTM, the RF- MAPE 0.07 0.06 0.08 0.06
11
L. Lei et al. Energy 288 (2024) 129795
Fig. 15. Energy consumption prediction results of RF-SSA-BiLSTM and RF-BiLSTM: (a) MAE, (b) RMSE, (c) MAPE.
When dealing with the energy consumption data sets of different sea
Table 5
sons, the burden of manual parameter adjustment is effectively reduced
Prediction results of RF-SSA-BiLSTM, RF-PSO-BiLSTM and RF-CNN-BiLSTM.
by optimizing the parameters, so that the model can achieve a higher
Model Index Spring Summer Autumn Winter accuracy level.
RF-SSA-BiLSTM MAE 1.06 1.54 1.12 1.47 The calculation results of MAE, RMSE, and MAPE indexes of the RF-
RMSE 1.27 1.85 1.50 1.91 SSA-BiLSTM, the RF-PSO-BiLSTM, and the RF-CNN-BiLSTM prediction
MAPE 0.02 0.03 0.02 0.02
models are shown in Table 5 and Fig. 16. By comparing the MAE indexes
RF-PSO-BiLSTM MAE 1.92 1.56 2.22 1.63
RMSE 2.41 2.04 2.69 2.30
of the RF-SSA-BiLSTM model and the RF-PSO-BiLSTM model in Tables5
MAPE 0.06 0.04 0.06 0.03 and it can be seen that in spring and autumn prediction, the MAE values
RF-CNN-BiLSTM MAE 1.45 1.53 1.79 1.81 of the RF-SSA-BiLSTM model are 1.06 and 1.12, which are 44.8 % and
RMSE 1.67 2.02 2.16 2.30 49.5 % lower than that of the RF-PSO-BiLSTM model. In summer and
MAPE 0.03 0.03 0.04 0.04
winter prediction, the RMSE values of the RF-SSA-BiLSTM model are
1.85 and 1.91, which are 9.31 % and 16.96 % lower than that of the RF-
of the first month of each quarter. The prediction results are shown in PSO-BiLSTM model. The area enclosed by the three indicators of the RF-
Fig. 14. SSA-BiLSTM model in each season is smaller than that of the RF-PSO-
The MAE, RMSE, and MAPE are used to evaluate the performance BiLSTM model in Fig. 16, indicating that the RF-SSA-BiLSTM model
improvement of the BiLSTM neural network optimized by the SSA. The has better prediction performance than the RF-PSO-BiLSTM model.
results of MAE, RMSE, and MAPE of the RF-SSA-BiLSTM and the RF- Compared with the classical PSO algorithm, the SSA has a better effect
BiLSTM prediction models are compared in Table 4. It can be seen on search accuracy, convergence speed, and stability. The strong global
that the MAE index of RF-SSA-BiLSTM in the four seasons decreased by search ability of the SSA avoids the problem that the algorithm falls into
54 % on average, the RMSE index decreased by between 46.3 % and 55 the local optimum in the process of searching for BiLSTM parameters
%, and the MAPE index decreased to varying degrees, especially in the Therefore, the RF-SSA-BiLSTM model can more accurately capture the
autumn forecast. The MAPE index achieved a maximum 75 % reduction. characteristics and long-term trend of building energy consumption data
The prediction performance of the model optimized by the sparrow than the RF-PSO-BiLSTM model, and achieve more accurate prediction.
search algorithm is improved, and the prediction result is closer to the By comparing the prediction performance of the RF-SSA-BiLSTM
real energy consumption. Fig. 15 intuitively shows the index values of model and the RF-CNN-BiLSTM model in Tables5 and it can be seen
the RF-SSA-BiLSTM model and the RF-BiLSTM models. The area size of that the prediction accuracy of the RF-SSA-BiLSTM model and the RF-
the approximate diamond surrounded by each index value can be used CNN-BiLSTM model is similar in summer. However, in the other three
to compare the prediction performance of the model in four quarters. As seasons, the RF-SSA-BiLSTM model has better prediction performance.
shown in Fig. 15 (a), the area enclosed by the MAE index of the RF-SSA- Taking the autumn energy consumption prediction as an example, it can
BiLSTM prediction model in four seasons is much smaller than that of be seen from Table 5 that the RMSE index of the RF-SSA-BiLSTM model
the RF-BiLSTM model. The results show that the SSA improves the optimized by parameters is 1.5, while the RMSE index of the RF-CNN-
processing ability of the RF-SSA-BiLSTM model for complex building BiLSTM model is 2.16. In addition, the area enclosed by the MAPE
energy consumption data by adjusting the parameters of the RF-BiLSTM index of the RF-SSA-BiLSTM model in Fig. 16 (c) is smaller than that of
neural network, and the prediction accuracy is significantly improved. the RF-CNN-BiLSTM model, indicating that the average absolute per
centage error between the predicted value and the true value of the RF-
12
L. Lei et al. Energy 288 (2024) 129795
Fig. 16. Energy consumption prediction results of RF-SSA-BiLSTM, RF-PSO-BiLSTM and RF-CNN-BiLSTM: (a) MAE, (b) RMSE, (c) MAPE.
SSA-BiLSTM model in the four seasons is lower than that of the RF-CNN-
BiLSTM model. The RF-SSA-BiLSTM model and the RF-CNN-BiLSTM
model are two prediction models with different structures. Although
the RF-CNN-BiLSTM model introduces a convolutional neural network
as a feature extractor for building energy consumption influencing fac
tors, in the process of feature extraction, it may lose the global infor
mation of energy consumption data, resulting in a decline in prediction
performance. In contrast, the RF-SSA-BiLSTM model focuses on
parameter adjustment and model fitting capabilities, focusing on the
relationship between energy consumption data before and after the time Fig. 18. Mid-term prediction results of building energy consumption: (a) Pre
diction results of the four models, (b) Absolute error of medium-
series. Therefore, the performance of the RF-SSA-BiLSTM model is better
term prediction.
than that of the RF-CNN-BiLSTM model, and it has better calculation
accuracy.
prediction models under different data availability. When the avail
ability of energy consumption data changes, the prediction results are
4.3.1.4. Robustness analysis of prediction model. By comparing the ac
curacy of the prediction model under different data availability, the always relatively stable, indicating that the prediction model is more
robust. With the increased data availability, the CVRMSE of the RF-SSA-
robustness of the RF-SSA-BiLSTM prediction model is evaluated.
BiLSTM prediction model is stable at about 3 %. In contrast, the RF-
Robustness means that the prediction model can maintain stable and
BiLSTM model has the highest sensitivity to the amount of training
reliable prediction performance in the face of outliers or incomplete
data, and its CVRMSE maximum difference is about 2 %. In addition, the
data. Fig. 17 shows the trend of prediction accuracy of the four
RF-CNN-BiLSTM and RF-PSO-BiLSTM models have similar robustness,
13
L. Lei et al. Energy 288 (2024) 129795
but the CVRMSE values of the two models are higher than those of the (1) EWKM clustering method coupled with a random forest algo
RF-SSA-BiLSTM model. In summary, when the availability of energy rithm can effectively select the important factors in building en
consumption data changes, the prediction model that optimizes the ergy consumption. In comparison with the SSA-BiLSTM model,
BiLSTM network parameters through the sparrow search algorithm can the prediction accuracy of the RF-SSA-BiLSTM model after
still obtain accurate and stable prediction results. Among the four feature selection is significantly improved. It shows that the
models, RF-SSA-BiLSTM has the best robustness. feature selection can improve the accuracy and reliability of the
prediction model and play an important role in building energy
4.3.2. Mid-term prediction of building energy consumption consumption prediction.
In the mid-term prediction of building energy consumption, there are (2) After optimizing the BiLSTM neural network with the SSA algo
341 sets of valid data, of which 311 sets are used for training and 30 sets rithm, the prediction performances of the model are effectively
are used for testing. To further analyze the performance of the proposed improved. The RF-SSA-BiLSTM model shows excellent prediction
RF-SSA-BiLSTM model in the medium-term prediction, the hourly en ability when processing data in different seasons. The proposed
ergy consumption data of an experimental building are processed into model avoids the burden of manual parameter adjustment, re
daily energy consumption data, and the June data of high energy con alizes automatic parameter adjustment, and improves the
sumption are selected for 30 days of medium-term prediction. The mid- generalization ability and practicability of the model.
term prediction results of the RF-SSA-BiLSTM model and RF-BiLSTM, (3) The RF-SSA-BiLSTM model can predict building energy con
RF-PSO-BiLSTM, and RF-CNN-BiLSTM models are shown in Fig. 18 sumption at different time scales. The proposed model not only
(a). The prediction results of the RF-SSA-BiLSTM model have the best provides strong support for decision-making and improves the
fitting degree with the actual energy consumption. The energy con efficiency of building energy consumption, but also reveals the
sumption prediction of the RF-BiLSTM model on the 4th, 12th, and 25th impact of building energy consumption influencing factors on
days has an obvious deviation from the actual situation. The prediction electricity consumption through cluster analysis and feature se
results of the RF-PSO-BiLSTM model on the 9th day have a large error lection. It helps managers identify and evaluate different influ
with the actual energy consumption, and the prediction results of other encing factors and take corresponding management measures to
days are acceptable. In the building energy consumption prediction from achieve sustainable building operation and management.
the 25th day to the 30th day, the four models all showed different de
grees of error fluctuation, but the error fluctuation of the RF-SSA- In future research, the decomposition techniques will be applied to
BiLSTM model was the smallest, and the predicted value was closer to process the data to optimize the energy consumption model. The
the actual energy consumption. To more clearly compare the prediction building energy consumption prediction performance for different
accuracy of the four models in the medium term of building energy geographical locations and different types of buildings under various
consumption, the needle diagram is used to analyze the prediction error feature selection methods also can be further explored.
of each model, as shown in Fig. 18 (b). The prediction error of the RF-
BiLSTM model on individual days is far greater than that of other CRediT authorship contribution statement
models, and the fluctuation is large. In contrast, the prediction error of
the RF-SSA-BiLSTM model changes little and the overall performance is Lei Lei: Funding acquisition, Project administration, Supervision,
relatively stable. The average relative errors of the RF-SSA-BiLSTM Conceptualization, Investigation, Data curation, Software, Validation,
model, the RF-BiLSTM model, the RF-PSO-BiLSTM model, and the RF- Writing - original draft, Resources, Formal analysis, Methodology. Suola
CNN-BiLSTM model are 1.82 %, 3.23 %, 2.52 %, and 2.3 %, respec Shao: Investigation, Supervision, Data curation, Validation, Writing -
tively. Results show that the RF-SSA-BiLSTM model has the highest original draft. Lixia Liang: Investigation, Supervision.
prediction accuracy and is not only suitable for building energy
consumption. Declaration of competing Interest
In this study, EWKM and RF are coupled to realize the feature se
lection of building energy consumption influencing factors, and a pre The authors declare that they have no known competing financial
diction model using the sparrow search algorithm to optimize BiLSTM is interests or personal relationships that could have appeared to influence
established, which provides a more novel and advanced approach for the work reported in this paper.
building energy consumption prediction. However, the universality and
robustness of the proposed method need to be further verified for Data availability
different geographical locations and different types of buildings.
Data will be made available on request.
5. Conclusions
Acknowledgments
The RF-SSA-BiLSTM hybrid model based on deep learning is pro
posed for building energy consumption predictions in this study. Firstly,
This research was supported by Zhejiang Province “spearhead”
the EWKM clustering algorithm is used to cluster the influencing factors
“bellwether” research and development project “The key technology
in building energy consumption, while the DBI index is used to deter
and equipment development for low-carbon buildings” (Grant No.:
mine the optimal number of clusters. Secondly, the feature importance
2023C03153).
ranking method in the random forest algorithm is used to select the
energy consumption influencing factors. Finally, the SSA is used to
References
optimize the BiLSTM neural network to predict building energy con
sumption in the short and medium term. Taking the annual electricity [1] Ye Y, Zuo W, Wang G. A comprehensive review of energy-related data for US
consumption of an experimental building in Dalian as the research ob commercial buildings. Energy Build 2019;186:126–37.
ject, the prediction model established in this paper is compared with the [2] Cheng YL, Lim MH, Hui KH. Impact of internet of things paradigm towards energy
consumption prediction: a systematic literature review. Sustain Cities Soc 2022;78:
prediction performance of the RF-BiLSTM model, the RF-PSO-BiLSTM 103624.
model, and the RF-CNN-BiLSTM model. The following conclusions are [3] Yoshino H, Hong T, Nord N. IEA EBC annex 53: total energy use in buildings-
drawn: Analysis and evaluation methods. Energy Build 2017;152:124–36.
[4] Wang Z, Srinivasan RS. A review of artificial intelligence based building energy use
prediction: contrasting the capabilities of single and ensemble prediction models.
Renew Sustain Energy Rev 2017;75:796–808.
14
L. Lei et al. Energy 288 (2024) 129795
[5] Karijadi I, Chou SY. A hybrid RF-LSTM based on CEEMDAN for improving the [24] Faiq M, Tan KG, Liew CP, et al. Prediction of energy consumption in campus
accuracy of building energy consumption prediction. Energy Build 2022;259: buildings using long short-term memory. Alex Eng J 2023;67:65–76.
111908. [25] Rahman A, Srikumar V, Smith AD. Predicting electricity consumption for
[6] Amasyali K, El-Gohary NM. A review of data-driven building energy consumption commercial and residential buildings using deep recurrent neural networks. Appl
prediction studies. Renew Sustain Energy Rev 2018;81:1192–205. Energy 2018;212:372–85.
[7] Tamer T, Dino IG, Akgül CM. Data-driven, long-term prediction of building [26] Kim TY, Cho SB. Predicting residential energy consumption using CNN-LSTM
performance under climate change: building energy demand and BIPV energy neural networks. Energy 2019;182:72–81.
generation analysis across Turkey. Renew Sustain Energy Rev 2022;162:112396. [27] Lei L, Chen W, Wu B, et al. A building energy consumption prediction model based
[8] Li X, Chen S, Li H, et al. A behavior-orientated prediction method for short-term on rough set theory and deep learning algorithms. Energy Build 2021;240:110886.
energy consumption of air-conditioning systems in buildings blocks. Energy 2023; [28] Ding Z, Chen W, Hu T, et al. Evolutionary double attention-based long short-term
263:125940. memory model for building energy prediction: case study of a green building. Appl
[9] Zhao D, Zhong M, Zhang X, et al. Energy consumption predicting model of VRV Energy 2021;288:116660.
(Variable refrigerant volume) system in office buildings based on data mining. [29] Shi X, Huang G, Hao X, et al. Sliding window and dual-channel CNN (SWDC-CNN):
Energy 2016;102:660–8. a novel method for synchronous prediction of coal and electricity consumption in
[10] Wang L, Kubichek R, Zhou X. Adaptive learning based data-driven models for cement calcination process. Appl Soft Comput 2022;129:109520.
predicting hourly building energy use. Energy Build 2018;159:454–61. [30] Jang J, Han J, Leigh SB. Prediction of heating energy consumption with operation
[11] Qiao Q, Yunusa-Kaltungo A, Edwards RE. Towards developing a systematic pattern variables for non-residential buildings using LSTM networks. Energy Build
knowledge trend for building energy consumption prediction. J Build Eng 2021;35: 2022;255:111647.
101967. [31] Wang Y, Zhang C, Fu Y, et al. Hybrid solar radiation forecasting model with
[12] Crawley DB, Lawrie LK, Winkelmann FC, et al. EnergyPlus: creating a new- temporal convolutional network using data decomposition and improved artificial
generation building energy simulation program. Energy Build 2001;33(4):319–31. ecosystem-based optimization algorithm. Energy 2023:128171.
[13] Zhu Y. Applying computer-based simulation to energy auditing: a case study. [32] Qiao X, Peng T, Sun N, et al. Metaheuristic evolutionary deep learning model based
Energy Build 2006;38(5):421–8. on temporal convolutional network, improved aquila optimizer and random forest
[14] Yan D, Xia J, Tang W, et al. DeST-An integrated building simulation toolkit Part I: for rainfall-runoff simulation and multi-step runoff prediction. Expert Syst Appl
fundamentals//Building Simulation, vol. 1. Tsinghua Press; 2008. p. 95–110. 2023:120616.
[15] Raoufi K, Wisthoff AK, DuPont BL, et al. A questionnaire-based methodology to [33] Suo L, Peng T, Song S, et al. Wind speed prediction by a swarm intelligence based
assist non-experts in selecting sustainable engineering analysis methods and deep learning model via signal decomposition and parameter optimization using
software tools. J Clean Prod 2019;229:528–41. improved chimp optimization algorithm. Energy 2023;276:127526.
[16] Cuerda E, Guerra-Santin O, Sendra JJ, et al. Understanding the performance gap in [34] Wang Z, Xia L, Yuan H, et al. Principles, research status, and prospects of feature
energy retrofitting: measured input data for adjusting building simulation models. engineering for data-driven building energy prediction: a comprehensive review.
Energy Build 2020;209:109688. J Build Eng 2022:105028.
[17] Fang T, Lahdelma R. Evaluation of a multiple linear regression model and SARIMA [35] Qiao Q, Yunusa-Kaltungo A, Edwards RE. Developing a machine learning based
model in forecasting heat demand for district heating system. Appl Energy 2016; building energy consumption prediction approach using limited data: Boruta
179:544–52. feature selection and empirical mode decomposition. Energy Rep 2023;9:3643–60.
[18] Zheng Z, Chen H, Luo X. A Kalman filter-based bottom-up approach for household [36] Ding Y, Fan L, Liu X. Analysis of feature matrix in machine learning algorithms to
short-term load forecast. Appl Energy 2019;250:882–94. predict energy consumption of public buildings. Energy Build 2021;249:111208.
[19] Bourdeau M, qiang Zhai X, Nefzaoui E, et al. Modeling and forecasting building [37] Jing L, Ng MK, Huang JZ. An entropy weighting k-means algorithm for subspace
energy consumption: a review of data-driven techniques. Sustain Cities Soc 2019; clustering of high-dimensional sparse data. IEEE Trans Knowl Data Eng 2007;19
48:101533. (8):1026–41.
[20] Wei Y, Zhang X, Shi Y, et al. A review of data-driven approaches for prediction and [38] Breiman L. Random forests. Mach Learn 2001;45:5–32.
classification of building energy consumption. Renew Sustain Energy Rev 2018;82: [39] Genuer R, Poggi JM, Tuleau-Malot C. Variable selection using random forests.
1027–47. Pattern Recogn Lett 2010;31(14):2225–36.
[21] Deb C, Lee SE, Santamouris M. Using artificial neural networks to assess HVAC [40] Xue J, Shen B. A novel swarm intelligence optimization approach: sparrow search
related energy saving in retrofitted office buildings. Sol Energy 2018;163:32–44. algorithm. Systems Science & Control Engineering 2020;8(1):22–34.
[22] Zhong H, Wang J, Jia H, et al. Vector field-based support vector regression for [41] Graves A, Schmidhuber J. Framewise phoneme classification with bidirectional
building energy consumption prediction. Appl Energy 2019;242:403–14. LSTM and other neural network architectures. Neural Network 2005;18(5–6):
[23] Shao M, Wang X, Bu Z, et al. Prediction of energy consumption in hotel buildings 602–10.
via support vector machines. Sustain Cities Soc 2020;57:102128.
15