0% found this document useful (0 votes)
53 views16 pages

Physical Energy and Data-Driven Models in Building Energy Prediction

The document reviews different approaches for building energy prediction including physical models, data-driven machine learning models, and hybrid models. Physical models use building simulation tools to model energy flows. Data-driven models apply machine learning algorithms like regression, random forest, and neural networks to predict energy usage from historical data. Hybrid models combine physical and data aspects. The review compares the methods and discusses their applications and limitations to guide future energy prediction research.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
53 views16 pages

Physical Energy and Data-Driven Models in Building Energy Prediction

The document reviews different approaches for building energy prediction including physical models, data-driven machine learning models, and hybrid models. Physical models use building simulation tools to model energy flows. Data-driven models apply machine learning algorithms like regression, random forest, and neural networks to predict energy usage from historical data. Hybrid models combine physical and data aspects. The review compares the methods and discusses their applications and limitations to guide future energy prediction research.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

Energy Reports 8 (2022) 2656–2671

Contents lists available at ScienceDirect

Energy Reports
journal homepage: www.elsevier.com/locate/egyr

Review article

Physical energy and data-driven models in building energy prediction:


A review
∗ ∗
Yongbao Chen a , , Mingyue Guo b , Zhisen Chen b , Zhe Chen a , , Ying Ji c
a
School of Energy and Power Engineering, University of Shanghai for Science and Technology, Shanghai 200093, China
b
School of Mechanical and Energy Engineering, Tongji University, Shanghai, 201804, China
c
Faculty of Architecture, Civil and Transportation Engineering, Beijing University of Technology, Beijing 100124, China

article info a b s t r a c t

Article history: The difficulty in balancing energy supply and demand is increasing due to the growth of diversified
Received 3 June 2021 and flexible building energy resources, particularly the rapid development of intermittent renewable
Received in revised form 15 January 2022 energy being added into the power grid. The accuracy of building energy consumption prediction is
Accepted 19 January 2022
of top priority for the electricity market management to ensure grid safety and reduce financial risks.
Available online 10 February 2022
The accuracy and speed of load prediction are fundamental prerequisites for different objectives such
Keywords: as long-term planning and short-term optimization of energy systems in buildings and the power grid.
Building energy modeling The past few decades have seen the impressive development of time series load forecasting models
Load prediction focusing on different domains and objectives. This paper presents an in-depth review and discussion of
Machine learning building energy prediction models. Three widely used prediction approaches, namely, building physical
Building energy simulation energy models (i.e., white box), data-driven models (i.e., black box), and hybrid models (i.e., grey box),
Time series forecasting were classified and introduced. The principles, advantages, limitations, and practical applications of
each model were investigated. Based on this review, the research priorities and future directions in the
domain of building energy prediction are highlighted. The conclusions drawn in this review could guide
the future development of building energy prediction, and therefore facilitate the energy management
and efficiency of buildings.
© 2022 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY license
(http://creativecommons.org/licenses/by/4.0/).

Contents

1. Introduction..................................................................................................................................................................................................................... 2657
1.1. Literature reviews .............................................................................................................................................................................................. 2657
1.2. Objectives and structure of the review........................................................................................................................................................... 2658
2. Building physical energy models – ‘‘white box’’ ........................................................................................................................................................ 2658
2.1. EnergyPlus .......................................................................................................................................................................................................... 2659
2.2. TRNSYS ................................................................................................................................................................................................................ 2659
2.3. Dymola ................................................................................................................................................................................................................ 2660
2.4. Other tools .......................................................................................................................................................................................................... 2660
2.5. Discussion on building physical energy models ............................................................................................................................................ 2660
3. Data-driven models using machine learning algorithms – ‘‘black box’’.................................................................................................................. 2660
3.1. Linear regression (LR)........................................................................................................................................................................................ 2661
3.2. Support vector machine (SVM) ........................................................................................................................................................................ 2661
3.3. Random forest (RF)............................................................................................................................................................................................ 2662
3.4. Extreme gradient boosting (XGBoost) and lightGBM .................................................................................................................................... 2662
3.5. Artificial neural network (ANN)....................................................................................................................................................................... 2663
3.6. Recurrent neural network (RNN) ..................................................................................................................................................................... 2663
3.7. Other models...................................................................................................................................................................................................... 2664
3.8. Discussion of data-driven models.................................................................................................................................................................... 2665
4. Hybrid models – ‘‘grey box’’ ......................................................................................................................................................................................... 2665
4.1. resistance–capacitance (RC) thermal network ............................................................................................................................................... 2665

∗ Corresponding authors.
E-mail addresses: [email protected] (Y. Chen), [email protected] (Z. Chen).

https://doi.org/10.1016/j.egyr.2022.01.162
2352-4847/© 2022 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
Y. Chen, M. Guo, Z. Chen et al. Energy Reports 8 (2022) 2656–2671

4.2. Other models...................................................................................................................................................................................................... 2665


4.3. Discussion of hybrid models ............................................................................................................................................................................ 2666
5. Advantages and limitations of each prediction approach ......................................................................................................................................... 2666
6. Conclusion ....................................................................................................................................................................................................................... 2667
Declaration of competing interest................................................................................................................................................................................ 2668
Acknowledgments .......................................................................................................................................................................................................... 2668
Appendix ......................................................................................................................................................................................................................... 2669
References ....................................................................................................................................................................................................................... 2669

main challenge is to match the intermittent renewable energies


Nomenclature
with energy supply and demand management in place and time.
AC Air conditioning Accurate and fast energy consumption prediction can help to
ANN Artificial neural networks achieve the goals of evaluating new building design alternatives
BIM Building information model and optimizing energy systems. For instance, in the design phase,
CV Coefficient of variation the forecasting of building load is the basis of energy system
CV-RMSE Coefficient of variation of root mean selection, for example, the selection of the size and type of air
square error conditioning (AC). In addition, with the rapid development of
DNN Deep neural networks renewable energy, the application of energy management strate-
gies such as demand response (DR) has been deemed to be a
DR Demand response
promising way to balance the power supply and demand in the
ELM Extreme learning machine
grid (Chen et al., 2018, 2019). In the domain of building DR,
LTLF Long-term load forecasting a fair and accurate load baseline that commonly predicts the
EMS Energy management system hour-ahead load of DR is the key factor for the stakeholders to
ENMIM Evolutionary neural machine inference determine whether to implement the DR program.
model With the different needs in practical building programs, there
GMM Gaussian mixture model are two types of common prediction models. One is short-term
GPR Gaussian process regression load forecasting (STLF) and the other is long-term load forecast-
GRU Gated recurrent unit ing (LTLF). STLF aims to estimate the load of the next seconds
HVAC Heating, ventilation, and up to the next two weeks, while LTLF focuses on months and
air-conditioning longer periods (Hong and Fan, 2016). Fig. 1 shows the applica-
kNN k-nearest neighbors tions of STLF and LTLF. Commonly, DR and system operational
LMSR Linear model using stepwise regression optimization require fast computational iterations in the control
LR Linear regression algorithms; hence, STLF is suitable. For system planning and
energy policy-making, energy supply and demand conditions in
LS-SVM Least-square support vector machine
the future should also be considered, and thus, LTLF is usually
LSTM Long short-term memory
implemented.
MAE Mean absolute error Regardless of the STLF or LTLF method, many efforts have
MAPE Mean absolute percentage error been made in recent decades by numerous scientists and en-
MARS Multivariate adaptive regression splines gineers to develop energy consumption prediction approaches.
MLP Multilayer layer perceptron These efforts can be categorized into three types. First, the build-
NARM Nonlinear autoregressive model ing physical energy model, also called ‘‘white box’’, is based on
RC Resistance capacitance detailed building parameters and heat balance equations. Com-
RF Random forest monly used building physical energy simulation tools such as
RNN Recurrent neural networks EnergyPlus (U.S. Department of Energy, 2021), Dymola (Anon,
STLF Short-term load forecasting 2021b,c), and TRNSYS (Anon, 2021d) are introduced in this paper.
Second, the data-driven model called ‘‘black box’’ is based on
SVM Support vector machine
historical operational big data and machine learning algorithms
SVR Support vector regression
which refer to support vector regression (SVR) (Chen et al., 2017),
XGBoost Extreme gradient boosting
random forest (RF) (Dudek, 2015), extreme gradient boosting
(XGBoost) (Butch, 2020), artificial neural networks (ANN) (Abu-
Et-Magd and Findla, 2003), among other techniques. Lastly, the
1. Introduction hybrid model called ‘‘grey box’’ is a model that combines building
physical information with historical data sources (Somu et al.,
1.1. Literature reviews 2020; Dong et al., 2016).
In the domain of the physical model, the zonal and nodal
The building sector is a major energy consumer in the world, approaches have been reviewed by Foucquier et al. (2013) and
accounting for 39% of the world’s total energy consumption ac- we recommend the readers to refer to their paper. Hence, in this
cording to statistics study (Somu et al., 2020). To reduce building study, we mainly focused on the advantages and disadvantages
energy consumption, improve energy efficiency, and increase the of other aspects of the simulation tools. Usually, the prediction
proportion of renewable energy utilization, building energy pre- accuracy of the physical models is higher compared with the
diction plays a critical role not only in building energy systems statistical models (Mazzeo et al., 2020). However, developing
planning and optimization (Zhou and Zheng, 2020; Fan et al., detailed physical energy models for each building is a tiresome
2017) but also in building renewable energy penetration (Salkuti, task. Therefore, a data-driven model is an alternative owing to the
2019; Ahmad et al., 2020). As we know, buildings can be energy rapid development of big data technologies such as sub-metering
consumers and producers simultaneously. In this situation, the and smart buildings, and it has gained increasing popularity in
2657
Y. Chen, M. Guo, Z. Chen et al. Energy Reports 8 (2022) 2656–2671

Fig. 1. Classification and applications of STLF and LTLF.

recent years. Deep learning is an example of particularly suc- building owner to select a suitable model in practical engineering
cessful (Sun et al., 2020; Gassar et al., 2019). In the data-driven (4) summarizing the widely used models at present and pointing
model domain, two main factors, i.e., feature importance and out future direction of building energy prediction models. The
algorithm selection, were considered in the previous literature. paper is organized as follows. Section 2 elaborates on the physical
The input feature variables, including external and internal fac- building energy models by introducing and comparing different
tors, are the key elements for the prediction performance of the commonly used simulation tools. Section 3 studies the data-
algorithms (Zhang and Wen, 2019a; Luo et al., 2020). Although driven energy prediction models, and the hybrid methods are
the data-driven model has the merit of requiring less building introduced in Section 4. The advantages and disadvantages of
information to develop the model, the prediction performance is
each approach are presented in Section 5, and the conclusions
unstable, especially when the model is applied to other building
are drawn in Section 6.
cases. In addition, hybrid models have been developed simultane-
ously to improve the prediction performance by integrating the
advantages of physical and data-driven models. 2. Building physical energy models – ‘‘white box’’
There are several review papers about building energy pre-
diction (Foucquier et al., 2013; Wang and Srinivasan, 2017a;
Building physical energy models, also called physical models,
Amasyali and El-Gohary, 2018). However, there are still two
research gaps. First, most of the review papers focused on data- are based on heat and mass balance equations, which present the
driven models using machine learning algorithms (i.e., ‘‘black dynamic thermal behavior of buildings. Three heat transfer mod-
box’’ model). Despite the importance of these review efforts, els (i.e., conduction, convection and radiation) between building
physical and hybrid models are also important and well- envelop and its surroundings are considered in the heat balance
developed that they should be included and further discussed. analysis of physical energy models. Various commercial or open-
Second, review studies that cover overall building energy con- source software products such as EnergyPlus, Dymola, TRNSYS,
sumption prediction research in terms of different prediction DOE-2, and Matlab are available for building energy modeling to
spans (i.e., STLF and LTLF) are still insufficient. Such a review is construct and solve these equations conveniently (Harish and Ku-
essential for building owners to select an appropriate prediction mar, 2016), though the cooling and heating load can be calculated
model. Distinguishing from the published review papers, the manually. The description of heat and mass balance equations
novelty of this paper is to elaborate on building energy prediction and detailed steps to calculate the building heating and cooling
models based on prediction span. In the field of building energy loads was introduced in this paper (Hensen and Lamberts, 2012).
prediction, the prediction span is diverse according to practical Understanding the overall physical characteristics of buildings
engineering needs. Different models have different performances is important for using these building simulation tools. The heat
in manifold tasks of various prediction spans. In this review study, flow through the building envelope is determined not only by
three types of methods in different prediction spans (i.e., STLF and
the temperature difference, thermal resistance, and surface area,
LTLF) were investigated. The principle, advantages, limitations,
but also by the thermal inertia effect of the thermal mass, which
and practical applications of each method were investigated. In
results in heat lag. In general, detailed building information is
summary, this paper paves the way for a better understanding of
required to develop such models. Building envelope parameters,
the methodology for building energy prediction.
HVAC systems setting, internal heat gains, equipment and occu-
1.2. Objectives and structure of the review pancy schedules, thermal zones, location, and weather data are
essential to construct a physical building energy model (Crawley
The goal of this paper is to provide a comprehensive review et al., 2001). Zonal (Inard et al., 1996) and nodal (Zhai et al., 2011)
of building energy prediction approaches. The goals of this pa- approaches are two common methods for developing a physical
per are fourfold: (1) presenting a systematic review (including model. These approaches are a fast and simple way to estimate
physics-based, data-driven, and hybrid approaches) to facilitate the heat behavior of buildings (Foucquier et al., 2013). The rest
the development of energy prediction models; (2) describing the of this section describes the modeling process, advantages and
key processes and tactics of each approach; (3) paving a way for limitations, and applications of the commonly used software.
2658
Y. Chen, M. Guo, Z. Chen et al. Energy Reports 8 (2022) 2656–2671

Fig. 2. Modeling flow chart of EnergyPlus.

2.1. EnergyPlus HVAC systems, EnergyPlus has an energy management system


(EMS) module, which can be used to control energy-related
EnergyPlus has been under development since 1997 and was systems. Cetin et al. (2019) developed an EMS program to im-
first released in 2001 (Crawley et al., 2001). It is an open-source prove the simulation performance at a short-time step (minute-
program (U.S. Department of Energy, 2021) and has been con- level) of residential and small commercial direct expansion (DX)
sidered as a widespread and powerful energy simulation tool in HVAC systems’ on/off control in EnergyPlus. More realistic re-
buildings (Anon, 2006). Based on the Building Energy Software sults representing the on/off nature of the HVAC systems could
Tools Directory, EnergyPlus is introduced as a recommended tool be obtained. It is worth noting that EnergyPlus was originally
for energy simulation, building performance, heat and mass bal- developed for building envelope simulation, and therefore, estab-
ance analysis, etc. (Anon, 2021e). EnergyPlus is a typical nodal lishing HVAC systems is troublesome and may cause problems in
approach software. The conduction transfer function and finite- EnergyPlus (Anon, 2021f).
difference algorithm are the two main methods used for the
nodal approach, which can be regarded as a one-dimensional 2.2. TRNSYS
method. The main advantage of this nodal approach that it can
solve the heat function of a large time scale of building thermal Transient system simulation (TRNSYS) was developed by the
performance within a short computation time. The modeling flow Solar Energy Laboratory at the University of Wisconsin-Madison
chart of EnergyPlus is shown in Fig. 2. (Anon, 2021d). TRNSYS is a transient system simulation tool with
Owing to the merits of fast simulation speed and precise a modular structure that is characterized as a flexible tool in
energy consumption estimation, EnergyPlus is a popular tool for specific components or types for many applications such as solar
calculating and analyzing the energy consumption of various systems, buildings and HVAC systems, renewable energy systems,
buildings and energy systems, particularly at large time scales fuel cells, and cogeneration. TRNSYS is an application with a
such as annual and monthly simulations (Trcka and Hensen, graphical user interface and has the extreme flexibility to develop
2010). Westphal and Lamberts (2005) presented a case study personal components or types (Wetter and Christoph, 2006).
showing that the annual electricity consumption prediction was TRNSYS is reportedly regarded as a widely used tool for build-
only 1% lower than the actual value. Neto and Fiorelli (2008) con- ing energy systems modeling, particularly for solar energy sys-
ducted a comparison between EnergyPlus and an artificial neural tems and heat pumps. Chargui et al. (2012) investigated the
network (ANN) for predicting building energy consumption. En- heat performance of a geothermal heat pump system using the
ergyPlus presented a prediction error range of ±13%, while the TRNSYS model. A dual-source heat pump (Type 20) was used
ANN algorithm showed a prediction result of ±10%. They also to study the thermodynamic properties. Quesada et al. (2011)
concluded that the schedules of lighting, equipment, and occu- presented a dynamic model of a grid-connected photovoltaic (PV)
pancy are the major sources of uncertainties in prediction. As the system on TRNSYS. The results show that an accurate prediction
literature study has shown, the prediction errors were reportedly of long-term energy performance can be realized. A comparison
hugely different for specific cases. To improve the stability of of TRNSYS, EnergyPlus, and IDA indoor climate and energy (IDA
this method, historical operational data from existing buildings ICE) was conducted by Mazzeo et al. (2020). The results showed
are readily used. Fumo et al. (2010) used EnergyPlus benchmark that EnergyPlus and IDA ICE are better than TRNSYS in predict-
models to estimate building energy consumption. In their study, a ing thermal behavior in the presence of phase change materials
series of predetermined coefficients determined by electrical and (PCM). However, in the absence of PCM, TRNSYS showed the
fuel utility bills were considered. highest prediction accuracy in the warm period, whereas IDA ICE
In the modeling process, heating, ventilation, and air- achieved the best performance in the cooling period. TRNSYS can
conditioning (HVAC) systems are the most complicated and time- not only forecast energy consumption but also facilitate energy
consuming components. To evaluate the energy consumption of system design for energy optimization. Magnier and Haghighat
2659
Y. Chen, M. Guo, Z. Chen et al. Energy Reports 8 (2022) 2656–2671

EnergyPlus, can simulate a single building with an acceptable


computational cost but it might not be suitable for a large block of
buildings. Kim et al. (2019) established a single reduced model for
assembling ten buildings on Dymola, calculating district heating
and cooling demand.

2.4. Other tools

Other building energy performance simulation tools such as


IDA ICE, DOE-2, and eQUEST. IDA ICE was developed at the De-
partment of Building Sciences in Stockholm. DOE-2 was released
in the early 1980s, and eQUEST is an advanced version of DOE-2.
Researchers generally used eQUEST (Xing et al., 2015; Ke et al.,
2013; Wang et al., 2015) and DOE-2 (Tuhus-Dubrow and Krarti,
2010; Siddharth et al., 2011) to calibrate energy consumption
previously. Software co-simulation is a new trend because this
approach can combine the advantages of two or more simulation
Fig. 3. Architecture of Dymola software (Dynamic Modeling Laboratory User
tools, and it designates the best simulation performance and the
Manual Volume 1, Dassault Systèmes A.B., 2017).
most computationally efficient approach for different sub-tasks.

2.5. Discussion on building physical energy models


(2010) built a building base model on TRNSYS to develop a
database that was used to train the ANN for optimization. The main advantage of the ‘‘white box’’ is that the relationship
between input and output is explainable. Correspondingly, the
2.3. Dymola disadvantage is that it is time-consuming and labor-intensive
to enter all the detailed building parameters, which might be
The dynamic modeling laboratory (Dymola) was initially de- a problem for many buildings in the design phase and some
signed in 1978 by Hiding Elmqvist in his doctoral dissertation existing buildings. A brief description of the different simulation
to build a structured model language for large continuous sys- tools is presented in Table 1. When selecting a tool to estimate the
tems (Elmqvist, 1978). The building energy modeling Dymola building energy performance, it is important to make the trade-
software is based on the Modelica language, which is an acausal, off between the prediction accuracy and computing time. Nageler
object-oriented, and equation-based language to conveniently et al. (2018) presented a comparison of four building energy sim-
model physical systems, including thermal, mechanical, electri- ulation tools, including Dymola, EnergyPlus, IDA ICE, and TRNSYS,
cal, and control systems (Anon, 2021c). The acausal modeling on a test-box. After comparing the room temperature of test-box
approach describes the components based on equations without with measured data, they found that the simulation results are
allocating input and output variables; therefore, the components relatively accurate with an average bias of −0.92, −2.18, −0.37,
are easy to establish and modify. With this feature, Modelica and −1.13 K for these respective four tools.
supports hierarchical model composition, truly reusable libraries, Model calibration is an integral step to ensure the accuracy of
connectors, and acausal connections, and relieves the users from the energy model after it has been developed. For the purpose
manually converting equations to a block diagram or assignment of making the simulation results meet the measured data well
statement. Dymola has a powerful graphic editor for developing enough, the main process of building energy model is to adjust
and running models. Additionally, it can conveniently interact the input parameters, such as the efficiency of the chiller, the
with external data. A typical architecture of Dymola is shown in occupancy schedules, and so on (Guo et al., 2021). Manual calibra-
Fig. 3. tion heavily relies on the user’s practical experience to tune the
Dymola is a relatively new tool in the domain of building en- key parameters in models. Automated calibration is based on an
ergy simulations. Modelica buildings library developed by objective function or penalty function which is defined for match-
Lawrence Berkeley National Laboratory is an open-source and ing simulation results with measured data, and the parameters
widely used library with comprehensive building components setting is under search automatedly (Gaurav et al., 2016). Most of
and control systems, which is sufficient for different buildings these calibrations are deterministic and neglect the inherent un-
and energy systems (Anon, 2021b). The packages and components certainties of the building energy model. Therefore, the stochastic
in this library have been tested and validated using benchmark calibration methods such as the Bayesian approach have gained
models (Nouidui et al., 2012), and the calculation time is compa- attention recently (Hou et al., 2021). The data source and data
rable with that of TRNSYS (Wetter and Christoph, 2006). AixLib pre-processing methods utilized in the calibration process were
from RWTH Aachen University in Germany, BuildingSystems from comprehensively discussed in papers (Adrian et al., 2019; Murphy
Udk Berlin in Germany, and IDEAS from KU Leuven in Belgium et al., 2021; Chong et al., 2021).
are three other core Modelica libraries for energy design and
operation of buildings under the IBPSA Project (Anon, 2021g). Kim 3. Data-driven models using machine learning algorithms –
et al. (2015) developed a physical building information model ‘‘black box’’
(BIM) on the Dymola platform using object-based Modelica lan-
guage to simulate energy consumption, and the Modelica building Compared with physical models, data-driven models do not
library was used in their study. Chen et al. (2019) built an office require building thermal balance equations; therefore, less or
building model to estimate the HVAC load and total building no building physical information is needed. Data-driven models
energy consumption. In addition to describing the detailed HVAC are based on historical data to deduce the hidden relationship
systems, the internal thermal mass of the interior walls and between output (i.e., building energy consumption) and input
furniture was taken into account; thus, the dynamic thermal variables (i.e., features such as weather, building information, oc-
balance of this system is much more realistic. Dymola, unlike cupant behaviors, and equipment schedules) using mathematical
2660
Y. Chen, M. Guo, Z. Chen et al. Energy Reports 8 (2022) 2656–2671

Table 1
Brief description of different simulation tools.
Software tool Modeling approach Advantages Disadvantages Representative Reference
Require small computation time; Require a significant amount of time, U.S. Department of Energy (2021),
Applicable to large scale buildings; experience, and effort to enter the Trcka and Hensen (2010)
EnergyPlus Causal Good at envelope modeling; detailed parameters; and Fumo et al. (2010)
Friendly for beginners; Some required parameters are not
Free available;
Not good at HVAC systems
Require small computation time; Not good at building physical model; Anon (2021d), Quesada et al. (2011)
Flexibility and customization; Chargeable and Jani et al. (2020)
TRNSYS Causal Modular design;
Good at solar energy systems;
Friendly for beginners
Flexibility and customization; Not friendly for beginners; Anon (2021c), Hafner et al. (2014)
Modular design; Relatively long computation time; and Violidakis et al. (2020)
Dymola Acausal
Good at HVAC systems modeling; Chargeable
High reuse of components

DOE-2 Causal Good at building physics modeling; Not good at energy systems; Carriere et al. (1999) and
A traditional building simulation tool Unfriendly user interface Winkelmann and Selkowitz (1985)

methods. Data-driven methods are well adaptable for buildings


without detailed physical parameters such as buildings in the de-
sign phase. A general process of the machine learning prediction
method is shown in Fig. 4. Widely used input variables include
time-series features (e.g., day type, occupancy rate and schedule,
operational schedule of equipment), meteorological conditions
(e.g., temperature, humidity, solar radiation), and building phys-
ical parameters (e.g., the number of floors, wall area, glazing
area, wall heat transfer coefficient). The output variables are
generally heating/cooling loads and electricity consumption (Do
and Cetin, 2018; Guo et al., 2018). Data-driven models have
gained increasing interest in building energy prediction owing to
their simplicity and flexibility (Wang and Srinivasan, 2017b). This
section presents the promising ‘‘black box’’ methods, including
linear regression (LR), support vector machine (SVM), extreme
gradient boosting (XGBoost), random forest (RF), recurrent neural
network (RNN), and artificial neural network (ANN).

3.1. Linear regression (LR)

LR is the simplest machine learning algorithm for a data-


mining beginner because no parameters need to be tuned. In
addition, it requires fewer computing resources and therefore
has a fast prediction speed. LR has been widely used owing to
its simplicity and good prediction performance in many fields.
Linear and non-linear regression are two regression methods.
The principle of the regression is to establish the relationship
between the output response variable y (i.e., label) and input
explanatory variables x (i.e., feature variables). One of the most Fig. 4. Flow chart of machine learning model development and validation.
common regression models of LR is expressed in Eq. (1). Other
types of regression models can be found in Fahrmeir et al. (2013).
models are still a popular option for energy prediction. LR models
are simple and have a fast prediction speed. However, LR models
y = a1 x1 + a2 x2 + · · · + ai xi + · · · + an xn + ε i ∈ [1, n] (1)
can barely meet high-precision prediction requirements, espe-
where a is the regression coefficient of the explanatory variables, cially for HVAC loads, which are influenced by non-linear and
ε is a random deviation or error term, and n is the dimension of uncertain factors such as weather and schedules. The LR models
the explanatory variables. For example, if the predicted output have acceptable prediction performance for weather-insensitive
y is the electricity consumption of buildings, the feature vari- loads such as lighting and equipment loads, but they lack the
ables could be ambient temperature, solar radiation, occupancy ability to accurately predict weather-sensitive loads such as HVAC
schedules, and the total heat transfer coefficient of walls. loads (Chu et al., 2020).
Because of the strong correlation between building loads and
outdoor air temperature, temperature as the most common ex- 3.2. Support vector machine (SVM)
planatory variable is chosen in many different regression models.
Hagan and Behr (1989) established an LR model with time series SVM is a promising machine learning algorithm owing to its
and temperature as explanatory variables to predict the build- strong non-linear capabilities, capable of realizing classification
ing electricity loads. According to the global energy forecasting and regression. SVM is commonly used for classification, while
competition (Hong et al., 2014), linear and non-linear regression support vector regression (SVR) can be used to forecast building
2661
Y. Chen, M. Guo, Z. Chen et al. Energy Reports 8 (2022) 2656–2671

features and training datasets, and they can be trained in parallel.


In this way, the prediction accuracy is higher than that of a single
decision tree. In addition, it overcomes the overfitting problem
by establishing multiple decision trees, where each decision tree
works on a random sample of the original dataset. Thus, the
prediction results are less likely to be influenced by outliers,
which are quite common in datasets. In the RF model, the number
of trees and the depth of a tree are two key parameters that need
to be tuned, and therefore RF model requires fewer parameters
to be set compared with other algorithms (Dudek, 2015; Lahouar
and Slama, 2015; Moon et al., 2018).
Ahmad and Chen (2019) made a comparison between a non-
linear autoregressive model (NARM), linear model using stepwise
regression (LMSR), and RF for medium-term and long-term en-
ergy prediction. Ambient temperature and relative humidity ratio
are the two main input variables in the models. For different
seasons, they found that the RF model had a lower average error
Fig. 5. Schematic of the hyperplane. (MAPE: 2.64%) than the other two methods, i.e., LMSR (MAPE:
3.10%) and NARM (MAPE: 4.21%). In all these three models, the
average MAPE was worse in summer and winter (summer: 3.97%;
loads. The main idea of SVM in regression is to introduce a ker- winter: 3.42%; spring: 3.00%; autumn: 2.87%). One reason is that
nel function, which is capable of nonlinearly mapping the input AC loads are more complicated and difficult to forecast with a
space into a hyper-dimensional feature space that formulates an standalone algorithm in the summer and winter seasons. There-
optimized hyperplane to realize LR in the feature space (Vapnik, fore, decomposing the building loads into different types and
2013). The SVR function is expressed in Eq. (2). predicting them individually is a promising way to improve the
prediction performance (Wang et al., 2012; Ji et al., 2016).
f (x) = W T ϕ (x) + b (2) In addition to the advantages of less overfitting and higher ac-
curacy, RF can give the importance of features that are used in the
where f (x) denotes the prediction outputs, W is the weight
factor, b is the adjustable factor, and ϕ (x) is the map function of training and testing process of the model. Feature importance is
mapping the input space into a high-dimensional feature space. important for choosing the main features while skipping the weak
Fig. 5 shows the solution process of the SVR. A margin of tolerance ones to accelerate the computational process and ensure the
ε is set, and the main goal is to maximize the margin to minimize prediction accuracy at the same time. Lahouar and Slama (2015)
prediction error. predicted the day-ahead building load based on the RF model.
Studies on SVM for building energy prediction have been In their study, an expert feature selection strategy was adopted.
widely reported in recent years (Chen and Tan, 2017; Son and The input feature variables included day type, temperature, and
Kim, 2015) because of the ability to solve non-linear regression load of the previous day. They concluded that the order of the
problems. Chen et al. (2017) proposed a novel SVR model in importance is previous day load, day type, and temperature.
which the ambient temperature of two hours ahead was chosen
as the real input variable for short-term electrical load prediction. 3.4. Extreme gradient boosting (XGBoost) and lightGBM
This innovation improves the prediction accuracy by reducing the
weather-sensitive loads’ lagging effect from building’s internal XGBoost is an ensemble learning algorithm that can solve
thermal inertia. The input variables are the ambient tempera- many data-mining problems in a fast and accurate manner. Re-
ture and time series, which are adaptively used to build cases leased on March 27, 2014, by Tianqi Chen, XGBoost is based on a
where only the weather information is permitted. Vrablecova gradient boosting algorithm and dominates the field of machine
et al. (2018) used SVR to forecast load using smart metering data learning. It is a powerful algorithm, as most Kaggle competitions
from individual households. They concluded that SVR is not the reported that it was the final winner (Butch, 2020; Anon, 2021a).
best algorithm for individual households’ electricity forecasting, XGBoost was designed using a gradient boosting algorithm, con-
but it is a promising method for forecasting aggregated loads verting weak learners to a strong learner. It can produce better
from single buildings and a cluster of buildings. SVR has been prediction outcomes by controlling the model complexity and
frequently reported in short-term load forecasting because of its reducing overfitting owing to its built-in regularization. XGBoost
prediction accuracy and speed in this field (Yang et al., 2019; is a relatively novel and advanced algorithm that has not been
He et al., 2017). Chen and Tan (2017) used SVR to forecast 24-h widely studied in building energy prediction (Wang et al., 2020a).
ahead hourly electric demand for a hotel and a mall. According to Unlike the RF algorithm, in which the multiple predictors are
their study, the SVR model can complete the prediction procedure in parallel, XGBoost adds the predictors sequentially. Nowadays,
within 20 s, and the prediction errors are approximately 4.0% and XGBoost can be easily implemented with the package in Python,
6.0% for the hotel and mall, respectively, which is applicable for R, Julia, and Scala (Anon, 2021h).
real-time control of a building energy management system such Wang et al. (2020a) studied the prediction characteristics of
as DR. XGBoost on building thermal load prediction. In their models, five
input variables including day of week, hour of day, holiday, tem-
3.3. Random forest (RF) perature, and relative humidity were taken into consideration.
They found that XGBoost (CV-RMSE: 21.1%) in shallow machine
RF is a supervised learning algorithm that uses a bagging learning outperformed other machine learning algorithms such
(bootstrap aggregating) algorithm for regression. RF is based on a as SVM (CV-RMSE: 25.0%), RF (CV-RMSE: 23.7%), and LSTM (CV-
decision tree, and multiple trees are established to obtain average RMSE: 31.9%) for long-term prediction. Because the correlation
prediction results. The prediction process of the RF is shown between input and output is less relevant when the prediction
in Fig. 6. Each decision tree is formed randomly with different duration is long, algorithms such as LSTM (long-term: CV-RMSE
2662
Y. Chen, M. Guo, Z. Chen et al. Energy Reports 8 (2022) 2656–2671

Fig. 6. Principle of RF algorithm.

31.9%; short-term: CV-RMSE 20.2%) are not good for this long- hidden layer ANN is shown in Fig. 8. A typical ANN has three
term task. XGBoost is good at long-term prediction, as other interconnected layers: input, middle (i.e., hidden), and output
studies have found. Lu et al. (2020) proposed a novel model that layers. Theoretically, the hidden layer consists of many sub-layers
combines XGBoost to predict the long-term energy consump- depending on the complexity and nature of the task (Mandal
tion of an intake tower. The MAPE of the prediction results of et al., 2006; Kiartzis et al., 1997).
the different methods are as follows: CEEMDAN-XGBoost: 4.85%, In addition to the factor of prediction accuracy, computing
XGBoost: 8.06%, CEEMDAN-RF: 6.26%, and PSO-SVM: 7.92%. time is another critical factor in evaluating the performance of a
In the building energy demand, the load demand of HVAC model. Generally, increasing the dimension of input features can
systems is the main difficulty to estimate because of its nonlinear improve prediction accuracy, but this strategy may also increase
character. Lu and Meng (2020) found that XGBoost is the best the computation cost, particularly for massive data processing.
model for forecasting AC energy use in residential buildings in Ahmad et al. (2017) compared an ANN and RF for HVAC electricity
Chongqing. For the cooling season of AC in buildings, they found consumption prediction of a hotel. Outdoor air-dry bulb temper-
that 11 input variables have a great influence on cooling energy ature, outdoor air relative humidity, day of week, hour of day,
use. These variables mainly included outdoor air temperature, occupancy schedule, and total rooms booked were considered as
running time of the AC, and temperature differences between input variables. They found that the ANN model’s prediction re-
indoor air and set-point, whereas no building physical and ther- sults were slightly better (MAE: 9.18% vs. 9.31%) when the model
mophysical variables such as window–wall ratio and total heat used all variables (ten features) instead of only using the impor-
transfer coefficient of the envelope were considered. The predic- tant variables (four features). In their study, the computation time
tion performance is not well compared with the results in Wang was not provided; nevertheless, using the main input features in
et al. (2020a) XGBoost (CV-RMSE: 62%), RF (CV-RMSE: 64%), SVR the model is a good way to optimize the prediction model in
(CV-RMSE: 64%), and ANN (CV-RMSE: 73%). Wang et al. (2019b)
practice. Mena et al. (2014) predicted the short-term electricity
tested several popular models (i.e., XGBoost, RF, ANN, and SVR) to
demand of a bioclimatic building in Spain by using an ANN-based
predict the heating energy consumption of a residential building
model. To avoid the use of unimportant variables in the model,
in Tianjin, China. Six input features (i.e., outdoor dry bulb temper-
input feature variable selection was implemented ahead of the
ature, dew point temperature, outdoor relative humidity, wind
data training and testing. The order of features’ importance of
speed, solar radiation, and hour of day) were used in their models.
different input variables is as follows: solar radiation, outdoor
The CV-RMSE of the prediction results of the different models is
temperature, wind speed, outdoor humidity, and wind direction,
as follows: average RF: 5.0%, XGBoost: 5.8%, SVR: 6.2%, and ANN:
and a mean error of 11.48% has been realized.
7.0%.
In addition, lightGBM is a tree-based gradient boosting frame-
3.6. Recurrent neural network (RNN)
work similar to XGBoost. It was first released on October 17,
2016 as a part of Microsoft’s Distributed Machine Learning Toolkit
project (Anon, 2021i). It was designed to be fast and distributed Elman RNN, LSTM, and gated recurrent unit (GRU) are three
with the advantages of faster training speed and higher efficiency, common RNN algorithms. LSTM was designed for handling se-
lower memory usage, supporting parallel and GPU learning, and quential data and was first introduced by Hochreiter and Schnid-
capable of handling large-scale data. LightGBM uses histogram- huber in 1997 (Hochreiter and Schmidhuber, 1997). Compared
based algorithms to bucket continuous features into discrete bins with the traditional neural network, LSTM can pass the informa-
so that it can reduce communication cost and memory usage (Jin tion from the last steps to the next time step (i.e., backpropa-
and Agrawal, 2003; Ke et al., 2017). Thus, lightGBM is a promising gation). Fig. 7 shows this memory passing process. LSTM can be
algorithm for energy prediction in massive data sources. considered as an integration of many traditional neural networks.
Based on this feature, LSTM is an inborn network that processes
3.5. Artificial neural network (ANN) sequential data such as building load. It can solve complex and
long-time-lag tasks that traditional RNN algorithms can barely
ANN is a nonlinear statistical algorithm inspired by biological solve. In the study (Wang et al., 2020a), LSTM performed better
neural networks. It can deduce the complicated hidden relation- for short-term load prediction compared with LR, SVM, RF, and
ship between inputs and outputs. The principle of a typical one XGBoost.
2663
Y. Chen, M. Guo, Z. Chen et al. Energy Reports 8 (2022) 2656–2671

Fig. 7. Process of backpropagation approach of RNN.

Fig. 9. Framework of an ensemble model.

Fig. 8. Principle of typical ANN with one hidden layer.


ensemble deep learning method for short-term building energy
forecasting. They decomposed the data into a stable and stochas-
tic part. The stochastic part was estimated by using the ensemble
3.7. Other models model, which combines a novel deep belief network and extreme
learning machine. Wang et al. (2020b) proposed a novel stack-
In 1965, the concept of ensemble learning was introduced by ing model for building energy prediction. In their study, several
Nilsson (1965). Compared with single models that use only one widely used algorithms, including RF, XGBoost, SVR, and kNN
algorithm, the ensemble model consists of multiple algorithms. models, were selected as the base models in the first layer. Then,
The ensemble model combines a single algorithm and takes ad- the stacking method was used to ensemble each base model by
vantage of each algorithm to improve prediction accuracy. The cross-validation to boost the prediction performance.
framework of the ensemble model is shown in Fig. 9. The goal of Transfer learning aims to learn knowledge from one task to
the ensemble model is to minimize the prediction errors; hence, other similar tasks, as shown in Fig. 10. It has been successfully
the base algorithms with high prediction accuracy have higher applied in domains such as machinery fault diagnosis (Wu et al.,
weights. 2020; Chuan et al., 2020) and image classification (Swati et al.,
Fan et al. (2014) developed an ensemble model integrating 2019). It is worth noting that data sources should have simi-
eight base algorithms for next-day building energy prediction. lar features to apply this method to building energy prediction.
The eight base algorithms were RF, SVR, multiple LR, multi- Because the building data usually meet this criterion, it is an
layer perceptron, boosting tree, multivariate adaptive regression attractive idea to use the data in well-measured buildings to
splines, k-nearest neighbors, and autoregressive integrated mov- predict the energy consumption of other buildings with limited
ing average, and the weights of these algorithms were 0.404, data (Qian et al., 2020).
0.315, 0.087, 0.076, 0.066, 0.023, 0.021, and 0.008, respectively. The literature reviews show that transfer learning can also
Duc-Hoc et al. (2020) proposed a new ensemble model called an be well integrated with other data-driven algorithms. Qian et al.
evolutionary neural machine inference model that combined SVR (2020) studied the transfer learning model with SVR in short- and
and radial basis function ANN. Measured data from residential medium-term HVAC energy consumption. Gao et al. (2020) used
buildings was used to evaluate this ensemble model. In their the transfer learning model to predict the energy consumption
study, the computing time of different models was logged. The of a building with poor data information. Ribeiro et al. (2018)
computing times of ANN, SVM, DNN, and ENMIM were 5, 7, 300, proposed a transfer learning method for cross-building energy
and 600 s, respectively. Zhang et al. (2020) proposed a novel forecasting; the results showed that the prediction accuracy was
2664
Y. Chen, M. Guo, Z. Chen et al. Energy Reports 8 (2022) 2656–2671

HVAC systems, lighting, plugs, and other equipment separately.


Fu et al. (2015) used sub-meter data to predict the electricity load
of public buildings with high prediction performance. Compared
with load decomposition, load clustering technology can classify
homogeneous loads that have similar patterns into one group,
which enhances the prediction performance by developing a sin-
gle model for each cluster. An early load classification algorithm
was proposed in this paper (Chen et al., 2021).
Model calibration is much easier for the data-driven models
compared with the white box models mentioned above because
the dataset has already obtained before model development and
has been split into training and validation datasets. The com-
monly used split rate is 80% for training and 20% for valida-
tion (Sun et al., 2020; Liu et al., 2020). A proper validation metric
is the key to evaluating a model. The widely used evaluation
Fig. 10. Principle of transfer learning. metrics are MAPE, MAE, and CV-RMSE (Chen et al., 2021; Zhe
et al., 2020).

4. Hybrid models – ‘‘grey box’’


improved by 11.2% by using data from other buildings. A novel
transfer learning-based methodology has been proposed for 24-h
The modeling and calibration process of ‘‘white box’’ software
ahead building energy forecasting by Fan et al. (2020). Compared
is a huge challenge for building energy stakeholders. A large num-
with standalone models, this new model could reduce prediction
ber of basic input parameters are required; thus, the modeling
errors by 15% to 78%.
development on a physical software platform is time-consuming,
3.8. Discussion of data-driven models and the simulation economic cost is high. The ‘‘black box’’ models
capture linear and nonlinear relevance between the input and
Compared with ‘‘white box’’ models, although data-driven output variables in an easy way. However, for such models, it
models do not require massive engineering efforts during the usually takes enormous historical data and a long time to train
development process, the generalization of a well-tuned data- the model and achieve accurate predictions under different con-
driven model is usually poor (Zhang and Wen, 2019a). Thus, ditions. To solve this dilemma, ‘‘grey box’’ has been proposed.
scholars have focused on feature selection in almost all data- It uses a simplified physical model and easily accessible data to
driven models to develop a more general model that can be used simulate building energy demand, thus combining the advantages
in different buildings (Luo et al., 2020). The input feature variables of both the white and black boxes.
are commonly categorized into three types: external climate data,
internal factors, and operation schedules of energy facilities (Li 4.1. resistance–capacitance (RC) thermal network
et al., 2009; Leung et al., 2012; Luo et al., 2019), and five mature
feature selection methods were proposed in the paper (Sun et al., RC model is a typical grey box model which was introduced
2020). Considering that buildings and their energy systems are in early 1985 by Hassid (1985). He proposed two resistances and
different in practice, the feature selection results are quite dif- one capacitance (i.e., the 2R1C model) to represent the building
ferent from one to another. Based on a comprehensive literature envelope’s thermal performance of a multi-story building. The
review, the usage frequency of features in 25 core references is sketch diagram of a simplified building energy RC model is shown
summarized in Fig. 11, and the detailed description is included in Fig. 13.
in Appendix. As shown in Fig. 11, outdoor dry bulb temperature, Ji (2016) established a new RC-S (RC-submeter) model with
outdoor relative humidity, solar radiation, day of week, and hour building submeter measured data to predict the hourly cooling
of day are the five most frequently used input features in data- load in buildings. In this model, the heat transfer through the
driven models. In addition to external climate data, the physical building envelope is calculated using the traditional 3R2C model,
information about buildings such as the number of floors, wall aiming at the heat stored inside the building thermal mass. This
area, and glazing area is used to improve prediction accuracy. model was verified by simulation and actual operational data, and
According to the abovementioned data-driven models, STLF the prediction MAE was within 9.0%. Mohammad et al. (2020) de-
and LTLF are two common prediction requests in building en- veloped a second-order RC grey box representing a case building
ergy management. Some of them are good at short-term predic- structure for building heat demand simulation with uncertainty
tion, while some perform better in the long-term. Fig. 12 shows analysis, which is crucial to ensure the validity of simulation
the prediction preferences of the abovementioned algorithms for results in a ‘‘grey box’’ model. Murphy et al. (2021) developed a
STLF and LTLF in 25 core references. As shown in Fig. 12, XGBoost 2R2C grey box model to predict the dynamic internal air temper-
is most probably used for LTLF, while RNN is commonly employed ature. In this grey model, internal air mass and internal thermal
for STLF; ANN and RF could be used for both time spans. masses were modeled as independent capacitors; external sur-
In addition to the hard works on feature engineering and faces were modeled as one resistor while walls and roofs were
model selection, load decomposition is a promising approach to modeled as the other resistor. More variants of xRyC models can
improve the prediction accuracy. There are two methods for load be found in the paper (Li et al., 2021).
decomposition. First, signal decomposition and transformation
are the most widely used methods, such as Fourier analysis (He 4.2. Other models
et al., 2020) and wavelet analysis (Alipour et al., 2020). Chu
et al. (2020) decomposed the total building loads into a basic and A hybrid model integrated with a simulation tool and data-
seasonal weather-sensitive part and obtained more accurate re- driven algorithm can be another option, as shown in Fig. 14. It
sults. Second, sub-meter technology has been rapidly developing may generate better prediction results than the ‘‘white box’’ and
in recent years, which makes it possible to predict the load of ‘‘black box’’. For instance, Dong et al. (2016) established a hybrid
2665
Y. Chen, M. Guo, Z. Chen et al. Energy Reports 8 (2022) 2656–2671

Fig. 11. Frequency of input features of data-driven models in 25 core references.

Fig. 12. Frequency of use of different algorithms for STLF and LTLF in 25 core
references.

model that combines data-driven and physics-based models to


estimate the total energy consumption for a residential building.
Compared with the other five data-driven algorithms, ANN, SVR, Fig. 13. An example of building RC thermal model.
LS-SVM, GPR, and GMM, the 24-ahead prediction accuracy of this
hybrid model is the best. Xu et al. (2012) developed a model
coupling EnergyPlus with ANN to predict the energy consump-
some applications of the hybrid model including heat dynamic
tion of a cluster of ten residential single-family houses. External
analysis, building control and optimization, urban size energy
empirical sources including experimental data and demographics
modeling, and building grid integration. With the development of
were considered in the model.
building energy simulation tools and cloud computing, the hybrid
model could become more prevalent in the future, and promote
4.3. Discussion of hybrid models
the development and cooperation between physics-based and
data-driven approaches.
Through the literature review, it appears that the hybrid mod-
els simplify the description of the building heat transfer process
and leverage the advantages and disadvantages of both physical 5. Advantages and limitations of each prediction approach
and statistical methods. Thus, some of the parameters in the
model are interpretable. RC model is a typical and the most A detailed description of each prediction approach was pre-
popular hybrid model, and the RC model can separately represent sented in the above sections. Physics-based, data-driven, and
the building physical parts and dynamic processes such as heat hybrid modes have different properties that are determined to
transfer through external envelope, zone air, zone internal mass, the model preference of users. A standard to select the best
internal heat gains, and infiltration (Li et al., 2021). There are approach for different scenarios was proposed in the paper (Dawn
2666
Y. Chen, M. Guo, Z. Chen et al. Energy Reports 8 (2022) 2656–2671

Fig. 14. Scheme of a hybrid model combining building physics with the data-driven algorithm.

Table 2
Advantages and disadvantages of different prediction approaches.
Prediction approaches Advantages Disadvantages
• The relationship between input and output is • It requires a significant amount of efforts to
explainable input building information and parameters
‘‘White box’’ models • The parameters are easily modified • Computation cost is huge
• No historical data is required • It requires previous knowledge of thermal
dynamics and software
• No specific expertise is required • It requires large amounts of historical data
‘‘Black box’’ models
• Model development period and computational • It is easy overfitting
time are short • It is usually no-explainable
• The developed model is easy to be generalized
• Less data is required • It couples two distinct scientific domains
‘‘Grey box’’ models • Only bounds on physical parameters are • It is not easy to develop
required

et al., 2015). Table 2 summarizes the advantages and disadvan- random forest (RF), extreme gradient boosting (XGBoost), recur-
tages of these three approaches, and it benefits engineers to select rent neural network (RNN), and artificial neural network (ANN).
an appropriate model for their own needs. The last category is the hybrid model (i.e. grey box), which
relies on both physical model and data-driven model. Through
6. Conclusion a well-rounded study of these three approaches, the following
conclusions were drawn:
In this paper, we presented an in-depth review of the main ap- (1) It is better to use physical models for their reliability, inter-
proaches applied to building energy prediction. These approaches pretability and accuracy although the modeling process is time-
were clustered into three groups including the most commonly consuming and repetitive. Data-driven models are extremely use-
used methods. First, the building physics-based simulation tools ful when owners have sufficient historic data from existing build-
(i.e., white box) have been introduced. These tools can be divided ings, whereas there is no or insufficient information of design
into zonal and nodal approaches. Second, data-driven models building. Hybrid models make a great trade-off between physical
(i.e., black box) have been reviewed. There are six main algo- and data-driven models, and they could be a better option when
rithms: linear regression (LR), support vector machine (SVM), the required information is insufficient for the other two models.
2667
Y. Chen, M. Guo, Z. Chen et al. Energy Reports 8 (2022) 2656–2671

Table A.1

Reference Input features Algorithms Prediction span


Wang et al. (2020a) Day of week; Hour of day; Holiday; Outdoor dry bulb XGBoost; RF; SVR; LSTM LTLF
temperature; Outdoor relative humidity
Lu and Meng (2020) Outdoor dry bulb temperature; Equipment schedule XGBoost; RF; SVR; ANN LTLF
Wang et al. (2019b) Outdoor dry bulb temperature; Outdoor relative humidity; RF; SVR; ANN LTLF
Wind speed; Solar radiation; Hour of day
Ahmad and Chen (2019) Outdoor dry bulb temperature; Outdoor relative humidity XGBoost; LR; NARM LTLF
Lahouar and Slama (2015) Hour of day; Outdoor dry bulb temperature RF STLF
Ahmad et al. (2017) Outdoor dry bulb temperature; Outdoor relative humidity; RF; ANN STLF
Day of week; Hour of day; Occupancy schedule
Leung et al. (2012) Outdoor dry bulb temperature; Outdoor relative humidity; ANN STLF
Solar radiation; Wind speed; Equipment Schedule; Day of
week; Holiday; Occupancy schedule
Mena et al. (2014) and Li et al. (2009) Outdoor dry bulb temperature; Solar radiation SVR; ANN STLF
Chen et al. (2017) Outdoor dry bulb temperature; Hour of day; Day of week SVR STLF
Kusiak et al. (2010) Outdoor dry bulb temperature; Outdoor relative humidity ANN Both
Cai et al. (2019) Outdoor dry bulb temperature; Outdoor relative humidity; Air RNN STLF
pressure; Wind speed
Chammas et al. (2019) Outdoor dry bulb temperature; Outdoor relative humidity; MLP STLF
Equipment energy density; day of the week
Wei et al. (2019) Indoor temperature; Indoor humidity; Indoor CO2 ; Occupancy ANN STLF
schedule; Solar radiation; Outdoor dry bulb temperature;
Outdoor relative humidity; Wind speed
Ding et al. (2018) Outdoor dry bulb temperature; solar radiation; wind speed; MLP; SVR STLF
Indoor temperature; Indoor humidity; Occupancy schedule;
Equipment schedule; Thermal mass
Mun et al. (2019) Indoor temperature; Indoor humidity LR; RF; SVR STLF
Wang et al. (2019a) Occupancy schedule; Day of week LSTM STLF
Sala-Cardoso et al. (2018) Occupancy schedule RNN STLF
Fan et al. (2014) Day of week; Hour of day; Holiday; Outdoor dry bulb Ensemble models STLF
temperature; Outdoor relative humidity; Solar radiation; Wind
speed
Seyedzadeh et al. (2019) Relative compactness; Surface area; Wall area; Roof area; ANN; SVM; RF; XGBoost LTLF
Number of floors; Orientation; Glazing area; Outdoor dry bulb
temperature; Outdoor relative humidity; Solar radiation
Wei et al. (2016) Aspect ratio; Window–wall ratio; Number of floors; RF LTLF
Orientation; Building scale
Tsanas and Xifara (2012) Relative compactness; Surface area; Wall area; Roof area; LR; RF Both
Number of floors; Orientation; Glazing area
Kumar et al. (2018b) Aspect ratio; Relative compactness; Glazing area; Roof area; ANN; SVR; RF Both
Surface area; Wall area; Orientation; Number of floors;
Glazing area
Kumar et al. (2018a) Outdoor dry bulb temperature; Outdoor relative humidity; ELM STLF
Solar radiation; Roof area; Wall area; Relative compactness;
Surface area; Number of floors; Orientation; Glazing area
Zhang and Wen (2019b) Day of week; Hour of day; Holiday; Outdoor dry bulb MARS STLF
temperature; Outdoor relative humidity;

(2) STLF and LTLF are two different types of needs in building such as the number of floors, wall area, and glazing area, can be
energy management projects. STLF is crucially important for con- used to improve the prediction accuracy.
trol goals such as energy system optimization and DR, whereas
LTLF is of great interest for long-term energy planning, such as Declaration of competing interest
system planning and energy policy formulation.
(3) Simulation tools such as TRNSYS and Dymola are good at
The authors declare that they have no known competing finan-
establishing energy systems, while EnergyPlus is good at building
cial interests or personal relationships that could have appeared
envelop simulation. In addition, some data-driven models such as
to influence the work reported in this paper.
XGBoost are preferred for LTLF, while RNN is good at STLF, and
ANN and RF can be used for both time spans.
(4) Feature selection is the most popular strategy for a data- Acknowledgments
driven model. Outdoor dry bulb temperature, outdoor relative
humidity, solar radiation, day of week, and hour of day are the This work was funded by the China Postdoctoral Science
five most important and frequently used input features in data- Foundation (No. 2020M681347) and the National Natural Science
driven models. In addition, the physical information of buildings, Foundation of China (No. 51908006).
2668
Y. Chen, M. Guo, Z. Chen et al. Energy Reports 8 (2022) 2656–2671

Appendix Chong, A., Augenbroe, G., Yan, D., 2021. Occupancy data at different spatial res-
olutions: Building energy performance and model calibration. Appl. Energy
286.
See Table A.1.
Chu, Y., Xu, P., Li, M., Chen, Z., Chen, Z., Chen, Y., et al., 2020. Short-term
metropolitan-scale electric load forecasting based on load decomposition and
ensemble algorithms. Energy Build. 225, 110343.
References Chuan, L., Shaohui, Z., Yi, Q., Edgar, E., 2020. A systematic review of deep transfer
learning for machinery fault diagnosis. Neurocomputing 407, 121–135.
Abu-Et-Magd, M.A., Findla, R.D., 2003. A new approach using artificial neural Crawley, D.B., Lawrie, L.K., Winkelmann, F.C., Buhl, W.F., Huang, Y.J., Peder-
network and time series models for short-term load forecasting. In: Canadian sen, C.O., et al., 2001. Energyplus: creating a new-generation building energy
Conference on Electrical and Computer Engineering. Quebec, Canada. simulation program. Energy Build. 33, 319–331.
Adrian, C., Weili, X., Song, C., Ngoc-Tri, N., 2019. Continuous-time Bayesian Dawn, A., Nam, H.K., Joo-Ho, C., 2015. Practical options for selecting data-driven
calibration of energy models using BIM and energy data. Energy Build. 194. or physics-based prognostics algorithms with reviews. Reliab. Eng. Syst. Saf.
Ahmad, T., Chen, H., 2019. Nonlinear autoregressive and random forest ap- 133.
proaches to forecasting electricity load for utility energy management Ding, Y., Zhang, Q., Yuan, T., Yang, K., 2018. Model input selection for building
systems. Sustain. Cities Soc. 45, 460–473. heating load prediction: A case study for an office building in tianjin. Energy
Ahmad, M.W., Mourshed, M., Rezgui, Y., 2017. Trees vs neurons: Comparison Build. 159, 254–270.
between random forest and ANN for high-resolution prediction of building Do, H., Cetin, K.S., 2018. Evaluation of the causes and impact of outliers on
energy consumption. Energy Build. 147, 77–89. residential building energy use prediction using inverse modeling. Build.
Ahmad, T., Zhang, H., Yan, B., 2020. A review on renewable energy and electricity Environ. 138, 194–206.
requirement forecasting models for smart grid and buildings. Sustain. Cities Dong, B., Li, Z., Rahman, S.M.M., Vega, R., 2016. A hybrid model approach
Soc. 55, 102052. for forecasting future residential electricity consumption. Energy Build. 117,
Alipour, M., Aghaei, J., Norouzi, M., Niknam, T., Hashemi, S., Lehtonen, M., 2020. 341–351.
A novel electrical net-load forecasting model based on deep neural networks Duc-Hoc, T., Duc-Long, L., Chou, J., 2020. Nature-inspired metaheuristic ensemble
and wavelet transform integration. Energy 205, 118106. model for forecasting energy consumption in residential buildings. Energy
Amasyali, K., El-Gohary, N.M., 2018. A review of data-driven building energy 191, 116552.
consumption prediction studies. Renew. Sustain. Energy Rev. 81, 1192–1205. Dudek, G., 2015. Short-term load forecasting using random forests. In: Advances
Anon, 2006. On-Site Generation Simulation with EnergyPlus for Commercial in Intelligent Systems and Computing.
Buildings. Microgrids at Berkeley Lab. Dynamic Modeling Laboratory User Manual Volume 1, Dassault Systèmes A.B.,
Anon, 2021a. Kaggle Competitions. https://www.kaggle.com/competitions. 2017. Dynamic Modeling Laboratory User Manual Volume 1. Dassault
Anon, 2021b. Open source library for building energy and control systems. Systèmes AB.
Lawrence Berkeley National Laboratory. http://simulationresearch.lbl.gov/ Elmqvist, H., 1978. A Structured Model Language for Large Continuous Systems.
modelica/index.html. Lund University.
Anon, 2021c. The introdction of Modelica language, Modelica association. https: Fahrmeir, L., Thomas, K., Stefan, L., Brian, M., 2013. Regression Models. Springer,
//www.modelica.org/modelicalanguage. Berlin, Heidelberg.
Anon, 2021d. A transien system simulation program. http://www.trnsys.com/. Fan, C., Sun, Y., Xiao, F., Ma, J., Lee, D., Wang, J., et al., 2020. Statistical investiga-
Anon, 2021e. Building Energy Software Tools Directory, The United State tions of transfer learning-based methodology for short-term building energy
Department of Energy https://www.buildingenergysoftwaretools.com/. predictions. Appl. Energy 262, 114499.
Anon, 2021f. EnergyPlus Documentation - Tips & Tricks for Using EnergyPlus Fan, C., Xiao, F., Wang, S., 2014. Development of prediction models for next-day
Insider secrets to Using EnergyPlus. https://energyplus.net/sites/default/files/ building energy consumption and peak power demand using data mining
docs/site_v8.3.0/Tips_and_Tricks_Using_EnergyPlus/Tips_and_Tricks_Using_ techniques. Appl. Energy 127, 1–10.
EnergyPlus/index.html#hvac-sizing-equipment-simulation-and-controls. Fan, C., Xiao, F., Zhao, Y., 2017. A short-term building cooling load prediction
Anon, 2021g. BIM/GIS and Modelica Framework for building and community method using deep learning algorithms. Appl. Energy 195, 222–233.
energy system design and operation. https://ibpsa.github.io/project1/index. Foucquier, A., Robert, S., Suard, F., Stephan, L., Jay, A., 2013. State of the art in
html. building modelling and energy performances prediction: A review. Renew.
Anon, 2021h. Get Started with XGBoost. https://xgboost.readthedocs.io/en/latest/ Sustain. Energy Rev. 23, 272–288.
get_started.html. Fu, Y., Li, Z., Zhang, H., Xu, P., 2015. Using support vector machine to predict next
Anon, 2021i. Distributed machine learning toolkit—big data, big model, day electricity load of public buildings with sub-metering devices. Procedia
flexibility, efficiency. http://www.dmtk.io/. Eng..
Butch, Q., 2020. Machine Learning with Spark: Covers XGBoost, LightGBM, Fumo, N., Mago, P., Luck, R., 2010. Methodology to estimate building en-
Spark NLP, Distributed Deep Learning with Keras, and more. Springer ergy consumption using EnergyPlus benchmark models. Energy Build. 42,
Science+Business Media New York, New York, NY 10004. 2331–2337.
Cai, M., Pipattanasomporn, M., Rahman, S., 2019. Day-ahead building-level load Gao, Y., Ruan, Y., Fang, C., Yin, S., 2020. Deep learning and transfer learning mod-
forecasts using deep learning vs. traditional time-series techniques. Appl. els of energy consumption forecasting for a building with poor information
Energy 236, 1078–1088. data. Energy Build. 223, 110156.
Carriere, M., Schoenau, G.J., Besant, R.W., 1999. Investigation of some large Gassar, A.A.A., Yun, G.Y., Kim, S., 2019. Data-driven approach to prediction
building energy conservation opportunities using the DOE-2 model. Energy of residential energy consumption at urban scales in London. Energy 187,
Convers. Manage. 40, 861–872. 115973.
Cetin, K.S., Fathollahzadeh, M.H., Kunwar, N., Huyen, D., Tabares-Velasco, P.C., Gaurav, C., Joshua, N., Jibonananda, S., Piljae, I., Zheng, O.N., Vishal, G., 2016.
2019. Development and validation of an HVAC on/off controller in EnergyPlus Evaluation of autotune calibration against manual calibration of building
for energy simulation of residential and small commercial buildings. Energy energy models. Appl. Energy 182.
Build. 183, 467–483. Guo, J., Liu, R., Xia, T., Pouramini, S., 2021. Energy model calibration in an office
Chammas, M., Makhoul, A., Demerjian, J., 2019. An efficient data model for building by an optimization-based method. Energy Rep. 7.
energy prediction using wireless sensors. Comput. Electr. Eng. 76, 249–257. Guo, Y., Wang, J., Chen, H., Li, G., Liu, J., Xu, C., et al., 2018. Machine learning-
Chargui, R., Sammouda, H., Farhat, A., 2012. Geothermal heat pump in heating based thermal response time ahead energy demand prediction for building
mode: Modeling and simulation on TRNSYS. Int. J. Refrig. 35, 1824–1832. heating systems. Appl. Energy 221, 16–27.
Chen, Z., Chen, Y., Xiao, T., Wang, H., Hou, P., 2021. A novel short-term load fore- Hafner, I., Roessler, M., Heinzl, B., Koerner, A., Landsiedl, M., Breitenecker, F.,
casting framework based on time-series clustering and early classification 2014. Investigating communication and step-size behaviour for co-simulation
algorithm. Energy Build. 251. of hybrid physical systems. J. Comput. Sci.-Neth. 5, 427–438.
Chen, Y., Chen, Z., Xu, P., Li, W., Sha, H., Yang, Z., et al., 2019. Quantification of Hagan, M.T., Behr, S.M., 1989. The time series approach to short-term load
electricity flexibility in demand response: Office building case study. Energy forecasting. Trans. Power Syst. PWRS-2.
188, 116054. Harish, V.S.K.V., Kumar, A., 2016. A review on modeling and simulation of
Chen, Y., Tan, H., 2017. Short-term prediction of electric demand in building building energy systems. Renew. Sustain. Energy Rev. 56, 1272–1292.
sector via hybrid support vector regression. Appl. Energy 204, 1363–1374. Hassid, S., 1985. A linear-model for passive solar calculations-evaluation of
Chen, Y., Xu, P., Chu, Y., Li, W., Wu, Y., Ni, L., et al., 2017. Short-term performance. Build. Environ. 20, 53–59.
electrical load forecasting using the support vector regression (SVR) model He, Y., Liu, R., Li, H., Wang, S., Lu, X., 2017. Short-term power load probability
to calculate the demand response baseline for office buildings. Appl. Energy density forecasting method using kernel-based support vector quantile
195, 659–670. regression and copula theory. Appl. Energy 185, 254–266.
Chen, Y., Xu, P., Gu, J., Schmidt, F., Li, W., 2018. Measures to improve energy He, F., Zhou, J., Mo, L., Feng, K., Liu, G., He, Z., 2020. Day-ahead short-term load
demand flexibility in buildings for demand response (DR): A review. Energy probability density forecasting method with a decomposition-based quantile
Build. 177, 125–139. regression forest. Appl. Energy 262, 114396.

2669
Y. Chen, M. Guo, Z. Chen et al. Energy Reports 8 (2022) 2656–2671

Hensen, J.L.M., Lamberts, R., 2012. Building Performance Simulation for Design Mazzeo, D., Matera, N., Cornaro, C., Oliveti, G., Romagnoni, P., De Santoli, L., 2020.
and Operation: Taylor and Francis. CRC Press. Energyplus, IDA ice and TRNSYS predictive simulation accuracy for building
Hochreiter, S., Schmidhuber, J., 1997. Long short-term memory. Neural Comput. thermal behaviour evaluation by using an experimental campaign in solar
9, 1735–1780. test boxes with and without a PCM module. Energy Build. 212, 109812.
Hong, T., Fan, S., 2016. Probabilistic electric load forecasting: A tutorial review. Mena, R., Rodriguez, F., Castilla, M., Arahal, M.R., 2014. A prediction model based
Int. J. Forecast. 32, 914–938. on neural networks for the energy consumption of a bioclimatic building.
Hong, T., Pinson, P., Fan, S., 2014. Global energy forecasting competition 2012. Energy Build. 82, 142–155.
Int. J. Forecast. 30, 357–363. Mohammad, H.S., Usman, A., Elen, I., James, O., 2020. A framework for
uncertainty quantification in building heat demand simulations using
Hou, D., Hassan, I.G., Wang, L., 2021. Review on building energy model
reduced-order grey-box energy models. Appl. Energy 275, 115141.
calibration by Bayesian inference. Renew. Sustain. Energy Rev. 143.
Moon, J., Kim, Y., Son, M., Hwang, E., 2018. Hybrid short-term load forecasting
Inard, C., Bouia, H., Dalicieux, P., 1996. Prediction of air temperature distribution
scheme using random forest and multilayer perceptron. Energies 11, 328312.
in buildings with a zonal model. Energy Build. 24, 125–132.
Mun, S., Kwak, Y., Huh, J., 2019. A case-centered behavior analysis and operation
Jani, D.B., Bhabhor, K., Dadi, M., Doshi, S., Jotaniya, P.V., Ravat, H., et al., 2020.
prediction of AC use in residential buildings. Energy Build. 188, 137–148.
A review on use of TRNSYS as simulation tool in performance prediction of Murphy, M.D., O’Sullivan, P.D., Da Graça, G.C., O’Donovan, A., 2021. Development,
desiccant cooling cycle. J. Therm. Anal. Calorim. 140, 2011–2031. calibration and validation of an internal air temperature model for a
Ji, Y., 2016. FDD Method of HVAC System in Large Public Buildings with naturally ventilated nearly zero energy building: Comparison of model types
Submetering Electricity Data (in Chinese). Tongji University, Shanghai. and calibration methods. Energies 14.
Ji, Y., Xu, P., Duan, P., Lu, X., 2016. Estimating hourly cooling load in commercial Nageler, P., Schweiger, G., Pichler, M., Brandl, D., Mach, T., Heimrath, R., et al.,
buildings using a thermal network model and electricity submetering data. 2018. Validation of dynamic building energy simulation tools based on a real
Appl. Energy 169, 309–323. test-box with thermally activated building systems (TABS). Energy Build. 168,
Jin, R.M., Agrawal, G., 2003. Communication and memory efficient parallel 42–55.
decision tree construction. pp. 119–129. Neto, A.H., Fiorelli, F.A.S., 2008. Comparison between detailed model simulation
Ke, G., Meng, Q., Finley, T., Wang, T., Chen, W., Ma, W., et al., 2017. LightGBM: and artificial neural network for forecasting building energy consumption.
A highly efficient gradient boosting decision tree. In: 31st Conference on Energy Build. 40, 2169–2176.
Neural Information Processing Systems. Long Beach, CA, USA. Nilsson, N.J., 1965. Learning Machines: Foundations of Trainable Pattern-
Ke, M., Yeh, C., Jian, J., 2013. Analysis of building energy consumption parameters Classifying Systems. McGraw-Hill, New York.
and energy savings measurement and verification by applying eQUEST Nouidui, T.S., Phalak, K., Wangda, Z., Wetter, M., 2012. Validation and application
software. Energy Build. 61, 100–107. of the room model of the modelica buildings library. In: Proceedings of the
Kiartzis, S.J., Zoumas, C.E., Theocharis, J.B., Bakirtzis, A.G., Petridis, V., 1997. Short- 9th International Modelica Conference. Munich, Germany.
term load forecasting in an autonomous power system using artificial neural Qian, F., Gao, W., Yang, Y., Yu, D., 2020. Potential analysis of the transfer
networks. Ieee T Power Syst. 12, 1591–1596. learning model in short and medium-term forecasting of building HVAC
Kim, E., He, X., Roux, J., Johannes, K., Kuznik, F., 2019. Fast and accurate energy consumption. Energy 193, 315–324.
district heating and cooling energy demand and load calculations using Quesada, B., Sanchez, C., Canada, J., Royo, R., Paya, J., 2011. Experimental results
reduced-order modelling. Appl. Energy 238, 963–971. and simulation with TRNSYS of a 7.2 kWp grid-connected photovoltaic
system. Appl. Energy 88, 1772–1783.
Kim, J.B., Jeong, W., Clayton, M.J., Haberl, J.S., Yan, W., 2015. Developing a
Ribeiro, M., Grolinger, K., ElYamany, H.F., Higashino, W.A., Capretz, M.A.M., 2018.
physical BIM library for building thermal energy simulation. Autom. Constr.
Transfer learning with seasonal and trend adjustment for cross-building
50, 16–28.
energy forecasting. Energy Build. 165, 352–363.
Kumar, S., Pal, S.K., Singh, R.P., 2018a. Intra ELM variants ensemble based model
Sala-Cardoso, E., Delgado-Prieto, M., Kampouropoulos, K., Romeral, L., 2018.
to predict energy performance in residential buildings. Sustain. Energy Grids
Activity-aware HVAC power demand forecasting. Energy Build. 170, 15–24.
Netw. 16, 177–187.
Salkuti, S.R., 2019. Day-ahead thermal and renewable power generation
Kumar, S., Pal, S.K., Singh, R.P., 2018b. A novel method based on extreme learning scheduling considering uncertainty. Renew. Energy 131, 956–965.
machine to predict heating and cooling load through design and structural Seyedzadeh, S., Rahimian, F.P., Rastogi, P., Glesk, I., 2019. Tuning machine
attributes. Energy Build. 176, 275–286. learning models for prediction of building energy loads. Sustain. Cities Soc.
Kusiak, A., Li, M., Zhang, Z., 2010. A data-driven approach for steam load 47, 101484.
prediction in buildings. Appl. Energy 87, 925–933. Siddharth, V., Ramakrishna, P.V., Geetha, T., Sivasubramaniam, A., 2011. Auto-
Lahouar, A., Slama, J.B.H., 2015. Day-ahead load forecast using random forest matic generation of energy conservation measures in buildings using genetic
and expert input selection. Energy Convers. Manage. 103, 1040–1051. algorithms. Energy Build. 43, 2718–2726.
Leung, M.C., Tse, N.C.F., Lai, L.L., Chow, T.T., 2012. The use of occupancy space Somu, N., Raman, G.M.R., Ramamritham, K., 2020. A hybrid model for building
electrical power demand in building cooling load prediction. Energy Build. energy consumption forecasting using long short term memory networks.
55, 151–163. Appl. Energy 261, 114131.
Li, Q., Meng, Q., Cai, J., Yoshino, H., Mochida, A., 2009. Applying support vector Son, H., Kim, C., 2015. Forecasting short-term electricity demand in residential
machine to predict hourly cooling load in the building. Appl. Energy 86, sector based on support vector regression and fuzzy-rough feature selection
2249–2256. with particle swarm optimization. 118, pp. 1162–1168.
Li, Y., O’Neill, Z., Zhang, L., Chen, J., Im, P., DeGraw, J., 2021. Grey-box modeling Sun, Y., Haghighat, F., Fung, B.C.M., 2020. A review of the -state-of-the-art in
and application for building energy simulations - A critical review. Renew. data -driven approaches for building energy prediction. Energy Build. 221,
Sustain. Energy Rev. 146. 110022.
Liu, C., Sun, B., Zhang, C., Li, F., 2020. A hybrid prediction model for residential Swati, Z.N.K., Zhao, Q., Kabir, M., Ali, F., Ali, Z., Ahmed, S., et al., 2019. Brain
electricity consumption using holt-winters and extreme learning machine. tumor classification for MR images using transfer learning and fine-tuning.
Appl. Energy 275. Comput. Med. Imaging Graph. 75, 34–46.
Trcka, M., Hensen, J.L.M., 2010. Overview of HVAC system simulation. Autom.
Lu, H., Cheng, F., Ma, X., Hu, G., 2020. Short-term prediction of building energy
Constr. 19, 93–99.
consumption employing an improved extreme gradient boosting model: A
Tsanas, A., Xifara, A., 2012. Accurate quantitative estimation of energy per-
case study of an intake tower. Energy 203, 117756.
formance of residential buildings using statistical machine learning tools.
Lu, Y., Meng, L., 2020. A simplified prediction model for energy use of air
Energy Build. 49, 560–567.
conditioner in residential buildings based on monitoring data from the cloud
Tuhus-Dubrow, D., Krarti, M., 2010. Genetic-algorithm based approach to opti-
platform. Sustain. Cities Soc. 60, 102194.
mize building envelope design for residential buildings. Build. Environ. 45,
Luo, X.J., Lukumon, O.O., Anuoluwapo, O.A., Olugbenga, O.A., Hakeem, A.O., 1574–1581.
Ashraf, A., 2020. Feature extraction and genetic algorithm enhanced adaptive U.S. Department of Energy, Energy Efficiency and Renewable Energy Office,
deep neural network for energy consumption prediction in buildings. Renew. Building Technology Program. https://energyplus.net/downloads.
Sustain. Energy Rev. 131, 109980. Vapnik, V., 2013. The Nature of Statistical Learning Theory. Springer, New York.
Luo, X.J., Oyedele, L.O., Ajayi, A.O., Monyei, C.G., Akinade, O.O., Akanbi, L.A., 2019. Violidakis, I., Zeneli, M., Atsonios, K., Strotos, G., Nikolopoulos, N., Karellas, S.,
Development of an IoT-based big data platform for day-ahead prediction of 2020. Dynamic modelling of an ultra high temperature PCM with combined
building heating and cooling demands. Adv. Eng. Inform. 41, 100926. heat and electricity production for application at residential buildings. Energy
Magnier, L., Haghighat, F., 2010. Multiobjective optimization of building design Build. 222, 110067.
using TRNSYS simulations, genetic algorithm, and artificial neural network. Vrablecova, P., Ezzeddine, A.B., Rozinajova, V., Sarik, S., Sangaiah, A.K., 2018.
Build. Environ. 45, 739–746. Smart grid load forecasting using online support vector regression. Comput.
Mandal, P., Senjyu, T., Funabashi, T., 2006. Neural networks approach to forecast Electr. Eng. 65, 102–117.
several hour ahead electricity prices and loads in deregulated market. Energy Wang, C., Grozev, G., Seo, S., 2012. Decomposition and statistical analysis for
Convers. Manage. 47, 2128–2142. regional electricity demand forecasting. Energy 41, 313–325.

2670
Y. Chen, M. Guo, Z. Chen et al. Energy Reports 8 (2022) 2656–2671

Wang, Z., Hong, T., Piette, M.A., 2019a. Predicting plug loads with occupant count Winkelmann, F.C., Selkowitz, S., 1985. Daylighting simulation in the DOE-2
data through a deep learning approach. Energy 181, 29–42. building energy analysis program. Energy Build. 8, 271–286.
Wang, Z., Hong, T., Piette, M.A., 2020a. Building thermal load prediction through Wu, J., Zhao, Z., Sun, C., Yan, R., Chen, X., 2020. Few-shot transfer learning for
shallow machine learning and deep learning. Appl. Energy 263, 114683. intelligent fault diagnosis of machine. Measurement.
Wang, S., Liu, X., Gates, S., 2015. An introduction of new features for conventional Xing, J., Ren, P., Ling, J., 2015. Analysis of energy efficiency retrofit scheme
and hybrid GSHP simulations in eQUEST 3.7. Energy Build. 105, 368–376. for hotel buildings using equest software: A case study from tianjin. China
Wang, R., Lu, S., Feng, W., 2020b. A novel improved model for building energy Energy Build. 87, 14–24.
consumption prediction based on model integration. Appl. Energy 262, Xu, X., Taylor, J.E., Pisello, A.L., Culligan, P.J., 2012. The impact of place-based af-
114561. filiation networks on energy conservation: An holistic model that integrates
Wang, R., Lu, S., Li, Q., 2019b. Multi-criteria comprehensive study on predictive the influence of buildings, residents and the neighborhood context. Energy
algorithm of hourly heating energy consumption for residential buildings. Build. 55, 637–646.
Sustain. Cities Soc. 49, 101623. Yang, Y., Che, J., Deng, C., Li, L., 2019. Sequential grid approach based support
Wang, Z., Srinivasan, R.S., 2017a. A review of artificial intelligence based building vector regression for short-term electric load forecasting. Appl. Energy 238,
energy use prediction: Contrasting the capabilities of single and ensemble 1010–1021.
prediction models. Renew. Sustain. Energy Rev. 75, 796–808. Zhai, Z.J., Johnson, M., Krarti, M., 2011. Assessment of natural and hybrid
Wang, Z., Srinivasan, R.S., 2017b. A review of artificial intelligence based building ventilation models in whole-building energy simulations. Energy Build. 43,
energy use prediction: Contrasting the capabilities of single and ensemble 2251–2261.
Zhang, G., Tian, C., Li, C., Zhang, J.J., Zuo, W., 2020. Accurate forecasting of
prediction models. Renew. Sustain. Energy Rev. 75, 796–808.
building energy consumption via a novel ensembled deep learning method
Wei, L., Tian, W., Zup, J., Yang, Z., Liu, Y., Yang, S., 2016. Effects of building form
considering the cyclic feature. Energy 201, 117531.
on energy use for buildings in Cold Climate Regions. 146, pp. 181–188.
Zhang, L., Wen, J., 2019a. A systematic feature selection procedure for short-term
Wei, Y., Xia, L., Pan, S., Wu, J., Zhang, X., Han, M., et al., 2019. Prediction
data-driven building energy forecasting model development. Energy Build.
of occupancy level and energy consumption in office building using blind
183, 428–442.
system identification and neural networks. Appl. Energy 240, 276–294.
Zhang, L., Wen, J., 2019b. A systematic feature selection procedure for short-term
Westphal, F.S., Lamberts, R., 2005. Building simulation calibration using sensi-
data-driven building energy forecasting model development. Energy Build.
tivity analysis. In: Proceedings of the 9th International IBPSA Conference.
183, 428–442.
Montreal, Canada.
Zhe, W., Tianzhen, H., Mary, A.P., 2020. Building thermal load prediction through
Wetter, M., Christoph, H., 2006. Modelica versus TRNSYS–a comparison between shallow machine learning and deep learning. Appl. Energy 263.
an equation-based and a procedural modeling language for building energy Zhou, Y., Zheng, S., 2020. Machine-learning based hybrid demand-side controller
simulation. In: Proceedings of the 2nd SimBuild Conference. Cambridge, MA, for high-rise office buildings with high energy flexibilities. Appl. Energy 262,
USA. 114416.

2671

You might also like