Data Driven Modeling For Energy Consumption Prediction in Smart Buildings PDF
Data Driven Modeling For Energy Consumption Prediction in Smart Buildings PDF
Abstract—Energy efficiency is in the interest of everyone, to increase fuel poverty. It would appear that the new
from individuals to governments, since it yields economical sensorization of buildings may be one optionby which to
savings, reduces greenhouse gas emissions and alleviates energy resolve these issues.
poverty. Buildings are one of the largest consumers of primary
energy and attaining their efficiency is, therefore, an important With regard to energy saving, energy management should
goal. be the process of monitoring, controlling, and conserving
The Internet of Things currently provides vast amounts energy when buildings are in normal use. The prediction of
of data that can be used to extract knowledge of all kinds, energy use in buildings is a powerful piece of information
including that regarding energy prediction. that is fundamental in concerns such as micro-grids, energy
This has motivated us to test wether the prior information on
the physics of building heat transfer, that is currently available
storage, demand analysis or energy feedback. New energy
is now redundant owing to the completeness of the data from feedback systems involve the following steps:
the system. • Metering and collecting energy consumption data;
We propose a machine learning approach and a grey-box • Analyzing the data in order to propose means of saving
model approach with which to test this hypothesis. The former energy and then putting them into practice, and
is blind to the physiscs of the problem, while the latter is
• Tracking consumption in order to quantify the gains
greatly influenced by it. The energy consumption prediction
models were created with both approaches and then used to that will be obtained as a result of the proposed activity.
estimate energy consumption in a normal operation state and The series of methods and processes used to comfront
compare it with energy consumption when an energy efficiency the third step, which is that of assessing the performance of
campaign is run.
energy efficiency interventions by quantifying the gains in
Our black-box method, which is based on a combination
of statistical and machine learning models and on a time efficiency are commonly denoted as Evaluation, Measure-
series structurization of the data, shows better prediction ment, and Verification (EM&V).
accuracy than the so-called grey-box methods that include basic The evaluation of energy consumption is, when compared
physical equations. This shows that also a data driven approach to prior intervention scenarios, currenlty in a state of devel-
outperforms more informed methods in this, like other fields. opement and simple regression models, such those proposed
Keywords-data-driven models, black-box models, grey-box in the ASHRAEs Guideline [3], may not be sufficient at
models, smart buildings, data analytics present. Several researchers are, therefore, focusing on the
development of EM &V methods with less naive perspec-
I. I NTRODUCTION tives [4].
The energy consumed in buildings in developed countries Regression models typically need to be adjusted in an ad
comprises 20-40% of their total energy use and it is above hoc manner in order to capture non-linear behavior, which
that of industry and transport in the EU and US [1], [2]. arises from complex (physical) multi-variate interactions
In order to mitigate climate change, the reduction of between ambient conditions, occupancy, building operating
energy use together with the use of non-fossil energy sources conditions and so forth [5].
such as the sun and wind is crucial. Furthermore, reducing Regression has always been the standard approach used
energy consumption in buildings has to be done while to model the relationship between one outcome variable and
always maintaining the necessary levels so as to ensure several input variables, and this can be seen both from a
the buildings’ users comfort and lower costs in order not white-box and a black-box point of view. This means that
4563
demonstrated for the first time that resistor-capacitor (RC) available and it is not necessary to predict them. The most
networks can accurately represent the thermodynamics of commonly used weather information is outdoor dry-bulb air
buildings [20]. Since then, RC-networks have been used to temperature and solar irradiance, but in our case temperature
represent the thermodynamics of buildings. In the early years was seen to be sufficient.
of building dynamic simulation, this was one of the few ways
in which to represent the thermodynamics of buildings, but B. Proposed models
even today, programs such as EnergyPlus, include thermal Our interest lies in the weekly quantification of energy
networks in their codes [19] . use. However, daily dynamics are useful since there are
In addition to building simulation, the grey-box modeling patterns that can be found depending on the day of the week.
of buildings using RC-networks has, for the last two decades, Our model predicts daily energy consumption and then
also been used for Model Predictive Control (MC). MC has computes the metrics in an aggregated manner, signifying
been used to govern the heating and cooling systems of that the global quantification takes place on a weekly basis.
normally large buildings in a way in which the controller This data is then used to test three approaches:
can anticipate the needs of the building via the previous 1) Black-box: Daily aggregate consumption is used as
estimation of its thermodynamic features (which normally output and we attempt to capture the relationship between
translate into response times and the conductivity of the daily average temperature and consumption by relating the
thermal envelope) [21]. time series composed of the hourly means of the temperature
This has motivated research into ideal model topologies and the daily consumption.
and methodologies for these models so as to ensure that We then use several machine learning models in order to
they accurately represent the responses of buildings [22]. assess which will be best one for our scenarios. The models
However, other works are now focusing on how these are generated by following the steps shown below [26]:
methods can be used for the characterization of the thermal • Clean and transform the data: select predictive variables
envelope [23]. such as temperature and day type, deleting outliers;
In those cases in which limited amounts of data are avail- • Aggregate: compute daily consumption, create the time
able and the information concerning the building architecture series with the input variables and represent the series
is partially known, grey models are suitable alternatives for in a lower dimension. That is, apply an hourly average
the prediction of energy consumption [24]. or other representation and feature selection methods
III. M ETHODOLOGY in order to serve as inputs for our models;
• Divide the dataset into train (75 % ) and test (25 %);
In this section we introduce both a black-box model
• Validation method: 10-fold cross validation and 5 rep-
and use a grey-box model in order to estimate the energy
etitions over the training data set in order to find the
consumption of a building.
hyperparameters of each machine learning algorithm;
A. Inputs • Evaluate: apply the algorithm to the test dataset in order
to obtain the metric for the model fitting
Energy forecasting studies that use machine learning are
usually intended to predict consumption a priori in order to To the best of our knowledge, our method differs from
manage and store a suitable amount of energy, taking into existing methods as regards the way in which the data is
consideration the market prices and also the needs of the introduced. We relate the input as a time series with a single
buildings. However, our approach is different in the sense output. This has been previously done in scenarios in which
that our goal is to quantify the energy savings relative to a consumption could be fused as a predictor [27], i.e., in which
baseline period as a result of certain experiments related to the energy consumption is not altered by any intervention
efficiency improvement. and the goal is simply to make a prediction of the energy
This translates into a difference in the inputs that are avail- a priori, with storage and production intentions. However,
able for the prediction. In other scenarios, data concerning this is not possible for EM&V, as is our case, resulting on
energy consumption in previous hours and days is largely a new application of the method.
useful because it is evidently highly correlated with the 2) Grey-box: In order to make use of these models, the
later consumption [25]. However, we cannot use such data set of outputs and inputs have to be defined together with
because, in the case of using this algorithm for a building in the topology of the system. The most common mathematical
which an intervention for energy reduction has taken place, representation of lumped parameter models is the state-space
the consumption is likely to be altered by the experiment. representation. The general form for time-invariant models
Furthermore, in other scenarios, environmental data is can be written as shown in Eq. 1
not yet available (because they concern the future) and
predictions have to be used. When applying predictions to x (t) = Ax(t) + Bu(t)
(1)
EM&V, environmental and occupation variables are usually y(t) = Cx(t) + Du(t)
4564
where x is a vector concerning the states of the model, for every “time of the week”. A time of the week is a unique
in our case the temperatures, x is the derivative (rate of instant of a week (Monday 1am, Monday 2am, and so on).
change) of the states, A is a characteristic matrix of the For example, there will be 24 × 5 = 120 (24 hours and 5
model, B defines the effect of the inputs in the model, and u working days) coefficients in the case of hourly predictions
are the inputs, in our case the outside temperature and gains or 4 × 24 × 5 = 480 in the case of 15 minutes prediction.
in electric. In this formulation, y represents the variables In addition, some coefficients related to the temperature
that are measured, in our case electricity, C is the identity need also to be calculated. Basically, the range of outside
matrix; and D is zero in all cases for this work. Using this temperature data is divided into 6 levels, each which has a
formulation, every time a solution has to be evaluated, the coefficient assigned to it.
built-in GNU Octave function lsim was used. 2) Gaussian: A Gaussian process (GP) modelling frame-
work determining energy savings and uncertainty levels in
measurement and verification (EM&V) practice is presented
in [5]. In this work, the authors compare a black-box
approach to the regular linear regression techniques that
are widely explored in literature and state that GP models
can capture both complex nonlinear and multivariable inter-
actions and the multi-resolution trends of energy behavior
thanks to the Bayesian approach under which they are
developed.
D. Model Accuracy Metrics
This work assesses model accuracy using two metrics: the
Figure 1: Dual-mode RC network mean absolute percentage of error (MAPE) and the coeffi-
cient of variation of the root mean squared error (CVRMSE).
The conditioning system is governed in our case by a The MAPE metric has been used in a wide number of
thermostat with a timer that turns it on and off. For this electricity prediction studies [30], [31] . It expresses the
reason one need to consider that the RC-network that rep- average absolute error as a percentage and is calculated as
resents the building needs to change topologies depending it follows:
on the operation (on or off). We have considered a dual- n
mode RC-network as the one shown in Fig. 1 and previously 1 yi − y¯i
M AP E = | | × 100,
introduced by Ramallo-González on [4]. These Grey-box n i=1 yi
models have been largely used in the past for building
energy simulation. The reader is referred to [22], [21] and where yi is the real consumption, y¯i is the predicted
[4] for quantitative evidence of the accuracy of these kind consumption and n is the number of observations.
of models. The CVRMSE has often been used in energy prediction
studies [32] and will also be used also here . It evaluates how
C. Black-box baseline models much error varies with respect to the actual consumption
mean and is calculated as follows:
Our proposed black-box models are compared with two
baseline approaches documented in literature. The first is a n
1
regression-based electricity load model and the second is a n−1 i=1 (yi − y¯i )2
purely machine learning model that uses the inputs only at CV RM SE = × 100,
ȳ
the same timestamp as the output.
1) Time-of-week-and-temperature (TWT): This algorithm E. Savings Metrics
was introduced in [28]. We have chosen it because of its high The IPMVP [13] and ASHRAEs Guideline 14 [2] provide
accuracy, low complexity and low computational cost when three methods with which to determine energy savings
compared with the results of other state-of-the-art methods and uncertainty levels from energy efciency measures. That
used in a wide number of buildings [29]. which is most suitable for our approach is whole-building
In this work the predicted load is the sum of two terms: metering, since it compares the total energy demand or cost
a “time of week effect” that allows each time of the week during pre-experiment and post-experiments periods.
to have a different predicted load from the others, and a The output of the predictive baseline models is the
piecewise-continuous effect of temperature. metered pre-experiment energy use energypre which em-
The central point of this algorithm is the creation of ploys the environmental conditions as inputs for the model
several coefficients that will be estimated using multifple inputspre . The error in reported savings is proportional to
ordinary least square regression. There will be a coefficient the error in the baseline model forecasts.
4565
IV. U SE CASE during classification the samples that are close to the margin
The reference building in which the proposed procedure are penalized even if they are correctly classified, whereas
has been carried out in order to generate accurate building in the regression method an acceptable deviation margin
models is the Chemistry Faculty of the University of Murcia, of the samples from the prediction curve is set. The free
which is a building used as a pilot for the H2020 ENTROPY hyperparameter of this model is C, the penalty parameter of
project1 . the error term. C is the weight of how much the samples
The data that is used in order to build and train our base- inside the margin contribute to the overall error.
line corresponds to 1 year’s worth of data, from February Regression Forest is a type of ensemble learning method,
2016 to February 2017. in which a group of weak models combine to form a more
powerful model. In Regression Forest, multiple regression
trees are grown. In order to predict a new observation, each
2676 tree provides its own prediction after which the average of
the output of each one is taken. The algorithm works by
2545
growing a tree from each random with replacement samples
2504 taken from the training. For each node, mtry variables are
selected at random out of the number of inputs. The best
2727 split in these mtry is then used to split the node. The
hyperparameter of Regression Forest that we will search for
Figure 2: Daily temperature time series input and consump- is the number of random variables that are taken into account
tion output. Numbers in Watts hour in every split: mtry .
XGB is built on the principles of gradient boosting and
A. Black-box approach is designed for speed and performance (from which the
Our black-box methodology is highly versatile with re- term “extreme” originates). Gradient Boosted Regression
spect to the input data i.e., it allows the addition of variables is a technique that generates a prediction by means of an
with a minimal effort. We create the method in a constructive ensemble of weak prediction models that, in our case, are
manner by relating the 24 temperature values of each day decision trees. The concept is to sequentially build the model
with the energy consumption of the building (see Fig. 2). by fitting a weak prediction model on the weighted training
Having introduced the daily temperature time series, we data set, in which the higher weights are assigned to samples
considered that the addition of a categorical variable indi- that were previously difficult to predict.
cating the season was irrelevant. The subject building has The free hyperparameters that are adjusted in this model
several features that are typical of educational buildings: are the maximum depth limit of the number of nodes in the
the load on weekends is substantially lower than that of tree, the minimum number of samples required to split an
weekdays and there are also differences among week days internal node and the learning rate by which the contribution
(mainly Fridays). In these terms, we used analysis of vari- of each tree is shrunk.
ance (ANOVA) in order to determine whether there were For the sake of comparison, we also employ the Gaussian
differences between the consumption on the different days Process to model the instantaneous consumption prediction
of the week (p-value = 0.001 < 0.05). After carrying out a using the current input values (mean daily temperature for
post-hoc test we concluded that Fridays could be considered daily consumption or mean hourly temperature for hourly
to behave differently to the other days of the week, which consumption).
could be owing to lower occupation. Having attained this B. Grey-box approach
knowledge, we considered it necessary to add a dychotomus
variable that indicates the kind of day of the week. Weekend In the case of our grey-box methodology, a topology was
and holiday consumption is estimated using the mean of selected that could adequately represent the building.
previous weekends and holidays. Once the system had been defined, an optimization algo-
The algorithms that were found to be relevant for use rithm was used to find the values that minimize the RMSE
within our black-box methodology are: Support Vector (10 minutes intervals) of the simulated power consumption.
Regression (SVR), Regression Forest (RF) and Extreme To ensure that the data was used appropriately, the total
Gradient Boosting (XGB). electricity consumption was separated onto an un-seasonal
SVR works in a similar fashion to Support Vector Ma- component and a seasonal one. The un-seasonal component
chines (SVM). Whereas SVM is a classification technique, was used as electric loads and the seasonal component
SVR fits the optimal curve out of which the training data do was considered to be the heating and cooling load. The
not deviate more than a small number . More specifically, building is equipped with a boiler and radiators network
that contribute to some of the heating loads. In order to take
1 http://entropy-project.eu/ into account this effect, the prelimiary estimation was made
4566
using the summer period. From which the portion of heating behavior. The physics of the building can capture the re-
provided by radiators was extracted. This ratio was used in sponse of passive elements such as walls and windows, but
the consecutive calculations. behavioral patterns are out of their scope.
The optimization method employed to find the parameters One drawback of baseline data-driven models is that they
of the model was the simplex method. The termination ignore the correlation between timestamps. TWT creates
criterion was to attain a change in the solution that was artificial features in order to model each moment indepen-
smaller than 0.01 in all the parameters. dently. We understand that the input data are time series and
we indirectly consider all the interactions that the building
C. Results and temperature have as regards consumption. Moreover, the
The prediction metrics are summarized on Table I. The application of regression techniques has to be preceded by
first three methods: SVR, RF, XGB belong to the group the assessment of the assumptions that have to be fulfilled:
of black-box models and are blind to the physics of the normality, little multi-collinearity, no auto correlation and
problem. We also have the TWT, the GP and the Grey-box homocedasticity. However, black-box models do not require
model, the last of which contains information regarding the such assumptions.
physical phenomenon of the model topology. As can be seen, We have, therefore, created a methodology with which
they return the best results when compared to the Gaussian the combination of time series and current state variables
method, that is applied in a more traditional manner, i.e., by is possible and that provides a wide variety of techniques
relating the instantaneous consumption measurement with related to time series that can be used to improve the results
the instantaneous inputs measurements and also with our of the analysis.
grey-box model approach.
VI. C ONCLUSIONS AND FUTURE WORK
Of the three black-box methods, Random Forest is that
which stands out since it attained a CVRMSE of 9 and 5 After applying all the baseline models, grey-box approach
% and a MAPE of 6, and 4.5 % for the daily and weekly and black-box approach, we have seen that black-box models
predictions respectively. We have plotted the predicted vs outperform the rest.
the real daily consumption (measured in Watt-hour) in Fig. One future avenue of research consists of applying the
3 and the weekly consumption in Fig. 4 . same models to several buildings that share some character-
istics. That is, they belong to the same environment: different
Table I: Metrics faculties of a university (where students and professors
Models behave similarly in all the buildings), several houses of a
SVR RF XGB TWT Gauss Grey neighborhood or different commercial buildings in the same
CVRMSE 12.4 9 11 14.9 17.45 33.57
Daily
MAPE 7.2 6 7.3 12.3 15.01 43.02
city.
CVRMSE 6.4 5 6.2 11.1 16.3 19.53 Reducing the total time required for EM&V is the key
Weekly
MAPE 5.2 4.5 5.5 9.4 12.3 15.48 to scaling the deployment of energy efficiency projects in
general, and reducing overall costs [33]. It is for this reason
V. D ISCUSSION
that a transfer learning approach should be considered in
As mentioned previously, the objective of this paper is future studies in order to reduce the quantity of data that
to evaluate whether the fact that the RC-networks in the needs to be collected to create a reasonable building model.
grey-box inverse modeling contain basic information con- Moreover, a broader study on the means employed to
verning the physical system (i.e. what the thermodynamics introduce the time series into the algorithms should be
of the building should be like) have any advantages for carried out. This implies applying time series segmentation
the prediction of energy consumption for a smart building and representation techniques in order to discover more rep-
with measurement and verification purposes. They were, resentative means of introducing the data, or also considering
therefore, compared to our proposed state-of-the-art black- feature selection techniques.
box method in which no prior high level information about Overall, we believe that the work reported in this paper
the system is included (i.e. the models are blind to the has explored yet another discipline in which data-driven
physics of the problem) and which combines statistical methods are outperforming traditional methods based on
analysis and machine learning techniques. physics. The results once more demonstrate that data science
We have shown that by using the daily temperature time can provide answers that are as good as or better than
series we are able to capture the people’s behaviour better physically informed approaches.
than if instantaneous values are used to predict consumption
(Gaussian) or if we use a barrage of coefficients in linear ACKNOWLEDGEMENTS
regression to model each small part of the day (TWT). This work has been partially funded by MINECO
This result could, in some respects, have been anticipated, TIN2014-52099-R project (grant BES-2015-071956) and
since energy consumption is greatly dominated by human ERDF funds, by the European Commission through the
4567
G G
G
GG
G
G real
4000
G
G
predicted G G
G G G G
G G
G
G
G GGG G
G G G
G G
3500
G G G G
G G G
GG G G G G G
G G G G
G G
G
G GG G GG
G G G
G G G G G GG G
G G GG
G G G G
GG
3000
G G G G G GG
G
G GG G G
G GG G G G G
G G G G
SUMMER
G G GG G G GG
G
G G G GGG GGG G G
G
GG
G
G
Wh
G G GG G G G G G
G
G
2500
G G G G
G G G G
G G G G G GG G G G
GG G G
G G GG
GG GGG G G GG
G G
G GG GG G
G
G
G G G G G G G
G G G G
G G G
G G G
2000
G
G G
G G
G
G
G
1500
G
G GG
G GG G G G G
G G G
G G G
G G GG G
G G G GG GG G GG GG G
GG GG G GG GG GG GG GG GG G G GG GG
GG
1000
G G G GG GG G GG G GG GG GGG
GG GG GG GG GG GG GG GG GG G GG GG
GG GG
GG GGG GG
201601 201602 201603 201604 201605 201606 201609 201610 201611 201612 201701 201702
datestamp
14000
10000
real
predicted
7 8 9 11 13 15 17 19 21 23 25 36 38 40 42 44 46 48 50 1 2 3
week number
4568
[12] “Thermal energy system specialists,” http://www.trnsys.com, [27] Z. Ruijin, P. Yiqun, H. Zhizhong, and W. Qiujian, “Building
accessed: 2017-09-30. energy use prediction using time series analysis,” in Service-
Oriented Computing and Applications (SOCA), 2013 IEEE
[13] D. B. Crawley, J. W. Hand, M. Kummert, and B. T. Griffith, 6th International Conference on. IEEE, 2013, pp. 309–313.
“Contrasting the capabilities of building energy performance
simulation programs,” Building and environment, vol. 43, [28] J. L. Mathieu, P. N. Price, S. Kiliccote, and M. A. Piette,
no. 4, pp. 661–673, 2008. “Quantifying changes in building electricity use, with applica-
tion to demand response,” IEEE Transactions on Smart Grid,
[14] C. M. Bishop, Pattern recognition and machine learning. vol. 2, no. 3, pp. 507–518, 2011.
springer, 2006.
[29] J. Granderson, S. Touzani, C. Custodio, M. D. Sohn, D. Jump,
[15] L. Breiman et al., “Statistical modeling: The two cultures
and S. Fernandes, “Accuracy of automated measurement and
(with comments and a rejoinder by the author),” Statistical
verification (m&v) techniques for energy savings in commer-
science, vol. 16, no. 3, pp. 199–231, 2001.
cial buildings,” Applied Energy, vol. 173, pp. 296–308, 2016.
[16] M. Braun, H. Altan, and S. Beck, “Using regression analysis
to predict the future energy consumption of a supermarket in [30] C. Fan, F. Xiao, and S. Wang, “Development of prediction
the uk,” Applied Energy, vol. 130, pp. 305–313, 2014. models for next-day building energy consumption and peak
power demand using data mining techniques,” Applied En-
[17] A. Ahmad, M. Hassan, M. Abdullah, H. Rahman, F. Hussin, ergy, vol. 127, pp. 1–10, 2014.
H. Abdullah, and R. Saidur, “A review on applications of
ann and svm for building electrical energy consumption [31] R. E. Edwards, J. New, and L. E. Parker, “Predicting future
forecasting,” Renewable and Sustainable Energy Reviews, hourly residential electrical consumption: A machine learning
vol. 33, pp. 102–109, 2014. case study,” Energy and Buildings, vol. 49, pp. 591–603,
2012.
[18] L. Ciabattoni, M. Grisostomi, G. Ippoliti, and S. Longhi,
“Fuzzy logic home energy consumption modeling for resi- [32] F. L. Quilumba, W.-J. Lee, H. Huang, D. Y. Wang, and R. L.
dential photovoltaic plant sizing in the new italian scenario,” Szabados, “Using smart meter data to improve the accuracy
Energy, vol. 74, pp. 359–367, 2014. of intraday load forecasting considering customer behavior
similarities,” IEEE Transactions on Smart Grid, vol. 6, no. 2,
[19] A. F. Handbook, “American society of heating, refrigerating pp. 911–918, 2015.
and air-conditioning engineers,” Inc.: Atlanta, GA, USA,
2017. [33] J. Granderson, P. N. Price, D. Jump, N. Addy, and M. D.
Sohn, “Automated measurement and verification: Perfor-
[20] G. Burnand, “The study of the thermal behaviour of structures mance of public domain whole-building electric baseline
by electrical analogy,” British Journal of Applied Physics, models,” Applied Energy, vol. 144, pp. 106–113, 2015.
vol. 3, no. 2, p. 50, 1952.
4569