Etp Final-Report
Etp Final-Report
and
Final Report
Describing and evaluating the process currently used by IES to carry out the calibration of BEMs,
identifying benefits, limitations and areas of improvement. Additionally, this project will include an
evaluation of the effort and value of different methods of estimating energy savings.
Document created by: Melanie Gines Cooke
Integrated Environmental Solutions Limited
International Sustainability Consulting Developers of the IES <Virtual Environment>
Abstract ................................................................................................................................................... 6
List of Abbreviations ........................................................................................................................... 7
List of Figures ...................................................................................................................................... 7
List of Graphs ...................................................................................................................................... 8
List of Tables ....................................................................................................................................... 8
1. Introduction .................................................................................................................................... 9
1.1. Objectives ............................................................................................................................ 10
2. Methodology................................................................................................................................. 11
3. Background ................................................................................................................................... 12
3.1. Literature Review Concepts................................................................................................. 13
3.1.1. Day-typing ....................................................................................................................... 13
3.1.2. Zone-typing ..................................................................................................................... 13
3.1.3. Source Hierarchizing ....................................................................................................... 14
3.1.4. Version Control Software ................................................................................................ 14
3.1.5. Electricity Plug-Load Data to obtain occupancy schedules ............................................. 14
4. Context of Calibration within M&V............................................................................................... 15
4.1. M&V..................................................................................................................................... 15
4.1.1. Option D – Calibrated simulation .................................................................................... 16
4.1.2. Option C vs Option D ....................................................................................................... 18
4.2. M&V 2.0 ............................................................................................................................... 18
5. Review of Metrics and Uncertainty .............................................................................................. 19
5.1. Introduction to Metrics ....................................................................................................... 19
5.2. Background concepts .......................................................................................................... 20
5.2.1. Descriptive Statistics ....................................................................................................... 20
5.2.2. Distributions .................................................................................................................... 21
5.2.3. Comparing Simulated and Real Data .............................................................................. 22
5.3. Terms ................................................................................................................................... 23
5.4. Metrics ................................................................................................................................. 23
5.4.1. Coefficient of Determination (R2): .................................................................................. 23
5.4.2. Mean Bias Error (MBE): ................................................................................................... 24
5.4.3. Normalised Mean Bias Error (N_MBE): ........................................................................... 25
5.4.4. Root Mean Squared Error (RMSE):.................................................................................. 25
5.4.5. Predicted uncertainty (Error Range): .............................................................................. 26
5.4.6. Coefficient of Variation of the Root Mean Square Error (CV_RMSE): ............................ 26
5.4.7. Range Normalised Root Mean Squared Error (RN_RMSE): ............................................ 27
5.4.8. Unscaled Mean Bounded Relative Absolute Error (UMBRAE): ....................................... 29
5.5. Total Uncertainty ................................................................................................................. 29
5.6 Uncertainty Limits when evaluating savings due to an ECM............................................... 30
6. Literature review opinions regarding Matrices and Uncertainty ................................................. 32
7. Case studies .................................................................................................................................. 33
7.1. Case Study 1 – Nursery Villa in Abu Dhabi, UAE using monthly bills................................... 33
7.1.1. Procedure ........................................................................................................................ 33
7.1.2. Weather data .................................................................................................................. 34
7.1.3. Model Geometry and Zoning ......................................................................................... 34
7.1.4. Model Inputs ................................................................................................................... 35
7.1.5. Calibration results ........................................................................................................... 36
7.1.6. IPMVP Option C vs Option D ........................................................................................... 38
7.2. Case Study 2 – Office building in USA using sub-hourly data .............................................. 39
7.2.1. Procedure ........................................................................................................................ 39
7.2.2. Weather data .................................................................................................................. 39
7.2.3. Calibration results ........................................................................................................... 39
8. Process Evaluation ........................................................................................................................ 41
9. Conclusion ..................................................................................................................................... 44
10. Implementation of findings by other IES members ................................................................. 45
Appendix A: Using Excel to generate a linear regression model for option C: ..................................... 47
Appendix B: Using Hone for optimisation-based calibration using monthly data ............................... 49
Appendix C: Overview of the Calibration methodology by Raftery et al 2011 ..................................... 54
References ............................................................................................................................................ 55
Abstract
Calibration involves using real measured data from a building in order to improve the performance of
a simulation model of such building.
With the increase on quantity, quality and accesibility of building data due to the advancement of
technology, the applications, accuracy and efficiency of calibration are growing rapidly.
Applications of calibration being implemented currently can be classified in two main areas:
- Measurement and Verification (M&V) of achieved savings due to the impleentation of an
Energy Conservation Measure (ECM).
- Prediction or targeting of future savings.
Researchers in this area all agree that there is a lack of an available defined methodology and that its
existence would ensure improved accuracy, completeness, consitency, relevance and transparency.
The present study brings together all the information that IES has been using for different calibration
projects into one single document, identifying areas where there is room for improvement and
streamlining the process.
The key elements of the applied methodology are:
- An extensive literature review, aiding the researcher to gain a broder understanding of the
range of applications, different approaches used in the past and ways in which the accuracy
of a calibrated model can be assessed.
- The use two case studies with different characteristics in order to implement the identified
improvements.
- The analysis of the cost/benefit balance of the different approaches identified.
It is believed that this project will add a postivie value to the Company as it will contribute to the
efficiency and accuracy of future calibration projects.
List of Abbreviations
List of Figures
List of Tables
Reddy (2006) describes calibration as the procedure of changing the inputs of a Building Energy Model
(BEM) in order to closely match real measured data of such building.
BEMs can therefore be used to evaluate the impact of ECMs that are already installed,
including interactive effects between different ECMs and between ECMs and the rest of the
facility.
Fanell et al (2017) states that “There is a need for greater transparency when making decisions
about the most appropriate M&V strategy.”
This study will demonstrate that using a calibrated BEM can result in better certainty during
M&V compared to other options established in the International Performance Measurement
and Verification Protocol (IPMVP).
- Prediction of future savings. This includes prediction of impact of potential ECMs before their
installation, fault detections and Measurement and Targeting (M&T) – Using existing/past
data to set future targets of energy reduction.
Abundant research has been done in the “performance gap” between building energy models and real
energy consumption of buildings proving the difficulty of getting it right, even when detailed data is
available.
1.1. Objectives
Taking into account the above, the objectives of this study are:
- Evaluate and assess the procedure developed by the company for the calibration of BEMs.
Contextualise Calibration within M&V (achieved savings) and within potential savings estimation
(future savings).
- Evaluate the balance between certainty and effort. (Effort/Cost will be mainly evaluated as
time, although computational costs could be relevant in practice, the case studies will be simple
cases in order to eliminate such parameter from the equation).
- Evaluate the different outcomes and results, and analyse the “value for money” of each of
them. Identify which is the optimal ways to charge clients for a calibration project. Considering
the options of time and uncertainty level.
- Identify areas where the process could be optimised within the construction sector.
2. Methodology
Working alongside the Research and Development (R&D) and Consultancy Teams of IES and using a
number of their projects, the following tasks were carried out in order to achieve the objectives of
the research:
2) Development of calibration for two case studies with different granularity, applying the
appropriate metrics identified previously. This includes descriptions of the required
information and breakdown of the required time.
4) Identification of areas where the process can be improved and where a reduction of the cost
of calibrating a model can be achieved for specific applications.
According to Reddy (2006) trial and error is the method that has historically been used for
calibration of BEMs. This involves the need of an experienced engineer in order to change inputs one
by one and observe their influence on the outputs. Therefore this is a time consuming process that
highly depends on the personal judgment of the person carrying it out.
New methods have been used recently, however, there are no existing guidelines on the steps to
follow to calibrate a BEM using detailed simulation programs. This is expressed in numerous
research papers such as:
- Raftery et al (2011): “Despite the numerous published case studies, there is still no accepted
standard method for calibrating a model.”
- Gallagher et al (2018): “Lack of a rigid, prescribed, analytical calculation process.”
In the investigation of the history of calibration, Reddy (2006) identified that initially monthly bills
were used, then instantaneous or short term data and finally hourly or sub-hourly data for an entire
year. This improvement on the quantity of data available with regards to both building and energy
has been possible in the past decade due to the advancement of Advanced Metering Infrastructure
(AMI), Building Management Systems (BMS) and Internet of Things (IoT).
According to ASHRAE Research Project Report 1702-RP (2018), which assesses the practicality of
ASHRAE 14, whole building calibrated simulations are used just for scientific research and not for
industry practical applications. In order to achieve such models, there is a need for an experienced
engineer with knowledge about the building and its characteristics, making it very costly. However,
the current report contains improvements on the process that contribute to the possibility of its use
in practical industry cases.
Although IES-VE is not the most frequently used tool in existing research on the area of calibration,
there are some examples such as Parker et al (2012) which used IES for model calibration for
assessment of retrofits. The current research will prove that using different IES tools appropriately
and in conjunction, will result in reliable and accurate Calibrated Models.
ASHRAE 14 provides info on how to use a calibrated model but no guidelines/instructions on how to
actually calibrate a model.
During the literature review of this study, a few concepts, introduced by different researchers, were
identified as relevant for the current research. Therefore, such concepts are described in this
section.
3.1.1. Day-typing
Currently, data used for calibration of BEM can still be available in different forms: monthly, daily,
hourly or sub-hourly.
Reddy (2006) introduced the concept of day-typing which means hourly-data is condensed to
represent typical days, making data manipulation simpler. However, this is not a method considered
to be frequently needed for the Company as the company’s proprietary software (Ergon and SCAN)
provides methods of manipulating sub-hourly data for a year with ease.
3.1.2. Zone-typing
Raftery et al (2011) identified that in numerous case studies, the way the zoning of different spaces
is done is not reported. As this is a parameter that will significantly affect results, it is considered
that reporting the strategy used is crucial.
Raftery et al (2011) suggest that for the vast majority of cases, the standard “core plus four
perimeter per floor” zoning strategy will have a significantly negative impact on the calibration of a
model and that more detailed zoning should be carried out.
Raftery et al (2011) also describe a method to aid with obtaining a more reliable and accurate
calibration. They suggest designing a hierarchy of the different parameters depending on the source
which the information was obtained from. In order of priority, this is their suggestion:
1. Data-logged measurements
2. Short term measurements
3. Observation from site survey
4. Interview to operator
5. Operation manuals
6. Commissioning documents
7. Benchmarks
8. Standards, specifications and guidelines
9. Design stage information
Another concept that could improve the calibration process, introduced by Raftery et al (2011), is
using a version control software to keep record of every change done on the different inputs of the
model and the result obtained due to such change.
This is specifically useful when the process has to be reviewed by a different person to the one who
calibrated it or even to come back to the process on a later date.
Kin et al (2017) found that the occupancy schedules could be extracted using hourly electricity (plug-
load) data. Therefore, in a building where there is no sub-metering, using the plug load data can
greatly help the model calibration.
They used people counting sensors in the buildings to verify their findings.
4. Context of Calibration within M&V
M&V is the process of determining the amount of savings obtained through an Energy Conservation
Measure (ECM). This process involves planning, measuring, collecting data and analysing it.
An M&V methodology should be: accurate, complete, conservative, consistent, relevant and
transparent.
Numerous documents have been created by different institutions worldwide in order to try and
establish guidelines and regulations on M&V. The two main ones, which are referenced throughout
this document are IPMVP and ASHRAE 14. The main difference between these two documents is that
while IPMVP outlines four options for M&V: A, B, C & D, ASHRAE 14 outlines three approaches which
relate to and support IPMVP’s options excluding Option A. Additionally, ASHRAE 14 provides
thresholds of the uncertainty levels that should be achieved for each approach. Other complementary
documents include the M&V guidelines for Federal Energy Management Program (FEMP) and the
Uniform Methods Project (UMP), both prepared by the Department of Energy (DOE) of the United
States.
M&V options/approaches
IPMVP ASHRAE 14
Retrofit Isolation A- Key Parameter Measurement
B- All Parameter Measurement Retrofit Isolation
Whole Facility C- Mathematical Model Whole facility
D- Calibrated Simulation Calibrated Simulation
Table 1 M&V options
IPMVP defines option D as a calibrated simulation model, however it doesn’t provide any guidelines
on how to go about this approach. However, ASHRAE 14 does provide requirements for an acceptable
whole-building computer simulation model.
4.1. M&V
For the M&V of an ECM, three periods are necessary: Baseline, implementation and reporting. IPMVP
identifies four options for M&V (A, B, C & D).
Options A & B Involve retrofit isolation, while C & D will take into account the whole facility.
Option C uses a linear regression model which is typically based on the independent variable of:
Heating and/or Cooling Degree days (and sometimes other independent variables such as percentage
of occupied time.
Option D uses a calibrated model. This is a 3D model made in a computer software which requires a
series of inputs (e.g. occupancy profiles, internal gains, construction properties, etc.) that are
optimised in order for the outputs to match a series of energy bills or readings provided by the client.
Table 2 below defines the capabilities of the different options described in IPMVP.
A B C D
Assess individual retrofits X X X
Assess an entire facility X X
<10% savings of utility meter X X X
Industrial X X X
Unclear variables’ significance X X X
Complex interactive effects X X
Future Baseline Adjustments expected X X
Long term assessment X X X X
No baseline energy data X
Need of simulation software, skill and experience X
Table 2 Capabilities of the different M&V options by IPMVP
This section includes a methodology defined with the aid of guidelines from ASHRAE 14.
2) Report Simulation tool: for the purpose of this study the IES-VE software will be used and
referenced throughout as it falls in the category of computer-based program for building
energy use analysis capable of modelling (compliance established by ASHRAE 14, IPMVP
doesn’t establish any requirements for “Option D” models):
- 8760 hours yearly
- Thermal mass effects
- Occupancy and operating variation profiles
- Performance and efficiency of HVAC components and other mechanical equipment
- Set Points for individual zones
- Real weather data
- ECMs
3) Report Input data: Provide a full copy of the data inputted in the model, specifying whether
it is known, assumed, fixed or optimised. The source of the data should be included. Input
data includes:
- Building geometry
- Envelope characteristics
- Equipment size and efficiency
- Internal conditions (space temperatures, equipment operating hours, set points,
occupancy profiles, etc.)
Figure 3 below, extracted from BS ISO 17741 (standard based on IPMPV), describes the procedure of
Calibration, as it can be seen in step C) the word “assume” is used, reinforcing the fact that existing
guidelines are too vague
Option C calculates savings from Pre and Post ECM whole facility energy use measurements and
typically regression analysis is used to adjust the Post ECM baseline.
Option D uses the same measurement boundaries as option C, but uses a calibrated simulation model
in order to generate the Post ECM baseline.
The advantage of Option D over Option C within M&V is that with option D, an individual ECM can be
assessed even though more than one has been implemented at the same time. Additionally, the
interaction between the ECMs within themselves and with the different parameters of the building
can be understood.
Outside M&V, Option D also has clear advantages over Option C as it can be used:
- For an ECM plan (estimating savings without any physical work being done).
- To prove savings relative to a standard.
- For Monitoring and Targeting (M&T).
During this research it is intended to provide an insight of the effort needed to create a baseline model
and the accuracy that can be achieved using the two whole-facility options established in IPMVP (i.e.
C & D).
Gallagher et al (2018) explains that M&V 2.0 is different from M&V in that it uses large data sets and
automated advanced analytics to streamline and scale the process.
This paper also highlights that AMI systems (which collect large quantities of data) can significantly
improve the accuracy of M&V. However, its potential is not commonly maximised due to the lack of a
central data repository and the need for skilled professionals for data clearing (pre-processing
techniques) and baseline modelling.
Franconi et al (2017) identified that using new technologies, savings can be determined in near-real
time to benefit a range of stakeholders and provide a baseline consistency across applications. For
M&V 2.0 new technologies are emerging, affecting M&V by:
- Increasing data granularity (by means of high-resolution smart meters, communicating smart
thermostats, and nonintrusive load-sub-metering devices).
- Enabling high speed processing of large quantities of data (by means of automated analytics)
therefore enabling ongoing monitoring and estimating of energy savings in near-real time.
5. Review of Metrics and Uncertainty
Statistics are used for the collection, analysis, implementation and presentation of numerical data. In
building energy modelling, statistics can be used for:
There are eight statistical tools that have been identified in this research that can be useful in order
to evaluate the performance of a whole-building energy simulation model:
These have been extracted from various sources, including Uncertainty Assessment for IPMVP,
ASHRAE 14-2014, Chakraborty & Elzarka, 2017 (which identified that in the last two decades the metric
used are R2, RMSE, CVRMSE, MBE and introduces RN_RMSE) and Chen et al, 2017 (which discusses
about the benefits of UMBRAE against other time series forecasting metrics).
It is important to note that the first six metrics are outlined by IPMVP and ASHRAE 14. However, the
last two are scale-independent metrics which have been identified during this research as being truly
informative measures for model evaluation.
Some basic statistics concepts are explained in the following section in order to aid the reader with
the understanding of the metrics mentioned above.
5.2. Background concepts
∑(𝑥1 , 𝑥2 … 𝑥𝑛 )
𝑥̅ =
𝑛
Variance ( s2 ) = Variation from the mean (as for some data points this will be negative and for other
it will be positive, to avoid these cancelling each other, the formula for variance uses squared values)
∑(𝑥 − 𝑥̅ )2
𝑠2 =
(𝑛 − 1)
Standard deviation ( 𝒔̅ ) = Squared root of the variance, meaning it can be directly compared with the
measured values.
𝑠̅ = √𝑠 2
5.2.2. Distributions
Probability distribution = Different types. Area under the curve always equal 1. For M&V it is assumed
that distributions are always normal distributions.
Normal distribution = Probability distribution, bell shaped, defined by its mean and a standard
deviation. Area under the curve = probability of obtaining a value in that range when randomly picking
one value = Confidence level.
Standard normal distribution = Symmetrical around zero. “x” axis shows the number of standard
deviations.
https://quantra.quantinsti.com/glossary/Standard-Normal-Distribution
When reporting predicted savings Confidence level (CL) and confidence interval (CI) are reported
together. E.g. there is a 95% confidence (level) of annual savings to be 55 MWh plus or minus 10 MWh
(interval).
T-distribution
In M&V, the number of data points can be low (e.g. when dealing with only monthly utility bills). In
such cases (where n<30), it is more appropriate to use the introduction of T-distribution rather than
normal distribution.
T-value = Found on a table in statistics textbooks. It defines the relationship between CI and CL in a t-
distribution, depending on the DF. The T-value is multiplied by the standard CI to determine the new CI
at the desired CL.
Residual Errors = A model of energy use (regression or calibrated model) will typically not be exactly
the same as the actual energy consumption of the real building. Difference between the
modelled/predicted energy and the actual energy.
5.3. Terms
This section includes the descriptions of the characters used in the formulas of the following section.
k = 1, 2, . . ., n.
̂ = Modelled Values
Y𝑘
𝑡 = t-statistics, t-distribution table found in statistics text books. Value selected by knowing:
It is important to note than in the following section, when talking about regression model it refers to
models classified as Option C of the IPMVP and calibrated simulation model refers to Option D of the
IPMVP.
5.4. Metrics
This metric can be used to assess the accuracy of the model. It indicates how much of y is explained
by the regression or calibrated simulation model (“goodness of fit” or “degree of correlation”).
2
𝐸𝑥𝑝𝑙𝑎𝑖𝑛𝑒𝑑 𝑣𝑎𝑟𝑖𝑎𝑡𝑖𝑜𝑛 𝑖𝑛 𝑦
𝑅 = =1− 𝑛
𝑇𝑜𝑡𝑎𝑙 𝑣𝑎𝑟𝑖𝑎𝑡𝑖𝑜𝑛 𝑖𝑛 𝑦 ∑𝑘=1 (Y𝑘 − μ)2
R2 is a value between 0 and 1. Normally the greater the better, an acceptable value is dependent on
the context, however industry tends to accept coefficient values above 0.75.
R2 should only be used as an initial check. The variance of y should also be observed (Scatterplot can
be used to help illustrate this).
It is important to note the major flaw of R2 which is its dependency on the slope of the model. Higher
slopes have a tendency to show a higher value of R2 even though the variance of y is the same. For
this reason it is important to use other metrics in conjunction to R2 when assessing a model.
This flaw is illustrated in the graphs below, where two different lines that have the same variance have
very different R2 values. The line with a higher slope has a significantly higher R2 value than that of the
lower scope.
Adjusted R2 = for regression models. Prevents from adding irrelevant variables to the model.
(1 − 𝑅 2 ) × (𝑁 − 1)
𝐴𝑑𝑗𝑢𝑠𝑡𝑒𝑑 𝑅 2 =
𝑁−𝑝−1
This metric can provide an indication of the level of bias of the modelled data relative to the baseline
data. It can indicate if overall modelled values tend to be higher or tend to be lower than baseline
values.
This metric can provide a positive or a negative value. A positive vale of MBE indicates that modelled
data under predicts the baseline data and a negative value of MBE indicates that the modelled data
over predicts the baseline data. The closer this value is to zero the better. However, as it is not
normalised, in order to interpret this number, the scale of the data needs to be understood first.
This metric, similarly to the MBE, can provide an indication of the level of bias of the modelled data
relative to the baseline data. However, N_MBE provides a value in percentage which is therefore
understood without the need of knowledge of the scale of the data.
In the same way as for MBE, the major flaw of N_MBE is the offsetting error.
This metric can illustrate how well the simulated data points agree with the baseline model. It
represents the sample standard deviation of the differences between modelled and baseline data
values. RMSE can also be called Standard Deviation of Residuals.
2
∑𝑛𝑘=1 (Y𝑘 − Ŷ𝑘 )
𝑅𝑀𝑆𝐸 = √
𝑛−𝑝−1
The value of RMSE can be between 0 and ∞.A lower value indicates a better performance of the
simulation model.
The major flaw of this metric is that it is scale dependent, meaning the size of the data needs to be
understood before being able to correctly interpret the value of RMSE. For example, for a model of a
small house a good RMSE would have to be much smaller than the RMSE of a model of an entire. (e.g.
1000 MWh can be a good RMSE for a building such as an airport but terrible for a single dwelling).
This metric will provide two different values that will indicate the limits of the range of uncertainty.
For a 68% confidence level, which is commonly approved by industry, t=1. For a different confidence
level, t should be obtained from the t-statistics table.
This metric normalises the RMSE with the mean of the baseline data, in other words it eliminates the
dependency that the RMSE has on the scale of the data, by making the RMSE a value proportional to
the mean of the model. CVRMSE is expressed in percentage.
2
∑𝑛 ̂
√ 𝑘=1 (Y𝑘 − Y𝑘 )
𝑅𝑀𝑆𝐸 𝑛−𝑝−1
𝐶𝑉𝑅𝑀𝑆𝐸 (%) = 𝑋 100 = 𝑥 100
𝜇 𝜇
Being a scale independent metric, CVRMSE can help compare the “fitness” of different models.
CVRMSE is useful when different calibration periods differ only in a multiplicative way (as shown in
the formula below).
RMSE 𝑚 𝑥 𝑅𝑀𝑆𝐸
=
𝜇 𝑚𝑥𝜇
This means that as long as the two periods have the same mean and the same variance, the CVRMSE
is a useful metric to compare the results of a model. This almost never happens.
Hence, CVRMSE has a flaw which is that it cannot be compared with other periods when they differ
in an additive way (as shown in the formula below).
RMSE 𝑅𝑀𝑆𝐸
≠
𝜇 𝑎+ 𝜇
Meaning that the same model can “pass” the calibration criteria just by incrementing the energy
demand of the building, as opposed of improving the model itself. For the case of continuous M&V
projects (dashboards), this would mean that the same model might look calibrated for one week but
it won’t the next one.
This metric normalises the RMSE with the range of the baseline data, in other words it eliminates the
dependency that the RMSE has on the scale of the data, by making the RMSE a value proportional to
the range of the model. RN_RMSE is expressed in percentage.
𝑛 2
̂
√∑𝑘=1 (Y𝑘 − Y𝑘 )
𝑅𝑀𝑆𝐸 𝑛
𝑅𝑁_𝑅𝑀𝑆𝐸 (%) = 𝑋 100 = 𝑥 100
𝑟𝑎𝑛𝑔𝑒 (Y) max(Y) − min(Y)
RN_RMSE solves the flaw that CVRMSE has, as changes in the mean of the energy demand
(represented by “a”) do not impact the metric.
RMSE 𝑅𝑀𝑆𝐸
𝑅𝑁_𝑅𝑀𝑆𝐸 = =
maximum(Y) − minimum(Y) (maximum(Y) + _𝑎) − (minimum(Y) + _𝑎)
Chakraborty and Elzarka (2017) suggest that a RN_RMSE together with R2 can not only provide more
accurate estimates of model performance but also allow fair comparison between models developed
on different datasets (or simulation periods). They suggest RN_RMSE should be the first metric to be
used for validation of BEMs, and second (complementing) should be R2 .
The paper shows an example of how the same model (visually speaking) can pass or fail calibration
criteria just by modifying the ratio between mean and variance. However, RN_RMSE remains constant
as the aim of this metric is to assess the model itself regardless the magnitude of the prediction.
It is shown that RN_RMSE can successfully normalize multiplicative as well as additive differences
between datasets. Also, RN_RMSE does not suffer from overestimation and underestimation
problems. It is suggested that R2 is used along with RN_RMSE to identify any existing nonlinear
relationship between the modelled results and baseline data.
Therefore, it is suggested that this metric is added to the calibration evaluation metrics list especially
for continuous calibration projects.
5.4.8. Unscaled Mean Bounded Relative Absolute Error (UMBRAE):
Chen et al, 2017 identified a new accuracy measures for time series forecasting. In their literature
review they have identified commonly used measures outlining the issues that each of these have and
have suggested a measure called UMBRAE which combines the best properties of the identified
metrics but addresses their common issues. Its qualities are:
|(Y𝑘 − Ŷ𝑘 )|
∑𝑛𝑘=1 ∗
|(Y𝑘 − Ŷ𝑘 )| + |(Y𝑘 − Ŷ𝑘 )|
𝑀𝐵𝑅𝐴𝐸 =
𝑛
𝑀𝐵𝑅𝐴𝐸
𝑀𝐵𝑅𝐴𝐸 =
1 − 𝑀𝐵𝑅𝐴𝐸
In order to report the performance of a calibrated model with respect to the performance of a
benchmark model:
M&V is not an exact science. Adequate consideration of uncertainty enables rational M&V budget
decisions. Maintaining accuracy and reporting uncertainty are critical steps. There are three main
sources of uncertainty: sampling, measuring and modelling.
Not all errors can be quantified. Hence, when reporting uncertainty a combination of qualitative and
quantitative statements can be used.
The multiple sources of uncertainty (sampling, measuring and modelling) can be combined to give a
total uncertainty. With independent components, the following formula applies
Category Factors
Scenario uncertainty Outdoor weather conditions
Building usage/ occupancy schedule
Building physical/operational uncertainty Building envelope properties
Internal gains
HVAC systems
Operation and control settings
Model inadequacy Modelling assumptions
Simplification in the model algorithm
Ignored phenomena in the algorithm
Observation error Metered data accuracy
Table 3 Heo et al (2012)
According to IPMVP an acceptable accuracy of the model means that the distance between the
Adjusted Baseline Energy and the Reporting Measured Energy is greater than two times the standard
error (Standard error is assumed to be at 68% confidence level meaning it’s just = +/- RMSE).
The IPMVP also recommends that whole-building savings quantification be applied in cases where the
expected savings are greater than ten percent, and where at least twelve months of pre- and post-
data is available.
Gallagher et al (2018) designed the following graph to represent the model prediction performance
requirements under varying fractional savings.
Graph 6 Gallagher et al (2018)
6. Literature review opinions regarding Matrices and Uncertainty
Chakraborty and Elzarka, 2017 identified that in the last two decades the metric used are R2, RMSE,
CVRMSE, and MBE. They state that ideally, a statistical performance metric should be:
Unfortunately, as illustrated in the paper, none of the widely used statistical metrics (MBE, RMSE,
CVRSME and R2) satisfy all these criteria.
According to ASHRAE 14 (Section 6.3.3.4.2.2), the accuracy of a calibration model should be:
Royapoor and Roskilly, 2015 suggest that when hourly annual data is available, the ASHRAE 14
specifications of MBE +/- 10% and CVRMSE <30% can be brought down to +/- 5% and <20%. When
hourly data is not available the ASHRAE 14 limits are realistic.
Chen et al, 2017 states that an accuracy measure should provide an informative and clear summary
of the error distribution. It is being commonly accepted that no single accuracy measure is best and
that no single measure is superior to the others on all the following criteria at the same time:
- Include reliability
- Construct validity
- Computational complexity
- Outlier protection
- Scale independency
- Sensitivity to changes
- Interpretability
According to Gallagher et al., NMBE is an indication of overall bias in a regression model. It quantifies
the tendency of a model to over or underestimated across a series of values. In contrast to the
CV(RMSE), the NMBE is independent of time and hence, it can result in overall positive bias cancelling
out negative bias. The use of both metrics in conjunction with each other allows for a true insight into
model performance.
7. Case studies
In order to implement the concepts and metrics identified, two case studies were used. The major
difference between these is the granularity of the data available.
7.1. Case Study 1 – Nursery Villa in Abu Dhabi, UAE using monthly bills
The purpose of the calibration of this model was to later evaluate savings due to potential ECMs as
well as identify current savings due to already existing ECMs.
7.1.1. Procedure
The following steps were undertaken for this case study:
This case study had optimal weather data availability as it had weather data from a station on the
actual Villa itself. This is a rare situation, normally when simulating a building, the weather file used is
within a 30km range, therefore, it doesn’t consider the microclimate which impacts the building. This
makes a significant difference. A study done in this particular case study shows that the dry bulb
temperature reached over one OC higher in the Villa (red line) compared to the rural Bateen Airport in
Abu Dhabi (orange line) and the wet bulb temperature reached over three OC higher in the Villa (green)
than the airport (blue).
This study proves the importance of accuracy on the weather when calibrating a simulation model as
for the nursery model, this difference in the weather data caused a 7% difference in the annual
energy consumption output of the simulation.
The model geometry was imported from the architects drawing done in Rhino.
Figure 8 Nursery VE
Figure 9 Nursery Rhino
7.1.4. Model Inputs
Construction data was based on UPC study of typical Abu Dhabi buildings, a survey conducted by Abu
Dhabi Municipality, a report by the Executive Affairs Authority.
Cooling Set-Point was also obtained from the UPC study (22oC). However, as recommended by
Allesina et al, 2018, this was also calibrated within a reasonable range (+/- 2oC).
“Best-guess” parameters (such as the roof properties, as these were measured by the client) were left
constant, others were calibrated using Hone and others were calibrated manually.
The following table shows the Parameters inputted on the final calibrated model:
Internal Gains
Miscellaneous sens. (Light+equip.) 10W/m2 10.0 - 50.0 Hone
People sensible 80W/person
{eople latent 60W/person
Air exchanges
Infiltration 0.21ach 0.1 - 0.4 Hone
Constructions
Roof U-Value 1.01 FIXED
Windows U-Value 2.93 2.0 - 4.0 Manually
Ground/Exposed Floor U-Value 0.22 FIXED
External Wall U-Value 1.98 1.5 - 2.5 Manually
This data wasn’t available and thus was generated by using reverse engineering, surveys, plus a man-
power observation of the trend.
Cooling Operation Profile 2017 AC on for Regression Model 2
Weekday Friday Saturday
Jan 6:00-18:00 6:00-18:00 6:00-18:00 Jan 50%
Feb 6:00-18:00 6:00-18:00 6:00-18:00 Feb 50%
Mar 6:00-18:00 Off cont. Off cont. Mar 35%
Apr 6:00-18:00 Off cont. Off cont. Apr 35%
May 6:00-18:00 Off cont. Off cont. May 35%
Jun 6:00-18:00 Off cont. Off cont. Jun 35%
Jul 6:00-18:00 6:00-18:00 Off cont. Jul 43%
Aug On cont. On cont. On cont. Aug 100%
Sept On cont. On cont. On cont. Sept 100%
Oct On cont. On cont. On cont. Oct 100%
Nov 6:00-18:00 6:00-18:00 6:00-18:00 Nov 50%
Dec 6:00-18:00 6:00-18:00 6:00-18:00 Dec 50%
As shown above, the % of hours was extracted from the VE model inputs. However, it is important to
note that in contrast with the VE calibrated model, Regression model 2 does not take into account at
what time of the day the AC is on, just a percentage of time that the AC is on.
The following graph illustrates the monthly electricity demand during 2017 in both the real case and
the calibrated simulation model.
RMSE 640.3124
CVRMSE 7%
R2 98%
N_MBE -2%
RN_RMSE 5%
As it can be seen both CVRMSE and N_MBE both are under the limits that ASHRAE consider acceptable
for a calibrated model, indicating this would comply with ASHRAE standards.
An overfitting test was also done by using the three first months in 2018, to check for overfitting and
also to check if the values fall inside the uncertainty range.
The following graph shows the real and the calibrated data as well as the interval/range for the three
first months in 2018 for a confidence level of 95%.
16000
Comparison of accuracy between regression models and calibrated model
14000
Two regression models were carried out in order to compare their accuracy with that of the
12000
calibrated model.
10000
Regression model 1) Cooling Degree Days(CDD) with Base = 24oC
8000
Regression model 2) CDD + AC on percentages
6000
4000
2000
0
Nov-16 Jan-17 Mar-17 Apr-17 Jun-17 Aug-17 Sep-17 Nov-17 Dec-17 Feb-18 Apr-18
Two regression models were generated using Excel. One using CDD and the second using CDD and
occupancy schedules. The results are shown in the graph below. A comparison is done on the
accuracy of the regression models (option C) compared to the calibrated model (option D).
16000
Testing
14000 period
12000
10000
8000
6000
4000
2000
0
Nov-16 Jan-17 Mar-17 Apr-17 Jun-17 Aug-17 Sep-17 Nov-17 Dec-17 Feb-18 Apr-18
UMBRAE was used to compare the accuracy of the different models, the results obtained were:
For the nursery, it was assumed that the independent variables have no error (meter uncertainty is
0), there is no sampling and the only uncertain part of prediction is the model itself.
7.2. Case Study 2 – Office building in USA using sub-hourly data
This BEM was created for LEED and is now being used as an asset for the building owners in the post
occupancy phase for M&V.
7.2.1. Procedure
The building is located in the US, in a climate requiring mostly cooling to be provided for space
conditioning. The data was obtained from Weather Analytics, which has a 30km granularity.
The following values represent the different metrics used to evaluate the Rev 0 model.
RMSE 90.22393867
CV_RMSE 66%
R2 -23%
N_MBE -13%
RN_RMSE* 22%
As it is clear from the results, the model is not yet calibrated as both CVRMSE and RN_RMSE is way
outside the limits established as acceptable by ASHRAE 14.
The following values represent the different metrics used to evaluate the Rev 1 model.
RMSE 37.49003727
CV_RMSE 27%
R2 79%
N_MBE -2%
RN_RMSE* 9%
These results now show compliance within the ASHRAE 14 limits and therefore it is considered to be
a Hybrid Calibrated model.
Hybrid calibrated model is a term introduced by the Company (IES) in order to define a model that
has used time series data from SCAN in order to make it more accurate. Mixing real data and virtual
data.
UMBRAE
UMBRAE was also carried out in this case study to compare Rev 0 to Rev 1.
The calibrated model (Rev 1) performs 62% better than the benchmark model (Rev 0).
8. Process Evaluation
The following table will explain the method in which the different processes were rated/evaluated.
Including two key parameters, effort and strategic value.
5 Fairly Unique
4 Major Differentiation
3 Good differentiation
Strategic Value 1 It can only be used for M&V, not for M&T or for
ECM saving prediction.
Total 6
VE model calibration using Monthly Bills manually changing parameters (trial and error)
Total 6
Total 8
It is important to note that for this process, the time spent cleaning data to obtain relevant and
correctly named information to input into the model is not taken into account. However, this is an
area where significant research is being done by IES in order to use machine learning for the
automation of this process.
9. Conclusion
During this research it was identified that for an efficient calibration of a BEM it is critical that the
purpose of the calibration is very clear from the beginning (i.e. what is this calibrated model going to
be used for). There has been numerous cases where the purpose has not been identified early enough
in the process and a lot of time is wasted trying to achieve things that are not needed.
This research has aided the Company by bringing together and explaining in detail all the statistical
metrics that can be useful in order to improve the accuracy of a calibrated model in a comprehensive
document (including new metrics that have never been used before in the Company such as RN_RMSE
or UMBRAE). This will therefore be useful for future projects where calibration is needed and certain
levels of accuracy are required.
Other innovative solutions have been implemented during this research that can be useful to a more
efficient process, such as the generation of an occupancy schedule using available electricity data and
reverse engineering.
Other areas of improvement identified during this research and that will be addressed by the Company
in the near future include:
The inclusion of more parameters to the IES Hone tool (such as U-Values) for a more efficient
optimisation of all parameters influencing the model outputs.
The integration of a version control tool in the IES-VE software for a more effective exchange
of data between stakeholders.
The automation of data exchange between Hone and the VE for a more rapid completion of
the calibration.
The integration of a metrics calculator within the Ergon/SCAN tool allowing for model
accuracy evaluation without the need of using external tools such as Excel.
10. Implementation of findings by other IES members
A third case study was carried out by a member of the IES consultancy team after this research in
order to examine the value of the findings.
- Identification of quantity, quality and granularity of the data available. In this case, this was
time series annual electricity consumption.
- Derivation of the operational profile from the electricity profile as per the findings described
in Section 3.5.1.
- Incorporation of internal gains by using NCM standards for primary schools. This included
the data source hierarchizing mentioned in Section 3.1.3.
- Simulation of the BEM using the VE.
- Establishing of a range of variability for the parameters affecting electricity consumption and
inputting this into Hone.
- Running Hone in order to identify optimised values for the unknown parameters.
- Evaluation of the model accuracy using the metrics described in Section 5. Comparing the
calibrated model with the standard model.
The consultant that carried out this project reported that the time invested was of one day (8 hours)
of work.
Graphs 12 and 13 below illustrate the difference between the model before and after calibration.
An acceptable level of accuracy was reached once the model was calibrated. The consultant found
that the methods identified during this research that were implemented allowed for a more efficient
calibration. Additionally, the areas of improvement that have been suggested for integration within
the IES tools were described by the consultant as potentially very beneficial, describing the reduction
in time that could have been achieved for this Case Study if those had been in place.
11. Reflection from Academic Mentor
This study has been undertaken by a student researcher at Heriot Watt University, working in
conjunction with IES (Integrated Environmental Solutions) Ltd. The work was facilitated by the
Energy Technology Partnership under the Student Engagement Project (STEP) scheme.
The work was undertaken between June and September 2018. During this period, the student
worked semi-independently, co-locating her study between the university and the industry partner.
The student met with the industry supervisor regularly, and with the academic supervisor monthly
throughout the project.
The resulting report is a high-quality, rigorous piece of work which effectively meets the stated aims
of the project. The research, whilst directly benefitting the industry partner as evidenced elsewhere,
also provides a valuable contribution to our understanding of model calibration. This remains an
essential part of our understanding of modelling energy use in buildings, and our efforts to close the
‘performance gap’.
The project increased the connections between the industry partner and the University in the
following ways:
Two physical meetings between university and industry management, one at the partner
HQ, and one at the university;
Significant remote communications including telephone, skype and email contact (relating to
this project), and recent meeting at a national conference
Further discussion relating to software-curriculum integration, and university staff training
on software
Development of a potential curriculum-engaged future research proposal, working across
Heriot-Watt’s global campuses, currently under investigation.
The ETP-funded research project has achieved the stated aims of the project and also effectively
catalysed a growing relationship between the Industry and University partners. The student has
worked extremely effectively and the quality of the work is to be congratulated.
Appendix A: Using Excel to generate a linear regression model for option C:
If this is not installed in your PC, in Excel go to File > Options > Add-ins > Manage > Excel Add-Ins > Go
> Analysis ToolPak > OK
This methodology includes the use of Elements and Hone software. These both come with the VE.
Therefore, if you have the VE installed in your PC, these will also be installed. Use the start menu to
find them.
1) Create a text file (.txt) that includes the REAL DATA (e.g. monthly electricity or gas bills) in an
identical format to the following. Save such file in the parametric_batch file folder of your
model.
2) Open your model in the VE. Open ApacheSim, select a full year calibration period and create
an “.apr” file (by clicking the “Add to Queue” button followed by “Save & Exit”). Ensure the
adequate weather file and simulation outputs are selected.
Such APR file will automatically save in the parametric_batch file folder of your model.
b) Add a “Rule” (this means add a parameter that you would like to optimise – you can add
as many as you desire).
c) Select a Rule Type, Parameter Range and Grouping Type.
Parameter Range:
- It can be scale (dependant on unit already established in the model, this will multiply
the value in the model by a “scale factor”) or absolute (unit provided by Elements -
will overwrite what established previously on the model).
- For Hone only Lower and Upper limit are relevant.
- Simulations and Step size can be ignored (these are used by Parametric Tool – Hone
takes random sampling).
d) Update Parameters and variables by clicking on the green arrow and check that the
parameters to the left are highlighted in green (if highlighted in red, parameters are not
found in the model).
e) Go to the variables tab and select the variable you would like to improve (the one you
are calibrating against real data). Right click on it, and select weather you want to
improve the total, maximum, minimum or mean.
f) Save the RULE file in the parametric_batch file folder of your model. When you Update
parameters and variables, this will automatically save a PARA and a VARS File on your
parametric_batch file folder.
b) d)
a)
c)
d)
4) Open Hone. Open project. Open the MOOP file provided alongside this document.
c) Click on “Import”. This will import the PARA and VARS files.
d) Select the objectives. These must always be TWO different Metrics. Currently there is
option to select between CVRMSE, NMBE or R2. Traditionally CVRNSE plus one of the
other two would be used.
*RN_RMSE can potentially be added to this in order to allow for successful normalization
of multiplicative and additive differences between datasets.
e) These were automatically imported from the PAR File. You can edit these manually here.
g) Select number of generations that you want to run (you can stop the simulations once
you identify you have reached and adequate level – i.e. new generations are not
reducing the metrics).
Results will be shown on the Output log to the right, and an Excel Spreadsheet with the output
values from the latest generation will appear automatically in the parametric_batch file folder of
your model.
h)
a)
b)
c)
e)
d)
f)
g)
Appendix C: Overview of the Calibration methodology by Raftery et al 2011
References
Allesina, G., Mussatti, E., Ferrari, F. and Muscio, A. (2018). A calibration methodology for building
dynamic models based on data collected through survey and billings. Energy and Buildings, 158,
pp.406-416.
BS ISO 17741:2016 General technical rules for measurement, calculation and verification of energy
savings of projects
BS ISO 50006:2014 Energy management systems — Measuring energy performance using energy
baselines (EnB) and energy performance indicators (EnPI) — General principles and guidance
Chakraborty, D. and Elzarka, H. (2017). Performance testing of energy models: are we using the right
statistical metrics?. Journal of Building Performance Simulation, 11(4), pp.433-448.
Chen, C., Twycross, J. and Garibaldi, J. (2017). A new accuracy measure based on bounded relative
error for time series forecasting. PLOS ONE, 12(3), p.e0174202.
FEMP. Federal Energy Management Program, M&V Guidelines: Measurement and Verification for
Federal Energy Projects Version 3.0; U.S. Department of Energy Federal Energy Management
Program: Washington, DC, USA, 2008.
Fennell, PJ; Ruyssevelt, PAR; Smith, AZP; (2017) Exploring the Commercial Implications of
Measurement and Verification Choices in Energy Performance Contracting Using Stochastic
Building Simulation. In: O'Neill, Z, (ed.) Proceedings of Building Simulation 2017. International
Building Performance Simulation Association (IBPSA): San Francisco, CA, USA.
Franconi, E., Gee, M., Goldberg, M., Granderson, J., Guiterman, T., Li, M., & Smith, B. (2017). The
Status and Promise of Advanced M&V: An Overview of “M&V 2.0” Methods, Tools, and
Applications. Lawrence Berkeley National Laboratory.
Gallagher, C., Leahy, K., O’Donovan, P., Bruton, K. and O’Sullivan, D. (2018). Development and
application of a machine learning supported methodology for measurement and verification
(M&V) 2.0. Energy and Buildings, 167, pp.8-22.
Kim, Y., Heidarinejad, M., Dahlhausen, M. and Srebric, J. (2017). Building energy model calibration
with schedules derived from electricity use data. Applied Energy, 190, pp.997-1007.
Raftery, P., Keane, M., O’Donnell, J., (2011) Calibrating whole building energy models: an evidence-
based methodology, Energy Build. 43 (9) 2356–2364.
Royapoor, M. and Roskilly, T. (2015). Building model calibration using energy and environmental
data. Energy and Buildings, 94, pp.109-120.
T.A. Reddy. (2006). Literature review on calibration of building energy simulation programs: uses,
problems, procedures, uncertainty, and tools. ASHRAE Trans.112.