0% found this document useful (0 votes)
13 views20 pages

Physically Consistent Neural Networks For Building

This document presents a novel architecture for Physically Consistent Neural Networks (PCNN) aimed at improving building thermal modeling by incorporating physical consistency into neural networks. The PCNN architecture requires only past operational data and outperforms traditional physics-based models, achieving up to 40% better accuracy while maintaining expressiveness and reducing overfitting. The study highlights the importance of integrating prior knowledge of physical laws into neural networks to enhance their generalization capabilities in building energy modeling.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views20 pages

Physically Consistent Neural Networks For Building

This document presents a novel architecture for Physically Consistent Neural Networks (PCNN) aimed at improving building thermal modeling by incorporating physical consistency into neural networks. The PCNN architecture requires only past operational data and outperforms traditional physics-based models, achieving up to 40% better accuracy while maintaining expressiveness and reducing overfitting. The study highlights the importance of integrating prior knowledge of physical laws into neural networks to enhance their generalization capabilities in building energy modeling.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

Physically Consistent Neural Networks for building thermal modeling:

theory and analysis


Di Natale L.a,b,∗ , Svetozarevic B.a , Heer P.a and Jones C.N.b
a Urban Energy Systems Laboratory, Swiss Federal Laboratories for Materials Science and Technology (Empa), 8600 Dübendorf, Switzerland
b Laboratoire d’Automatique, Swiss Federal Institute of Technology Lausanne (EPFL), 1015 Lausanne, Switzerland

ARTICLE INFO ABSTRACT


Keywords: Due to their high energy intensity, buildings play a major role in the current worldwide energy tran-
Neural Networks sition. Building models are ubiquitous since they are needed at each stage of the life of buildings,
Physical consistency i.e. for design, retrofitting, and control operations. Classical white-box models, based on physical
Prior knowledge equations, are bound to follow the laws of physics but the specific design of their underlying structure
arXiv:2112.03212v3 [cs.LG] 11 Jul 2022

Building models might hinder their expressiveness and hence their accuracy. On the other hand, black-box models are
Deep Learning better suited to capture nonlinear building dynamics and thus can often achieve better accuracy, but
they require a lot of data and might not follow the laws of physics, a problem that is particularly com-
mon for neural network (NN) models. To counter this known generalization issue, physics-informed
NNs have recently been introduced, where researchers introduce prior knowledge in the structure of
NNs to ground them in known underlying physical laws and avoid classical NN generalization issues.
In this work, we present a novel physics-informed NN architecture, dubbed Physically Consis-
tent NN (PCNN), which only requires past operational data and no engineering overhead, including
prior knowledge in a linear module running in parallel to a classical NN. We formally prove that
such networks are physically consistent – by design and even on unseen data – with respect to dif-
ferent control inputs and temperatures outside and in neighboring zones. We demonstrate their per-
formance on a case study, where the PCNN attains an accuracy up to 40% better than a classical
physics-based resistance-capacitance model on 3-day long prediction horizons. Furthermore, despite
their constrained structure, PCNNs attain similar performance to classical NNs on the validation data,
overfitting the training data less and retaining high expressiveness to tackle the generalization issue.

1. Introduction Most of the existing models are focused on commercial


buildings [8] and study single-step predictors [13, 14], which
Buildings consume 30% of global end-use energy, pro- work adequately for energy consumption predictions in de-
ducing 28% of the world’s Green House Gas (GHG) emis- sign or retrofitting applications. On the other hand, in this
sions related to energy according to the IEA [1], and those
work, we design control-oriented thermal models for a resi-
proportions rise to 40% of the total energy usage and 36%
dential case study that could, for example, be used in to learn
of the total GHG in the European Union (EU) [2]. Space Reinforcement Learning (RL) control policies. This calls for
heating and cooling have a major impact, with heating alone short-term multi-step temperature predictions, to be able to
being responsible for 64% of household energy consumption minimize energy consumption while maintaining the com-
in the EU [3]. To follow the Paris Agreement pledges to limit
fort of the occupants. Indeed, while RL can generally be
global warming to well below 2 °C [4], there is thus a need
applied in a model-free fashion, the data-inefficiency of RL
to decrease the energy intensity of the building sector. algorithms [15] and the slow dynamics of buildings often
require agents to be trained over thousands of days of data
1.1. The importance of building models
[16, 17], which is not feasible in practice.
There are three technology-driven ways to attain this de-
While classically engineered physics-based models still
carbonization objective: through better designs, retrofits, or
dominate the field, researchers recently started to leverage
improved operations of buildings. In all cases, models play
the growing amount of available data to design data-driven
a central role, either to find the best design [5], the most ef-
building models. Such models generally perform better, are
fective refurbishment [6], or to learn intelligent controllers
more flexible, and rely on less technical knowledge, but re-
to replace poorly performing rule-based controllers [7].
quire a lot of past data to be trained on and lack general-
Modeling buildings is a challenging task in general since
ization guarantees outside of the training data [13]. This
inside temperatures, air quality, or visual comfort, among
is particularly true for models based on Neural Networks
others, all depend on highly stochastic exogenous factors
(NNs), which can be very data-inefficient and fail when new
mainly driven by the weather and the behavior of the occu-
inputs they were not trained on are fed to them [18], which
pants [8–11]. Additionally, the introduction of solar panels,
is known as their generalization issue. In particular, there
heat pumps, battery storage, electric vehicles, and other new
are no guarantees that a classical NN follows the underlying
technologies makes it harder to model building operations
physics, e.g. the laws of thermodynamics in the case of ther-
as a whole and calls for scalable and flexible methods [12].
mal modeling. This is however critical for control-oriented
∗ Corresponding author: [email protected] (L. Di Natale) applications, e.g. to ensure the RL agents trained on these
ORCID (s): 0000-0002-3295-412X (D.N. L.) models capture the impact of heating and cooling correctly.

Di Natale et al.: Preprint submitted to Elsevier Page 1 of 20


Physically Consistent Neural Networks for building thermal modeling: theory and analysis

Nomenclature
PCNN variables 𝑐 Heat losses to the neighboring zone scaling param-
𝐷 Unforced dynamics eter
𝐸 Energy accumulator 𝑑 Cooling effect scaling parameter
𝑃 Power 𝑚̇ Water mass flow rate in a radiator
𝑄𝑠𝑢𝑛 Solar gains 𝑢 Control inputs
𝑇 Temperature of the modeled zone 𝑥 Inputs to the black-box module
𝑇 𝑛𝑒𝑖𝑔ℎ Temperature of the neighboring zone Grey-box model variables
𝑇 𝑜𝑢𝑡 Outside temperature 𝜉 Disturbance model
𝑇𝑤 Water temperature of the heating system 𝑢 Controllable inputs
𝑎 Heating effect scaling parameter 𝑤1 , 𝑤2 Uncontrollable inputs
𝑏 Heat losses to the outside scaling parameter 𝑧 State of the system

1.2. The generalization issue of neural networks 1.3. Introducing physics-based prior knowledge
Originally spotted by Szegedy et al. [19], the general- In general, classical NNs suffer from underspecification,
ization issue of NNs led to the field of adversarial examples, as reported in a large-scale study from Google [25]. As a
where researchers aim to find input perturbations that fool countermeasure, we should find ways to include prior knowl-
NNs, showing how brittle their predictions can be [20, 21], edge, typically about the underlying laws of physics, into
even when only little noise is applied to the input. NNs to facilitate their training and improve their performance.
To circumvent this generalization issue, researchers of- This trend already began several years ago with the emer-
ten rely on better sets of data that cover the entire spectrum gence of physics-guided machine learning [26] and the cre-
of inputs and allow NNs to react to any situation. This re- ation of specific network structures that represent known phys-
quires vast amounts of resources and is only possible in fields ical systems [18, 27, 28]. In such NNs, physics can, for ex-
where a significant amount of data is available, such as for ample, be introduced directly in the structure of the network
tasks related to natural language processing [22] or images or through custom loss functions, among others [29]. In this
[23]. Additionally, to ensure some level of generalization, paper, we refer to these models as Physics-informed Neural
practitioners typically separate the data into training and val- Networks (PiNNs).
idation sets, the former being used to train the network and To the best of the authors’ knowledge, Drgoňa et al. [30]
the latter to assess its performance on unseen data to avoid were the first to use PiNNs as control-oriented building mod-
overfitting the training data [24]. However, classical NNs els, but they did not provide theoretical guarantees of their
cannot be robust to input modifications that do not exist in models following the underlying physics, except for the hard-
the entire data set. encoded dissipativity. Furthermore, the performance of their
In the case of building thermal models, even if several models, which remarkably work in the multi-zone setting,
years of data are available, one will always face an input was not benchmarked against classical methods. Concur-
coverage problem. Indeed, buildings are usually inhabited rently to our work, Gokhale et al. developed another PiNN
and operated in a typical fashion to maintain a comfortable structure for control-oriented building modeling, but they
temperature – heating when it gets cold in winter and cool- modified the loss function of their NNs and not their archi-
ing when it gets hot in summer. Most data sets are hence tecture [31], contrary to PCNNs. Finally, while not relying
inherently incomplete and we cannot hope to learn robust on PiNNs, we want to mention here the recent work of Bün-
NNs that grasp the effect of heating in summer, for exam- ning et al. on physics-inspired linear regression for buildings
ple. When predicting the evolution of the temperature over [32], which is philosophically related to the general efforts
long horizons of several days, classical NNs might there- to introduce physical priors in otherwise black-box models.
fore fail to capture the underlying physics, i.e. the impact of
heating and cooling on the temperature. This is illustrated 1.4. Contribution
in Figure 1, where one can compare the temperature pre- To tackle the aforementioned generalization issues of clas-
dictions of a classical physics-based resistance-capacitance sical NNs, we introduce a novel PiNN architecture, dubbed
(RC) model, a classical Long Short-Term Memory network PCNN, which includes existing knowledge on the physics of
(LSTM), and a Physically Consistent NN (PCNN) proposed the system at its core, with an application to building zone
in this work under different heating and cooling power in- temperature modeling. The introduction of prior knowledge
puts. Interestingly, the LSTM achieves a superior accuracy essentially works as an inductive bias, such that PCNNs do
than both other models on the training data, overfitting it, but not need to learn everything from data, but only what we
clearly fails to capture the impact of heating and cooling. cannot easily characterize a priori.

Di Natale et al.: Preprint submitted to Elsevier Page 2 of 20


Physically Consistent Neural Networks for building thermal modeling: theory and analysis

31 Classical RC model
No power
Temperature
28 Significantly heating
25 Slightly cooling
( C)

Slightly heating
22
19

31 Classical LSTM
Temperature

28
25
( C)

22
19

31 PCNN (ours)
Temperature

28
25
( C)

22
19

2
Power input
1
Power
(kW)

1
h 12h , 0h 12h , 0h 12h , 0h
a r, 0 , ar , ar , ar
2M 2 Mar 3M 3 Mar 4M 4 Mar 5M
Time
Figure 1: Temperature predictions of the RC model and the proposed PCNN detailed and analyzed in Section 5 compared to a
classical LSTM under different control inputs. The grey-shaded areas represent the span of the RC model predictions to provide a
visual comparison with both black-box methods. While the LSTM presents a lower training error than the PCNN (see Section 5),
indicating a good fit to the data, it does not capture the impact of the different heating/cooling powers applied to the system, e.g.
predicting higher temperatures when cooling is on than when heating is. The specific structure of PCNNs introduced in Section 3,
on the other hand, allows them to retain physical consistency, similarly to classical physics-based models, while improving the
prediction accuracy (see Section 5.2).

While PCNNs model unforced temperature dynamics1 to be proportional to the corresponding temperature gradi-
with classical NNs, they treat parts of the inputs separately: ents to provide physically consistent predictions. This solves
the power input to the zone and the heat losses to the envi- parts of the generalization issue of NN building models and
ronment and neighboring zones are processed in parallel by a makes PCNNs well-suited for control applications. The key
linear module inspired by classical physics-based RC mod- however is that, unlike in classical physics-based models, no
els. This module ensures the positive correlation between engineering effort is required to design and identify the pa-
power inputs and zone temperatures while forcing heat losses rameters of PCNNs: we only need access to past data, and
1 Throughout this work, unforced dynamics represent the temperature
PCNNs are then trained in an end-to-end fashion to learn all
evolution in the zone when no heating or cooling is applied and heat losses the parameters simultaneously. Furthermore, we show that
are neglected. PCNNs achieve better accuracy than a baseline RC model on

Di Natale et al.: Preprint submitted to Elsevier Page 3 of 20


Physically Consistent Neural Networks for building thermal modeling: theory and analysis

a case study. Moreover, they attain a precision on par with


classical LSTMs on the validation data, despite performing Hybrid methods
worse on the training data. This shows that PCNNs do not Physics-based Grey-box PiNN Black-box
lose much expressiveness due to their constrained architec-
Physics
ture and have less tendency to overfit the training data. The Physics Physics Data
main contributions of this work can be summarized as fol- RC model RC model NNs
lows: EnergyPlus Data SVMs
… NNs …
• PCNNs, novel PiNNs, are introduced and applied to Data
zone temperature modeling.
• The physical consistency of PCNNs with respect to
control inputs and exogenous temperatures2 is formally
Figure 2: Structural differences between the different methods.
proven.
• PCNNs perform comparably to LSTMs on the valida-
tion data and overfit the training data less.
Since they are grounded in first principles, a natural ad-
• PCNNs attain better performance than classical RC vantage of these approaches is the interpretability of the so-
models on a case study while avoiding any engineer- lutions [33]. Additionally, this gives them interesting gen-
ing overhead. eralization capabilities outside of the training data [36]. On
the other hand, however, due to the complexity of detailed
The rest of the paper is structured as follows. We start thermal models, assumptions and simplifications have to be
with a brief overview of the building modeling and PiNN made, such as in the choice of the ODEs, which can limit the
literature in Section 2. We then describe what it means to be accuracy of physics-based models [9]. Moreover, the more
physically consistent with respect to a given input, present precision desired, the more knowledge and time is required
the PCNN architecture and formally prove its physical con- to design the model and find the corresponding parameters,
sistency in Section 3. The case study is described in Sec- typically concerning the building envelope and the HVAC
tion 4, and we then compare the performance of this method system, which might introduce uncertainty [11, 39].
against a classical RC model and LSTMs and provide a graph- To simplify the design of physics-based models, various
ical interpretation of its physical consistency in Section 5. detailed simulation tools were developed, such as Energy-
Finally, we discuss the potential and limitations of PCNNs Plus, Modelica, TRNSYS, or IDA ICE [40–42]. While these
in Section 6 and Section 7 concludes the paper. models can attain good accuracy and respect the underlying
physical laws, they are notoriously hard to calibrate [43–45],
2. Background entail considerable development and implementation costs
to find and detail all the required parameters [46], and suffer
This section presents an overview of the existing liter-
from a high computational burden at run-time [47].
ature on building models, which can be broadly classified
into three categories: physics-based, black-box, and hybrid 2.2. Black-box building models
methods, as pictured in Figure 2. While the latter can gen- As opposed to physics-based methods, black-box (or data-
erally be further broken down into grey-box approaches and driven) models do not rely on first principles but derive pat-
PiNNs, only grey-box modeling was previously applied to terns from historical operational data. The most widely used
buildings, with the exception of [30, 31]. For this reason, we methods rely on Multiple Linear Regression (MLR), Sup-
propose a more general overview of PiNNs in Section 2.3.2. port Vector Regression (SVR), NNs, and ensembles, apart
Due to the vast literature on building modeling, we only from the classical Autoregressive Integrated Moving Aver-
provide a short summary of the strengths and weaknesses of age (ARIMA) models, as reviewed by Bourdeau et al. [37].
the different techniques, and more details can be found in Black-box models are generally easier to use than physics-
dedicated reviews, such as [8, 13, 33–38]. based ones since no expert knowledge is required at the de-
sign stage, but often lack generalization guarantees outside
2.1. Physics-based building models of the data they are trained on [13, 33]. Furthermore, they
Also known as white-box or first principle models, physics-
need historical data as input, sometimes in large amounts,
based models rely on Ordinary Differential Equations (ODEs),
to achieve satisfactory accuracy [13], and the data addition-
such as convection, radiation, or conduction equations, to
ally has to be exciting enough, i.e. to cover the different
describe building thermal dynamics. These methods were
operating conditions of the building, something not trivial,
dominating the field early on when the lack of available data
as discussed in Section 1. The subsequent data imbalance
hindered the development of data-driven models [33].
issue can, for example, be tackled through the creation of
2 We refer to the temperature outside and in neighboring zones together sub-models, like in Zhang et al. [48]. Moreover, black-box
as exogenous temperatures in this work. models are sensitive to the choice of features – or feature ex-
traction methods – used as model inputs [9]. On the other

Di Natale et al.: Preprint submitted to Elsevier Page 4 of 20


Physically Consistent Neural Networks for building thermal modeling: theory and analysis

hand, an advantage of data-driven methods is their flexibil- spired from Bünning et al. [57] and simplified versions of
ity, as they can be scaled to large systems in a more straight- Maasoumy et al. [58–60] to construct the PCNNs proposed
forward manner than physics-based methods [49]. Addition- in this work.
ally, they are generally easier to transfer from one building
to another since similar model architectures can be used and 2.3.2. Physics-informed neural networks
all the parameters are learned from data. While early DL applications used classical feedforward
Very recently, as a consequence of the growing amount NNs, researchers soon realized how transferring prior knowl-
of available data, Deep Learning (DL) has started to be ap- edge to NNs could be beneficial. Among the success stories,
plied to building modeling [37]. For example, recurrent NNs one can find the CNN and RNN families, specially designed
(RNNs) were shown to provide better accuracy than feedfor- to capture spatial invariance [61] and temporal dependencies
ward NNs for the prediction of energy consumption [50]. [62] in the data, respectively.
In another study, a specific gated Convolutional NN (CNN) In recent years, a new field emerged in the Machine Learn-
was shown to outperform RNNs and the classical Seasonal ing community to tackle the generalization issue of neural
ARIMAX model on day-ahead multistep hourly predictions networks and create new NN architectures bound to follow
of the electricity consumption [14]. Due to the nonconvexity given physical laws, such as Hamiltonian NNs [28] or La-
of classical NN-based models, which makes them hard to use grangian NNs [27], later generalized by Djeumou et al. [18].
in optimization procedures, researchers also used specific In parallel, PiNN architectures flourished, pioneered by the
control-oriented models, such as Input Convex NN (ICNN), physics-guided NNs of Karpatne et al. [26, 63] and the more
to model building dynamics [51]. general physics-informed Deep Learning (DL) framework
originally proposed by Raissi et al. [64–66] . Since then, var-
2.3. Hybrid methods ious methods to include prior knowledge in NNs have been
Hybrid methods combine physics-based knowledge with proposed, several of which can be found in [29], where the
existing data to have the best of both worlds. Note that some authors tried to classify them.
researchers use the term “hybrid methods” to refer to the Methodologically, the PCNNs proposed in this work are
fact that they first build a physics-based model and then fit a close to the physics-interpretable shallow NNs, where the
black-box model to it to then accelerate the inference proce- inputs are also processed by two modules in parallel, one to
dure at run-time, such as [47, 52], which is out of the scope retain physical exactness when possible and one to capture
of this overview and hence not covered here. nonlinearities through a shallow NN [67]. Also related in
spirit to the PCNN architecture, Hu et al. introduced a spe-
2.3.1. Grey-box building models cific learning pipeline, where the output of the forward NN
In grey-box modeling, one generally starts from simpli- is fed back through a physics-inspired NN structure to recon-
fied physics-based equations and uses data-driven methods struct the input and hence ensure the forward process retains
to identify the model parameters [10, 12, 46] and/or learn an physical consistency [68].
unknown disturbance model on top of it [53]. The simpli- Finally, two recent works applied PiNNs to create control-
fied base model requires less expert knowledge and time to oriented building models [30, 31]. Drgoňa et al. replaced
be designed than pure physics-based models but still allows the state, input, disturbance, and output matrices of classical
one to retain the interpretability of physics-based models. linear models with four NNs and leveraged known physical
Furthermore, this basis includes physical knowledge in the rules to enforce constraints on them [30]. They additionally
model, so that less information has to be learned from data used the Perron-Frobenius theorem to enforce the stability
compared to pure black-box models, which in turn implies and dissipativity of the system by bounding the eigenvalues
that less historical data is required to fit such models [34]. of all the NNs. On the other hand, Gokhale et al. relied on
Typical grey-box models start with linear state-space mod- a more classical PiNN approach with the introduction of a
els and identify their parameters from data, even if some new physics-inspired loss term to guide the learning towards
nonlinearities are not well captured by this approach [49]. physically meaningful solutions without modifying the NN
Due to the difficulty of finding good parameters in general, architecture [31]. However, neither of these works provide
low complexity RC models usually perform better, with mod- physical consistency guarantees, unlike the PCNN architec-
els with one or two capacitances usually being selected [46, ture presented in this work.
54, 55]. Higher-order models furthermore entail more com-
plexity and hinder the generalization capability of grey-box
models, which also advocates in favor of low-complexity 3. Methods
frameworks [56]. As a partial solution, a feature assessment This section firstly defines a notion of physical consis-
framework to test the flexibility, scalability, and interoper- tency and then details the novel PCNN structure proposed
ability of grey-box models and select the right model char- in this work, where the effect of the control inputs and the
acteristics was proposed by Shamsi et al. [12]. In essence, heat losses to the environment and neighboring zones are
grey-box approaches hence allow for a trade-off between the separated from the unforced temperature dynamics. Finally,
accuracy and the complexity of building models [56]. we formally prove the physical consistency of PCNNs with
Due to the effectiveness of low-order RC models, we respect to control inputs and exogenous temperatures.
hence rely on linear first-order RC modeling techniques in-

Di Natale et al.: Preprint submitted to Elsevier Page 5 of 20


Physically Consistent Neural Networks for building thermal modeling: theory and analysis

3.1. Respecting the underlying physical laws Secondly, from the laws of thermodynamics, we know
Throughout this work, we define a model as being physi- that the modeled zone loses energy through heat transfers to
cally consistent with respect to a given input when any change the environment and the neighboring zone. We hence sub-
in this input leads to a change of the output that follows the tract these effects, which are proportional to the correspond-
underlying physical laws. In our case, for example, we need ing temperature gradients with the outside temperature 𝑇 𝑜𝑢𝑡 ,
models that are physically consistent with respect to control respectively the temperature in the neighboring zone 𝑇 𝑛𝑒𝑖𝑔ℎ ,
inputs to ensure that turning the heating on leads to higher scaled by parameters 𝑏, respectively 𝑐, learned from data.
zone temperatures than when heating is off, and vice versa Mathematically, in the heating case, we can hence write the
for cooling. Mathematically, we can express this require- evolution of the physics-inspired module as follows:
ment as follows for a zone with power input 𝑃 ∈ ℝ at time
step 𝑗 and temperature prediction 𝑇 ∈ ℝ at time step 𝑘: 𝐸𝑘+1 = 𝐸𝑘 + 𝑎𝑔(𝑢𝑘 ) − 𝑏(𝑇𝑘 − 𝑇𝑘𝑜𝑢𝑡 ) − 𝑐(𝑇𝑘 − 𝑇𝑘𝑛𝑒𝑖𝑔ℎ ), (6)

𝜕𝑇𝑘 with 𝐸0 = 0. In the cooling case, one simply needs to ex-


>0 ∀0 ≤ 𝑗 < 𝑘, (1) change the parameter 𝑎 with 𝑑. As can readily be seen, Equa-
𝜕𝑃𝑗
tion (6) is heavily inspired by classical first-order RC build-
where we consider 𝑃 < 0 in the cooling case by convention ing models, found e.g. in Bünning et al. [57] and detailed
throughout this paper. We can similarly define physical con- in Appendix A.1. The main difference with the generic RC
sistency with respect to the outside temperature 𝑇 𝑜𝑢𝑡 ∈ ℝ, model in Equation (22) is that the proposed PCNN archi-
the temperature in a neighboring zone 𝑇 𝑛𝑒𝑖𝑔ℎ ∈ ℝ, and the tecture allows us to treat nonlinear solar and additional un-
solar gains 𝑄𝑠𝑢𝑛 ∈ ℝ as follows: known heat gains using neural networks or other nonlinear
functions in 𝐷 instead of relying on engineered linear solu-
𝜕𝑇𝑘
>0 ∀0 ≤ 𝑗 < 𝑘, (2) tions. Furthermore, the physics-inspired parameters 𝑎, 𝑏, 𝑐,
𝜕𝑇𝑗𝑜𝑢𝑡 and 𝑑 are learned from data simultaneously to the parameters
𝜕𝑇𝑘 of the black-box module described below (see Section 3.2.3).
>0 ∀0 ≤ 𝑗 < 𝑘, (3)
𝜕𝑇𝑗𝑛𝑒𝑖𝑔ℎ Remark 2 (Design of 𝑔). In some cases, we can directly
𝜕𝑇𝑘 control the heating or cooling power input to the zone, i.e.
>0 ∀0 ≤ 𝑗 < 𝑘, (4) 𝑔(𝑢) = 𝑢. When this is not possible, e.g. when 𝑢 controls
𝜕𝑄𝑠𝑢𝑛
𝑗 the opening of the valves in radiators, we need to process
since higher exogenous temperatures or solar gains all lead the controllable inputs into power inputs through some func-
to increased zone temperatures. tion 𝑔. This function might be engineered, for example as
𝑔(𝑢) = 𝑢 ∗ 𝑚̇ ∗ (𝑇 𝑤 − 𝑇 ) in the case of a radiator, with
3.2. Physically consistent neural networks 𝑚̇ the mass flow and 𝑇 𝑤 the temperature of the water in the
The proposed PCNN architecture is sketched in Figure 3 pipes, or it could be learned from data, e.g. using NNs. This
for one time step 𝑘, and we apply it recursively over the pre- learned function should be strictly monotonically increasing
diction horizon. The temperature of the zone 𝑇 is computed with 𝑔(0) = 0, i.e. no energy is consumed when there is no
as the sum of two latent variables evolving through time: control input, 𝑔(𝑢) < 0 when cooling is on, and 𝑔(𝑢) > 0
the unforced dynamics 𝐷 ∈ ℝ, and the energy accumula- when heating is applied. Importantly, since everything is
tor 𝐸 ∈ ℝ, which includes prior knowledge about thermal trained together in an end-to-end fashion (see Section 3.2.3),
dynamics. Mathematically, we thus have: 𝑔 can seamlessly be learned in parallel to the other param-
eters.
𝑇𝑘+1 = 𝐷𝑘+1 + 𝐸𝑘+1 (5)
Remark 3 (Coupling between 𝐷 and 𝐸). Note that since
Remark 1 (Extension to several neighboring zones). While 𝑇𝑘 = 𝐷𝑘 + 𝐸𝑘 , the nonlinear black-box module 𝐷 influences
we describe and analyze the case with a single neighboring the evolution of the energy accumulator 𝐸 in Equation (6),
zone throughout this paper, it is straightforward to extend which is one of the main differences with classical grey-box
PCNNs to model a zone connected to several other zones by techniques, where the physics-based and black-box modules
adding further energy loss terms with their corresponding are usually completely separated. This furthermore requires
scaling constants. learning the parameters 𝑎, 𝑏, 𝑐, and 𝑑 simultaneously to the
NNs in 𝐷, as presented in Section 3.2.3. Further details on
3.2.1. Linear physics-inspired module the differences between classical grey-box approaches and
The energy accumulator 𝐸 is firstly positively influenced PCNNs are discussed in Section 6.1.
by the power input to the zone 𝑔(𝑢) ∈ ℝ, which depends on
the control input 𝑢 ∈ ℝ𝑚 , e.g. the opening pattern of radiator 3.2.2. Black-box module
valves. The latter is scaled by a constant 𝑎 in the heating Running in parallel of the linear module, the nonlinear
and 𝑑 in the cooling case to represent its effect on the air black-box module processes all inputs not treated in 𝐸, such
mass in the room. Note that the power input is negative in as solar gains and time information, gathered in 𝑥 ∈ ℝ𝑛 ,
the cooling case by convention, so that cooling lowers the to capture the unforced temperature dynamics, i.e. when no
energy accumulated in 𝐸, as expected. heating or cooling is applied and heat losses are neglected.

Di Natale et al.: Preprint submitted to Elsevier Page 6 of 20


Physically Consistent Neural Networks for building thermal modeling: theory and analysis

f
<latexit sha1_base64="FShqvcHvZf8AvXd/dWPaE1bbkG8=">AAAB6HicbVBNS8NAEJ3Ur1q/qh69LBbBU0mkqMeCF48t2A9oQ9lsJ+3azSbsboQS+gu8eFDEqz/Jm//GbZuDtj4YeLw3w8y8IBFcG9f9dgobm1vbO8Xd0t7+weFR+fikreNUMWyxWMSqG1CNgktsGW4EdhOFNAoEdoLJ3dzvPKHSPJYPZpqgH9GR5CFn1FipGQ7KFbfqLkDWiZeTCuRoDMpf/WHM0gilYYJq3fPcxPgZVYYzgbNSP9WYUDahI+xZKmmE2s8Wh87IhVWGJIyVLWnIQv09kdFI62kU2M6ImrFe9ebif14vNeGtn3GZpAYlWy4KU0FMTOZfkyFXyIyYWkKZ4vZWwsZUUWZsNiUbgrf68jppX1W962qtWavUa3kcRTiDc7gED26gDvfQgBYwQHiGV3hzHp0X5935WLYWnHzmFP7A+fwByamM5Q==</latexit>

xk + Dk+1
<latexit sha1_base64="5P/vH5F4tWcRHC8IhL4TA9uraDA=">AAAB8XicbVBNS8NAEJ3Ur1q/qh69LBahgpREinos6MFjBfuBbQib7aZdstmE3Y1YQv+FFw+KePXfePPfuG1z0NYHA4/3ZpiZ5yecKW3b31ZhZXVtfaO4Wdra3tndK+8ftFWcSkJbJOax7PpYUc4EbWmmOe0mkuLI57Tjh9dTv/NIpWKxuNfjhLoRHgoWMIK1kR6C6o0Xnj154alXrtg1ewa0TJycVCBH0yt/9QcxSSMqNOFYqZ5jJ9rNsNSMcDop9VNFE0xCPKQ9QwWOqHKz2cUTdGKUAQpiaUpoNFN/T2Q4Umoc+aYzwnqkFr2p+J/XS3Vw5WZMJKmmgswXBSlHOkbT99GASUo0HxuCiWTmVkRGWGKiTUglE4Kz+PIyaZ/XnIta/a5eadTzOIpwBMdQBQcuoQG30IQWEBDwDK/wZinrxXq3PuatBSufOYQ/sD5/AGNHkAw=</latexit>

f (Dk , xk )
<latexit sha1_base64="tGB37cSWiMCY9bNDo89I+bkZfh0=">AAAB7nicbVBNS8NAEJ3Ur1q/qh69LBZBEEoiRT0W9OCxgv2ANpTNdtIu3WzC7kYooT/CiwdFvPp7vPlv3LY5aOuDgcd7M8zMCxLBtXHdb6ewtr6xuVXcLu3s7u0flA+PWjpOFcMmi0WsOgHVKLjEpuFGYCdRSKNAYDsY38789hMqzWP5aCYJ+hEdSh5yRo2V2nf9bHzhTfvlilt15yCrxMtJBXI0+uWv3iBmaYTSMEG17npuYvyMKsOZwGmpl2pMKBvTIXYtlTRC7Wfzc6fkzCoDEsbKljRkrv6eyGik9SQKbGdEzUgvezPxP6+bmvDGz7hMUoOSLRaFqSAmJrPfyYArZEZMLKFMcXsrYSOqKDM2oZINwVt+eZW0LqveVbX2UKvUa3kcRTiBUzgHD66hDvfQgCYwGMMzvMKbkzgvzrvzsWgtOPnMMfyB8/kDsHKPHQ==</latexit>

<latexit sha1_base64="pUA+2qCFUeuoMZBZQaNdhv0chOQ=">AAAB6nicbVBNS8NAEJ3Ur1q/qh69LBbBU0mkqMeCF48V7Qe0oWy2m3bpZhN2J2IJ/QlePCji1V/kzX/jts1BWx8MPN6bYWZekEhh0HW/ncLa+sbmVnG7tLO7t39QPjxqmTjVjDdZLGPdCajhUijeRIGSdxLNaRRI3g7GNzO//ci1EbF6wEnC/YgOlQgFo2il+6f+uF+uuFV3DrJKvJxUIEejX/7qDWKWRlwhk9SYrucm6GdUo2CST0u91PCEsjEd8q6likbc+Nn81Ck5s8qAhLG2pZDM1d8TGY2MmUSB7YwojsyyNxP/87ophtd+JlSSIldssShMJcGYzP4mA6E5QzmxhDIt7K2EjaimDG06JRuCt/zyKmldVL3Lau2uVqnX8jiKcAKncA4eXEEdbqEBTWAwhGd4hTdHOi/Ou/OxaC04+cwx/IHz+QNiUI3V</latexit>

+
Dk +
<latexit sha1_base64="JtNvdyPKXlB0qrun1J36R/glSo4=">AAAB6nicbVBNS8NAEJ3Ur1q/qh69LBbBU0mkqMeCHjxWtB/QhrLZTtqlm03Y3Qgl9Cd48aCIV3+RN/+N2zYHbX0w8Hhvhpl5QSK4Nq777RTW1jc2t4rbpZ3dvf2D8uFRS8epYthksYhVJ6AaBZfYNNwI7CQKaRQIbAfjm5nffkKleSwfzSRBP6JDyUPOqLHSw21/3C9X3Ko7B1klXk4qkKPRL3/1BjFLI5SGCap113MT42dUGc4ETku9VGNC2ZgOsWuppBFqP5ufOiVnVhmQMFa2pCFz9fdERiOtJ1FgOyNqRnrZm4n/ed3UhNd+xmWSGpRssShMBTExmf1NBlwhM2JiCWWK21sJG1FFmbHplGwI3vLLq6R1UfUuq7X7WqVey+Mowgmcwjl4cAV1uIMGNIHBEJ7hFd4c4bw4787HorXg5DPH8AfO5w8TGI2h</latexit>

Inputs from time step k + 1


Inputs from time step k
<latexit sha1_base64="/n8nEJKBtYAVuxMaizwscki5HXs=">AAAB6HicbVBNS8NAEJ34WetX1aOXxSJ4Kkkp6rHgxWML9gPaUDbbSbt2swm7G6GE/gIvHhTx6k/y5r9x2+agrQ8GHu/NMDMvSATXxnW/nY3Nre2d3cJecf/g8Oi4dHLa1nGqGLZYLGLVDahGwSW2DDcCu4lCGgUCO8Hkbu53nlBpHssHM03Qj+hI8pAzaqzUnAxKZbfiLkDWiZeTMuRoDEpf/WHM0gilYYJq3fPcxPgZVYYzgbNiP9WYUDahI+xZKmmE2s8Wh87IpVWGJIyVLWnIQv09kdFI62kU2M6ImrFe9ebif14vNeGtn3GZpAYlWy4KU0FMTOZfkyFXyIyYWkKZ4vZWwsZUUWZsNkUbgrf68jppVyvedaXWrJXr1TyOApzDBVyBBzdQh3toQAsYIDzDK7w5j86L8+58LFs3nHzmDP7A+fwB0KOM6A==</latexit>

<latexit sha1_base64="jnu3XjTFPeqDUDuLnt5gOWor3nY=">AAAB63icbVBNS8NAEJ3Ur1q/qh69LBZBEEpSinosePFYwX5AG8pmO22X7m7C7kYooX/BiwdFvPqHvPlvTNoctPXBwOO9GWbmBZHgxrrut1PY2Nza3inulvb2Dw6PyscnbRPGmmGLhSLU3YAaFFxhy3IrsBtppDIQ2Ammd5nfeUJteKge7SxCX9Kx4iPOqM2k6ZVXGpQrbtVdgKwTLycVyNEclL/6w5DFEpVlghrT89zI+gnVljOB81I/NhhRNqVj7KVUUYnGTxa3zslFqgzJKNRpKUsW6u+JhEpjZjJIOyW1E7PqZeJ/Xi+2o1s/4SqKLSq2XDSKBbEhyR4nQ66RWTFLCWWap7cSNqGaMpvGk4Xgrb68Ttq1qnddrT/UK41aHkcRzuAcLsGDG2jAPTShBQwm8Ayv8OZI58V5dz6WrQUnnzmFP3A+fwDbxI1s</latexit>
Tkout Tkout
<latexit sha1_base64="NMVaK4r6Sy3c+l1LEThg0SFCD+g=">AAAB8HicbVDLSgNBEOyNrxhfUY9eBoPgKexKUI8BLx4j5KEka5idzCZD5rHMzAphyVd48aCIVz/Hm3/jJNmDJhY0FFXddHdFCWfG+v63V1hb39jcKm6Xdnb39g/Kh0dto1JNaIsorvR9hA3lTNKWZZbT+0RTLCJOO9H4ZuZ3nqg2TMmmnSQ0FHgoWcwItk56aD5mKrXT/rhfrvhVfw60SoKcVCBHo1/+6g0USQWVlnBsTDfwExtmWFtGOJ2WeqmhCSZjPKRdRyUW1ITZ/OApOnPKAMVKu5IWzdXfExkWxkxE5DoFtiOz7M3E/7xuauPrMGMySS2VZLEoTjmyCs2+RwOmKbF84ggmmrlbERlhjYl1GZVcCMHyy6ukfVENLqu1u1qlXsvjKMIJnMI5BHAFdbiFBrSAgIBneIU3T3sv3rv3sWgtePnMMfyB9/kDJsqQmw==</latexit>

<latexit sha1_base64="c9Dmpwhy/kUomafAPk7ynvTvuRc=">AAAB+HicbVDLSsNAFJ34rPXRqEs3g0VwY0mkqMuCG5cV+oI2hsl00g6ZzIR5CDX0S9y4UMStn+LOv3HaZqGtBy4czrmXe++JMkaV9rxvZ219Y3Nru7RT3t3bP6i4h0cdJYzEpI0FE7IXIUUY5aStqWakl0mC0oiRbpTczvzuI5GKCt7Sk4wEKRpxGlOMtJVCt9IKE3gBWw+5MHoaJqFb9WreHHCV+AWpggLN0P0aDAU2KeEaM6RU3/cyHeRIaooZmZYHRpEM4QSNSN9SjlKignx++BSeWWUIYyFtcQ3n6u+JHKVKTdLIdqZIj9WyNxP/8/pGxzdBTnlmNOF4sSg2DGoBZynAIZUEazaxBGFJ7a0Qj5FEWNusyjYEf/nlVdK5rPlXtfp9vdqoF3GUwAk4BefAB9egAe5AE7QBBgY8g1fw5jw5L86787FoXXOKmWPwB87nD+XfkpM=</latexit>

Tk b

Tkout Tk+1
<latexit sha1_base64="HsYoKGvM2d/aOj+jkMxe3J3+LCk=">AAAB7nicbVBNS8NAEJ3Ur1q/qh69LBZBEEoiRT0WvHis0C9oQ9lsN+3SzSbsToQS+iO8eFDEq7/Hm//GbZuDtj4YeLw3w8y8IJHCoOt+O4WNza3tneJuaW//4PCofHzSNnGqGW+xWMa6G1DDpVC8hQIl7yaa0yiQvBNM7ud+54lrI2LVxGnC/YiOlAgFo2ilTnOQTa682aBccavuAmSdeDmpQI7GoPzVH8YsjbhCJqkxPc9N0M+oRsEkn5X6qeEJZRM64j1LFY248bPFuTNyYZUhCWNtSyFZqL8nMhoZM40C2xlRHJtVby7+5/VSDO/8TKgkRa7YclGYSoIxmf9OhkJzhnJqCWVa2FsJG1NNGdqESjYEb/XlddK+rno31dpjrVKv5XEU4QzO4RI8uIU6PEADWsBgAs/wCm9O4rw4787HsrXg5DOn8AfO5w/JEo8t</latexit>

<latexit sha1_base64="c9Dmpwhy/kUomafAPk7ynvTvuRc=">AAAB+HicbVDLSsNAFJ34rPXRqEs3g0VwY0mkqMuCG5cV+oI2hsl00g6ZzIR5CDX0S9y4UMStn+LOv3HaZqGtBy4czrmXe++JMkaV9rxvZ219Y3Nru7RT3t3bP6i4h0cdJYzEpI0FE7IXIUUY5aStqWakl0mC0oiRbpTczvzuI5GKCt7Sk4wEKRpxGlOMtJVCt9IKE3gBWw+5MHoaJqFb9WreHHCV+AWpggLN0P0aDAU2KeEaM6RU3/cyHeRIaooZmZYHRpEM4QSNSN9SjlKignx++BSeWWUIYyFtcQ3n6u+JHKVKTdLIdqZIj9WyNxP/8/pGxzdBTnlmNOF4sSg2DGoBZynAIZUEazaxBGFJ7a0Qj5FEWNusyjYEf/nlVdK5rPlXtfp9vdqoF3GUwAk4BefAB9egAe5AE7QBBgY8g1fw5jw5L86787FoXXOKmWPwB87nD+XfkpM=</latexit>

Tk

Tkneigh Tkneigh
<latexit sha1_base64="x0/Hh1nxkaUfXucBthYWwaXsQeE=">AAAB8nicbVBNS8NAEJ3Ur1q/qh69BIvgqSRS1GPBi8cK/YI0ls122i7d7IbdjVBCf4YXD4p49dd489+4bXPQ1gcDj/dmmJkXJZxp43nfTmFjc2t7p7hb2ts/ODwqH5+0tUwVxRaVXKpuRDRyJrBlmOHYTRSSOOLYiSZ3c7/zhEozKZpmmmAYk5FgQ0aJsVLQfMwEstF41p/0yxWv6i3grhM/JxXI0eiXv3oDSdMYhaGcaB34XmLCjCjDKMdZqZdqTAidkBEGlgoSow6zxckz98IqA3colS1h3IX6eyIjsdbTOLKdMTFjverNxf+8IDXD2zBjIkkNCrpcNEy5a6Q7/98dMIXU8KklhCpmb3XpmChCjU2pZEPwV19eJ+2rqn9drT3UKvVaHkcRzuAcLsGHG6jDPTSgBRQkPMMrvDnGeXHenY9la8HJZ07hD5zPH4dPkWI=</latexit>

<latexit sha1_base64="2U1QJtC9t5EFo3s9XpUZxkDgits=">AAAB+nicbVBNS8NAEN3Ur1q/Uj16WSyCF0siRT0WvHis0C9oY9hsJ+3SzSbsbpQS+1O8eFDEq7/Em//GbZuDtj4YeLw3w8y8IOFMacf5tgpr6xubW8Xt0s7u3v6BXT5sqziVFFo05rHsBkQBZwJammkO3UQCiQIOnWB8M/M7DyAVi0VTTxLwIjIULGSUaCP5drnpj/E5bt5nAthwNPXHvl1xqs4ceJW4OamgHA3f/uoPYppGIDTlRKme6yTay4jUjHKYlvqpgoTQMRlCz1BBIlBeNj99ik+NMsBhLE0Jjefq74mMREpNosB0RkSP1LI3E//zeqkOr72MiSTVIOhiUZhyrGM8ywEPmASq+cQQQiUzt2I6IpJQbdIqmRDc5ZdXSfui6l5Wa3e1Sr2Wx1FEx+gEnSEXXaE6ukUN1EIUPaJn9IrerCfrxXq3PhatBSufOUJ/YH3+AEnfk1o=</latexit>

Tk c
- -

a + +
uk
<latexit sha1_base64="BxW5usZrsHqeaJTno/B2bJ695HY=">AAAB7XicbVDLSgNBEOyNrxhfUY9eBoMQL2FXgnoMePEYwcRAsoTZyWwyZnZmmYcQlvyDFw+KePV/vPk3TpI9aGJBQ1HVTXdXlHKmje9/e4W19Y3NreJ2aWd3b/+gfHjU1tIqQltEcqk6EdaUM0FbhhlOO6miOIk4fYjGNzP/4YkqzaS4N5OUhgkeChYzgo2T2sOq7Y/P++WKX/PnQKskyEkFcjT75a/eQBKbUGEIx1p3Az81YYaVYYTTaalnNU0xGeMh7ToqcEJ1mM2vnaIzpwxQLJUrYdBc/T2R4UTrSRK5zgSbkV72ZuJ/Xtea+DrMmEitoYIsFsWWIyPR7HU0YIoSwyeOYKKYuxWREVaYGBdQyYUQLL+8StoXteCyVr+rVxr1PI4inMApVCGAK2jALTShBQQe4Rle4c2T3ov37n0sWgtePnMMf+B9/gDnTY6o</latexit>

g(uk ) Ek+1
<latexit sha1_base64="vpyZB+WDHpC6SugZ0z3m+UqkaBs=">AAAB7nicbVBNS8NAEJ3Ur1q/qh69LBZBEEoiRT0WRPBYwX5AG8pmO2mXbjZhdyOU0B/hxYMiXv093vw3btsctPXBwOO9GWbmBYng2rjut1NYW9/Y3Cpul3Z29/YPyodHLR2nimGTxSJWnYBqFFxi03AjsJMopFEgsB2Mb2d++wmV5rF8NJME/YgOJQ85o8ZK7bt+Nr7wpv1yxa26c5BV4uWkAjka/fJXbxCzNEJpmKBadz03MX5GleFM4LTUSzUmlI3pELuWShqh9rP5uVNyZpUBCWNlSxoyV39PZDTSehIFtjOiZqSXvZn4n9dNTXjjZ1wmqUHJFovCVBATk9nvZMAVMiMmllCmuL2VsBFVlBmbUMmG4C2/vEpal1Xvqlp7qFXqtTyOIpzAKZyDB9dQh3toQBMYjOEZXuHNSZwX5935WLQWnHzmGP7A+fwBsfyPHg==</latexit>

<latexit sha1_base64="CoBL86ZOwg2Q3gVsJgly5o8X7Tg=">AAAB6nicbVBNS8NAEJ3Ur1q/qh69LBbBU0mkVI8FLx4r2g9oQ9lsN+3SzSbsToQS+hO8eFDEq7/Im//GbZuDtj4YeLw3w8y8IJHCoOt+O4WNza3tneJuaW//4PCofHzSNnGqGW+xWMa6G1DDpVC8hQIl7yaa0yiQvBNMbud+54lrI2L1iNOE+xEdKREKRtFKD+lgMihX3Kq7AFknXk4qkKM5KH/1hzFLI66QSWpMz3MT9DOqUTDJZ6V+anhC2YSOeM9SRSNu/Gxx6oxcWGVIwljbUkgW6u+JjEbGTKPAdkYUx2bVm4v/eb0Uwxs/EypJkSu2XBSmkmBM5n+TodCcoZxaQpkW9lbCxlRThjadkg3BW315nbSvql69WruvVRq1PI4inME5XIIH19CAO2hCCxiM4Ble4c2Rzovz7nwsWwtOPnMKf+B8/gBdvo3S</latexit>

d +

Ek +
<latexit sha1_base64="eGYj3z46wv0pwKVbdjOtxCQsUHc=">AAAB6nicbVBNS8NAEJ3Ur1q/qh69LBbBU0mkqMeCCB4r2g9oQ9lsJ+3SzSbsboQS+hO8eFDEq7/Im//GbZuDtj4YeLw3w8y8IBFcG9f9dgpr6xubW8Xt0s7u3v5B+fCopeNUMWyyWMSqE1CNgktsGm4EdhKFNAoEtoPxzcxvP6HSPJaPZpKgH9Gh5CFn1Fjp4bY/7pcrbtWdg6wSLycVyNHol796g5ilEUrDBNW667mJ8TOqDGcCp6VeqjGhbEyH2LVU0gi1n81PnZIzqwxIGCtb0pC5+nsio5HWkyiwnRE1I73szcT/vG5qwms/4zJJDUq2WBSmgpiYzP4mA66QGTGxhDLF7a2EjaiizNh0SjYEb/nlVdK6qHqX1dp9rVKv5XEU4QRO4Rw8uII63EEDmsBgCM/wCm+OcF6cd+dj0Vpw8plj+APn8wcUno2i</latexit>

Figure 3: The proposed PCNN architecture used recursively at each time step. The control inputs 𝑢, transformed into power
inputs by the function 𝑔, and the losses to the environment 𝑏(𝑇 − 𝑇 𝑜𝑢𝑡 ) and neighboring zone 𝑐(𝑇 − 𝑇 𝑛𝑒𝑖𝑔ℎ ) all influence an energy
accumulator 𝐸, which accumulates or dissipates energy at each time step. Here, the separation between red and blue lines signals
a different treatment of the power inputs in the heating and cooling case, respectively, since they are scaled by different constants
𝑎 and 𝑑. The accumulated energy is then added to the unforced dynamics 𝐷, modeled by a residual NN that takes all the features
apart from 𝑢, 𝑇 𝑜𝑢𝑡 , and 𝑇 𝑛𝑒𝑖𝑔ℎ – gathered in 𝑥 – as input, to get the final zone temperature prediction 𝑇 .

This can typically be modeled using residual NNs, which ̃ 𝑇 (𝑡)}𝑁


 = {𝑥(𝑡), 𝑡=1
, where 𝑥̃ = {𝑥, 𝑢, 𝑇 𝑜𝑢𝑡 , 𝑇 𝑛𝑒𝑖𝑔ℎ } and 𝑁
leads to the following expression: is the number of data points, we can directly optimize all the
parameters of both the physics-inspired and black-box mod-
𝐷𝑘+1 = 𝐷𝑘 + 𝑓 (𝐷𝑘 , 𝑥𝑘 ) (7) ules together.
𝐷0 = 𝑇 (𝑡0 ), To that end, we first construct a set of 𝑆 time series
𝑆 = {𝑥̃ (𝑠) (𝑡), 𝑇 (𝑠) (𝑡)}𝑆𝑠=1 , each consisting of consecutive
where 𝑇 (𝑡0 ) is the measured temperature at the beginning of data points from . Note that these sequences might over-
the prediction horizon and 𝑓 is a potentially highly nonlin- lap in practice to increase the data efficiency of the proposed
ear function, typically based on recurrent neural networks. method. We then minimize the Mean Squared Error (MSE)
Remarkably, the unforced dynamics 𝐷 are independent of between the PCNN predictions 𝑇𝑘+1 (𝑥̃ (𝑠) ), recursively com-
0∶𝑘
power inputs and heat losses by design, which will allow us (𝑠)
puted from past inputs 𝑥̃ 0∶𝑘 up to time 𝑘, and the true mea-
to prove the physical consistency of PCNNs with respect to
these inputs in Section 3.3. surements 𝑇 (𝑘 + 1) for each time series 𝑠 over the prede-
(𝑠)

fined prediction horizon 𝐻:


Remark 4 (Design of 𝑓 ). While 𝑓 is composed of an encoder- [ 𝐻−1 ]
( )2
1∑ 1 ∑
𝑆
LSTM-decoder structure in our case (see Section 4.2), any (𝑠)
NN architecture – and even functions that do not contain 𝑇 (𝑥̃ ) − 𝑇 (𝑠) (𝑘 + 1) , (8)
𝑆 𝑠=1 𝐻 𝑘=0 𝑘+1 0∶𝑘
NNs – can be used without affecting the physical consistency
of the predictions. Nonetheless, due to the sequential nature In this work, we parametrize 𝑓 using RNNs and rely on the
of temperature dynamics and the expressiveness of NNs, we standard automatic Backpropagation Through Time (BPTT)
suspect RNNs to be a good choice in general. algorithm [69] and the PyTorch library [70] to solve this op-
timization problem.
3.2.3. Training procedure
Importantly, PCNNs do not require any engineering or
knowledge about the building structure or parameters be-
yond connectivity information, i.e. which zones are adja-
cent to the modeled one. Given a data set of measurements

Di Natale et al.: Preprint submitted to Elsevier Page 7 of 20


Physically Consistent Neural Networks for building thermal modeling: theory and analysis

3.3. PCNNs follow physical laws by design


Plugging Equations (6)-(7) in Equation (5), we get:

𝑇𝑘+1 = 𝑇𝑘 + 𝑓 (𝐷𝑘 , 𝑥𝑘 ) + 𝑎𝑔(𝑢𝑘 )


−𝑏(𝑇𝑘 − 𝑇𝑘𝑜𝑢𝑡 ) − 𝑐(𝑇𝑘 − 𝑇𝑘𝑛𝑒𝑖𝑔ℎ ) (9)

Applying Equation (9) recursively, as detailed in Appendix B.2,


one can express the temperature prediction of the PCNNs at UMAR
any future time step 𝑖 as follows:

𝑇𝑘+𝑖 = (1 − 𝑏 − 𝑐)𝑖 𝑇𝑘

𝑖
+ (1 − 𝑏 − 𝑐)(𝑗−1) [𝑓 (𝐷𝑘+𝑖−𝑗 , 𝑥𝑘+𝑖−𝑗 ) (10)
𝑗=1
Figure 4: NEST building, Duebendorf, and the UMAR unit
𝑜𝑢𝑡 𝑛𝑒𝑖𝑔ℎ
+ 𝑎𝑔(𝑢𝑘+𝑖−𝑗 ) + 𝑏𝑇𝑘+𝑖−𝑗 + 𝑐𝑇𝑘+𝑖−𝑗 ] circled in white © Zooey Braun, Stuttgart.

Here, it is important to note that 𝐷𝑚+1 = 𝐷𝑚 + 𝑓 (𝐷𝑚 , 𝑥𝑚 ) is


independent of the variables 𝑢, 𝑇 𝑜𝑢𝑡 , and 𝑇 𝑛𝑒𝑖𝑔ℎ at any step
𝑚, it solely depends on the other inputs in 𝑥, so we do not which remains positive under the same conditions. This en-
need to explicitly write the recursion out. sures that any change in the power input, as computed by
In the case when we can directly control the power input a function 𝑔, still yields the expected physically consistent
to the zone, i.e. 𝑔(𝑢) = 𝑢, we can now formally assess the outcome on the zone temperature. As long as 𝑔 satisfies the
physical consistency of PCNNs with respect to control in- conditions in Remark 2, we furthermore observe that:
puts and exogenous temperatures since we get the following
partial derivatives: 𝜕𝑇𝑘+𝑖 𝜕𝑇𝑘+𝑖 𝜕𝑔(𝑢𝑘+𝑖−𝑗 )
=
𝜕𝑢𝑘+𝑖−𝑗 𝜕𝑔(𝑢𝑘+𝑖−𝑗 ) 𝜕𝑢𝑘+𝑖−𝑗
𝜕𝑇𝑘+𝑖 𝜕𝑔(𝑢𝑘+𝑖−𝑗 )
= (1 − 𝑏 − 𝑐)(𝑗−1) 𝑎 ∀𝑗 = 1, .., 𝑖. (11)
𝜕𝑢𝑘+𝑖−𝑗 = (1 − 𝑏 − 𝑐)(𝑗−1) 𝑎 , (16)
𝜕𝑢𝑘+𝑖−𝑗
𝜕𝑇𝑘+𝑖
𝑜𝑢𝑡
= (1 − 𝑏 − 𝑐)(𝑗−1) 𝑏 ∀𝑗 = 1, .., 𝑖. (12) which remains positive, hence satisfying Equation (1), as
𝜕𝑇𝑘+𝑖−𝑗
long as the conditions in Equation (14) hold since 𝑔 is de-
𝜕𝑇𝑘+𝑖 fined as a monotonically increasing function.
𝑛𝑒𝑖𝑔ℎ
= (1 − 𝑏 − 𝑐)(𝑗−1) 𝑐 ∀𝑗 = 1, .., 𝑖. (13)
𝜕𝑇𝑘+𝑖−𝑗 Remark 5 (Condition on 𝑑). Replacing 𝑎 by 𝑑 throughout
Equations (9)-(16) yields similar conditions for the cooling
Remarkably, these derivatives take the same form in a clas-
case. In particular, we require to have 𝑑 > 0 to retain
sical RC model, as shown in Equation (29), Appendix B.1.
physical consistency, additionally to the conditions in Equa-
PCNNs hence satisfy the physical consistency criteria of Equa-
tion (14).
tions (1)-(3) as long as the conditions below hold:

𝑎, 𝑏, 𝑐 > 0 4. Case study


1−𝑏−𝑐 >0 (14) In this work, we take advantage of NEST, a vertically
integrated district located in Duebendorf, Switzerland, and
This is the case for real systems since 𝑎, 𝑏, and 𝑐 are small
pictured in Figure 4 [71]. NEST is composed of several res-
positive physical constants, i.e. inverses of resistances and
idential and office units, and we focus our attention on the
capacitances. Moreover, it gives us simple verification crite-
"Urban Mining and Recycling" (UMAR) unit, where more
ria to ensure that the learned PCNN stays physically consis-
than three years of data is available to assess the quality of
tent as these conditions could easily be enforced during the
our models.
training of the models, even though it was not needed in our
experiments3 . 4.1. UMAR
Note that even when we do not have access to the power UMAR is an apartment composed of two bedrooms, with
input directly and have to process the control inputs through a living room in between them, and two small bathrooms.
an engineered or learned function 𝑔, we still get: We model the temperature of one of the bedrooms through-
𝜕𝑇𝑘+𝑖 out this work. All the rooms are equipped with radiant heat-
= (1 − 𝑏 − 𝑐)(𝑗−1) 𝑎 ∀𝑗 = 1, .., 𝑖, (15) ing/cooling panels in the ceiling and controlled by opening
𝜕𝑔(𝑢𝑘+𝑖−𝑗 )
and closing valves to let hot or cold water flow through them
3 The values learned by PCNNs in practice have orders of magnitude depending on the season. Since individual room power con-
10−1 -10−2 for 𝑎 and 𝑑 and 10−3 -10−4 for 𝑏 and 𝑐. sumption measurements are not available, we approximate

Di Natale et al.: Preprint submitted to Elsevier Page 8 of 20


Physically Consistent Neural Networks for building thermal modeling: theory and analysis

them by disaggregating the total consumption of UMAR us- Seed Training loss Validation loss
ing the design mass flows and the amount of time the valves LSTMs 0 0.57 2.28
in each room are open. Apart from the temperature and 1 0.57 1.92
power consumption of the rooms, we also use data about 2 1.14 2.30
Mean 0.76 2.17
the solar irradiation and the ambient temperature on-site.
PCNNs 0 1.83 1.93
Details on the data preprocessing can be found in Appendix C. 1 1.85 1.65
Since both bathrooms are much smaller and have signifi- 2 2.06 1.75
cantly less heating/cooling power than the bedrooms and the Mean 1.91 1.78
living room, we assume that the heat transfers between the
former and the latter are negligible compared to the other Table 1
Comparison training and validation loss for three classical
heat transfers. In other words, we do not consider the bath-
LSTMs and PCNNs, scaled by 103 (full table in Appendix E).
rooms as distinct zones and only include the living room as
neighboring zone of the modeled bedroom.

4.2. Implementation details 5. Results


In our case, the inputs 𝑥 are comprised of the raw solar In this section, we analyze the performance of the PCNN
irradiation and time information, i.e. the day of the week, the that obtained the best error on the validation set of the case
month of the year, and the current time of the day, and we study and compare it to a physics-based RC model base-
assume direct control over the power input, i.e. 𝑔(𝑢) = 𝑢. All line. Since more complex structures often do not improve
the models are trained to predict the temperature of the zone the accuracy of RC models, as discussed in Section 2.3.1,
over a horizon of three days with time steps of 15 minutes, we chose a 2R1C model with two heat sources, the heat-
and the data was split in a training and validation set. For our ing/cooling power and the solar gains, (see Appendix A).
experiments, we chose to design 𝑓 with an encoder-LSTM- This simple architecture furthermore presents the advantage
decoder structure, where both the encoder and decoder have to have a very similar form to the physics-based module 𝐸
two layers of 128 units and the LSTM is composed of two in PCNNs (see Appendix B), which allows us to assess the
layers of size 512. The code of the PCNNs can be found on impact of the NNs in 𝐷 on the model performance. Note that
GitLab4 and further implementation details in Appendix D. the RC model has a sampling time of 1 min, we thus keep the
One critical implementation point is the initialization of power input fixed over intervals of 15 min when we compare
the parameters 𝑎, 𝑏, 𝑐, and 𝑑 of the PCNN in Figure 3. In- its predictions with the ones of the PCNN.
deed, as they are inspired by the known physics of buildings,
they must correspond to meaningful values. Furthermore, 5.1. Improving the generalization issue of NNs
due to the recurrent use of these parameters to modify the While we only discuss one PCNN in depth throughout
state of the energy accumulator along the prediction hori- this section, a broader analysis can be found in Appendix E,
zon, wrong values would have a large impact on the quality where we trained PCNNs with different random seeds and on
of the model and the PCNN might get stuck in a local min- the other bedroom in UMAR for comparison. We obtained
imum. In practice, we saw that using rules of thumb to ini- consistent results, showing the robustness, respectively the
tialize those parameters to plausible values using our prior flexibility of the approach.
knowledge and then letting the PCNN modify them during We additionally performed an ablation study where we
the back-propagation procedure led to good results, as pre- removed the physics-inspired prior 𝐸 and used only the clas-
sented in Section 5.2. We thus use our intuition about how sical encoder-LSTMs-decoder framework in 𝑓 to predict the
UMAR behaves to define initial values such that: temperature evolution, concatenating all the available fea-
tures as inputs (hence losing any physical consistency guar-
• For 𝑎 and 𝑑: The temperature in the zone rises/drops
antee). Interestingly, as presented in Table 1, while PCNNs
by 1 °C in 2 h when the maximal heating/cooling power
could not attain the performance of classical LSTMs on the
is applied.
training data due to their constrained structure to follow the
• For 𝑏 and 𝑐: The temperature drops by 1.5 °C in 6 h underlying physical laws, they obtained lower errors on the
when the exogenous temperature is 25 °C lower. validation set. This confirms that PCNNs solve part of the
generalization issue of classical NNs, having a smaller ten-
These rules of thumb can be derived from historical data, for dency to overfit the training data but retaining enough ex-
example looking at how much time it generally takes for the pressiveness to perform well on new data.
temperature to rise by 1 °C when the zone is heated at full In the rest of this section, however, we only investigate
power for 𝑎, respectively to drop by 1.5 °C when it is 25 °C in detail the accuracy of the best PCNN compared to the RC
colder outside and heating is off for 𝑏. Similar investigations model since LSTMs were found to be physically inconsis-
will give plausible initial values for 𝑐 and 𝑑. Since these tent in our experiments, as pictured in Figure 1, a critical
parameters turn out to be very small in practice, we learn issue for control-oriented thermal models.
their inverse for numerical stability (see Appendix D).
4 https://gitlab.nccr-automation.ch/loris.dinatale/pcnn.

Di Natale et al.: Preprint submitted to Elsevier Page 9 of 20


Physically Consistent Neural Networks for building thermal modeling: theory and analysis

3.0 Hours ahead RC model PCNN (ours)


RC model
PCNN 𝟣𝗁 𝟬.𝟭𝟵 °𝐂 𝟢.𝟥𝟣 °C
2.5 𝟢.𝟧𝟪 °C 𝟬.𝟱𝟱 °𝐂
Absolute Error ( C)
𝟨𝗁
2.0 𝟣𝟤 𝗁 𝟢.𝟩𝟪 °C 𝟬.𝟲𝟲 °𝐂
𝟤𝟦 𝗁 𝟢.𝟫𝟥 °C 𝟬.𝟳𝟳 °𝐂
1.5 𝟦𝟪 𝗁 𝟣.𝟥𝟢 °C 𝟬.𝟴𝟴 °𝐂
𝟩𝟤 𝗁 𝟣.𝟦𝟪 °C 𝟬.𝟴𝟴 °𝐂
1.0
Table 2
0.5 Comparison of the MAE of the two models over the prediction
horizon.
0.0
1h 12h 24h 48h 72h
Hour ahead
by looking at the error distributions of both models in the
Figure 5: Mean and standard deviation of the error at each left plot, with the errors of the PCNN (green) clustered be-
time step of the prediction horizon for both the RC model in low 1 °C and almost always below 2 °C while the errors of
blue and the PCNN in green, where the statistics were com-
the RC models in blue are much more spread out. This indi-
puted from almost 2000 predictions from the validation set.
cates that the PCNN is robust with respect to different inputs,
even on unseen data.
Altogether, we can hence conclude that the PCNN is
less prone to extreme errors and keeps the majority of errors
5.2. Performance analysis
lower than the given RC baseline, proving its robustness and
Since predicting the evolution of the temperature for sev-
effectiveness. Remarkably, all the results were obtained on
eral time steps entails a recursive use of the architecture in
over three years of data, hence under various weather condi-
Figure 3, we leverage the ability of LSTMs to handle long se-
tions and during all the seasons, which also hints that exoge-
quences of data to minimize the error over the entire horizon.
nous variables do not impact the quality of the model much.
On the other hand, RC models are usually fitted over a sin-
gle step, leading to error propagation, as pictured in Figure 5, 5.3. Empirical analysis of the physical consistency
where we plotted the average Absolute Error (AE) and one
With the physical consistency of the models formally
standard deviation for both models over almost 2000 possi-
proven in Section 3.3, we can now visualize its impact em-
bly overlapping 3-day long sequences of data from the vali-
pirically. Note that the main point of this analysis is to show
dation set of the PCNN, i.e. unseen data. Note that while the
that the PCNN retains physical consistency even on unseen
RC model used as baseline in this work is not optimal, it was
data, i.e. data from the validation set and with various engi-
nonetheless tuned to obtain good accuracy, with an average
neered power inputs that do not exist in the data, avoiding the
error below 1 °C after 24 h.
classical generalization issue of NNs detailed in Section 1.
One can observe the PCNN providing better predictions
To that end, we take an input sequence from the validation
than the RC model in general, which is supported by the av-
set and compare the temperature predictions of both the RC
erage AE reported at key points along the horizon in Table 2.
model and the PCNN when:
In particular, the PCNN is able to keep a good accuracy even
on long horizons, with an error more than 40% lower than • the original and true power inputs are applied (blue),
the RC model after three days. On the other hand, it presents
slightly higher errors at the beginning of the horizon because • only the first half of the power inputs are used (red),
of the warm start that is implemented (Appendix D): since • only the second half of the input is applied (orange),
they firstly predict past data – the last 3 h – PCNNs might
indeed start the actual prediction horizon at a temperature • no power is used (black), hereafter named uncontrolled,
different from the true one. Nonetheless, since we observed
that the warm start benefited the overall performance of PC- where we separate the power inputs in half with respect to
NNs during our experiments, we kept it in the final imple- their magnitudes, i.e. so that both the red and orange control
mentations. sequence apply roughly the same total power. One such ex-
To investigate the Mean Absolute Errors (MAEs) ob- periment is summarized in Figure 7 for a heating case, where
tained by both models on each sequence of data, we also we also added the ground truth in dashed blue as reference
provide the corresponding histograms and scatter plot in Fig- for both models.
ure 6. In general, one can see the PCNN dominating the RC Firstly, comparing the blue predictions with the dashed
model: there are only a few sequences where its error is sig- ground truth, we see both models performing well, exempli-
nificantly larger than the one of the RC model, represented fying the results discussed in Section 5.2. In particular, the
by points over the black diagonal line in the right plot. On proposed PCNN is able to grasp the general trends to match
the other hand, towards the lower and right side of this figure, the ground truth despite the large amount of heating power
we find data sequences where the PCNN presents a signifi- applied and the temperature rising to more than 30 °C, some-
cantly better accuracy than the RC model. This is confirmed thing unusual in a real setting and hence not well covered by
the training data.

Di Natale et al.: Preprint submitted to Elsevier Page 10 of 20


Physically Consistent Neural Networks for building thermal modeling: theory and analysis

Frequency Frequency
RC model
100

PCNN MAE ( C)
50 2
0
0.5-quantile
100 0.9-quantile 1
PCNN

50
0
0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 00 1 2 3
Mean Absolute Error ( C) RC model MAE ( C)
(a) Distribution of the MAE of both models over the test sequences, (b) Scatter plot of the MAEs of both models on each test sequence,
with the 50% and 90% quantiles marked in red, respectively black. with the black diagonal line representing equal performance.
Figure 6: Comparison of the MAE of both the PCNN and RC model over almost 2000 predictions of three days, taken from the
unseen validation data of the PCNN.

Furthermore, looking at the three other predictions, for els, as detailed in Appendix F, written as follows:
which we do not have a ground truth anymore, both mod-
els again show similar behaviors. This is the visual conse- 𝜉𝑘+1 = 𝜉𝑘 + 𝑚(𝜉𝑘 , 𝑤2𝑘 )
quence of the physical consistency proven in Section 3.3, 𝑧𝑘+1 = 𝐴𝑧𝑘 + 𝐵𝑢 𝑔(𝑢𝑘 ) + 𝐵𝑤1 𝑤1𝑘 (17)
with the red predictions deviating from the blue ones at the
+ 𝐵𝑑 𝜉𝑘 + 𝜉𝑘+1
same point in time for both models: as soon as we stop heat-
ing the room, we get lower temperatures. Similarly, the or- where 𝑧 represents the state of the system, 𝑢 the controllable
ange predictions deviate from the uncontrolled dynamics at inputs, 𝑤1 uncontrollable ones, and 𝜉 is a disturbance model
the same points in time for both models. Finally, looking at computed as a function 𝑚 of the rest of the uncontrollable
the uncontrolled predictions, one can observe smoother pat- inputs 𝑤2 . The main structural difference between Equa-
terns for the PCNNs due to the unforced base dynamics be- tion (17) and classical grey-box formulations is the impact of
ing captured by LSTMs instead of the more aggressive linear the disturbance 𝜉, which appears both with and without a lag
regression at the core of the RC model. of one in the state update function. Traditional approaches
To get a better visualization of the behavior of both mod- generally first forget about the unknown disturbance 𝜉 to iden-
els with respect to the different control inputs, we can sub- tify 𝐴, 𝐵𝑢 , and 𝐵𝑤1 , and then fit a disturbance model to the
tract the uncontrolled predictions from the other curves. The residuals, e.g. using Gaussian Processes [72].
result is pictured in Figure 8 and allows us to assess the im- Despite the similarity with classical grey-box models,
pact of the three different control sequences on the final pre- PCNNs are fundamentally different both in terms of philos-
dictions. As expected, both models still exhibit similar be- ophy and training procedure. Firstly, the linear evolution of
haviors, with predictions diverging from the baseline as soon the state 𝑧 captures the main dynamics of grey-box mod-
as heating is turned on. On the other hand, when heating is els, including the impact of control inputs, and the nonlinear
off, the gap with the baseline gets slowly closed because of disturbance 𝜉 corrects them to match the data. On the other
the higher inside temperature leading to higher energy losses hand, in PCNNs, the main (unforced) dynamics 𝐷 are pro-
to the environment and the neighboring zone. Note that the cessed by nonlinear NNs, while the linear energy accumu-
impact of the neighboring room is hard to distinguish in that lator 𝐸 adjusts the predictions according to the controllable
plot since it is an order of magnitude smaller than the losses inputs and known disturbances, i.e. heat losses.
to the outside. Secondly, contrary to classical techniques modeling the
disturbance 𝜉 as a separate process, all the parameters of
6. Discussion PCNNs are trained simultaneously over the entire prediction
horizon – PCNNs are multi-step-ahead models – and in an
In this section, we briefly discuss the main differences
end-to-end fashion to capture dependencies between 𝐷 and
between PCNNs and classical grey-box modelsand then men-
𝐸, leveraging automatic BPTT.
tion potential applications of PCNNs, leveraging their physi-
cal consistency, and some hurdles that still need clarification. 6.2. Potential of PCNNs
As discussed, the good accuracy and physical consis-
6.1. Contrasting PCNNs with grey-box models
tency of PCNNs make them natural candidates for control-
Since PCNNs are heavily inspired from classical RC mod-
oriented zone temperature models. They could however also
els, we can derive them as a specific form of grey-box mod-
be used or integrated into Digital Twins (DTs), a fast-growing
field that suffers from two problems that PCNNs could solve.

Di Natale et al.: Preprint submitted to Elsevier Page 11 of 20


Physically Consistent Neural Networks for building thermal modeling: theory and analysis

RC model
31 No power
Temperature
28 Full power input
First half
( C)

25 Second half
22
19
PCNN
31
Temperature

28
( C)

25
22
19
2.0 Power input
1.5
Power
(kW)

1.0
0.5
0.0
ar , 0h , 12h ar , 0h , 12h ar, 0h , 12h
11 M 11 Mar 12 M 12 Mar 13 M 13 Mar
Time
Figure 7: Comparison between the RC model (top) and the PCNN (middle) given the bottom heating control sequence, over
three days. In blue, one can assess the precision of both models compared to the ground truth (dashed), where the full control
sequence was used. Then, red and orange show the result when only the first half of the control input, respectively the second
one, is used. Finally, the black uncontrolled dynamics reflect the case when no power is used, and we shaded the span of the RC
model predictions in the middle plot as reference.

RC model even when the calibration is successful, DTs often remain


Difference Difference

6 slow at run-time due to the level of detail included, some-


( C)

3 thing that can be improved by training PCNNs to imitate


0 their behavior and subsequently accelerate the inference pro-
PCNN cedure.
6 PCNNs could also be used in retrofitting operations, al-
( C)

3 beit restricted to the renewal of energy systems. Indeed,


0 since PCNNs are robust to changes in the control inputs, one
0h 2h 0h 2h 0h 2h
M ar, Mar, 1 Mar, Mar, 1 Mar, Mar, 1 could assess the possibilities arising from different power
11 11 12 12 13 13 systems, e.g. the impact of adding or subtracting some heat-
Time ing capacity. Combined with an intelligent controller, one
Full power input First half Second half
could for example anticipate the potential energy savings of
Figure 8: Difference between each control input and the black the new system and balance them with the installation costs
baseline (no energy) in Figure 7, for the physics-based model to compute the return on investment of the operation.
(top) and the proposed black-box structure (middle). Overall, we thus see good potential for PCNNs in the
field of building modeling and beyond. Indeed, while we
present a specific case study on zone temperature models
in this work, it is noteworthy that the structure of PCNNs,
Firstly, if historical data is available, they could reduce the with a physics-informed module in parallel to a black-box
effort and knowledge required to calibrate DTs. Secondly, one, is very general and flexible. If more information about

Di Natale et al.: Preprint submitted to Elsevier Page 12 of 20


Physically Consistent Neural Networks for building thermal modeling: theory and analysis

the physics of the system is known or required, the physics- 7. Conclusion


informed module can be expanded to incorporate it. For
example, if PCNNs are used to analyze the impact of solar In this work, we presented a novel neural network archi-
gains, then 𝐸 should include a detailed model of this process tecture, dubbed PCNN, with an application to building zone
on top of the current formulation. Thanks to the modularity temperature modeling. By treating some input variables sep-
of PCNNs, the rest of the pipeline of Figure 3 would not be arately from the main NN in a physics-informed module, PC-
modified, apart from possible changes in which inputs are NNs include prior knowledge in their structure. They are
contained in 𝑥. Typically, if a detailed model of the solar hence able to capture parts of the underlying physics while
gains is included in 𝐸, then the solar irradiation does not leveraging the accuracy of NNs to attain significant perfor-
need to be processed in 𝐷 anymore. In general, any system mance improvement over classical physics-based models with-
with similar underlying physical processes, i.e. systems that out any engineering overhead.
can accumulate and dissipate energy, might be modeled with A key advantage of PCNNs over existing NN-based ther-
mal models is that we could formally prove that their temper-
PCNNs, adjusting the physics-informed module.
ature predictions remain physically consistent with respect to
6.3. Limits of our application any control inputs and exogenous temperatures and over the
Firstly, it is important to note that the proposed structure entire prediction horizon. Furthermore, grounding PCNNs
does not fully solve the generalization issue of NNs: PCNNs in the underlying physics allowed us to mitigate the usual
are only physically consistent with respect to control inputs generalization issue of classical NN frameworks.
and exogenous temperatures, i.e. they satisfy the conditions These results were confirmed by our experiments on a
in Equations (1)-(3). Should other inputs in 𝑥 vary, we can- bedroom temperature modeling case study, in which PCNNs
not guarantee the robustness of the model anymore. In par- generally obtained better results than classical LSTMs over
ticular, the current version of PCNNs does not meet the con- the validation data, even when they were performing worse
dition in Equation (4). This is left for future work for two during the training phase, strongly indicating that they do
main reasons: one usually does not have direct access to so- not suffer from generalization issues as much as classical
lar gains through the windows of the zone to model, but only frameworks. Additionally, PCNNs clearly outperformed the
to the horizontal irradiation measurement, which has a non- RC model baseline while following the underlying physical
linear effect on the zone temperature. Furthermore, Equa- laws, reducing the error by more than 40% at the end of the
tion (4) is not strictly needed in control-oriented models, the 3-day long prediction horizon.
main target of this work. Indeed, modifying the heating or Since PCNNs are solely based on data and do not require
cooling power inside a zone only impacts its temperature, any engineering, they are very flexible and easy to use. This
and hence heat transfers indirectly, but the solar gains always makes them interesting for different applications in the field
have the same impact under any control input. of building modeling and beyond. To complete the anal-
Secondly, it would be of interest to investigate the tuning ysis of their potential, it would be of interest to assess the
of the various parameters, both to optimize the black-box 𝑓 sensitivity of PCNNs with respect to the key parameters of
and, in particular, to understand how to initialize and learn the physics-informed module, which should have links with
meaningful values for 𝑎, 𝑏, 𝑐, and 𝑑. In this work, we ini- physical quantities, and the amount of data required to at-
tialized them using prior knowledge and rules of thumb, it tain satisfactory performance. In future work, we plan to
remains unclear how and why they get to their final values, focus our research on extending the current architecture to
which can differ depending on the random seed used during the multi-zone setting and use PCNNs in various control
training. Furthermore, we observed that the quality of the schemes to learn intelligent temperature controllers.
solution can vary significantly if they are initialized to unre-
alistic values, i.e. PCNNs do not always recover physically Acknowledgements
consistent parameters from data. Consequently, it might be This research was supported by the Swiss National Sci-
useful to add constraints to ensure they retain meaningful ence Foundation under NCCR Automation, grant agreement
values throughout training, i.e. they continuously meet the 51NF40_180545.
conditions in Equation (14). In practice, however, we did not
have that issue, with the parameters only slowly changing
around their physics-informed initial value and thus staying Declaration of competing interests
physically consistent throughout our experiments. The authors declare that they have no known competing
Lastly, one should keep in mind that the RC baseline financial interests or personal relationships that could have
used in this paper, albeit tuned to obtain satisfying perfor- appeared to influence the work reported in this paper.
mance, still remains a low complexity model. While our
PCNN was able to get better accuracy than this RC model, it CRediT authorship contribution statement
might be possible to find better physics-based models. Nonethe-
less, PCNNs seem very competitive and avoid any need for Di Natale L.: Conceptualization, Methodology, Soft-
engineering, which makes them attractive in general. ware, Validation, Formal analysis, Data Curation, Visual-
ization, Writing - Original Draft. Svetozarevic B.: Con-
ceptualization, Methodology, Writing - Review & Editing,

Di Natale et al.: Preprint submitted to Elsevier Page 13 of 20


Physically Consistent Neural Networks for building thermal modeling: theory and analysis

Supervision . Heer P.: Writing - Review & Editing, Re- [17] L. Yu, W. Xie, D. Xie, Y. Zou, D. Zhang, Z. Sun, L. Zhang, Y. Zhang,
sources, Funding acquisition. Jones C.N.: Conceptualiza- T. Jiang, Deep Reinforcement Learning for Smart Home Energy Man-
tion, Methodology, Writing - Review & Editing, Supervi- agement, IEEE Internet of Things Journal (2019) https://doi.org/10.
1109/JIOT.2019.2957289.
sion. [18] F. Djeumou, C. Neary, E. Goubault, S. Putot, U. Topcu, Neural Net-
works with Physics-Informed Architectures and Constraints for Dy-
namical Systems Modeling, arXiv preprint arXiv:2109.06407 (2021).
References [19] C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Good-
[1] International Energy Agency (IEA), Tracking Buildings 2020, fellow, R. Fergus, Intriguing properties of neural networks, arXiv
https://www.iea.org/reports/tracking-buildings-2020, Accessed: preprint arXiv:1312.6199 (2013).
2021.05.28, 2020. [20] R. R. Wiyatno, A. Xu, O. Dia, A. de Berker, Adversarial ex-
[2] European Commission (EC), Factsheet: The energy performance of amples in modern machine learning: A review, arXiv preprint
buildings directive, https://ec.europa.eu/energy/sites/ener/files/ arXiv:1911.05268 (2019).
documents/buildings_performance_factsheet.pdf, 2019. Accessed: [21] S.-M. Moosavi-Dezfooli, A. Fawzi, P. Frossard, Deepfool: a simple
06.01.2021. and accurate method to fool deep neural networks, in: Proceedings of
[3] Eurostat, statistics explained, Energy consumption in house- the IEEE conference on computer vision and pattern recognition, pp.
holds, https://ec.europa.eu/eurostat/statistics-explained/index. 2574–2582.
php?title=Energy_consumption_in_households, Accessed: 2021.05.28, [22] T. B. Brown, B. Mann, N. Ryder, M. Subbiah, J. Kaplan, P. Dhariwal,
2020. A. Neelakantan, P. Shyam, G. Sastry, A. Askell, et al., Language mod-
[4] United Nations Framework Convention on Climate Change (UN- els are few-shot learners, arXiv preprint arXiv:2005.14165 (2020).
FCCC), Paris agreement to the united nations framework convention [23] Q. Xie, M.-T. Luong, E. Hovy, Q. V. Le, Self-training with noisy
on climate change, https://unfccc.int/process-and-meetings/ student improves imagenet classification, in: Proceedings of the
the-paris-agreement/the-paris-agreement, 2015. Accessed: IEEE/CVF Conference on Computer Vision and Pattern Recognition,
05.05.2021. pp. 10687–10698.
[5] P. Westermann, R. Evins, Surrogate modelling for sustainable build- [24] Y. Xu, R. Goodacre, On splitting training and validation set: a com-
ing design–a review, Energy and Buildings 198 (2019) 170–186, parative study of cross-validation, bootstrap and systematic sampling
https://doi.org/10.1016/j.enbuild.2019.05.057. for estimating the generalization performance of supervised learning,
[6] M. Rabani, H. B. Madessa, O. Mohseni, N. Nord, Minimizing de- Journal of Analysis and Testing 2 (2018) 249–262, https://doi.org/
livered energy and life cycle cost using Graphical script: An of- 10.1007/s41664--018--0068--2.
fice building retrofitting case, Applied Energy 268 (2020) 114929, [25] A. D’Amour, K. Heller, D. Moldovan, B. Adlam, B. Alipanahi,
https://doi.org/10.1016/j.apenergy.2020.114929. A. Beutel, C. Chen, J. Deaton, J. Eisenstein, M. D. Hoffman, et al.,
[7] B. Svetozarevic, C. Baumann, S. Muntwiler, L. Di Natale, M. N. Underspecification Presents Challenges for Credibility in Modern
Zeilinger, P. Heer, Data-driven control of room temperature and bidi- Machine Learning, arXiv preprint arXiv:2011.03395 (2020).
rectional ev charging using deep reinforcement learning: simulations [26] A. Karpatne, W. Watkins, J. Read, V. Kumar, Physics-guided neural
and experiments, Applied Energy (2021) 118127. networks (pgnn): An application in lake temperature modeling, arXiv
[8] A. Boodi, K. Beddiar, M. Benamour, Y. Amirat, M. Benbouzid, Intel- preprint arXiv:1710.11431 (2017).
ligent systems for building energy and occupant comfort optimization: [27] M. Lutter, C. Ritter, J. Peters, Deep lagrangian networks: Us-
A state of the art review and recommendations, Energies 11 (2018) ing physics as model prior for deep learning, arXiv preprint
2604, https://doi.org/10.3390/en11102604. arXiv:1907.04490 (2019).
[9] C. Fan, F. Xiao, Y. Zhao, A short-term building cooling load pre- [28] S. Greydanus, M. Dzamba, J. Yosinski, Hamiltonian neural net-
diction method using deep learning algorithms, Applied energy 195 works, Advances in Neural Information Processing Systems 32
(2017) 222–233, https://doi.org/10.1016/j.apenergy.2017.03.064. (2019) 15379–15389.
[10] M. H. Shamsi, U. Ali, E. Mangina, J. O’Donnell, A framework for [29] L. von Rueden, S. Mayer, K. Beckh, B. Georgiev, S. Giesselbach,
uncertainty quantification in building heat demand simulations using R. Heese, B. Kirsch, M. Walczak, J. Pfrommer, A. Pick, R. Ra-
reduced-order grey-box energy models, Applied Energy 275 (2020) mamurthy, J. Garcke, C. Bauckhage, J. Schuecker, Informed Ma-
115141, https://doi.org/10.1016/j.apenergy.2020.115141. chine Learning - A Taxonomy and Survey of Integrating Prior Knowl-
[11] W. Tian, Y. Heo, P. De Wilde, Z. Li, D. Yan, C. S. Park, X. Feng, edge into Learning Systems, IEEE Transactions on Knowledge and
G. Augenbroe, A review of uncertainty analysis in building energy Data Engineering (2021) 1–1, https://doi.org/10.1109/TKDE.2021.
assessment, Renewable and Sustainable Energy Reviews 93 (2018) 3079836.
285–301, https://doi.org/10.1016/j.rser.2018.05.029. [30] J. Drgoňa, A. R. Tuor, V. Chandan, D. L. Vrabie, Physics-constrained
[12] M. H. Shamsi, U. Ali, E. Mangina, J. O’Donnell, Feature assessment deep learning of multi-zone building thermal dynamics, Energy and
frameworks to evaluate reduced-order grey-box building energy mod- Buildings 243 (2021) 110992, https://doi.org/10.1016/j.enbuild.
els, Applied Energy 298 (2021) 117174, https://doi.org/10.1016/j. 2021.110992.
apenergy.2021.117174. [31] G. Gokhale, B. Claessens, C. Develder, Physics informed neural net-
[13] Z. Afroz, G. Shafiullah, T. Urmee, G. Higgins, Modeling tech- works for control oriented thermal modeling of buildings, Applied
niques used in building HVAC control systems: A review, Renewable Energy 314 (2022) 118852.
and sustainable energy reviews 83 (2018) 64–84, https://doi.org/10. [32] F. Bünning, B. Huber, A. Schalbetter, A. Aboudonia, M. H. de Badyn,
1016/j.rser.2017.10.044. P. Heer, R. S. Smith, J. Lygeros, Physics-informed linear regression
[14] M. Cai, M. Pipattanasomporn, S. Rahman, Day-ahead building-level is a competitive approach compared to Machine Learning methods in
load forecasts using deep learning vs. traditional time-series tech- building MPC, arXiv preprint arXiv:2110.15911 (2021).
niques, Applied Energy 236 (2019) 1078–1088, https://doi.org/10. [33] R. Z. Homod, Review on the HVAC system modeling types and the
1016/j.apenergy.2018.12.042. shortcomings of their application, Journal of Energy 2013 (2013) ,
[15] Z. Wang, T. Hong, Reinforcement learning for building controls: The https://doi.org/10.1155/2013/768632.
opportunities and challenges, Applied Energy 269 (2020) 115036, [34] A. Foucquier, S. Robert, F. Suard, L. Stéphan, A. Jay, State of the
https://doi.org/10.1016/j.apenergy.2020.115036. art in building modelling and energy performances prediction: A re-
[16] Z. Wan, H. Li, H. He, Residential energy management with deep view, Renewable and Sustainable Energy Reviews 23 (2013) 272–
reinforcement learning, in: 2018 International Joint Conference on 288, https://doi.org/10.1016/j.rser.2013.03.004.
Neural Networks (IJCNN), IEEE, pp. 1–7, https://doi.org/10.1109/ [35] X. Li, J. Wen, Review of building energy modeling for control and op-
IJCNN.2018.8489210.

Di Natale et al.: Preprint submitted to Elsevier Page 14 of 20


Physically Consistent Neural Networks for building thermal modeling: theory and analysis

eration, Renewable and Sustainable Energy Reviews 37 (2014) 517– J. Lygeros, Input Convex Neural Networks for Building MPC, in:
537, https://doi.org/10.1016/j.rser.2014.05.056. Proceedings of the 3rd Conference on Learning for Dynamics and
[36] C. Deb, F. Zhang, J. Yang, S. E. Lee, K. W. Shah, A review on time Control, volume 144 of Proceedings of Machine Learning Research,
series forecasting techniques for building energy consumption, Re- PMLR, 2021, pp. 251–262.
newable and Sustainable Energy Reviews 74 (2017) 902–924, https: [52] X. Li, R. Yao, Modelling heating and cooling energy demand for
//doi.org/10.1016/j.rser.2017.02.085. building stock using a hybrid approach, Energy and Buildings 235
[37] M. Bourdeau, X. qiang Zhai, E. Nefzaoui, X. Guo, P. Chatellier, Mod- (2021) 110740, https://doi.org/10.1016/j.enbuild.2021.110740.
eling and forecasting building energy consumption: A review of data- [53] F. M. Gray, M. Schmidt, A hybrid approach to thermal building mod-
driven techniques, Sustainable Cities and Society 48 (2019) 101533, elling using a combination of Gaussian processes and grey-box mod-
https://doi.org/10.1016/j.scs.2019.101533. els, Energy and Buildings 165 (2018) 56–63, https://doi.org/10.
[38] U. Ali, M. H. Shamsi, C. Hoare, E. Mangina, J. O’Donnell, Review 1016/j.enbuild.2018.01.039.
of urban building energy modeling (UBEM) approaches, methods and [54] S. F. Fux, A. Ashouri, M. J. Benz, L. Guzzella, EKF based self-
tools using qualitative and quantitative analysis, Energy and Buildings adaptive thermal model for a passive house, Energy and Buildings 68
246 (2021) 111073, https://doi.org/10.1016/j.enbuild.2021.111073. (2014) 811–817.
[39] T. Wei, S. Ren, Q. Zhu, Deep reinforcement learning for joint dat- [55] T. Berthou, P. Stabat, R. Salvazet, D. Marchio, Development and
acenter and HVAC load control in distributed mixed-use buildings, validation of a gray box model to predict thermal behavior of occupied
IEEE Transactions on Sustainable Computing (2019) , https://doi. office buildings, Energy and Buildings 74 (2014) 91–100, https://
org/10.1109/TSUSC.2019.2910533. doi.org/10.1016/j.enbuild.2014.01.038.
[40] D. B. Crawley, L. K. Lawrie, F. C. Winkelmann, W. F. Buhl, Y. J. [56] M. H. Shamsi, U. Ali, J. O’Donnell, A generalization approach for
Huang, C. O. Pedersen, R. K. Strand, R. J. Liesen, D. E. Fisher, M. J. reduced order modelling of commercial buildings, Journal of Build-
Witte, et al., EnergyPlus: creating a new-generation building energy ing Performance Simulation 12 (2019) 729–744, https://doi.org/10.
simulation program, Energy and buildings 33 (2001) 319–331, https: 1080/19401493.2019.1641554.
//doi.org/10.1016/S0378--7788(00)00114--6. [57] F. Bünning, B. Huber, A. Schalbetter, A. Aboudonia, M. H. de Badyn,
[41] M. Wetter, C. Haugstetter, Modelica versus TRNSYS–A comparison P. Heer, R. S. Smith, J. Lygeros, Physics-informed linear regression is
between an equation-based and a procedural modeling language for competitive with two Machine Learning methods in residential build-
building energy simulation, Proceedings of SimBuild 2 (2006). ing MPC, Applied Energy 310 (2022) 118491.
[42] D. Mazzeo, N. Matera, C. Cornaro, G. Oliveti, P. Romagnoni, [58] M. Maasoumy, A. Pinto, A. Sangiovanni-Vincentelli, Model-based
L. De Santoli, EnergyPlus, IDA ICE and TRNSYS predictive sim- hierarchical optimal control design for HVAC systems, in: Dynamic
ulation accuracy for building thermal behaviour evaluation by us- Systems and Control Conference, volume 54754, pp. 271–278.
ing an experimental campaign in solar test boxes with and without [59] M. Maasoumy, M. Razmara, M. Shahbakhti, A. S. Vincentelli, Han-
a PCM module, Energy and Buildings 212 (2020) 109812, https: dling model uncertainty in model predictive control for energy effi-
//doi.org/10.1016/j.enbuild.2020.109812. cient buildings, Energy and Buildings 77 (2014) 377–392.
[43] X. Ding, W. Du, A. Cerpa, OCTOPUS: Deep reinforcement learning [60] M. Maasoumy, M. Razmara, M. Shahbakhti, A. S. Vincentelli, Se-
for holistic smart building control, in: Proceedings of the 6th ACM lecting building predictive control based on model uncertainty, in:
International Conference on Systems for Energy-Efficient Buildings, 2014 American Control Conference, IEEE, pp. 404–411.
Cities, and Transportation, pp. 326–335, https://doi.org/10.1145/ [61] O. S. Kayhan, J. C. v. Gemert, On translation invariance in cnns: Con-
3360322.3360857. volutional layers can exploit absolute spatial location, in: Proceedings
[44] Z. Zhang, A. Chong, Y. Pan, C. Zhang, K. P. Lam, Whole building en- of the IEEE/CVF Conference on Computer Vision and Pattern Recog-
ergy model for HVAC optimal control: A practical framework based nition, pp. 14274–14285.
on deep reinforcement learning, Energy and Buildings 199 (2019) [62] Y. Yu, X. Si, C. Hu, J. Zhang, A review of recurrent neural networks:
472–490, https://doi.org/10.1016/j.enbuild.2019.07.029. LSTM cells and network architectures, Neural computation 31 (2019)
[45] A. Chakrabarty, E. Maddalena, H. Qiao, C. Laughman, Scalable 1235–1270, https://doi.org/10.1162/neco_a_01199.
Bayesian Optimization for Model Calibration: Case Study on Cou- [63] A. Karpatne, G. Atluri, J. H. Faghmous, M. Steinbach, A. Banerjee,
pled Building and HVAC Dynamics, Energy and Buildings (2021) A. Ganguly, S. Shekhar, N. Samatova, V. Kumar, Theory-guided data
111460, https://doi.org/10.1016/j.enbuild.2021.111460. science: A new paradigm for scientific discovery from data, IEEE
[46] H. Harb, N. Boyanov, L. Hernandez, R. Streblow, D. Müller, Devel- Transactions on knowledge and data engineering 29 (2017) 2318–
opment and validation of grey-box models for forecasting the thermal 2331, https://doi.org/10.1109/TKDE.2017.2720168.
response of occupied buildings, Energy and Buildings 117 (2016) [64] M. Raissi, P. Perdikaris, G. E. Karniadakis, Physics informed deep
199–207, https://doi.org/10.1016/j.enbuild.2016.02.021. learning (part I): Data-driven solutions of nonlinear partial differential
[47] F. Ascione, N. Bianco, C. De Stasio, G. M. Mauro, G. P. Vanoli, Arti- equations, arXiv preprint arXiv:1711.10561 (2017).
ficial neural networks to predict energy performance and retrofit sce- [65] M. Raissi, P. Perdikaris, G. E. Karniadakis, Physics informed deep
narios for any member of a building category: A novel approach, En- learning (part II): Data-driven solutions of nonlinear partial differen-
ergy 118 (2017) 999–1017, https://doi.org/10.1016/j.energy.2016. tial equations, arXiv preprint arXiv:1711.10561 (2017).
10.126. [66] Y. Yang, P. Perdikaris, Physics-informed deep generative models,
[48] C. Zhang, J. Li, Y. Zhao, T. Li, Q. Chen, X. Zhang, W. Qiu, Prob- arXiv preprint arXiv:1812.03511 (2018).
lem of data imbalance in building energy load prediction: Concept, [67] J. Yuan, Y. Weng, Physics Interpretable Shallow-Deep Neural Net-
influence, and solution, Applied Energy 297 (2021) 117139, https: works for Physical System Identification with Unobservability (2021).
//doi.org/10.1016/j.apenergy.2021.117139. [68] X. Hu, H. Hu, S. Verma, Z.-L. Zhang, Physics-guided deep neural net-
[49] S. Royer, S. Thil, T. Talbert, Towards a generic procedure for works for power flow analysis, IEEE Transactions on Power Systems
modeling buildings and their thermal zones, in: 2016 IEEE 16th 36 (2020) 2082–2092, https://doi.org/10.1109/TPWRS.2020.3029557.
International Conference on Environment and Electrical Engineer- [69] P. J. Werbos, Backpropagation through time: what it does and how to
ing (EEEIC), IEEE, pp. 1–6, https://doi.org/10.1109/EEEIC.2016. do it, Proceedings of the IEEE 78 (1990) 1550–1560.
7555567. [70] A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan,
[50] A. Rahman, A. D. Smith, Predicting heating demand and siz- T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, A. Desmaison,
ing a stratified thermal storage tank using deep learning algorithms, A. Kopf, E. Yang, Z. DeVito, M. Raison, A. Tejani, S. Chilamkurthy,
Applied Energy 228 (2018) 108–121, https://doi.org/10.1016/j. B. Steiner, L. Fang, J. Bai, S. Chintala, PyTorch: An Imperative
apenergy.2018.06.064. Style, High-Performance Deep Learning Library, in: H. Wallach,
[51] F. Bünning, A. Schalbetter, A. Aboudonia, M. H. de Badyn, P. Heer, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, R. Garnett

Di Natale et al.: Preprint submitted to Elsevier Page 15 of 20


Physically Consistent Neural Networks for building thermal modeling: theory and analysis

(Eds.), Advances in Neural Information Processing Systems 32, Cur- of the walls adjacent to the outside, respectively the neigh-
ran Associates, Inc., 2019, pp. 8024–8035. boring zone, and 𝑇 𝑜𝑢𝑡 and 𝑇 𝑛𝑒𝑖𝑔ℎ the temperature outside,
[71] Empa, NEST, https://www.empa.ch/web/nest/overview, 2021. Ac- respectively in the neighboring zone. We then group all the
cessed: 04.10.2021.
[72] L. Hewing, J. Kabzan, M. N. Zeilinger, Cautious model predictive other heat gains in 𝑄𝑟𝑒𝑠𝑡 , scaled by a parameter 𝜂 and dis-
control using gaussian process regression, IEEE Transactions on Con- cretize this ODE with the Euler forward method and the time
trol Systems Technology 28 (2019) 2736–2743, https://doi.org/10. step Δ𝑡 = 1 min, yielding:
1109/TCST.2019.2949757.
[73] D. Sturzenegger, D. Gyalistras, M. Morari, R. S. Smith, Model pre- 1 𝜖 𝜂
dictive climate control of a swiss office building: Implementation,
𝑇𝑘+1 = 𝑇𝑘 + Δ𝑡 ∗ [ 𝑄ℎ𝑒𝑎𝑡 + 𝑄𝑖𝑟𝑟 + 𝑄𝑟𝑒𝑠𝑡
𝐶 𝑘 𝐶 𝑘 𝐶 𝑘
results, and cost–benefit analysis, IEEE Transactions on Control 1
Systems Technology 24 (2015) 1–12, https://doi.org/10.1109/TCST. − (𝑇 − 𝑇𝑘𝑜𝑢𝑡 ) (21)
2015.2415411. 𝐶𝑅𝑜𝑢𝑡 𝑘
1
− (𝑇 − 𝑇𝑘𝑛𝑒𝑖𝑔ℎ )]
𝐶𝑅𝑛𝑒𝑖𝑔ℎ 𝑘
Appendices Grouping the constants together and defining new parame-
A. RC building model ters 𝑎, 𝑏, 𝑐, 𝑒1 and 𝑒2 , we can reformulate it as follows:
A.1. General RC models 𝑇𝑘+1 = 𝑇𝑘 + 𝑎𝑄ℎ𝑒𝑎𝑡
𝑘 − 𝑏(𝑇𝑘 − 𝑇𝑘𝑜𝑢𝑡 )
In general, we can describe the thermal dynamics of a
room with the following ordinary differential equation (ODE): − 𝑐(𝑇𝑘 − 𝑇𝑘𝑛𝑒𝑖𝑔ℎ ) + 𝑒1 𝑄𝑖𝑟𝑟 𝑟𝑒𝑠𝑡
𝑘 + 𝑒 2 𝑄𝑘 (22)

𝑑𝑇 𝑑𝑄ℎ𝑒𝑎𝑡 𝑑𝑄𝑖𝑟𝑟 𝑑𝑄𝑜𝑐𝑐 A.2. Baseline RC model


𝐶
𝑑𝑡
=
𝑑𝑡
+
𝑑𝑡
+
𝑑𝑡 In this work, to create a simple RC model to use as a
∑ 𝑑𝑄𝑐𝑜𝑛𝑑 ∑ 𝑑𝑄𝑐𝑜𝑛𝑣 comparison baseline, we assume no knowledge of the oc-
+ + , (18) cupants and other heat gains and discard the corresponding
𝑑𝑡 𝑑𝑡
term 𝑒2 𝑄𝑟𝑒𝑠𝑡
𝑘
. Rewriting Equation (22), we get:
where 𝑇 is the temperature, 𝐶 the heat capacitance of the air
𝑇
mass, 𝑄 respectively represents heat flows from the heat- ⎡ 𝑄ℎ𝑒𝑎𝑡 ⎤ ⎡𝑎⎤
ing/cooling system (negative values represent cooling en- ⎢ −(𝑇 −𝑘
𝑇 𝑜𝑢𝑡 ) ⎥ ⎢𝑏⎥
ergy), the solar irradiation, the occupants, heat conduction, 𝑇𝑘+1 − 𝑇𝑘 = ⎢ 𝑘 𝑘 ⎥ ⎢𝑐⎥
⎢−(𝑇𝑘 − 𝑇𝑘𝑛𝑒𝑖𝑔ℎ )⎥ ⎢ ⎥
and heat convection, respectively, where both sums are taken ⎢ ⎥ ⎣𝑒1 ⎦
⎣ 𝑄𝑖𝑟𝑟 ⎦
over the number of surfaces adjacent to the measured volume
of air. Δ𝑇𝑘+1 = 𝑦𝑇𝑘 𝑝, (23)
In this work, we consider conductive and convective trans-
fer together in two heat transfers: one to represent transfer to where Δ𝑇 represents the temperature difference, 𝑦 groups
the neighboring zone (assuming there is only one) and the the factors influencing it, and 𝑝 the unknown parameters.
other to gather losses to the environment, both being pro- Doing this for every time step, we can create matrices of
portional to the corresponding temperature gradient. Addi- data, grouping all the temperature differences in matrix 𝑋
tionally, we process the horizontal solar irradiation data to and the external factors in 𝑌 :
reflect the solar gains through the windows as follows:
⎡ Δ𝑇1 ⎤ ⎡ 𝑦𝑇1 ⎤
sin (𝜃 − 𝜃0 ) cos (𝜙) ⎢ ⋮ ⎥ = ⎢ ⋮ ⎥𝑝
𝑄𝑖𝑟𝑟 (𝑡) = 𝐼(𝑡), (19) ⎢ ⎥ ⎢ ⎥
sin (𝜙) ⎣Δ𝑇𝑁 ⎦ ⎣𝑦𝑇𝑁 ⎦
𝑋 = 𝑌𝑝 (24)
where 𝐼 is the irradiation measured by the sensor, 𝜙 the alti-
tude and 𝜃 the azimuth of the sun, and 𝜃0 accounts for the ori- Finally, we can use Least Squares to identify the parameters:
entation of the window (as counter clock-wise rotation from
a north-south-aligned surface facing east). Altogether, we 𝑌𝑇𝑋 = 𝑌𝑇𝑌𝑝
can then rewrite the thermal dynamics as:
𝑝 = (𝑌 𝑇 𝑌 )−1 𝑌 𝑇 𝑋 (25)
𝑑𝑇 1 𝑑𝑄ℎ𝑒𝑎𝑡 𝜖 𝑑𝑄𝑖𝑟𝑟
𝜂 𝑑𝑄𝑟𝑒𝑠𝑡
= + +
𝑑𝑡 𝐶 𝑑𝑡 𝐶 𝑑𝑡 𝐶 𝑑𝑡 B. Mathematical derivations
1 𝑑(𝑇 − 𝑇 𝑜𝑢𝑡 )
− (20) B.1. Physics-based predictions
𝐶𝑅𝑜𝑢𝑡 𝑑𝑡
We can rewrite the predictions of the RC model from
1 𝑑(𝑇 − 𝑇 𝑛𝑒𝑖𝑔ℎ ) Equation (22) as follows:

𝐶𝑅𝑛𝑒𝑖𝑔ℎ 𝑑𝑡
𝑇𝑘+1 = (1 − 𝑏 − 𝑐)𝑇𝑘 + 𝑎𝑄ℎ𝑒𝑎𝑡
𝑘 + 𝑏𝑇𝑘𝑜𝑢𝑡
with 𝜖 representing the lumped permissivity of the windows
and exterior walls, 𝑅𝑜𝑢𝑡 and 𝑅𝑛𝑒𝑖𝑔ℎ the thermal resistance + 𝑐𝑇𝑘𝑛𝑒𝑖𝑔ℎ + 𝑒𝑄𝑖𝑟𝑟
𝑘 , (26)

Di Natale et al.: Preprint submitted to Elsevier Page 16 of 20


Physically Consistent Neural Networks for building thermal modeling: theory and analysis

Applying this transformation recursively yields the follow- and we can write temperature predictions from the model as
ing two-steps-ahead temperature predictions: follows:

𝑇𝑘+2 = (1 − 𝑏 − 𝑐)𝑇𝑘+1 + 𝑎𝑄ℎ𝑒𝑎𝑡


𝑘+1
𝑜𝑢𝑡
+ 𝑏𝑇𝑘+1 𝑇𝑘+2 = (1 − 𝑏 − 𝑐)2 𝑇𝑘 + (1 − 𝑏 − 𝑐)[𝑓 (𝐷𝑘 , 𝑥𝑘 )
𝑛𝑒𝑖𝑔ℎ
+ 𝑐𝑇𝑘+1 + 𝑒𝑄𝑖𝑟𝑟
𝑘+1
+ 𝑎𝑔(𝑢𝑘 ) + 𝑏𝑇𝑘𝑜𝑢𝑡 + 𝑐𝑇𝑘𝑛𝑒𝑖𝑔ℎ ]
= (1 − 𝑏 − 𝑐)[(1 − 𝑏 − 𝑐)𝑇𝑘 + 𝑎𝑄ℎ𝑒𝑎𝑡 + 𝑓 (𝐷𝑘+1 , 𝑥𝑘+1 ) + 𝑎𝑔(𝑢𝑘+1 ) (31)
𝑘
𝑜𝑢𝑡 𝑛𝑒𝑖𝑔ℎ
+ 𝑏𝑇𝑘𝑜𝑢𝑡 + 𝑐𝑇𝑘𝑛𝑒𝑖𝑔ℎ + 𝑒𝑄𝑖𝑟𝑟𝑘 ]
+ 𝑏𝑇𝑘+1 + 𝑐𝑇𝑘+1
𝑛𝑒𝑖𝑔ℎ
+ 𝑎𝑄𝑘+1 + 𝑏𝑇𝑘+1 + 𝑐𝑇𝑘+1 + 𝑒𝑄𝑖𝑟𝑟
ℎ𝑒𝑎𝑡 𝑜𝑢𝑡
𝑘+1
(27) Similarly to the physics-based case, this leads to the follow-
= 2
(1 − 𝑏 − 𝑐) 𝑇𝑘 ing general formula for any future predictions:

+ (1 − 𝑏 − 𝑐)[𝑎𝑄ℎ𝑒𝑎𝑡
𝑘 + 𝑏𝑇𝑘𝑜𝑢𝑡 𝑇𝑘+𝑖 = (1 − 𝑏 − 𝑐)𝑖 𝑇𝑘
+ 𝑐𝑇𝑘𝑛𝑒𝑖𝑔ℎ + 𝑒𝑄𝑖𝑟𝑟𝑘 ] ∑
𝑖

𝑛𝑒𝑖𝑔ℎ + (1 − 𝑏 − 𝑐)(𝑗−1) [𝑓 (𝐷𝑘+𝑖−𝑗 , 𝑥𝑘+𝑖−𝑗 ) (32)


+ 𝑎𝑄ℎ𝑒𝑎𝑡
𝑘+1
𝑜𝑢𝑡
+ 𝑏𝑇𝑘+1 + 𝑐𝑇𝑘+1 + 𝑒𝑄𝑖𝑟𝑟
𝑘+1 𝑗=1
𝑜𝑢𝑡 𝑛𝑒𝑖𝑔ℎ
This leads to the general formula below for the temperature + 𝑎𝑔(𝑢𝑘+𝑖−𝑗 ) + 𝑏𝑇𝑘+𝑖−𝑗 + 𝑐𝑇𝑘+𝑖−𝑗 ]
prediction 𝑖 time steps ahead:

𝑇𝑘+𝑖 = (1 − 𝑏 − 𝑐)𝑖 𝑇𝑘 C. Data preprocessing



𝑖 C.1. NEST data
+ (1 − 𝑏 − 𝑐)(𝑗−1) [𝑎𝑄ℎ𝑒𝑎𝑡
𝑘+𝑖−𝑗 (28) Data from all the sensors in NEST is sampled and stored
𝑗=1 at a frequency of one minute. Concerning the solar irradia-
𝑜𝑢𝑡
+𝑏𝑇𝑘+𝑖−𝑗 𝑛𝑒𝑖𝑔ℎ
+ 𝑐𝑇𝑘+𝑖−𝑗 + 𝑒𝑄𝑖𝑟𝑟 tion data, we delete constant streaks of more than 20 h than
𝑘+𝑖−𝑗 ]
indicate a fault of the sensor – where deleting refers to set-
Remarkably, this model is known to follow the laws of physics ting the values to NaN – and clip the measurement at 0 since
by design, i.e. it satisfies Equations (1)-(4) as long as all the it cannot be negative. For the outside temperature, we delete
parameters 𝑎, 𝑏, 𝑐, and 𝑒 are small and positive. This is true constant streaks of more than 30 min. Both measurements
for real systems since they represent small positive physical are then smoothed with a Gaussian filter with 𝜎 = 2. For
constants, i.e. inverses of resistances and capacitances. power inputs, we delete constant streaks of more than 1 day
and smooth the measurements with a Gaussian filter with
B.2. Black-box predictions 𝜎 = 1. Finally, the temperature measurements in both the
PCNN predictions from Equation (9) can be rewritten as room of interest and the neighboring one are smoothed with
follows: 𝜎 = 5.
Before using the created data, we linearly interpolate all
𝑇𝑘+1 = (1 − 𝑏 − 𝑐)𝑇𝑘 + 𝑓 (𝐷𝑘 , 𝑥𝑘 ) + 𝑎𝑔(𝑢𝑘 ) the missing values when less than 30 min of information is
+ 𝑏𝑇𝑘𝑜𝑢𝑡 + 𝑐𝑇𝑘𝑛𝑒𝑖𝑔ℎ (29) missing. When we use it to train and test PCNNs, the data
is subsampled to 15 minute intervals through averaging.
When this formula is applied recursively, what the model
does in practice, we get: C.2. Individual room energy consumption
As mentioned in Section 4, UMAR has a unique power
𝑇𝑘+2 = (1 − 𝑏 − 𝑐)𝑇𝑘+1 + 𝑓 (𝐷𝑘+1 , 𝑥𝑘+1 ) meter and we need to disaggregate this global measurement
𝑜𝑢𝑡 𝑛𝑒𝑖𝑔ℎ 𝑃 𝑡𝑜𝑡 into individual consumption for each room. To that end,
+ 𝑎𝑔(𝑢𝑘+1 ) + 𝑏𝑇𝑘+1 + 𝑐𝑇𝑘+1 we use the design mass flow 𝑚̇ 𝑖 of room 𝑖, something known
= (1 − 𝑏 − 𝑐)[(1 − 𝑏 − 𝑐)𝑇𝑘 + 𝑓 (𝐷𝑘 , 𝑥𝑘 ) from technical construction sheets. At each time step 𝑡, we
then approximate the power consumed by each room, 𝑃 𝑖 , as
+ 𝑎𝑔(𝑢𝑘 ) + 𝑏𝑇𝑘𝑜𝑢𝑡 + 𝑐𝑇𝑘𝑛𝑒𝑖𝑔ℎ ] (30)
follows:
+ 𝑓 (𝐷𝑘+1 , 𝑥𝑘+1 ) + 𝑎𝑔(𝑢𝑘+1 )
𝑛𝑒𝑖𝑔ℎ 𝑢𝑖𝑡 𝑚̇ 𝑖𝑡
𝑜𝑢𝑡
+ 𝑏𝑇𝑘+1 + 𝑐𝑇𝑘+1 𝑃𝑡𝑖 = ∑ 𝑘 𝑘
𝑃𝑡𝑡𝑜𝑡 , (33)
𝑘 𝑢𝑡 𝑚̇ 𝑡
Note that 𝐷𝑘+1 = 𝐷𝑘 + 𝑓 (𝐷𝑘 , 𝑥𝑘 ) is independent from all
the other variables 𝑢, 𝑇 𝑜𝑢𝑡 , and 𝑇 𝑛𝑒𝑖𝑔ℎ . We can thus keep where 𝑢𝑖 is the amount of time the valves are opened and
the recursion as is, only noting that at each time step 𝐷 goes we sum over all the 𝑘 = 5 rooms in UMAR. In words, we
through the neural network function 𝑓 so that we end up with approximate the individual energy consumption of each to
a nested application of 𝑓 to the inputs 𝑥. However and cru- be proportional to the amount of water flowing through its
cially, they do not get impacted by changes of control input, ceiling panels.

Di Natale et al.: Preprint submitted to Elsevier Page 17 of 20


Physically Consistent Neural Networks for building thermal modeling: theory and analysis

D. Implementation details
0.9
The month and time of day variables are represented by

Absolute Error ( C)
sine and cosine functions to introduce periodicity, so that the 0.8
last month has a value close to the first month of the year for 0.7
example. Mathematically, two variables are created: 0.6
𝑚 𝑚
𝑡𝑠𝑖𝑛
𝑚 = sin ( 12 2𝜋), 𝑡𝑐𝑜𝑠
𝑚 = cos ( 12 2𝜋), (34) 0.5
where the months 𝑚 are labeled linearly and in order from 0.4
1 to 12. The same processing is done for the time step in 0.3
during the day, replacing the factor 12 in Equation (34) by 1h 6h 12h 24h 48h 72h
96, the number of 15 min interval in one day. Hour ahead
We let the initial hidden and cell state of the LSTM be Figure 9: MAE of six PCNNs with different random seeds at
learned during training and additionally give the model a six chosen prediction steps in grey and the average in green,
warm start of 3 h, i.e. the PCNN first predicts the last 12 where the statistics were computed from almost 2000 predic-
time steps in the past, where we feed the true temperatures tions from the validation set.
back to the network to initialize all the internal states, be-
fore predicting the temperature over the given horizon. We
train the PCNN to minimize the Mean Square Error (MSE) Starting Learned
of the predictions over a horizon of 3 days with 15-minute Parameter value value
time steps, and use the Adam optimizer with a decreasing 𝑎 2 2.01
learning rate of 0.001
√ at epoch ℎ. We create sequences of 𝑏 1.5 1.50
ℎ 𝑐 1.5 1.51
data using sliding windows of minimum 12 h – and maxi- 𝑑 2 1.97
mum 3 days – with a stride of 1 h. We then separate both the
heating and cooling season in training and validation data Table 3
with an 80-20% split to ensure a fair partition of heating and Comparison between the initial and learned values of the PCNN
parameters, in degrees Celsius. For 𝑎 and 𝑑, it represents how
cooling cases in the training and validation sets. Finally, the
many degrees are gained in 𝟦 𝗁 when heating/cooling at full
data is normalized between 0.1 and 0.9. power, while for 𝑏 and 𝑐 it represents how many degrees are lost
Since 𝑎, 𝑏, 𝑐, and 𝑑 are very small values that could be through heat transfer in 𝟨 𝗁 when the exogenous temperature
unstable during training – hence leading to physically incon- is 𝟤𝟧 °C lower.
sistent parameters –, we rewrite:
𝑠 = 𝑠0 𝑠,
̃ ∀𝑠 ∈ {𝑎, 𝑏, 𝑐, 𝑑} Seed Training loss Validation loss
0 1.82 2.42
where 𝑠0 is the initial value of the parameter. We initialize 1 1.66 2.44
𝑠̃ = 1 and let the backpropagation algorithm modify this 2 1.58 2.52
much more stable value instead. 3 1.66 2.54
4 1.66 2.39
E. Additional results Mean 1.68 2.46

E.1. Robustness of PCNNs Table 4


Training and validation losses of five PCNNs on the other bed-
To further analyze the robustness of the PCNN discussed
room in UMAR, scaled by 103 .
in Section 5, we trained five other networks with the same
structure but different random seeds. As pictured in Figure 9,
all six models provide similar accuracy over the validation
E.3. Flexibility of PCNNs
set, except at the beginning of the horizon. Two out of the
Additionally, to test the flexibility of our approach, we
six PCNNs trained indeed showed oscillatory behavior on
trained five PCNNs on the other bedroom in UMAR, again
the first prediction steps, leading to higher errors.
with five different random seeds. As can be observed in Fig-
E.2. Learned parameters ure 10, the models again arrive at a similar accuracy to what
To complete the analysis of the PCNN presented in Sec- was obtained for the first bedroom (see Figure 9), except to-
tion 5, we also display the final values of the parameters 𝑎, wards the end of the prediction horizon, where the error is
𝑏, 𝑐, and 𝑑 in Table 3. Overall, we see that the parameters 20 − 40% higher. Nonetheless, the performance of PCNNs
do not change much, and the same conclusion was drawn for is comparable for both rooms, which is particularly interest-
the other PCNNs trained during our experiments. Out of the ing since no engineering was required to transfer the model
six PCNNs plotted in Figure 9, only two modified the values between them: we used the same architecture for both bed-
substantially, even though by a maximum of 10% − 15%, rooms, simply changing the training and validation data sets.
and they correspond to the two models showing the worst The training and validation errors displayed in Table 4 con-
performance overall. firm these conclusions.

Di Natale et al.: Preprint submitted to Elsevier Page 18 of 20


Physically Consistent Neural Networks for building thermal modeling: theory and analysis

1.4 Seed Training loss Validation loss


LSTMs 0 0.57 2.28
1.2 0.57 1.92
Absolute Error ( C)
1
2 1.14 2.30
1.0 3 0.97 2.22
4 1.00 1.77
0.8 Mean 0.85 2.10
PCNNs 0 1.83 1.93
0.6 1 1.85 1.65
2 2.06 1.75
0.4 3 2.28 1.73
1h 6h 12h 24h 48h 72h 4 1.90 1.97
Hour ahead 5 2.38 1.66
Mean 2.05 1.78
Figure 10: MAE on the other bedroom in UMAR at key time
steps of the prediction horizon for the PCNN with five different Table 5
random seeds, where the statistics were computed from almost Comparison training and validation loss for five classical
2000 predictions from the validation set. LSTMs and six PCNNs, scaled by 103 .

1.0 F. Deriving PCNNs from classical grey-box


0.9 models
Absolute Error ( C)

0.8 Classical grey-box models generally start from a linear


RC model of the following form [73]:
0.7
̇ = 𝐴𝑧(𝑡) + 𝐵𝑞(𝑡)
𝑧(𝑡) (35)
0.6
0.5 where 𝑧 is the state of the system5 and 𝑞 captures various
heat fluxes like heating/cooling inputs, heat losses to the en-
0.4 vironment and neighboring zones, or heat gains from solar
1h 6h 12h 24h 48h 72h irradiation. After using Euler’s discretization, we obtain the
Hour ahead following discrete-time linear model:
Figure 11: MAE at key time steps of the prediction horizon for
the classical encoder-LSTMs-decoder network with five differ- 𝑧𝑘+1 = 𝐴𝑧𝑘 + 𝐵𝑞𝑘 (36)
ent random seeds in grey and the average in green, where the
statistics were computed from almost 2000 predictions from Using system identification, one can then identify the param-
the validation set. eters 𝐴 and 𝐵 of this model, yielding a grey-box model.
Traditionally, researchers separate the various heat fluxes
in 𝑞 between controllable and uncontrollable variables 𝑣 and
𝑤, respectively. The former generally captures power inputs
E.4. Comparison to classical LSTMs in building models, but it could be extended to include blind
Finally, to analyze the impact of the prior knowledge in- controls for example. On the other hand, 𝑤 captures the ef-
clusion in 𝐸, we performed a small ablation experiment by fect of the sun, the occupants, and other disturbances that
training a classical black-box LSTM network, i.e. only train- cannot be controlled. In the past, both types of heat fluxes
ing 𝐷 with all the inputs in 𝑥, including the power and exoge- have usually been separated linearly, leading to models of
nous temperatures, again with five different random seeds. the form [73]:
For comparison purposes, the training and validation losses
can be found in Table 5 and the error propagation over the 𝑧𝑘+1 = 𝐴𝑧𝑘 + 𝐵𝑣 𝑣𝑘 + 𝐵𝑤 𝑤𝑘 (37)
validation set in Figure 11, where one can remark similar However, while some exogenous factors in 𝑤 do indeed present
to worse performance compared to the proposed PCNNs in a linear impact on the zone temperatures, others are much
Figure 9. This is a strong indication that PCNNs do not lose harder to capture, such as the solar gains or the effect of the
much expressiveness, even if we constrain the structure to occupants. One way to capture more complex phenomena
follow some given laws. On the contrary, the linear physics- is to introduce bilinear terms coupling 𝑣 and 𝑤 in Equa-
informed module inside PCNNs seems to give them useful tion (37), as in Sturzenegger et al. [73] for example. The
information, as they are able to beat the performance of clas- natural disadvantage arising from such additional coupling
sical unconstrained LSTMs. terms in control application is that the subsequent optimiza-
tion of 𝑣 gets more complicated.
5 We adopt the unconventional notation 𝑧 for the state to avoid confu-

sion with the NN inputs 𝑥 in PCNNs.

Di Natale et al.: Preprint submitted to Elsevier Page 19 of 20


Physically Consistent Neural Networks for building thermal modeling: theory and analysis

On the other hand, PCNNs effectively separate the un-


controllable variables into two sets 𝑤1 and 𝑤2 using prior
knowledge on the law of physics in buildings. The former
groups variables known to have a linear impact on the zone
temperature, the temperature outside and in the neighboring
zones. 𝑤2 then gathers the other inputs with nonlinear ef-
fects, which computed in a separate disturbance process 𝜉
based on an unknown residual function 𝑚, which yields a
process similar to:

𝜉𝑘+1 = 𝜉𝑘 + 𝑚(𝜉𝑘 , 𝑤2𝑘 )


𝑧𝑘+1 = 𝐴𝑧𝑘 + 𝐵𝑢 𝑔(𝑢𝑘 ) + 𝐵𝑤1 𝑤1𝑘 + 𝜉𝑘+1 (38)

Note that we introduced a more general form of the control-


lable inputs 𝑣 = 𝑔(𝑢) since the power inputs 𝑣 are for exam-
ple not directly controllable in general. We hence represent
them by a function 𝑔(𝑢) where 𝑢 is the true control variable,
typically the opening of the valves in the case of radiators.
Remarkably, the model presented in Equation (38) still re-
tains a linear structure with respect to power inputs 𝑣 = 𝑔(𝑢),
which is very well-suited for control applications.
However, PCNNs have yet a slightly different structure:
the disturbance model 𝜉 is influencing the state of the sys-
tem both with and without a lag of one time step, as can be
observed in Equation (5) since 𝐸𝑘+1 depends on 𝐷𝑘 . Alto-
gether, we can thus rewrite the equations of the PCNN as
follows:

𝜉𝑘+1 = 𝜉𝑘 + 𝑚(𝜉𝑘 , 𝑤2𝑘 )


𝑧𝑘+1 = 𝐴𝑧𝑘 + 𝐵𝑢 𝑔(𝑢𝑘 ) + 𝐵𝑤1 𝑤1𝑘 (39)
+ 𝐵𝑑 𝜉𝑘 + 𝜉𝑘+1

One can verify that Equations (9) and (39) are equivalent,
with:

𝜉=𝐷 𝑧=𝑇
[ ]
𝑛𝑒𝑖𝑔ℎ 𝑇
𝑤1 = 𝑇 𝑜𝑢𝑡 𝑇 𝑤2 = 𝑥
𝐴=1−𝑏−𝑐 𝐵𝑢 = 𝑎
[ ]
𝐵𝑤1 = −𝑏 −𝑐 𝐵𝑑 = −𝑏 − 𝑐

Di Natale et al.: Preprint submitted to Elsevier Page 20 of 20

You might also like